Agent Context — GSR Transformer
Read this first before doing any work in bankofus / GSR Transformer.
Repository
- Path: /Users/levander/coding/scharge/bankofus
- Branch: main
- Build: tsup, ESM output
- Standalone ingest entrypoint (for K8s CronJob) is wired in tsup config
Key Files
| Concern | File |
|---|---|
| Datasource definitions | src/targets.ts |
| Generic CSV to Prisma writer | src/datasource/tracked-ds.ts |
| Prisma schema | prisma/schema.prisma |
| Prisma generated client | src/prisma/ (output dir) |
| Reflection helper | src/util/prisma.ts (dynDispatchPrismaCall) |
| Env loader | src/util/env-loader.ts (loadEnv) |
| Logger | src/util/logger.ts (pino) |
| Helm chart | helm/gsr-transformer/ |
Core Abstraction: TrackedTabularDataSource
The central CSV to Prisma writer. Generic over Models[T] from Prisma’s typemap.
Constructor signature:
new TrackedTabularDataSource(
tablenameAndId: T, // e.g. "monthlyPerformanceDs"
path: string | (file => boolean), // fixed path or filter fn
headerMapping: Record<DBField, [CSVHeader, HeaderType]>,
dataSourceConfig: { type: "static" | "timeseries", deletePrevious?: boolean, routes?: ... }
)HeaderType is one of: Int, Float, DateTime, String, Boolean, Int?, Float?, DateTime?, String?, Boolean?. The trailing ? is parsed at runtime via endsWith(”?”) in coerceCell().
Ingest flow (track())
- loadEnv() — re-read .env
- expandPaths(ftp, baseDir) — resolve glob/filter to real paths
- For each file:
- download as string, trim, SHA-256 hash
- Papa.parse(file, { skipEmptyLines: ‘greedy’ })
- dedup check: skip if any row in target table already has this fileHash
- map rows to coerceCell per field to prismaArgObject
- createMany via dynDispatchPrismaCall
- optional deletePrevious: deleteMany({ where: { NOT: { fileHash } } })
Coercion (coerceCell)
- Trim raw value
- Empty string or ”-” returns null if nullable, throws if required
- Int|Float: Number(trimmed.replace(”%”, ""))
- DateTime: new Date(trimmed)
- Boolean: [“Y”, “y”, “1”].includes(trimmed)
- else: return trimmed string
Latent issues in coerceCell
- Number(“abc”) returns NaN, new Date(“abc”) returns Invalid Date — both pass through silently
- parsed.errors from PapaParse is never inspected
- Missing CSV column means findIndex returns -1, then row[-1] is undefined, then “empty value for required field” error (misleading)
CSV Conventions (US Bank GSR feed)
- Header row is row 0 (referenced via parsed.data[0].findIndex(x ⇒ x === headerName))
- Trailing blank rows can be all-empty fields (,,,,,,,,,,,) — handled by skipEmptyLines: ‘greedy’
- Missing-data sentinel is ”-” — handled by coerceCell() to null (nullable) or throw (required)
- Percentage strings like “1.23%” are parsed numerically by stripping % then Number(…)
Prisma 7 Notes
- Provider: prisma-client (newer generator) outputting to src/prisma/
- engineType = “client” (Prisma’s lib query engine, not binary)
- Adapter: @prisma/adapter-pg
- Migrations live in prisma/migrations/ — first migration created on 2026-05-04 (20260504165350_fix_one_month_string). Before that, schema was applied via db push. Switching to migration-based deploys requires baselining.
Deployment
- Helm chart at helm/gsr-transformer/
- Two workloads:
- HTTP API: Fastify Deployment
- Ingest: Kubernetes CronJob (uses standalone ingest entrypoint added recently — see commits f060200 and 1a5f947)
- Recent simplifications: scheduler.ts and node-cron removed (commit 8e8c2f0); sync endpoints stripped from API (commit 13dc372) — ingest is now exclusively driven by the CronJob.
Common Pitfalls
Schema vs parser config drift
The header mapping in src/targets.ts (HeaderType) and the Prisma schema field types must agree. They are not enforced to match. The oneMonth field was wrong in both places (DateTime? instead of String?) until 2026-05-04 — see 2026-05-04 Performance CSV ingest fix.
PapaParse skipEmptyLines
skipEmptyLines: true only drops zero-character lines. To drop rows like ,,,,,,,,,,, (all-empty fields), use skipEmptyLines: ‘greedy’. We use ‘greedy’.
deletePrevious: true is non-atomic
Ingest does createMany then deleteMany. If createMany throws mid-batch (e.g. one row fails Prisma validation), the new file’s rows are not inserted but the previous data also isn’t deleted — left in an inconsistent state for the current run. (No explicit transaction wraps these.)