Agent Context — GSR Transformer

Read this first before doing any work in bankofus / GSR Transformer.

Repository

  • Path: /Users/levander/coding/scharge/bankofus
  • Branch: main
  • Build: tsup, ESM output
  • Standalone ingest entrypoint (for K8s CronJob) is wired in tsup config

Key Files

ConcernFile
Datasource definitionssrc/targets.ts
Generic CSV to Prisma writersrc/datasource/tracked-ds.ts
Prisma schemaprisma/schema.prisma
Prisma generated clientsrc/prisma/ (output dir)
Reflection helpersrc/util/prisma.ts (dynDispatchPrismaCall)
Env loadersrc/util/env-loader.ts (loadEnv)
Loggersrc/util/logger.ts (pino)
Helm charthelm/gsr-transformer/

Core Abstraction: TrackedTabularDataSource

The central CSV to Prisma writer. Generic over Models[T] from Prisma’s typemap.

Constructor signature:

new TrackedTabularDataSource(
  tablenameAndId: T,                  // e.g. "monthlyPerformanceDs"
  path: string | (file => boolean),   // fixed path or filter fn
  headerMapping: Record<DBField, [CSVHeader, HeaderType]>,
  dataSourceConfig: { type: "static" | "timeseries", deletePrevious?: boolean, routes?: ... }
)

HeaderType is one of: Int, Float, DateTime, String, Boolean, Int?, Float?, DateTime?, String?, Boolean?. The trailing ? is parsed at runtime via endsWith(”?”) in coerceCell().

Ingest flow (track())

  1. loadEnv() — re-read .env
  2. expandPaths(ftp, baseDir) — resolve glob/filter to real paths
  3. For each file:
    • download as string, trim, SHA-256 hash
    • Papa.parse(file, { skipEmptyLines: ‘greedy’ })
    • dedup check: skip if any row in target table already has this fileHash
    • map rows to coerceCell per field to prismaArgObject
    • createMany via dynDispatchPrismaCall
    • optional deletePrevious: deleteMany({ where: { NOT: { fileHash } } })

Coercion (coerceCell)

  • Trim raw value
  • Empty string or ”-” returns null if nullable, throws if required
  • Int|Float: Number(trimmed.replace(”%”, ""))
  • DateTime: new Date(trimmed)
  • Boolean: [“Y”, “y”, “1”].includes(trimmed)
  • else: return trimmed string

Latent issues in coerceCell

  • Number(“abc”) returns NaN, new Date(“abc”) returns Invalid Date — both pass through silently
  • parsed.errors from PapaParse is never inspected
  • Missing CSV column means findIndex returns -1, then row[-1] is undefined, then “empty value for required field” error (misleading)

CSV Conventions (US Bank GSR feed)

  • Header row is row 0 (referenced via parsed.data[0].findIndex(x x === headerName))
  • Trailing blank rows can be all-empty fields (,,,,,,,,,,,) — handled by skipEmptyLines: ‘greedy’
  • Missing-data sentinel is ”-” — handled by coerceCell() to null (nullable) or throw (required)
  • Percentage strings like “1.23%” are parsed numerically by stripping % then Number(…)

Prisma 7 Notes

  • Provider: prisma-client (newer generator) outputting to src/prisma/
  • engineType = “client” (Prisma’s lib query engine, not binary)
  • Adapter: @prisma/adapter-pg
  • Migrations live in prisma/migrations/ — first migration created on 2026-05-04 (20260504165350_fix_one_month_string). Before that, schema was applied via db push. Switching to migration-based deploys requires baselining.

Deployment

  • Helm chart at helm/gsr-transformer/
  • Two workloads:
    • HTTP API: Fastify Deployment
    • Ingest: Kubernetes CronJob (uses standalone ingest entrypoint added recently — see commits f060200 and 1a5f947)
  • Recent simplifications: scheduler.ts and node-cron removed (commit 8e8c2f0); sync endpoints stripped from API (commit 13dc372) — ingest is now exclusively driven by the CronJob.

Common Pitfalls

Schema vs parser config drift

The header mapping in src/targets.ts (HeaderType) and the Prisma schema field types must agree. They are not enforced to match. The oneMonth field was wrong in both places (DateTime? instead of String?) until 2026-05-04 — see 2026-05-04 Performance CSV ingest fix.

PapaParse skipEmptyLines

skipEmptyLines: true only drops zero-character lines. To drop rows like ,,,,,,,,,,, (all-empty fields), use skipEmptyLines: ‘greedy’. We use ‘greedy’.

deletePrevious: true is non-atomic

Ingest does createMany then deleteMany. If createMany throws mid-batch (e.g. one row fails Prisma validation), the new file’s rows are not inserted but the previous data also isn’t deleted — left in an inconsistent state for the current run. (No explicit transaction wraps these.)