All Sessions

Plan: Database Sync and Backup Scripts

February 17, 20267:25 AM8:09 AM(44 min)6 turns
Model: claude-opus-4-6
Read15TaskUpdate6Bash4TaskCreate3Write3Edit3Task3Glob1Grep1AskUserQuestion1EnterPlanMode1ExitPlanMode1
7:25 AM

Implement the following plan:

Plan: Database Sync and Backup Scripts

Context

The project has no way to pull production data into the local development database, and no database backup mechanism beyond K8s persistent volume replication. We need:

  1. An on-demand script to sync production data into the local Docker postgres
  2. A manual backup script that pushes pg_dump output to Box via rclone

New Files

FilePurpose
scripts/db-sync.tsPull prod/dev database into local Docker postgres
scripts/db-backup.tsBackup prod/dev databases to Box via rclone

Task 1: scripts/db-sync.ts

Command: bun run db:sync (defaults to prod), bun run db:sync dev

Approach: Plain SQL dump piped from kubectl exec to local Docker psql. Skip embedding tables (documents, chunks) by default since they can be regenerated with bun run embed.

Flow:

  1. Parse arg (prod or dev, default prod)
  2. Safety: block if NODE_ENV=production
  3. Verify local Docker postgres container is running (same pattern as packages/db/scripts/reset.ts:15)
  4. Confirmation prompt using ask() pattern from scripts/deploy.ts:43
  5. Drop and recreate local schema + vector extension (same as packages/db/scripts/reset.ts:25-34)
  6. Run kubectl exec postgres-0 -- pg_dump with --no-owner --no-privileges and --exclude-table=documents --exclude-table=chunks, pipe stdout to docker exec psql -U postgres -d usingandunderstanding
  7. Verify: query pg_tables to show restored tables
  8. Print reminder about bun run embed to regenerate embeddings if needed

Key patterns to reuse:

  • NAMESPACE from scripts/lib/k8s.ts:5
  • dbName() from scripts/lib/k8s.ts:19
  • Container detection from packages/db/scripts/reset.ts:15
  • Schema reset from packages/db/scripts/reset.ts:25-34
  • step() and ask() helpers from scripts/deploy.ts:23,43

Task 2: scripts/db-backup.ts

Command: bun run db:backup (backs up both), bun run db:backup prod, bun run db:backup dev

Approach: Pipeline kubectl exec pg_dump | gzip | rclone rcat — streams directly to Box with no local temp file.

Flow:

  1. Parse arg (prod, dev, or all, default all)
  2. Verify rclone is installed and Box remote is configured (rclone about box: --json)
  3. For each environment to backup:
    • Generate filename: {env}-{YYYY-MM-DD}-{HHMMSS}.sql.gz
    • Remote path: box:usingandunderstanding-backups/{YYYY-MM-DD}/{filename}
    • Spawn pipeline: kubectl exec pg_dump | gzip | rclone rcat {remotePath}
    • Check all exit codes, report success/failure
  4. Print summary with remote paths

Key patterns to reuse:

  • Same NAMESPACE, dbName() from scripts/lib/k8s.ts
  • step() helper from scripts/deploy.ts

Package.json Changes

Add to root package.json scripts (after db:studio):

"db:sync": "bun run scripts/db-sync.ts",
"db:backup": "bun run scripts/db-backup.ts"

CLAUDE.md Updates

Add db:sync and db:backup to the Database section documentation.

Verification

  1. db:sync: Run bun run db:sync and confirm local database has production tables (check with bun run db:studio)
  2. db:backup: Run bun run db:backup prod and verify the file appears in Box under usingandunderstanding-backups/ folder
  3. Confirm bun run db:sync dev also works for syncing from the dev database

If you need specific details from before exiting plan mode (like exact code snippets, error messages, or content you generated), read the full transcript at: ~/.claude/projects/-Users-challen-www-usingandunderstanding-ai/c6b53239-d4ad-4dc5-8e2a-09374840c3f6.jsonl

I'll start by reading the key files referenced in the plan to understand the existing patterns.

Now let me check the existing scripts directory for additional context.

Let me also quickly look at another simple script to see patterns for spawn/pipe usage.

Now I have all the context I need. Let me implement both scripts, update package.json, and update CLAUDE.md.

Now let me update the root package.json and CLAUDE.md.

Now update CLAUDE.md — let me find the right section.

Now let me find the Database section to add the new commands.

I'll add new subsections after the Production Database section.

Now let me verify everything looks right by doing a quick check on the created files.

Both scripts parse arguments correctly and show usage for invalid args. Let me also verify the package.json scripts work.

All done. Here's what was created:

scripts/db-sync.ts — Syncs prod or dev database into local Docker postgres:

  • Defaults to prod, accepts dev as argument
  • Blocks if NODE_ENV=production
  • Verifies local Docker postgres is running
  • Confirmation prompt before destructive reset
  • Drops/recreates local schema with vector extension
  • Runs kubectl exec pg_dump (excluding documents and chunks tables) and pipes to local docker exec psql
  • Verifies restored tables and prints a reminder about bun run embed

scripts/db-backup.ts — Backs up databases to Box via rclone:

  • Defaults to both envs, accepts prod or dev
  • Verifies rclone is installed and Box remote is configured
  • Streams pg_dump | gzip | rclone rcat with no temp files
  • Organizes backups as box:usingandunderstanding-backups/{date}/{env}-{date}-{time}.sql.gz
  • Checks all pipeline exit codes and reports summary

Updated files:

  • package.json — Added db:sync and db:backup scripts
  • CLAUDE.md — Added "Sync and Backup" subsection in the Database section
Read9TaskUpdate6Bash4TaskCreate3Write2Edit2Glob1Grep1
7:49 AM

Cool! Can we add a package.json script to run the script, maybe bun db:sync?

Those already exist — I added them as part of the implementation:

"db:sync": "bun run scripts/db-sync.ts",
"db:backup": "bun run scripts/db-backup.ts"

So bun run db:sync and bun run db:backup both work (verified above).

7:50 AM

I ran db:sync, but after embed does not work: [07:50:09 challen@blueshift:~/www/usingandunderstanding.ai] $ bun embed $ bun run --cwd tools/embedding src/embed.ts Discovering MDX files... Found 18 MDX file(s) 36 | async queryWithCache(queryString, params, query) { 37 | if (this.cache === void 0 || is(this.cache, NoopCache) || this.queryMetadata === void 0) { 38 | try { 39 | return await query(); 40 | } catch (e) { 41 | throw new DrizzleQueryError(queryString, params, e); ^ error: Failed query: select "id", "title", "source", "source_url", "type", "content", "metadata", "uploaded_by", "created_at", "updated_at" from "documents" where "documents"."source" = $1 params: mdx query: "select "id", "title", "source", "source_url", "type", "content", "metadata", "uploaded_by", "created_at", "updated_at" from "documents" where "documents"."source" = $1", params: [ "mdx" ],

The issue is --exclude-table skips the table definition entirely, so the documents and chunks tables don't exist at all after sync. We need --exclude-table-data instead, which creates the empty tables but skips the (large) row data.

That's the fix. --exclude-table omits the table entirely (DDL + data), while --exclude-table-data keeps the table structure but skips the row data. Now after db:sync, the documents and chunks tables will exist (empty) and bun run embed can populate them.

Edit1
7:52 AM

Nice! That fix works.

Great, glad it's working!

7:54 AM

Great! And it looks like the backup also works. However, I do want this to run automatically in both the dev and prod environments, meaning that we probably do need to add a cron task to one of the containers, unless we want to do this with k8s directly. However, rclone requires a token when we'll need to add to the container.

A K8s CronJob is the cleaner approach here — no need to modify existing containers, and it gets its own schedule, restart policy, and logging. The job container would connect directly to postgres via the in-cluster service (no kubectl needed), pipe through gzip, and rclone to Box.

A few questions before I plan this out:

Now let me design the plan.

Task2AskUserQuestion1EnterPlanMode1
8:00 AM

Note that, I'm fairly sure that rclone needs to pull a refresh token as well to continue to work unattended over time. If the backup fails due to rclone issues, we should probably send an email, probably to [email], although it's annoying that that's hardcoded. Can our mail system support a [email] alias (and [email]) and point that to [email]?

Good catch on the rclone refresh token — that's a real issue since Box OAuth tokens expire and rclone needs to write the refreshed token back. A read-only K8s secret mount won't work for that.

Let me explore the mail system to understand alias support before continuing with the plan.

Good points on both the rclone refresh token and the notification email. Let me look at the mail list configuration to understand what's possible.

The mail system already supports explicit email lists. Adding backup@ is straightforward. Now let me read the existing deploy.sh to understand manifest application order.

Now let me check the postgres.yaml for the storage class pattern (needed for the rclone PVC).

Good, I have everything I need. Let me also quickly check the processor.ts to understand how the mail system sends outgoing mail — specifically how the backup job could send a failure notification through it.

Now I have all the context needed. Let me write the plan.

Read6Task1Write1ExitPlanMode1