Plan: Database Sync and Backup Scripts
Implement the following plan:
Plan: Database Sync and Backup Scripts
Context
The project has no way to pull production data into the local development database, and no database backup mechanism beyond K8s persistent volume replication. We need:
- An on-demand script to sync production data into the local Docker postgres
- A manual backup script that pushes pg_dump output to Box via rclone
New Files
| File | Purpose |
|---|---|
scripts/db-sync.ts | Pull prod/dev database into local Docker postgres |
scripts/db-backup.ts | Backup prod/dev databases to Box via rclone |
Task 1: scripts/db-sync.ts
Command: bun run db:sync (defaults to prod), bun run db:sync dev
Approach: Plain SQL dump piped from kubectl exec to local Docker psql. Skip embedding tables (documents, chunks) by default since they can be regenerated with bun run embed.
Flow:
- Parse arg (
prodordev, defaultprod) - Safety: block if
NODE_ENV=production - Verify local Docker postgres container is running (same pattern as
packages/db/scripts/reset.ts:15) - Confirmation prompt using
ask()pattern fromscripts/deploy.ts:43 - Drop and recreate local schema + vector extension (same as
packages/db/scripts/reset.ts:25-34) - Run
kubectl exec postgres-0 -- pg_dumpwith--no-owner --no-privilegesand--exclude-table=documents --exclude-table=chunks, pipe stdout todocker exec psql -U postgres -d usingandunderstanding - Verify: query
pg_tablesto show restored tables - Print reminder about
bun run embedto regenerate embeddings if needed
Key patterns to reuse:
NAMESPACEfromscripts/lib/k8s.ts:5dbName()fromscripts/lib/k8s.ts:19- Container detection from
packages/db/scripts/reset.ts:15 - Schema reset from
packages/db/scripts/reset.ts:25-34 step()andask()helpers fromscripts/deploy.ts:23,43
Task 2: scripts/db-backup.ts
Command: bun run db:backup (backs up both), bun run db:backup prod, bun run db:backup dev
Approach: Pipeline kubectl exec pg_dump | gzip | rclone rcat — streams directly to Box with no local temp file.
Flow:
- Parse arg (
prod,dev, orall, defaultall) - Verify
rcloneis installed and Box remote is configured (rclone about box: --json) - For each environment to backup:
- Generate filename:
{env}-{YYYY-MM-DD}-{HHMMSS}.sql.gz - Remote path:
box:usingandunderstanding-backups/{YYYY-MM-DD}/{filename} - Spawn pipeline:
kubectl exec pg_dump | gzip | rclone rcat {remotePath} - Check all exit codes, report success/failure
- Generate filename:
- Print summary with remote paths
Key patterns to reuse:
- Same
NAMESPACE,dbName()fromscripts/lib/k8s.ts step()helper fromscripts/deploy.ts
Package.json Changes
Add to root package.json scripts (after db:studio):
"db:sync": "bun run scripts/db-sync.ts",
"db:backup": "bun run scripts/db-backup.ts"
CLAUDE.md Updates
Add db:sync and db:backup to the Database section documentation.
Verification
- db:sync: Run
bun run db:syncand confirm local database has production tables (check withbun run db:studio) - db:backup: Run
bun run db:backup prodand verify the file appears in Box underusingandunderstanding-backups/folder - Confirm
bun run db:sync devalso works for syncing from the dev database
If you need specific details from before exiting plan mode (like exact code snippets, error messages, or content you generated), read the full transcript at: ~/.claude/projects/-Users-challen-www-usingandunderstanding-ai/c6b53239-d4ad-4dc5-8e2a-09374840c3f6.jsonl
I'll start by reading the key files referenced in the plan to understand the existing patterns.
Now let me check the existing scripts directory for additional context.
Let me also quickly look at another simple script to see patterns for spawn/pipe usage.
Now I have all the context I need. Let me implement both scripts, update package.json, and update CLAUDE.md.
Now let me update the root package.json and CLAUDE.md.
Now update CLAUDE.md — let me find the right section.
Now let me find the Database section to add the new commands.
I'll add new subsections after the Production Database section.
Now let me verify everything looks right by doing a quick check on the created files.
Both scripts parse arguments correctly and show usage for invalid args. Let me also verify the package.json scripts work.
All done. Here's what was created:
scripts/db-sync.ts — Syncs prod or dev database into local Docker postgres:
- Defaults to
prod, acceptsdevas argument - Blocks if
NODE_ENV=production - Verifies local Docker postgres is running
- Confirmation prompt before destructive reset
- Drops/recreates local schema with vector extension
- Runs
kubectl exec pg_dump(excludingdocumentsandchunkstables) and pipes to localdocker exec psql - Verifies restored tables and prints a reminder about
bun run embed
scripts/db-backup.ts — Backs up databases to Box via rclone:
- Defaults to both envs, accepts
prodordev - Verifies rclone is installed and Box remote is configured
- Streams
pg_dump | gzip | rclone rcatwith no temp files - Organizes backups as
box:usingandunderstanding-backups/{date}/{env}-{date}-{time}.sql.gz - Checks all pipeline exit codes and reports summary
Updated files:
package.json— Addeddb:syncanddb:backupscriptsCLAUDE.md— Added "Sync and Backup" subsection in the Database section
Cool! Can we add a package.json script to run the script, maybe bun db:sync?
Those already exist — I added them as part of the implementation:
"db:sync": "bun run scripts/db-sync.ts",
"db:backup": "bun run scripts/db-backup.ts"
So bun run db:sync and bun run db:backup both work (verified above).
I ran db:sync, but after embed does not work: [07:50:09 challen@blueshift:~/www/usingandunderstanding.ai] $ bun embed $ bun run --cwd tools/embedding src/embed.ts Discovering MDX files... Found 18 MDX file(s) 36 | async queryWithCache(queryString, params, query) { 37 | if (this.cache === void 0 || is(this.cache, NoopCache) || this.queryMetadata === void 0) { 38 | try { 39 | return await query(); 40 | } catch (e) { 41 | throw new DrizzleQueryError(queryString, params, e); ^ error: Failed query: select "id", "title", "source", "source_url", "type", "content", "metadata", "uploaded_by", "created_at", "updated_at" from "documents" where "documents"."source" = $1 params: mdx query: "select "id", "title", "source", "source_url", "type", "content", "metadata", "uploaded_by", "created_at", "updated_at" from "documents" where "documents"."source" = $1", params: [ "mdx" ],
The issue is --exclude-table skips the table definition entirely, so the documents and chunks tables don't exist at all after sync. We need --exclude-table-data instead, which creates the empty tables but skips the (large) row data.
That's the fix. --exclude-table omits the table entirely (DDL + data), while --exclude-table-data keeps the table structure but skips the row data. Now after db:sync, the documents and chunks tables will exist (empty) and bun run embed can populate them.
Nice! That fix works.
Great, glad it's working!
Great! And it looks like the backup also works. However, I do want this to run automatically in both the dev and prod environments, meaning that we probably do need to add a cron task to one of the containers, unless we want to do this with k8s directly. However, rclone requires a token when we'll need to add to the container.
A K8s CronJob is the cleaner approach here — no need to modify existing containers, and it gets its own schedule, restart policy, and logging. The job container would connect directly to postgres via the in-cluster service (no kubectl needed), pipe through gzip, and rclone to Box.
A few questions before I plan this out:
Now let me design the plan.
Note that, I'm fairly sure that rclone needs to pull a refresh token as well to continue to work unattended over time. If the backup fails due to rclone issues, we should probably send an email, probably to [email], although it's annoying that that's hardcoded. Can our mail system support a [email] alias (and [email]) and point that to [email]?
Good catch on the rclone refresh token — that's a real issue since Box OAuth tokens expire and rclone needs to write the refreshed token back. A read-only K8s secret mount won't work for that.
Let me explore the mail system to understand alias support before continuing with the plan.
Good points on both the rclone refresh token and the notification email. Let me look at the mail list configuration to understand what's possible.
The mail system already supports explicit email lists. Adding backup@ is straightforward. Now let me read the existing deploy.sh to understand manifest application order.
Now let me check the postgres.yaml for the storage class pattern (needed for the rclone PVC).
Good, I have everything I need. Let me also quickly check the processor.ts to understand how the mail system sends outgoing mail — specifically how the backup job could send a failure notification through it.
Now I have all the context needed. Let me write the plan.