Per-Message Context Window Control for Chat

January 29, 202610:18 AM – 10:55 AM(37 min)8 turns

Model: claude-opus-4-5-20251101

Edit12TaskUpdate10Read7TaskCreate5Task4Write2Bash2WebSearch1ExitPlanMode1

10:18 AM

Implement the following plan:

Per-Message Context Window Control for Chat

Goal

Add a slider to the chat UI that controls how many prior messages are included when sending a message to the model. This enables students to experiment with context window sizes and observe how models behave with limited history.

Design

Control: HTML range slider with label showing current value
Scope: Per-message (not persisted to DB) — the slider value is sent with each API request
Range: 0 to 50, plus an "All" option (represented as null)
Default: "All" (slider at max position, no truncation)
Display: "Context: All" or "Context: 4 messages" shown inline
Placement: Rendered alongside the model selector above the textarea in ChatInput

When contextWindow is a number, only the last N messages from conversation history are sent to the model. The system prompt is always included regardless of window size. 0 means only the current user message is sent with no history.

Files to Change

1. New component: `apps/web/app/chat/components/context-window-control.tsx`

A compact slider control:

Native HTML <input type="range"> (no need for a new UI primitive)
Range: min=0, max=50, with an extra tick at 51 representing "All"
Label updates reactively: "Context: All" or "Context: N messages" (or "Context: 0 messages" for no history)
Props: value: number | null, onChange: (value: number | null) => void, disabled?: boolean
Data-testid: context-window-control, context-window-slider, context-window-label

2. Update `apps/web/app/chat/client.tsx`

Add state: const [contextWindow, setContextWindow] = useState<number | null>(null)
Pass contextWindow in the API request body (for every message, not just new conversations)
Render <ContextWindowControl> alongside the <ModelSelector> — both always visible in a flex row above the textarea
The context window control should always be visible (not just for new conversations), since it's per-message

3. Update `apps/web/app/chat/components/chat-input.tsx`

Rename modelSelector prop to controls (or add a second prop) to accept the combined model selector + context window controls
Minor: the controls container should use flex row layout for side-by-side display

4. Update `apps/api/src/routes/chat.ts`

Add contextWindow: z.number().int().min(0).nullish() to chatRequestSchema

After building chatMessages from history (line ~100-103), truncate:

if (data.contextWindow !== null && data.contextWindow !== undefined) {
  // Keep only the last N messages from history, then append current message
  const historyMessages = chatMessages.slice(0, -1); // all but current
  const currentMessage = chatMessages[chatMessages.length - 1];
  const truncated = historyMessages.slice(-data.contextWindow);
  chatMessages = [...truncated, currentMessage];
}

This applies to both streaming and non-streaming paths since chatMessages is built once before branching

5. No database changes

The context window is per-message and not persisted. It's a transient UI control.

Verification

# TypeScript check
bun run tsc

# Lint
bun run lint

# Manual verification: open chat page, adjust slider, send messages
# with different context window sizes, observe model behavior changes

If you need specific details from before exiting plan mode (like exact code snippets, error messages, or content you generated), read the full transcript at: ~/.claude/projects/-Users-challen-www-usingandunderstanding-ai/76c2079e-f66f-44a2-9b12-629453b6f75a.jsonl

I'll implement this plan step by step. Let me start by reading the relevant files to understand the current code structure.

Now let me also look at the model selector component for reference on styling.

Good. I have all the context I need. Let me now create the task list and implement all the changes.

Now let me implement. Starting with the new component.

Now update ChatInput to rename modelSelector to controls.

Now update the ChatClient to add contextWindow state, pass it in requests, and render both controls.

Now update the fetch call to include contextWindow:

Update the useCallback dependency array to include contextWindow:

Now update the ChatInput rendering to use controls prop and include both the model selector and context window control:

Now update the API chat route to accept and use contextWindow.

Now add the truncation logic after building chatMessages and before building the system prompt. The truncation should happen after the user message is pushed, right before the agent loading:

Now let me run the type check and lint.

Both pass cleanly.

All done. Here's a summary of the changes:

New file:

apps/web/app/chat/components/context-window-control.tsx — Compact slider component with range 0–51 (where 51 = "All"). Displays "Context: All" or "Context: N messages". Uses data-testid attributes for testing.

Modified files:

apps/web/app/chat/components/chat-input.tsx — Renamed modelSelector prop to controls, changed container to flex row layout with gap-4 for side-by-side display.
apps/web/app/chat/client.tsx — Added contextWindow state (default null = all), imported and rendered ContextWindowControl always visible alongside the model selector, included contextWindow in every API request body.
apps/api/src/routes/chat.ts — Added contextWindow: z.number().int().min(0).nullish() to the schema. After building chatMessages, truncates history to the last N messages when contextWindow is set, keeping only the current user message plus N prior messages. System prompt is always included (it's passed separately to the LLM).

TaskUpdate10Edit9TaskCreate5Read4Bash2Write1

10:31 AM

Nice! Let's make 52 the default now for chat.

(no content)

Done. The slider now ranges from 0 to 52 messages, with the next tick (53) representing "All".

Edit1

10:33 AM

No, sorry! I meant the 52 model.

(no content)

10:34 AM

No, sorry! I meant the 52 model.

Reverted the slider max back to 50 and changed the default model to gpt-5.2.

Edit2

10:34 AM

Nice. Now, that raises a question though. How do we handle context overflow in the chat?

(no content)

Task1

10:36 AM

Like I know that, when using Claude, it will compact history at some point. Should we do that or just history gradually walk off the edge?

(no content)

Given the educational purpose of this tool — students learning how context windows work — I'd lean toward letting history walk off the edge (truncation) rather than compaction. A few reasons:

Pedagogical value: Students should experience what happens when the model "forgets" earlier messages. Compaction would hide that effect, which is exactly what you want them to observe.
Simplicity: Compaction requires an extra LLM call to summarize, adds latency, and introduces complexity around when to trigger it and how to store the summary.
The slider already handles this: The context window control you just added is essentially manual truncation. Students can see the difference between sending 5 messages vs 50 vs all.

The remaining question is what to do as a safety net when someone leaves the slider on "All" and the conversation gets long enough to exceed the model's token limit. Options there:

Silent truncation server-side: Automatically drop oldest messages to fit. Simple, but the student might not realize it happened.
Error with a message: Let the API error and tell the student to reduce context. More visible, but worse UX.
Do nothing: Azure OpenAI will return a 400 error, which currently shows as "[Error: Failed to generate response]".

Right now you're effectively doing option 3. Would you want to add automatic server-side truncation as a fallback, or is the current behavior (error if too long) fine for now?

10:38 AM

Right. The message limit is a bit deceptive in a way, since this is always done by tokens. Do we have a way of computing the tokenization for a message? We might want this later for explanatory purposes, but also to use it to figure out when to drop messages. We might also want to have dropped messages marked in the UI somehow.

(no content)

Currently there's no way to count tokens before sending a request. The codebase tracks token usage after API responses — promptTokens, completionTokens, totalTokens are extracted from the API response and stored in message metadata — but there's no local tokenizer installed.

For pre-request token counting (which you'd need for both UI display and server-side truncation), you'd need a library like:

tiktoken — OpenAI's official tokenizer (Rust/WASM, most accurate for GPT models)
gpt-tokenizer — Pure JS port, no WASM dependency, easier to bundle
js-tiktoken — Lighter JS wrapper around tiktoken

Any of these would let you count tokens per message, which enables:

UI display: Show token count next to each message or as a running total
Smart truncation: Drop oldest messages when total tokens approach the model's limit, rather than using a fixed message count
Dropped message indicators: Mark messages in the UI that were excluded from the current request

The message-count slider you just built is a good teaching tool on its own, but token-based truncation would be needed for a proper safety net since a single long message could blow the context window regardless of message count.

Want me to plan out adding token counting, or is this more of a future consideration?

Task1

10:46 AM

I want to avoid the 400 errors, so I think at minimum we need this on the backend. It should do per-message counting, add this to the database, and expose the information to the frontend. At some point we might also want to add some API endpoint to the backend to explore tokenization, returning counts and token boundaries, but I don't think we need that yet.

Good context. Let me design this properly.

Let me verify a couple of specifics before finalizing the plan.

Now let me check the latest gpt-tokenizer version for the dependency pin.

I have all the context I need. Let me write the plan.

Read3Task2WebSearch1Write1ExitPlanMode1

Per-Message Context Window Control for Chat

Goal

Design

Files to Change

1. New component: apps/web/app/chat/components/context-window-control.tsx

2. Update apps/web/app/chat/client.tsx

3. Update apps/web/app/chat/components/chat-input.tsx

4. Update apps/api/src/routes/chat.ts

5. No database changes

Verification

1. New component: `apps/web/app/chat/components/context-window-control.tsx`

2. Update `apps/web/app/chat/client.tsx`

3. Update `apps/web/app/chat/components/chat-input.tsx`

4. Update `apps/api/src/routes/chat.ts`