Memory and Context (Why It Forgets Everything)
I spent 45 minutes explaining our entire codebase to Claude. Architecture, patterns, conventions, the works. Twenty minutes later, it suggested using MongoDB. We use PostgreSQL. I told it this at the beginning. It apologized and said 'of course, PostgreSQL.' Three messages later: 'Have you considered switching to MongoDB?' I stared at my screen. The AI had forgotten. Not like a human forgets, where the memory is still there somewhere. Gone. Erased. Like I never said it. I checked the token count. We'd hit the limit. Everything from the first 30 minutes was just... gone.
Table of Contents
- The Goldfish Problem
- Why Context Windows Are a Lie
- The Conversation That Ate Itself
- What the AI Actually Remembers
- The Agentic Breakthrough
- Teaching the AI to Keep a Diary
- The Task List Pattern
- Building Your Project Memory
- When to Start Over (And How to Not Lose Everything)
- The Future That Actually Arrived
The Goldfish Problem
You’re 45 minutes into a debugging session with Claude. You’ve been building context, explaining your architecture, sharing code snippets. The AI understands your problem. You’re making progress.
Then it suggests something that completely contradicts what you told it 20 minutes ago.
You: “Wait, didn’t I tell you we’re using PostgreSQL, not MongoDB?”
Claude: “You’re right, I apologize for the confusion. Let me revise that for PostgreSQL…”
What you think: Did it just… forget? How did it forget something I told it in THIS conversation?
What happened: The conversation got too long. Your early messages fell off the edge of the context window. The AI literally can’t see them anymore. To the AI, you never said you were using PostgreSQL. That information no longer exists in its universe.
The realization: You’re not having a conversation with something that has memory. You’re having a conversation with something that has a very long notepad, and when the notepad fills up, it tears off the top page and throws it away.
And it doesn’t tell you it’s doing this. It just keeps talking like nothing happened, confidently answering based on incomplete information.
This is insane. This is also how every AI conversation works. Welcome to the future.
Why Context Windows Are a Lie
“Claude has a 200k token context window!” the marketing says. “That’s 150,000 words! That’s 300 pages!”
Technically true. Practically bullshit.
Why it’s bullshit:
Reason 1: You Never Get the Full Window
That 200k includes:
- Your system prompt (if you have one)
- Every message you’ve sent
- Every response the AI generated
- Any files you’ve uploaded
- Any conversation history
So your “200k window” is really maybe 150k after overhead. Then you paste your codebase. Now it’s 100k. Then you have a 30-message conversation. Now it’s 70k. You’re not getting 200k of useful space.
Reason 2: The AI Gets Dumber at the Edges
Context windows work like this: the AI pays the most attention to recent stuff (the last few messages) and the beginning stuff (your initial prompt). Everything in the middle? It’s technically there, but the AI’s attention is weaker.
It’s like trying to remember a conversation you had an hour ago while someone’s talking to you right now. You remember the gist, maybe, but not the details.
Real test: Have a 50-message conversation. In message 51, reference something specific from message 5. Watch the AI fumble. It’s there in the context, but the AI’s attention has moved on.
Reason 3: Tokens Add Up Faster Than You Think
You think: “200k tokens is huge, I’ll never hit that.”
Reality:
- Your initial prompt: 500 tokens
- A typical code file: 1,000-3,000 tokens
- Your explanation of the problem: 300 tokens
- AI’s response: 800 tokens
- Back-and-forth for 20 messages: 15,000 tokens
You’re at 20k already. Share a few more files, have a longer conversation, and suddenly you’re at 100k. Another 20 messages and you’re dropping context.
The pattern: Context windows are technically large but practically limited. You’ll hit the limits faster than you expect, and when you do, the AI starts forgetting without telling you.
The Conversation That Ate Itself
Let me tell you about the worst debugging session of my life.
I’m working on a complex bug. Something to do with our caching layer and database synchronization. I start a conversation with Claude:
Message 1: “I’m working on a bug where our cache and database are out of sync…”
Messages 2-15: Explaining the architecture, sharing code, trying solutions.
Message 16: Claude suggests something. Doesn’t work. I explain why.
Messages 17-30: More back and forth. We’re making progress. The AI seems to understand the system now.
Message 31: AI suggests checking the cache invalidation in the UserService.
Me: “I already showed you the UserService code in message 8. The cache invalidation is there. That’s not the issue.”
Claude: “You’re right, my apologies. Looking at the UserService code you shared…”
Except here’s the thing: message 8 is probably gone now. The context window has been eating the early messages. The AI is pretending to look at code it can no longer see.
Message 45: We’re deep in the weeds. The AI suggests something that would have made sense if it remembered the architecture I explained at the start. But it doesn’t remember. So the suggestion is nonsense.
I realize what’s happening. The conversation has eaten itself. The early context—the architecture explanation, the first code samples, my initial problem description—is all gone. The AI is working with fragments.
I start over. New conversation. Summarize everything we learned. Get the solution in 10 more messages.
Total time wasted: 90 minutes on a conversation that forgot its own beginning.
The lesson: Long conversations don’t work. The AI’s memory is a sliding window, and you’re coding against that window’s limits whether you know it or not.
But here’s where it gets interesting: modern AI tools have gotten smarter about this. They can actually help manage their own memory now.
The Agentic Breakthrough
Something changed in late 2024. LLMs became agentic. They can now do things during conversations, not just respond to you.
What “agentic” means in practice:
- The AI can create and edit files in your project
- It can read existing files to understand context
- It can write notes to itself about important information
- It can update task lists and track progress
- It can maintain its own memory between messages
The shift: Instead of you manually managing the AI’s memory, the AI can now manage its own memory. You just need to show it how.
Example of what the AI can do now:
You: “We’re starting a new feature. Here’s what we need to build…”
AI: “Let me create a working notes file to track this.
[Creates WORKING_NOTES.md with project context, tech stack, and current objectives]
I’ve documented our starting point. As we work through this, I’ll update these notes with important findings and decisions. If we start a new conversation, I can read this file to understand where we left off.”
What just happened: The AI gave itself persistent memory. Not through magic, but through files.
Teaching the AI to Keep a Diary
Here’s the pattern that changes everything: Let the AI write down what it learns.
The setup:
Create a file in your project: AI_CONTEXT.md or WORKING_NOTES.md or whatever you want to call it.
Tell the AI:
As we work on this project, maintain notes in WORKING_NOTES.md about:
- Important architectural decisions
- Tech stack and patterns we're using
- Problems we've solved and how
- Current blockers and context
- Things that didn't work and why
Update this file as we go. If I start a new conversation, you can read it to understand the current state.
What the AI will do:
First conversation:
# Working Notes - TaskManager Pro
## Last Updated: 2025-01-09
## Tech Stack
- Frontend: React 18 + TypeScript
- Backend: Node.js + Express + PostgreSQL
- Currently working on: User authentication system
## Important Decisions
- Using JWT tokens with 15min expiry + refresh tokens
- Passwords hashed with bcrypt (12 rounds)
- All database ops go through Repository pattern
## Current Status
- Login endpoint working
- Registration endpoint needs email validation
- Password reset flow: TODO
## Gotchas Discovered
- Cache invalidation: Must happen in service layer, not controllers
- Database transactions: Always use Prisma's $transaction for multi-step ops
Second conversation (days later):
You: “I need to work on the password reset feature”
AI: [Reads WORKING_NOTES.md]
“I see we’re working on TaskManager Pro with Node/Express/PostgreSQL. The authentication system uses JWT tokens. Password reset is marked as TODO. Let me update the notes and we can start implementing…”
[Updates WORKING_NOTES.md with current work]
What changed: The AI has memory. Not real memory, but functional memory through files. It can remember your tech stack, your decisions, your patterns. Across conversations. Across days. As long as the file exists.
The beautiful part: You don’t maintain this file. The AI maintains it. You just tell it to, and it does.
The Task List Pattern
Here’s another agentic pattern that works incredibly well: Let the AI manage its own task list.
The setup:
Create TASKS.md:
# TaskManager Pro - Development Tasks
## Authentication System
- [x] Implement login endpoint
- [x] Implement registration endpoint
- [ ] Add email validation to registration
- [ ] Implement password reset flow
- [ ] Generate reset token
- [ ] Send reset email
- [ ] Validate reset token
- [ ] Update password
- [ ] Add rate limiting to auth endpoints
## Cache Layer
- [x] Set up Redis connection
- [x] Implement cache service
- [ ] Fix cache invalidation on user updates (BLOCKER)
- [ ] Add cache warming for frequently accessed data
Tell the AI:
Track our progress in TASKS.md. As we complete tasks, check them off.
As we discover new tasks, add them. If we hit blockers, mark them as BLOCKER.
Use this to maintain continuity across conversations.
What happens:
During work, the AI updates the file:
## Authentication System
- [x] Implement login endpoint
- [x] Implement registration endpoint
- [x] Add email validation to registration ✅ (completed 2025-01-09)
- [ ] Implement password reset flow
- [x] Generate reset token ✅
- [x] Send reset email ✅
- [ ] Validate reset token (IN PROGRESS)
New conversation next day:
You: “Continue working on the password reset feature”
AI: [Reads TASKS.md]
“I see we’ve completed the reset token generation and email sending. The next step is validating the reset token. Let me continue from there…”
The power: The AI knows exactly where you left off. What’s done. What’s in progress. What’s blocked. No re-explaining. No “where were we?” No lost context.
The irony: We gave the AI a to-do list and checkbox discipline, and now it’s more organized than most developers.
Building Your Project Memory
Let’s put this together into a system that actually works.
File 1: Project Context (The Permanent Memory)
PROJECT_CONTEXT.md:
# TaskManager Pro - Project Context
## Overview
Task management app with teams, assignments, and deadlines.
## Tech Stack
- Frontend: React 18 + TypeScript + Tailwind CSS
- Backend: Node.js 18 + Express + TypeScript
- Database: PostgreSQL 14 with Prisma ORM
- Cache: Redis 7
- Auth: JWT (access + refresh tokens)
- Deployment: Docker on AWS ECS
## Architecture
/src /components (React components) /pages (Next.js pages) /services (Business logic) /repositories (Database layer) /middleware (Express middleware) /utils (Shared utilities)
## Key Patterns
- All API calls through ApiClient service
- All database ops through Repository classes
- Business logic in Service classes
- Controllers are thin routing layer
- Custom error classes from /errors
## Database Schema
[Tables: Users, Tasks, Teams, TeamMembers]
[Key relationships and constraints]
## Environment Setup
- Local: Docker Compose
- Staging: AWS ECS
- Prod: AWS ECS + RDS
This file doesn’t change much. It’s the reference manual. The AI reads it at the start of every conversation.
File 2: Working Notes (The Short-Term Memory)
WORKING_NOTES.md:
# Working Notes
## Last Updated: 2025-01-09 14:30
## Current Work
Implementing password reset functionality in authentication system.
## Recent Changes
- Added email validation to registration (2025-01-09)
- Fixed cache invalidation bug in UserService (2025-01-08)
- Implemented JWT refresh token rotation (2025-01-07)
## Active Problems
- Password reset emails not sending in production (works locally)
- Possible issue with SMTP configuration in AWS
- Need to check environment variables in ECS task definition
## Important Findings
- Cache keys must use format `user:${id}` not `user_${id}`
- Prisma transactions timeout after 5 seconds by default
- JWT tokens must include both userId and email for our middleware
- User.passwordHash field is non-nullable (by design for security)
## Next Steps
1. Debug email sending in production
2. Implement reset token validation
3. Add rate limiting to prevent abuse
4. Write tests for complete password reset flow
This file changes constantly. The AI updates it every conversation. It’s the journal of what’s happening now.
File 3: Task List (The Progress Tracker)
TASKS.md:
# Development Tasks
## 🔥 High Priority
- [ ] Fix password reset emails in production (BLOCKER)
- [ ] Implement rate limiting on auth endpoints (SECURITY)
## In Progress
- [ ] Password reset flow
- [x] Generate reset token
- [x] Send reset email (works locally, broken in prod)
- [ ] Validate reset token
- [ ] Update password
- [ ] Add expiry check (tokens valid 1 hour)
## Completed Recently
- [x] Email validation on registration ✅ 2025-01-09
- [x] Cache invalidation fix ✅ 2025-01-08
- [x] JWT refresh tokens ✅ 2025-01-07
## Backlog
- [ ] Add 2FA support
- [ ] Implement session management
- [ ] Add OAuth providers (Google, GitHub)
This file tracks everything. What’s done, what’s happening, what’s next. The AI checks things off as you work.
How This Works in Practice
Start of day, new conversation:
You: “Continue working on the authentication system”
AI:
[Reads PROJECT_CONTEXT.md to understand tech stack]
[Reads WORKING_NOTES.md to see current state]
[Reads TASKS.md to see what's in progress]
I see we're working on the password reset flow. The email sending is
working locally but failing in production. Based on the working notes,
it might be an SMTP configuration issue in AWS.
Let me check the environment configuration...
During work:
AI updates WORKING_NOTES.md with findings. AI checks off completed items in TASKS.md. AI adds new tasks as they’re discovered.
End of day:
All context is saved in files. Tomorrow, start a fresh conversation, and the AI picks up exactly where you left off.
The magic: The AI manages its own memory through files. You don’t copy-paste context. You don’t summarize conversations. The AI does it.
When to Start Over (And How to Not Lose Everything)
Even with memory files, you’ll still hit the context window limit in very long conversations. But now, starting over doesn’t lose everything.
Signs you should start a new conversation:
Signal 1: The Conversation is Unwieldy
You’re past 50 messages. The chat is hard to scroll through. Starting fresh will be cleaner.
Signal 2: Topic Change
You finished the password reset feature and want to work on the cache layer. Different topic, different conversation.
Signal 3: The AI is Confused
It’s giving contradictory advice or forgetting recent decisions. Fresh start will help.
Signal 4: You Want to Clean Up
The conversation has debugging rabbit holes and false starts. A clean conversation will be easier to work with.
How to start over without losing context:
Old way (terrible): Open new conversation. Re-explain everything. Lose 15 minutes rebuilding context.
New way (smart):
You: “Let’s start fresh. First, update WORKING_NOTES.md with our current status and findings from this conversation.”
AI: [Updates WORKING_NOTES.md with everything important]
You: [Start new conversation]
New You: “Continue working on TaskManager Pro. Read PROJECT_CONTEXT.md, WORKING_NOTES.md, and TASKS.md to understand the current state.”
New AI: [Reads all three files] “I understand. We’re working on the authentication system. Password reset emails are failing in production. Let me continue from there…”
What just happened: Zero context loss. The new conversation knows everything the old conversation knew, because it’s all in files.
Time to transition: 2 minutes.
Context preserved: 100%.
This is the breakthrough. Memory through files. The AI can be amnesiac between conversations, but the files remember. And the AI can read files.
The Instruction File (Teaching the AI to Manage Itself)
You can make this automatic by putting the instructions in a .cursorrules or .clinerules file (or whatever your tool uses):
# Project Management Instructions
## Memory Management
- Maintain working notes in WORKING_NOTES.md
- Update at the end of each session with key findings
- Include current blockers and context
- Note what worked and what didn't
- Track progress in TASKS.md
- Check off completed tasks with date
- Add new tasks as they're discovered
- Mark blockers as (BLOCKER)
- Keep completed tasks visible for context
- Read both files at the start of new conversations
- Use them to understand current state
- Continue from where we left off
- No need for user to re-explain context
## Working Style
- Ask before making significant architectural changes
- Update task list as work progresses
- Document important findings in working notes
- Be explicit about what you're doing to files
## Code Style
[Your project-specific patterns]
Now the AI knows:
- Manage your own memory through files
- Update task lists as you work
- Read context at conversation start
- Keep notes about important findings
You don’t have to remind it. It just does it. Every conversation. Automatically.
The Future That Actually Arrived
Remember earlier when I said we wished for:
- AI that remembers across conversations
- AI that knows your codebase
- AI that learns from corrections
- AI with persistent memory
Turns out, we have it. Not through magic neural network memory, but through the AI being agentic enough to manage files.
What modern AI can do:
- Read files to understand context
- Write files to save information
- Update files to track progress
- Manage its own memory through documentation
The trick: It’s not real memory. It’s just files. But functionally? It works the same.
The AI reads its own notes from yesterday and continues where it left off. That’s memory, even if it’s implemented through Markdown files.
The beautiful part: This is better than neural network memory would be because:
- You can read the notes (transparency)
- You can edit the notes (control)
- You can version control the notes (history)
- You can share the notes with teammates (collaboration)
The AI’s “memory” is just project documentation that both humans and AI can read. That’s elegant. That’s actually better than black-box memory.
What This Means Practically
After adopting this workflow:
Time spent on context management: 5 minutes at end of each session (AI updates notes)
Time saved on context rebuilding: 15-30 minutes at start of each new conversation
Net gain: 10-25 minutes per session. Times 5 sessions per week. That’s 50-125 minutes per week.
More importantly:
- No more “where were we?”
- No more re-explaining architecture
- No more lost context from long conversations
- Continuity across days and weeks
The workflow:
Monday morning:
- Start new conversation
- “Read PROJECT_CONTEXT.md, WORKING_NOTES.md, and TASKS.md”
- AI knows exactly where you left off Friday
- Continue working seamlessly
During work:
- AI updates notes with findings
- AI checks off completed tasks
- AI adds new tasks as discovered
End of day:
- “Update WORKING_NOTES.md with today’s progress”
- AI summarizes the day
- Walk away, everything is saved
Next day:
- Fresh conversation
- Full context restored in 30 seconds
- Keep working
The shift: From fighting the AI’s amnesia to working with it. The AI forgets between conversations, but the files remember. And the AI can read files.
The Uncomfortable Truth
Here’s what nobody wants to admit: the AI’s lack of native memory might be a feature, not a bug.
Why file-based memory is better:
You Can Audit It
Neural network memory is a black box. File-based memory? You can read it. You know what the AI “remembers.” No mysteries.
You Can Edit It
The AI got something wrong and saved it to notes? Edit the file. Fix it. The AI will read the corrected version.
You Can Version Control It
Git tracks your AI’s memory. You can see how understanding evolved. You can rollback if needed.
You Can Share It
New team member? They read the same files. Everyone has the same context. No “but the AI told me something different.”
It Forces Documentation
The AI’s memory IS your documentation. You’re documenting as you work. That helps everyone, not just the AI.
The downside: You’re one layer removed from “true” AI memory. The AI is managing files, not remembering natively.
The upside: That layer is visible, editable, versionable, and shareable. Which makes it better.
The verdict: We don’t need neural network memory when we have agentic file management. This might actually be the better solution.
What This Actually Looks Like
Here’s a real Friday → Monday transition:
Friday 5pm, end of conversation:
Me: “Update working notes with what we learned today”
AI: [Updates WORKING_NOTES.md]
## Last Updated: 2025-01-09 17:00
## Today's Work
- Fixed password reset email issue
- Root cause: SMTP port misconfigured in production (was 25, should be 587)
- Solution: Updated ECS task definition with correct SMTP_PORT env var
- Deployed and verified working in production
## Current Status
- Password reset flow fully working
- All auth endpoints have rate limiting (10 req/min)
- Ready to move to next feature
## Next Steps
- Start work on 2FA implementation
- Will need to add `twoFactorSecret` field to User table
- Consider using speakeasy library for TOTP
Monday 9am, new conversation:
Me: “Read the context files and let’s continue”
AI: [Reads PROJECT_CONTEXT.md, WORKING_NOTES.md, TASKS.md]
“Good morning. I see we completed the password reset flow on Friday. The email issue is resolved (SMTP port fix), and all auth endpoints now have rate limiting.
The working notes indicate next step is 2FA implementation. We’ll need to add a twoFactorSecret field to the User table and you’re considering the speakeasy library for TOTP generation.
Should we start by updating the database schema?”
What happened: Zero context rebuilding. The AI knows exactly where we are. I didn’t re-explain anything. The files did all the work.
Time to resume work: 30 seconds.
This is the workflow. This is what actually works.