Memory Fundamentals

Understanding the core concepts of how AI agents store, retrieve, and use information over time.

ELI5: Agent Memory

Think of AI agent memory like a human brain during a conversation. You remember what was said earlier (short-term memory), you can recall facts you learned years ago (long-term memory), and you can look things up in books when needed (external memory). AI agents work similarly - they keep track of recent interactions, store important information for later use, and can search databases or documents when they need specific facts.

Short-Term Memory
Immediate context and recent interactions
  • • Context window (tokens in current conversation)
  • • Recent user messages and AI responses
  • • Current task state and progress
  • • Temporary variables and calculations

Example: ChatGPT remembering your previous questions in the same conversation

Long-Term Memory
Persistent storage across sessions
  • • User preferences and patterns
  • • Historical conversation summaries
  • • Learned facts and relationships
  • • Domain-specific knowledge

Example: GitHub Copilot learning your coding style over time

Types of Agent Memory
Different approaches to storing and retrieving information

Parametric Memory

Knowledge encoded in the model's weights during training. This is like your brain's built-in knowledge.

GPT-4 knowledge
Claude training data
Model parameters

Contextual Memory

Information held in the current context window. Limited by token limits but immediately accessible.

Conversation history
System prompts
Current documents

External Memory

Information stored outside the model that can be retrieved when needed. Like having access to a library.

Vector databases
Knowledge graphs
File systems

Episodic Memory

Memories of specific events and experiences, often with temporal and contextual information.

Conversation logs
Task histories
User interactions
Video Resources
Recent videos explaining memory concepts

Attention Is All You Need - Explained

Deep dive into the transformer architecture that powers modern AI memory

Yannic Kilcher1:08:002024
Watch

RAG vs Long Context: When to Use What

Practical comparison of retrieval vs context window approaches

AI Explained18:322024
Watch

Building Memory-Enabled AI Agents

Hands-on tutorial for implementing agent memory systems

LangChain45:202024
Watch
Research Papers
Key papers on memory systems and architectures

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

The foundational RAG paper that started the retrieval revolution

Lewis et al.NeurIPS 20202020
Read

MemGPT: Towards LLMs as Operating Systems

Novel approach to managing memory hierarchies in LLMs

Packer et al.ArXiv 20232023
Read

Lost in the Middle: How Language Models Use Long Contexts

Critical analysis of how models actually use long context windows

Liu et al.TACL 20242024
Read

The Reversal Curse: LLMs trained on A is B fail to learn B is A

Important findings about bidirectional memory in language models

Berglund et al.ArXiv 20232023
Read
Key Challenges in Agent Memory
Common problems and limitations to understand

Context Window Limits

Even large models have finite context windows. GPT-4 has 128k tokens, but that's still limited for long conversations or large documents.

Memory Consistency

Ensuring that stored memories remain accurate and don't contradict each other over time.

Retrieval Accuracy

Finding the right information at the right time. Vector similarity doesn't always match semantic relevance.

Privacy & Security

Protecting sensitive information while maintaining useful memory capabilities across sessions.