Street Learner

Author

12 min read

Last Updated: a year ago

LangGraph Part 3: Checkpointing, Memory Persistence & State Snapshots in Production AI Systems

In the previous two parts, we built a strong foundation of LangGraph fundamentals—nodes, edges, message states, conditional routing, reducers, summarization loops, and graph orchestration.

Now we enter an advanced phase:

We integrate checkpointing and persistence so your LangGraph pipeline becomes:

resumable
stateful
multi-threaded
fault tolerant
database-backed

This shift moves LangGraph from a learning tool to a production-ready workflow engine.

We will cover four major concepts:

1️⃣ Understanding Checkpointing in LangGraph

A checkpoint is a saved execution state.

When an LLM graph runs, each step produces:

updated state values
updated messages
summary text
metadata
execution progress

Without checkpointing, this information disappears after execution.

Checkpointing allows:

pause & resume
step back in time
multi-threading
fault recovery
parallel user sessions

In real applications—customer chatbots, research agents, retrieval pipelines—you must preserve state between runs.

LangGraph supports multiple checkpointing backends:

InMemorySaver
SQLiteSaver

We will build both today.

2️⃣ Threads in LangGraph

Threads allow multiple isolated executions to run through the same graph logic.

Why important?

Imagine a SaaS AI where:

user A chats → summary saved
user B chats → separate context
user C resumes days later

Threads allow separation using:

config = {"configurable": {"thread_id": "unique_id"}}

LangGraph handles routing each thread:

maintains independent message history
assigns checkpoints to different IDs
prevents state leakage

This is production architecture for:

multi-user chatbot platforms
per-customer knowledge models
agent swarms
cloud hosted AI SaaS

3️⃣ Short-Term Memory using InMemorySaver

When you do not need database persistence and want fast runtime memory, use:

from langgraph.checkpoint.memory import InMemorySaver

Why used?

testing
prototyping
short workflows
ephemeral execution

Short-term memory stores:

last run state
messages
summary
metadata

BUT:

memory dies when script restarts
cannot resume across sessions

That is okay for early builds.

4️⃣ The StateSnapshot Class

Once execution runs with checkpointing enabled, LangGraph stores a snapshot for each step.

Snapshots contain:

node executed
next node to run
full state value
stored messages
stored summary
metadata including step count

Why valuable?

debugging
visualization
inspecting conversations
retrieving intermediate values
analytics
version control of LLM decisions

This becomes essential in real world application auditing.

5️⃣ Long-Term Memory with SQLiteSaver

SQLite persistence is one of LangGraph’s most powerful features.

It allows your graph to store state across:

app restarts
server crashes
user sessions
long-term deployment

SQLiteSaver creates a structured table holding:

serialized messages
serialized summaries
checkpoint records
execution metadata
thread separation rows

It transforms LangGraph workflows from demo code into real AI applications.

FULL TECHNICAL IMPLEMENTATION

Below is the complete architecture, built with your provided code and then explained in depth.

SECTION 0 – Setup & Graph Definitions

What happens here?

We:

Import libraries
Load API keys
Initialize model
Create shared state
Define processing nodes
Build a graph structure

import os
import sqlite3
from dotenv import load_dotenv

print(f"{'='*30}\nSECTION 0: Setup & Graph Definitions\n{'='*30}")

load_dotenv()

from langgraph.graph import StateGraph, START, END, MessagesState
from langchain_openai.chat_models import ChatOpenAI
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, RemoveMessage
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.checkpoint.sqlite import SqliteSaver

We now load the model:

chat = ChatOpenAI(
    model="gpt-4o", 
    seed=365, 
    temperature=0, 
    max_completion_tokens=100
)

Now define a shared application State:

class State(MessagesState):
    summary: str

Meaning:

messages store chat messages
summary stores conversation summary

We now define the nodes.

Node 1 — ask_question()

This node feeds a new human question into the conversation:

def ask_question(state: State) -> State:
    print(f"\n-------> ENTERING ask_question:")
    question = "What is your question?"
    print(question)
    
    if not state.get("summary"):
        user_input = "Tell me about the history of the internet."
    else:
        user_input = "That's cool. Who invented the web?"
    
    print(f"(Simulated Input): {user_input}")
    
    return {"messages": [HumanMessage(user_input)]}

Node 2 — chatbot()

This node generates the response:

def chatbot(state: State) -> State:
    print(f"\n-------> ENTERING chatbot:")
    
    summary = state.get("summary", "")
    system_message = f'''
    Here's a quick summary of what's been discussed so far:
    {summary}
    
    Keep this in mind as you answer the next question.
    '''
    
    messages = [SystemMessage(system_message)] + state["messages"]
    response = chat.invoke(messages)
    response.pretty_print()
    
    return {"messages": [response]}

Node 3 — summarize_messages()

This node builds conversation compression:

def summarize_messages(state: State) -> State:
    print(f"\n-------> ENTERING summarize_messages:")
    
    new_conversation = ""
    for i in state["messages"]:
        new_conversation += f"{i.type}: {i.content}\n\n"
    
    summary_instructions = f'''
    Update the ongoing summary by incorporating the new lines of conversation below. 
    Build upon the previous summary rather than repeating it, 
    so that the result reflects the most recent context and developments.
    Respond only with the summary.

    Previous Summary:
    {state.get("summary", "")}

    New Conversation:
    {new_conversation}
    '''
    
    summary = chat.invoke([HumanMessage(summary_instructions)])
    print(f"--- Updated Summary: {summary.content[:50]}... ---")
    
    remove_messages = [RemoveMessage(id=i.id) for i in state["messages"]]
    
    return {"messages": remove_messages, "summary": summary.content}

This ensures:

memory grows slowly
summary grows infinitely
short-term history removed

This is exactly how production chatbots behave.

Build Graph Function

Same graph reused across all persistence backends:

def build_graph():
    graph = StateGraph(State)
    graph.add_node("ask_question", ask_question)
    graph.add_node("chatbot", chatbot)
    graph.add_node("summarize_messages", summarize_messages)

    graph.add_edge(START, "ask_question")
    graph.add_edge("ask_question", "chatbot")
    graph.add_edge("chatbot", "summarize_messages")
    graph.add_edge("summarize_messages", END)
    return graph

SECTION 1 – Short-Term Memory using InMemorySaver

print(f"\n{'='*30}\nSECTION 1: InMemorySaver (Short-Term)\n{'='*30}")

memory_checkpointer = InMemorySaver()

graph_memory = build_graph().compile(checkpointer=memory_checkpointer)

config1 = {"configurable": {"thread_id": "1"}}
config2 = {"configurable": {"thread_id": "2"}}

print("--- Thread 1 Execution ---")
graph_memory.invoke(State(messages=[], summary=""), config1)

print("\n--- Thread 2 Execution (Independent) ---")
graph_memory.invoke(State(messages=[], summary=""), config2)

Results:

Thread 1 & 2 run independently
memory saved inside RAM
no database created

SECTION 2 – Inspecting History with StateSnapshot

print(f"\n{'='*30}\nSECTION 2: State Snapshots (Inspecting History)\n{'='*30}")

graph_states = [i for i in graph_memory.get_state_history(config1)]

print(f"Number of snapshots found: {len(graph_states)}")

Each snapshot contains:

intermediate state
messages count
summary values
node ordering

This is debugging gold.

SECTION 3 – Long-Term Memory with SQLiteSaver

print(f"\n{'='*30}\nSECTION 3: Long-Term Persistence (SQLite)\n{'='*30}")

db_path = "langgraph_memory.db"
con = sqlite3.connect(database=db_path, check_same_thread=False)

sqlite_checkpointer = SqliteSaver(con)

graph_sqlite = build_graph().compile(checkpointer=sqlite_checkpointer)

config_sqlite = {"configurable": {"thread_id": "persistence_demo_1"}}

graph_sqlite.invoke(State(messages=[], summary=""), config_sqlite)

This finally creates:

persistent agent memory across reboots
stored summaries
stored messages history
resumable agent state

Your .db file now contains:

state value history
message IDs
summaries
metadata

⭐ FINAL FULL REAL-WORLD EXAMPLE ⭐

This script:

initializes model
builds graph
uses SQLite
runs three sessions
resumes context
prints results

from langgraph.graph import START, END, StateGraph, MessagesState
from langgraph.checkpoint.sqlite import SqliteSaver
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
import sqlite3

# PATH
con = sqlite3.connect("memory_persist_demo.db", check_same_thread=False)

# LLM
llm = ChatOpenAI(model="gpt-4o")

class State(MessagesState):
    summary: str

def ask(state: State):
    return {"messages": [HumanMessage("Explain AI agents")]}

def answer(state: State):
    messages = state["messages"]
    res = llm.invoke(messages)
    return {"messages": [res]}

def summarize(state: State):
    summary = "\n".join([m.content for m in state["messages"]])
    return {"summary": summary}

graph = StateGraph(State)
graph.add_node("ask", ask)
graph.add_node("answer", answer)
graph.add_node("summarize", summarize)
graph.add_edge(START, "ask")
graph.add_edge("ask", "answer")
graph.add_edge("answer", "summarize")
graph.add_edge("summarize", END)

compiled = graph.compile(checkpointer=SqliteSaver(con))

config = {"configurable": {"thread_id": "A123"}}

out = compiled.invoke(State(messages=[], summary=""), config)
print("SUMMARY SAVED:", out["summary"])

Run this file multiple times.

You will notice:

summary grows
responses connect
memory loads from database

This is real AI persistence.

Conclusion

With this Part 3 chapter, you now know how to build enterprise-friendly LangGraph pipelines:

multi-thread execution
checkpoint state saving
short-term RAM memory
permanent SQLite database
snapshot inspection debugging

This is production engineering for:

conversational AI
customer support bots
autonomous agents
knowledge workers
data analytics

LangGraph Part 3: Checkpointing, Memory Persistence & State Snapshots in Production AI Systems

1️⃣ Understanding Checkpointing in LangGraph

2️⃣ Threads in LangGraph

3️⃣ Short-Term Memory using InMemorySaver

4️⃣ The StateSnapshot Class

5️⃣ Long-Term Memory with SQLiteSaver

FULL TECHNICAL IMPLEMENTATION

SECTION 0 – Setup & Graph Definitions

What happens here?

Node 1 — ask_question()

Node 2 — chatbot()

Node 3 — summarize_messages()

Build Graph Function

SECTION 1 – Short-Term Memory using InMemorySaver

SECTION 2 – Inspecting History with StateSnapshot

SECTION 3 – Long-Term Memory with SQLiteSaver

⭐ FINAL FULL REAL-WORLD EXAMPLE ⭐

Conclusion

Related Stories

LangGraph Part 3: Checkpointing, Memory Persistence & State Snapshots in Production AI Systems

LangGraph Part-2: Complete Message Management System With Reducers, Annotated Framework & Dynamic Memory

Part 1 – Introduction to LangGraph & Understanding State, Nodes, Edges and Conditional Routing (with Typesafe Python)