LangGraph Part 3: Checkpointing, Memory Persistence & State Snapshots in Production AI Systems
In the previous two parts, we built a strong foundation of LangGraph fundamentals—nodes, edges, message states, conditional routing, reducers, summarization loops, and graph orchestration.
Now we enter an advanced phase:
We integrate checkpointing and persistence so your LangGraph pipeline becomes:
resumable
stateful
multi-threaded
fault tolerant
database-backed
This shift moves LangGraph from a learning tool to a production-ready workflow engine.
We will cover four major concepts:
1️⃣ Understanding Checkpointing in LangGraph
A checkpoint is a saved execution state.
When an LLM graph runs, each step produces:
updated state values
updated messages
summary text
metadata
execution progress
Without checkpointing, this information disappears after execution.
Checkpointing allows:
pause & resume
step back in time
multi-threading
fault recovery
parallel user sessions
In real applications—customer chatbots, research agents, retrieval pipelines—you must preserve state between runs.
This node feeds a new human question into the conversation:
def ask_question(state: State) -> State:
print(f"\n-------> ENTERING ask_question:")
question = "What is your question?"
print(question)
if not state.get("summary"):
user_input = "Tell me about the history of the internet."
else:
user_input = "That's cool. Who invented the web?"
print(f"(Simulated Input): {user_input}")
return {"messages": [HumanMessage(user_input)]}
Node 2 — chatbot()
This node generates the response:
def chatbot(state: State) -> State:
print(f"\n-------> ENTERING chatbot:")
summary = state.get("summary", "")
system_message = f'''
Here's a quick summary of what's been discussed so far:
{summary}
Keep this in mind as you answer the next question.
'''
messages = [SystemMessage(system_message)] + state["messages"]
response = chat.invoke(messages)
response.pretty_print()
return {"messages": [response]}
Node 3 — summarize_messages()
This node builds conversation compression:
def summarize_messages(state: State) -> State:
print(f"\n-------> ENTERING summarize_messages:")
new_conversation = ""
for i in state["messages"]:
new_conversation += f"{i.type}: {i.content}\n\n"
summary_instructions = f'''
Update the ongoing summary by incorporating the new lines of conversation below.
Build upon the previous summary rather than repeating it,
so that the result reflects the most recent context and developments.
Respond only with the summary.
Previous Summary:
{state.get("summary", "")}
New Conversation:
{new_conversation}
'''
summary = chat.invoke([HumanMessage(summary_instructions)])
print(f"--- Updated Summary: {summary.content[:50]}... ---")
remove_messages = [RemoveMessage(id=i.id) for i in state["messages"]]
return {"messages": remove_messages, "summary": summary.content}
This ensures:
memory grows slowly
summary grows infinitely
short-term history removed
This is exactly how production chatbots behave.
Build Graph Function
Same graph reused across all persistence backends:
print(f"\n{'='*30}\nSECTION 2: State Snapshots (Inspecting History)\n{'='*30}")
graph_states = [i for i in graph_memory.get_state_history(config1)]
print(f"Number of snapshots found: {len(graph_states)}")
In the previous two parts, we built a strong foundation of LangGraph fundamentals—nodes, edges, message states, conditional routing, reducers, summarization loops, and graph orchestration.
In Part-1 of this LangGraph Blog Series, we understood the foundation of LangGraph — Graph structure, Nodes, Edges, Conditional Routing, State system, and Graph Execution.
Now in Part-2, we upgrade our knowledge and turn LangGraph into a real conversation system.
Modern AI workflows need more than just a prompt and a model call. Real applications require memory, state transitions, branching logic, routing decisions, and orchestration of multiple AI models. This is where LangGraph enters the scene.