AILangGraphRAGPython

Building a Multi-Agent RAG System with LangGraph

A deep dive into how I architected Ledger Lens — an AI-powered financial intelligence platform using LangGraph, Pinecone, and hallucination detection.

2 min read

Overview

Building Ledger Lens taught me a lot about the practical challenges of production RAG systems. In this post I'll walk through the architecture decisions and the lessons learned along the way.

Why Multi-Agent?

A single LLM call isn't enough for complex financial queries. You need specialized agents for retrieval, reasoning, and validation. LangGraph makes it straightforward to wire these together as a directed graph where each node has a clear responsibility.

The graph looks roughly like this:

retrieval → reasoning → validation → response
               ↑                        |
               └──── retry if flagged ──┘

The Architecture

The system consists of three core agents:

  1. Retrieval Agent — queries Pinecone for semantically relevant financial documents
  2. Reasoning Agent — synthesizes retrieved context into structured answers
  3. Validation Agent — runs hallucination detection before returning results

Each agent is a LangGraph node. State flows between them as a typed dict, which keeps things debuggable.

Hallucination Detection

This was the hardest part. We use a cross-encoder to score the faithfulness of each claim against the retrieved sources. Anything below a confidence threshold gets flagged and either regenerated or returned with a disclaimer.

def validate_response(state: AgentState) -> AgentState:
    score = cross_encoder.predict([
        state["query"],
        state["response"]
    ])
    if score < CONFIDENCE_THRESHOLD:
        state["needs_retry"] = True
    return state

Lessons Learned

  • Chunk size matters enormously for financial data — smaller chunks (256 tokens) outperformed larger ones
  • Hybrid search (dense + sparse) significantly improves recall on numerical queries
  • Always validate before returning — users trust financial data implicitly
  • LangGraph's checkpointing is invaluable for debugging long agent chains