Building AI Agents at Scale

When we set out to build JP Morgan's premier PowerPoint Generative AI chat service, we knew we were entering uncharted territory. Serving 200,000+ employees globally with AI-powered document insights in a regulated environment presented unique challenges that pushed us to rethink traditional AI application architecture.

The Challenge

Financial institutions operate under strict compliance requirements that make traditional RAG (Retrieval-Augmented Generation) architectures problematic. Persisting document chunks raises data governance concerns, while maintaining accuracy at scale requires sophisticated evaluation frameworks.

Our Solution: Stateless Architecture

We pioneered a map-reduce inspired architecture that chunks and reduces context without persistence:

\\\python


def process_document(document):
    # Fan-out: Parallel processing of document chunks
    chunks = chunk_document(document)
    processed_chunks = parallel_process(chunks)
    
    # Reduce: Combine insights without storing
    return reduce_context(processed_chunks)

\\\

This stateless approach eliminated compliance issues while maintaining performance. By avoiding document chunk persistence, we satisfied regulatory requirements without sacrificing functionality.

Accuracy Through Evaluation

We implemented OpenAI evals to ensure generation accuracy met defined requirements:

- Factual accuracy: 95% threshold for document-based responses

- Relevance scoring: Context-aware response evaluation

- Hallucination detection: Fine-tuned Llama 3 models on AWS SageMaker

Cost Optimization

Restructuring prompts to leverage OpenAI's Prompt Caching reduced costs by 35%. Key strategies included:

- Template standardization: Consistent prompt structures for caching

- Context optimization: Minimal viable context for accurate responses

- Batch processing: Grouping similar requests for efficiency

Key Takeaways

1. Compliance-first design enables innovation within regulatory constraints

2. Stateless architectures can solve complex data governance challenges

3. Rigorous evaluation is essential for enterprise AI deployment

4. Cost optimization requires architectural thinking, not just prompt engineering

Building AI at scale in regulated environments demands creative solutions that balance innovation with compliance. Our experience shows that thoughtful architecture can unlock AI's potential while meeting the strictest requirements.