Turning a Generative AI Court-Case Analysis System from 90–95% Errors to Production-Ready
A leading litigation finance firm launched an initiative to modernize its approach to assessing U.S. federal court cases using AI. The aim was to identify relevant cases from court case-detail pages, download associated PDF files, and ingest this data into large language models (LLMs) to extract key facts that speed up human review and funding decisions.
The concept worked on paper, but the first implementation did not hold up during actual operations. The system was producing extremely high failure rates across data downloads, processing, and interpretation, which prevented downstream teams from using the solution.
SPR stepped in to Evolve the initial implementation. The goal was to stabilize and rebuild the solution so the platform could become an enterprise-ready, scalable, and usable system.
The opportunity
The platform’s workflow focused on three key steps: retrieving a list of newly created US federal cases, obtaining the case documents, and then extracting usable findings from them.
First, the system navigated to the U.S. federal court website to find case details, identify cases of interest, and then download the corresponding PDF documents, which could range from just a few pages to hundreds.
The PDF files were stored in Azure Blob Storage and then routed into an Azure OpenAI LLM workflow for interpretation and categorization. From there, the findings were used to decide whether a case deserved further human review and potential funding.
The challenge
When SPR joined the effort, the issue wasn’t just a single bug or a fragile integration. The platform faced real-world conditions at every stage, from rapid automation on unpredictable court sites to downstream AI processing that relied on incomplete or inconsistent documents.
Automation moved faster than the real world: The system clicked through pages in milliseconds without reliably waiting for pages, links, or content to load. In practice, pages were not ready, links were unavailable, and documents were sometimes not yet posted, leading to failures and incomplete or incorrect information.
“Document not posted yet” was mistaken for a system failure: The pipeline assumed that if a case existed, the corresponding PDF file should be available immediately. In reality, there can be a noticeable delay between case creation and document availability, and that “PDF not available” status needed to be tracked as a follow-up, not treated as an error.
Dead-letter queues accumulate repeated failures: When cases were queued for analysis via Azure Service Bus and processed through Azure Functions, many messages ended up in the dead-letter queue whenever the function failed. Then, the system would requeue DLQ messages back to the main queue without fixing the underlying issue, causing the same errors to occur repeatedly.
A foundation that needed rebuilding, not patching: What started as an effort to “reduce errors” grew into a deeper understanding that the system required a complete re-architecture to effectively handle real-world scenarios, edge cases, and exceptions in an enterprise-ready manner. This involved replacing an overly simplified, denormalized database with a properly designed schema, rewriting much of the application code to systematically manage these scenarios, tracking case status throughout the processing lifecycle, making sure the court website navigation waited for pages and content to fully load, and preventing Service Bus messages from repeatedly entering the dead-letter queue (DLQ) by addressing the root causes.
SPR’s approach
SPR rebuilt the platform from the ground up, prioritizing stability and proper lifecycle management.
- Re-architect the acquisition workflow to enhance reliability. SPR redesigned the court-site navigation to imitate a careful human user, waiting at each step until pages loaded properly before proceeding. This reduced fragile “click too fast” failures that were causing large volumes of download errors.
- Treat “PDF not available” as a tracked status, not an error. Instead of assuming every case will have an immediately downloadable PDF file, the system was updated to explicitly mark when a PDF file was not yet available, recognizing it may take days or longer for documents to be prepared and uploaded.
- Handle real case life-cycle scenarios. SPR introduced logic to detect and manage case statuses and situations that should not go through the same pipeline, including cases like self-representation, duplicates, and cases that were closed. The work included identifying many distinct case status patterns before reaching the subset of cases the client needed to process further.
- Stabilize LLM processing and DLQ behavior. On the analysis side, SPR addressed situations where the LLM was instructed to extract specific sections from PDFs, but those sections were missing. Instead of allowing these conditions to crash processing and send messages to the DLQ, the workflow now includes proper error handling and structured outcomes. This helps teams understand what happened and respond appropriately.
- Lead, coach, and transfer capability. The SPR team functioned as the de facto head of software engineering for the initiative, guiding and mentoring the client team while personally handling core architecture, design, and hands-on implementation. Once the foundation was rebuilt, the client team continued feature development on top of the new base.
Results
Before SPR’s involvement, the platform’s full workflow was generating error rates of approximately 90-95% in downloading, processing, and interpretation. With nearly 400 cases arriving each day, most ended up in the dead-letter queue due to errors, rendering the system unusable for downstream analysis, review, and funding workflows.
After SPR’s re-architecture and rebuild, error rates decreased to approximately 10-15%, and the remaining issues were mainly in the LLM analysis stage when source PDFs lacked necessary data or sections. With the platform stabilized, it was deployed to production, allowing the client’s content analysis team and other groups to review findings and advance their work.
Conclusion
SPR’s efforts transformed a promising idea into a reliable operational platform that the client could trust and use daily. By rebuilding the foundation, adjusting lifecycle assumptions about document availability, and enhancing how the pipeline handled exceptions and downstream processing, SPR helped the team shift from frequent failures to a stable, production-ready system that supported real review workflows. The outcome was not just a reduction in errors but a system that teams could confidently use and a solid technical foundation for the client to build upon.
Technology highlights
- Application Code - Python and many python libraries
- Data Storage - PostgreSQL
- PDF File Storage - Azure Blob Storage
- Extraction and Interpretation - Azure OpenAI LLM integration
- Case Queuing and Orchestration - Azure Service Bus
- Message Processing - Azure Functions
- Source Control and CI/CD Pipelines - Azure DevOps
- AI Tools as a part of Delivery – Microsoft 365 Copilot and Cursor IDE with AI Agents