May 28, 2026

The Framework for Trusted LLMs

By Andy Leichtle · 3 minute read

The firms that have successfully deployed AI in financial services didn't get lucky with their model selection. They built something specific before the model ever touched their data.

There's a framework underlying every AI deployment that actually works in this industry. It has three layers. Most firms are working on layer one, haven't thought about layer two, and are completely unprepared for layer three. Understanding the full stack and knowing where your firm stands in it is what separates an AI deployment that delivers from one that gets quietly shelved.

Why the Same Deployments Fail

"AI doesn't fail because organizations lack advanced models. It fails when the foundations, governance, and measurement needed to scale are missing." The model is rarely the problem. The infrastructure beneath it is.

Before laying out the framework, it helps to understand the failure pattern. Most AI deployments in financial services follow the same arc: a firm licenses a tool (or builds a wrapper around an LLM), connects it to whatever data is accessible, and pilots it with a small group of analysts. The pilot produces mixed results. Some queries work. Others return confidently wrong answers. The team troubleshoots. When the answers don't improve, the project stalls.

The root cause is architectural, not technical. The firm connected the model to data that wasn't structured for retrieval. No framework underneath it.

Layer 1: The Data Foundation

The first layer is the most foundational and the most frequently skipped in the rush to deploy.

A governed data foundation means: every dataset the AI will be asked to reason over has been ingested into a unified, access-controlled environment. The data is versioned. It has consistent metadata and naming conventions. Definitions are standardized across business units. The ownership and lineage of each dataset is documented.

For financial services firms, this typically means unifying data from Bloomberg, FactSet, SEI, internal research systems, and document repositories into a single governed layer usually Snowflake. This isn't about moving everything into one place for its own sake. It's about giving the AI a corpus it can reliably retrieve from, rather than a collection of islands that don't speak the same language.

One $146B AUM asset manager we work with spent six months establishing this foundation before deploying AI on top of it. The payoff: data item requests that used to take six months through an IT backlog now take one week. The foundation made the AI work. Without it, the AI would have produced the same inconsistent results everyone else was getting.

The foundation also establishes the access controls the retrieval layer will enforce later. Defining who can see what at the data layer before the AI layer is added ensures the AI inherits governance rather than bypassing it.

Layer 2: The Retrieval Architecture

The second layer determines how the AI finds information when a user asks a question.

Keyword search finds documents that contain the words in a query. Retrieval architecture for LLMs does something different: it uses vector embeddings to find documents that are conceptually relevant regardless of whether they use the exact same terminology.

“What was our exposure to regional bank debt during the 2022 rate cycle?” is not a keyword search question. The documents that answer it probably don’t contain the phrase “exposure to regional bank debt.” They contain holdings data, transaction logs, and analyst commentary all of which need to be retrieved and synthesized. A retrieval architecture built on semantic search and Snowflake Cortex can find and reason across these sources. A keyword search cannot.

The retrieval layer also handles structured vs. unstructured data differently. Natural language queries against compliance documents require a different retrieval path than natural language queries against holdings tables. Cortex Analyst handles structured data in Snowflake natively. Document retrieval requires a separate vector-indexed corpus. A complete retrieval architecture handles both and can blend them in a single query response.

Layer 3: Output Governance

The third layer is where trust is built or broken.

Output governance means: every response the AI returns is grounded in the retrieved corpus, cites its sources, is access-controlled by user role, and is logged for audit purposes.

Grounding is the most critical element. An LLM without grounding will fill gaps in its knowledge with synthesized content from training data which may be accurate in a general sense and completely wrong for your firm’s specific situation. A grounded model is constrained to answer from what it retrieved, and to say “I don’t have reliable information on that” when it can’t.

Source citation serves two purposes: it allows analysts to verify answers before acting on them, and it creates an audit trail that compliance teams can review. Every answer should be traceable back to a specific document, data point, or dataset.

Access control at the output layer enforces the governance decisions made at the data foundation layer. A portfolio manager and a CCO asking the same question should receive answers drawn from the data each of them is authorized to see not from the full corpus.

Firms that build all three layers produce something most financial services organizations don’t have: AI output their analysts trust, their compliance teams can audit, and their governance frameworks can support.

The Framework for Trusted LLMs

Why the Same Deployments Fail

Layer 1: The Data Foundation

Layer 2: The Retrieval Architecture

Layer 3: Output Governance

Related posts

Beyond Chatbots: How Financial Firms Are Using LLMs to Unlock Knowledge

Leading the Way in Implementing the New Markit EDM Core Matcher

How the Snowflake Cloud Data Platform Stands Out and Above

Pages

Resources