Chain of custody is a concept borrowed from evidence law. It refers to the documented sequence of who handled a piece of evidence, when, and what they did with it. The purpose is to defend the evidence against claims that it was altered, contaminated, or fabricated. If the chain is intact, the evidence is admissible. If the chain is broken, the evidence is suspect regardless of whether the underlying facts are true.
The same logic applies to AI-extracted data in commercial real estate. A rent figure extracted from a lease and used in an underwriting model is evidence of an asset's economic value. The IC, the lender, and the future buyer all rely on that figure. If the chain from the source document to the model cannot be reconstructed, the figure is suspect. The decision built on it is suspect. The deal is suspect.
Most CRE workflows do not maintain chain of custody for extracted data. Values flow from documents to abstracts to models to memos with no record of the path. When a value is questioned, the questioning team has to retrace the path manually. Often the path is lost.
What a Broken Chain Looks Like
A broken chain of custody is the default state for most CRE extracted data. The break can occur at any of several points.
Break Point | What Goes Wrong |
|---|---|
Document to abstract | No record of which document or page produced the value |
Abstract to model | No record of which abstract field populated the model cell |
Model to memo | No record of which model output produced the memo number |
Memo to lender package | No record of which memo statement supports the lender claim |
Acquisition to asset management | No record of who verified what at handoff |
Each break is a place where a value can drift, be edited without trace, or be assumed without basis. The downstream user has no way to know whether the value they are reading is the value the source document produced.
The drift problem is underappreciated. A rent figure extracted from a lease may be modified in the abstract to reflect a free rent schedule, then modified in the model to reflect a stabilized assumption, then rounded in the memo for presentation. Each modification is reasonable in context. Without a chain of custody, the IC reads a stabilized rounded figure and assumes it is the lease rent. The decision is made on a number that is three transformations removed from the source.
What a Chain of Custody Requires
An intact chain of custody for CRE extracted data has three properties.
The first is provenance. Every value at every stage carries a reference back to its source. The model cell references the abstract field. The abstract field references the document page. The document page references the source PDF. A reviewer can trace any value to its origin without leaving the system.
The second is transformation logging. Every modification to a value is recorded with the modifier, the modification, and the rationale. When a lease rent of fifty dollars per square foot becomes a stabilized rent of forty-eight dollars per square foot in the model, the log shows who made the adjustment, when, and why.
The third is reviewer attribution. Every verification action is attributed. When a value is confirmed, corrected, or escalated, the system records who took the action. The accountability is durable across the deal lifecycle.
Property | What It Provides |
|---|---|
Provenance | Trace any value to its source document |
Transformation logging | Reconstruct every modification with rationale |
Reviewer attribution | Identify who verified what and when |
A chain that has all three properties is defensible against any reasonable challenge. A chain that is missing one of them has a gap that a sophisticated reviewer will exploit.
Where the Chain Has to Survive
The chain of custody is tested at predictable points in the deal lifecycle. Each test is a stress event that breaks weak chains and validates strong ones.
Test Point | What Gets Tested |
|---|---|
Internal QC | Whether the abstract reconciles to source |
IC review | Whether the model assumptions trace to evidence |
Lender diligence | Whether the rent roll matches the executed leases |
Buyer diligence | Whether the seller's representations match the records |
Audit | Whether the historical decisions can be reconstructed |
Litigation | Whether the data is admissible as evidence |
Most chains hold up at internal QC. Many hold up at IC review. Fewer hold up at lender diligence, because the lender's diligence team is professionally adversarial and looks for the breaks. Fewer still hold up at buyer diligence in a sale, because the buyer's diligence team has the same motivation. The chains that hold up at litigation are the ones that were designed to survive every prior stage.
What AI Changes
AI extraction changes the chain of custody problem in two ways.
The first is that the chain becomes more important. AI introduces a new actor in the workflow. The actor produces values from documents at scale. The reliability of the values depends on the reliability of the actor, which is not directly verifiable. The chain of custody is the only mechanism that lets a reviewer verify the actor's output without redoing the work.
The second is that the chain becomes tractable. AI systems that are designed correctly produce the chain as a byproduct of extraction. Each value is tagged with its source document, page, and language. Each transformation is logged. Each reviewer action is attributed. The chain is built into the data model, not assembled after the fact from notes and emails.
The combination is what makes AI extraction defensible at scale. Without the chain, AI is an unverifiable actor producing values that can be questioned at any stage. With the chain, AI is an extractor whose output can be inspected at any stage by any reviewer. The role of the human shifts from producing the values to verifying the chain.
What Stays the Responsibility of the Human
The chain of custody does not eliminate the need for human judgment. It documents the judgment that was applied. Three categories of judgment remain explicitly human.
Materiality determinations. Whether a discrepancy between an extracted value and a source language matters is a judgment call. AI surfaces the discrepancy. A human decides whether it changes the deal.
Conflict resolution. When the rent roll, the executed lease, and the most recent amendment disagree, a human decides which value flows forward. The chain records the decision and the rationale. AI does not make the call.
Acceptance of risk. When an extracted value carries a confidence score below the threshold, a human decides whether to accept the value, escalate for manual review, or hold the deal pending clarification. The chain records the decision. AI does not absorb the risk.
The chain of custody captures the human role rather than replacing it. The reviewer who confirms a value, the analyst who corrects a misextraction, the deal lead who accepts a flagged exception, all leave a trace. The trace is what makes the data defensible.
What "Done" Looks Like
An intact chain of custody for AI-extracted CRE data meets the following criteria:
Every value at every stage of the workflow can be traced to its source document and page.
Every transformation between source and final use is logged with the modifier, modification, and rationale.
Every reviewer action is attributed with the actor, action, and timestamp.
A reviewer at any downstream stage can verify any value without redoing the upstream work.
The chain survives across the acquisition-to-asset-management handoff and persists for the hold period.
If a value cannot be traced, the chain is broken at that point.
Conclusion
Chain of custody is the discipline that turns AI extraction from an efficiency story into a defensibility story. The efficiency is the headline. The defensibility is what allows the efficiency to compound across deals, across lender relationships, and across audit cycles. Firms that maintain the chain produce data that can be relied on without re-verification. Firms that do not will find that every AI-extracted value gets re-verified anyway, and the efficiency they expected never materializes. The chain is not an artifact of the workflow. It is the workflow.