Menu

Menu

  1. Jan 2, 2026

    Data Provenance in CRE: Why Source Citations Matter

Commercial real estate underwriting depends on data extracted from dozens of documents: rent rolls, leases, amendments, operating statements, loan documents, environmental reports, and offering memorandums. By the time an investment committee reviews a deal, the numbers on the screen have passed through multiple hands and systems. The tenant roster came from somewhere. The NOI figure originated in a specific document. The cap rate assumption was based on particular comparables.

Data provenance is the practice of tracking where each data point came from. It answers a simple question: "Where did this number originate?" In a discipline where investment decisions hinge on the accuracy of underlying data, and where investors, lenders, and regulators routinely ask hard questions, provenance is not a nice-to-have feature. It is a fundamental requirement for defensible underwriting.

What Data Provenance Means in CRE

Provenance, in its simplest form, is a citation. When a rent roll shows that Tenant A pays $25.00 per square foot in base rent, provenance tells you that this figure was extracted from page 4 of the executed lease dated January 15, 2022, Section 3.1, "Base Rent."

Comprehensive provenance includes several elements:

  • Source document. The specific file from which the data was extracted (e.g., "Acme Corp Lease - Executed 2022-01-15.pdf").

  • Location within document. The page number, section, table, or paragraph where the value appears.

  • Extraction timestamp. When the data was captured, which matters when documents are updated or superseded.

  • Extraction method. Whether the value was extracted automatically, manually entered, or derived through calculation.

  • Modification history. Any changes made after initial extraction, including who made them and why.

Together, these elements create an audit trail that connects every figure in your underwriting model back to its documentary source.

Why Provenance Matters

The value of provenance becomes clear in four scenarios that occur routinely in CRE transactions.

1. Investor and Lender Due Diligence

When investors or lenders review an underwriting package, they do not accept figures at face value. They ask questions. "Where did this occupancy number come from?" "What document supports this expense assumption?" "How do you know the lease expires in 2027?"

Without provenance, answering these questions requires someone to hunt through the data room, locate the relevant document, and find the specific passage that supports the figure. This process is slow, error-prone, and creates doubt about whether the underwriting was rigorous in the first place.

With provenance, every questioned figure links directly to its source. The reviewer clicks through to the cited page and verifies the value in seconds. This transparency builds confidence and accelerates the diligence process.

2. Error Tracing and Correction

Errors in underwriting models are inevitable. A tenant's square footage is wrong. An expense line item is miscategorized. A lease expiration date is off by a year. When errors surface, the first question is always: "Where did this come from?"

Provenance enables rapid error tracing. If the incorrect square footage was extracted from the rent roll, you check the rent roll. If the rent roll itself is wrong, you compare it against the executed lease. Without provenance, you cannot distinguish between an extraction error (the AI or analyst pulled the wrong value) and a source error (the document itself contains incorrect information). This distinction determines whether you fix your extraction process or go back to the seller for corrected documents.

3. Dispute Resolution

Disputes arise in CRE transactions. A seller claims the buyer misrepresented occupancy. A lender alleges that the borrower inflated NOI. A tenant disputes the landlord's calculation of pass-through expenses.

In these situations, provenance provides evidence. If your underwriting shows 94% occupancy, and you can demonstrate that this figure was extracted from the seller-provided rent roll dated March 15, 2024, page 1, row 47 (total occupied SF divided by total rentable SF), you have a defensible position. The dispute shifts to whether the source document was accurate, not whether your team fabricated numbers.

4. Regulatory and Compliance Requirements

CMBS issuers, institutional investors, and regulated lenders operate under compliance frameworks that require documentation of underwriting inputs. Provenance satisfies these requirements by providing the audit trail that regulators expect.

When an examiner asks how you determined the debt service coverage ratio, provenance lets you trace the NOI figure to the T-12, the debt service figure to the loan documents, and the calculation methodology to your underwriting policy. This traceability is not optional for regulated entities.

What Good Provenance Looks Like

Effective provenance implementation has several characteristics.

Field-level granularity. Provenance should attach to individual data points, not entire documents. Knowing that "data came from the lease" is insufficient. Knowing that "base rent of $25.00/SF came from Lease Section 3.1, page 4" is useful.

Direct linking. Users should be able to click from a data point to its source location, ideally with the relevant passage highlighted. This reduces friction in verification workflows.

Version awareness. When documents are updated (a new rent roll arrives, an amendment supersedes a lease provision), provenance should reflect which version of the document was used. A figure extracted from a March rent roll should not be confused with a figure from a June rent roll.

Preservation of original values. When data is normalized or transformed (dates reformatted, names standardized, units converted), provenance should preserve the original extracted value alongside the normalized version. This allows reviewers to verify that the transformation was correct.

Conflict documentation. When the same data point appears differently across documents (the lease says $25.00/SF, the rent roll says $26.00/SF), provenance should capture both values, their respective sources, and the resolution (which value was used and why).

Real-World Provenance Scenarios

Consider how provenance functions in practice.

Scenario 1: NOI verification. An investor questions the $2.4M NOI figure in the underwriting model. With provenance, the analyst shows that NOI was calculated from the trailing 12-month operating statement dated September 30, 2024. Revenue of $3.8M came from line 1 (Gross Potential Rent) minus line 5 (Vacancy Loss) plus line 8 (Other Income). Expenses of $1.4M came from lines 12 through 28 (Operating Expenses). Each line item links to the specific cell in the source document. The investor verifies the figures in minutes and proceeds with confidence.

Scenario 2: Lease term discrepancy. During due diligence, a lender notes that the rent roll shows Tenant B's lease expiring in December 2026, but the lease abstract shows December 2027. Provenance reveals that the rent roll figure was extracted from the seller's Excel file (cell F15), while the lease abstract figure was extracted from the executed lease (page 2, Section 2.1). Further investigation reveals an amendment extending the lease by one year. The rent roll was simply outdated. The underwriting model is updated to reflect the correct expiration, and the provenance record documents the correction with a reference to the amendment.

Scenario 3: Expense audit. A property manager disputes a CAM reconciliation, claiming that certain expenses were misallocated. The landlord's underwriting system shows provenance for each expense line item, tracing figures back to invoices, service contracts, and the operating statement. The dispute is resolved by examining the source documents rather than debating whose spreadsheet is correct.

Consequences of Missing Provenance

When provenance is absent, several problems compound.

Problem

Consequence

Unverifiable figures

Investors and lenders lose confidence in underwriting accuracy

Untraceable errors

Correction requires re-examining all source documents, not just the relevant one

Weak dispute position

Claims about data accuracy become "your word against theirs"

Compliance gaps

Audit requests trigger scrambles to reconstruct sourcing after the fact

Repeated mistakes

Without knowing where errors originated, the same extraction failures recur

These consequences are avoidable. Provenance is a workflow discipline, not a technological limitation. Modern extraction systems can capture source citations automatically. The question is whether teams implement and enforce provenance standards.

Conclusion

Data provenance transforms CRE underwriting from an exercise in spreadsheet manipulation to a discipline grounded in auditable evidence. Every figure in an underwriting model should trace back to a specific location in a specific document, with a clear record of when it was extracted and how it was transformed. This traceability satisfies investors, protects against disputes, supports compliance requirements, and enables rapid error correction. In a transaction environment where data quality determines deal outcomes, provenance is the mechanism that makes quality verifiable.

Request a Free Trial

See how Eagle Eye brings clarity, accuracy, and trust to deal documents.

Request a Free Trial

See how Eagle Eye brings clarity, accuracy, and trust to deal documents.