What a Complete OM Extract Contains: The Field Standard for Inbound Deal Flow

An offering memorandum is a marketing document with a financial document buried inside it. The first half is narrative: location summary, demographic charts, tenant profiles, value-add story. The second half is the rent roll, the T-12, and the proforma. A broker spends a month producing the OM. A firm spends ninety seconds deciding whether to read it.

The compression problem at the firm is a structural mismatch between how OMs are written and how OMs are consumed. The OM is dense because the broker has to convince a buyer. The buyer wants the structured fields that drive a screening decision. Most of the dense content gets discarded by the buyer not because it is unimportant but because it is in the wrong format.

A complete OM extract restores the information without forcing a human to read the document. It produces a structured record with every field a screening decision could touch, each field cited back to the page and language in the source OM. The extract is not a summary. It is a translation.

Why Summary Extracts Fail

The default extraction approach is to read the OM and capture the headline fields: address, asset type, units, asking price, cap rate, broker. This produces a record that fits in a row of a spreadsheet. It also discards the information that determines whether the deal is actually a fit.

What Summary Extracts Capture	What They Discard
Asking price	Underlying assumptions in the proforma
Headline cap rate	Rent growth assumptions producing it
Unit count	Unit mix and floor plan distribution
Year built	Renovation history and capital plan
In-place NOI	Trailing twelve months of expense detail
Tenant count	Concentration, credit, and lease maturity ladder
Business plan label	The actual narrative and operator thesis

A firm screening on summary extracts is screening on the broker's headline numbers. Those numbers are designed to attract attention, not to support a decision. The deal that fails on the headline cap rate may pass on the actual yield-on-cost after capex. The deal that passes on the headline cap rate may fail when the proforma rent assumptions are pulled apart.

The Field Categories a Complete Extract Covers

A complete extract has to capture every field a buy box could query and every field an underwriter would touch in a first model. The categories below are the minimum for multifamily and commercial deals.

Category	Sub-fields
Identity	Property name, addresses, parcel IDs, MSA, submarket
Asset	Type, subtype, year built, year renovated, condition, building count
Size	Units, NRA, GBA, parking, land area
Tenancy	Tenant list, leased SF, lease maturities, credit, concentration
Rent	Current, market, growth assumed, in-place vs. proforma
Operating	Trailing-12 expenses by line, OpEx ratio, recovery method
Capex	Renovation history, planned capital, deferred maintenance
Debt	Existing loan, balance, rate, maturity, assumability, prepay
Pricing	Asking, whisper, cap rate, price per unit or SF
Returns	Untrended yield, levered IRR, equity multiple, hold
Sponsor	Seller, broker, marketing process, timeline
Narrative	Business plan, value-add story, market thesis

The exact fields vary by asset class. A retail OM has tenant sales, occupancy cost, and co-tenancy provisions that a multifamily OM does not. An industrial OM has clear height, dock count, and trailer parking. The field standard has to be configurable by asset type without losing the cross-asset comparability that makes the database useful.

Extraction From Narrative, Not Just Tables

The fields that matter most for screening often live in narrative sections, not in tables. A broker explains in prose why the rents are below market, why the property is mismanaged, why the tenant base has more upside than the trailing data suggests. This narrative is the deal thesis. It is also the part most extraction systems ignore.

A complete extract captures the narrative as structured fields, not just as text. The "value-add thesis" field is not the entire two-page section. It is the extracted operator hypothesis: rents 12% below market, units in original condition, planned $8k per unit renovation, target rent premium of $180. The numbers come out of the narrative even when the narrative does not put them in a table.

The same applies to risk factors. A complete extract captures the risks the broker discloses (deferred maintenance, tenant concentration, environmental issue, zoning ambiguity) as discrete fields with the source language attached. The risks the broker does not disclose are a separate problem and require diligence, not extraction. But the disclosed risks should never be lost because they live in prose.

Source Citation Requirements

Every extracted field needs a citation back to the OM. The citation has to identify the page, the section, and the source language that produced the value. This is the same standard that applies to lease abstraction, and it applies here for the same reason: the extract is a record that has to be defensible when questioned.

Field	Citation Required
Asking price	Page, paragraph, exact phrasing
In-place rent	Source table, row, column
Proforma assumption	Section, paragraph, narrative excerpt
Risk factor	Section, paragraph, source language
Tenant credit	Source table or narrative claim

Without citations, the extract is a set of values that look authoritative but are not auditable. With citations, an analyst reviewing the extract can verify any field in seconds. A principal questioning a number in screening can see the source without opening the PDF.

Confidence Scores and Exception Routing

Extraction is not perfect. The system has to know when it is uncertain and route those fields to human review. The confidence score is the mechanism.

A field extracted from a clearly labeled table at 98% confidence does not need review. A field extracted from a narrative section at 72% confidence does. A firm that treats every extraction as authoritative will accumulate errors. A firm that reviews only the low-confidence extractions will catch most errors with a fraction of the review time.

Confidence Tier	Treatment
High (95%+)	Auto-accept, no review
Medium (80-95%)	Review on request or for material fields
Low (below 80%)	Mandatory review before screening decision

The tiering allows the firm to scale review effort with deal volume without losing accuracy. The principal sees a queue of deals where the high-confidence extractions are ready to score and the low-confidence ones are flagged for the analyst.

What "Done" Looks Like

A complete OM extract meets the following criteria:

Every screening-relevant field is populated, not just the headline fields.
Narrative sections are parsed for embedded structured data, not just stored as text.
Every field carries a citation back to the source page and language.
Every field carries a confidence score.
Asset-class-specific fields are configurable without losing comparability.
Low-confidence fields route to human review automatically.

If the principal still has to open the PDF to make a screening decision, the extract is incomplete.

Conclusion

The OM contains far more than firms typically extract from it, and most of what gets discarded is exactly what the screening decision actually needs. A summary extract is faster than reading the OM and worse than not extracting at all, because it produces a false sense of structure around a small subset of fields. A complete extract turns the OM into a record that can be queried, scored, and audited. The work shifts from reading every OM to verifying the extractions that need verification, which is the only version of this workflow that scales with the volume of deal flow a serious firm sees.