MergeOn Substrate Specification
Governed enterprise intelligence infrastructure.
The models are not the moat. The governed substrate is the moat.
The LLM is the terminal. The substrate is the operating system.
Every claim traces to evidence. Every dependency is tracked. Every operation is replay-safe.
Implementation layers
- Intelligence Layer — Orchestration + reasoning substrate. The shared compute surface where unstructured corpora and structured systems of record resolve into one canonical, version-aware operational truth. Your teams build workflows on top of it; the substrate handles orchestration, retrieval, and reasoning composition.
- Intelligent Document Center — Governed document & evidence execution layer. The implementation surface for document-centric operations: ingestion, evidence extraction, human verification gates, replay-safe events. Your operations team executes against this layer; the substrate provides the policy and evidence guarantees.
- MIL — Policy, governance, and runtime mediation layer. The governed context firewall that mediates every model invocation against the systems of record. Models never touch raw data. Your security, compliance, and data teams define the policy; MIL enforces it at runtime and records every decision.
- THEMIS — Dependency, review, and audit intelligence layer. Deterministic reasoning infrastructure for high-stakes operational work: dependency graphs, obligation runtime, scenario simulation, audit-grade lineage. Your domain experts ship rule packs; THEMIS turns them into provably consistent operational truth.
Governed Context Firewall
MIL sits between any LLM and the systems of record an enterprise relies on. Models never touch raw data; every request is mediated, governed, and evidence-linked.
- Intent Detection. Resolve what the requester actually needs.
- Entity Linking. Bind references to canonical operational identity.
- Context Selection. Retrieve only evidence required by the request.
- Retrieval Policy. Enforce access, scope, and need-to-know boundaries.
- Redaction. Remove sensitive fields before any external model receives context.
- Context Shaping. Structure the bounded context the model is permitted to see.
Deterministic Reasoning Infrastructure
THEMIS turns multi-document corpora into provably consistent operational truth. Versioned knowledge, dependency-aware reasoning, deterministic execution.
- Document Intake. Multi-format ingestion with layout-aware parsing.
- Structure Parser. Hierarchical section detection and cross-reference linking.
- AST Generator. Abstract syntax tree for the contractual or operational language.
- Semantic Analyzer. Concept extraction and defined-term resolution.
- Entity Extractor. Party identification and canonical role assignment.
- Obligation Engine. Duty extraction, rights identification, condition mapping.
- Temporal Analyzer. Effective dates, supersession chains, deadline derivation.
- Dependency Graph. Cross-clause and cross-document references as a typed graph.
- Constraint Solver. Logical consistency verification and contradiction detection.
- State Machine. Lifecycle tracking, milestone resolution, transition validation.
- Risk Scorer. Clause-level and corpus-level risk quantification.
- Comparator. Version differencing against canonical baselines.
- Anomaly Detector. Unusual clauses and outlier obligations.
- Transaction Simulator. Outcome and closing-probability simulation.
- Self-Healing Engine. Deterministic fix proposals for surfaced contradictions.
- Amendment Generator. Constrained drafting that respects every dependency.
- Compliance Checker. Regulatory and policy alignment.
- Knowledge Integrator. External-data enrichment against the canonical record.
- Output Formatter. Report generation, structured exports, downstream API responses.
For governed agentic execution
- Governed context for agents. Every model invocation passes through MIL — scope, redaction, and access are policy-bounded, not prompt-bounded. Agents see only what they are entitled to see.
- Deterministic action execution. When an agent invokes a tool that touches a system of record, THEMIS evaluates the call against rule packs before any side-effect. Probabilistic answers do not become unilateral actions.
- Replay-safe operations. Every agent step — context received, policy decision, tool call, outcome — is logged with provenance. The decision can be reconstructed end-to-end for any auditor or counterparty.
- Multi-engine coordination. Agents coordinate through the substrate, not through ad-hoc agent-to-agent calls. Reasoning, retrieval, and policy engines compose under one governance boundary.
- Provider-agnostic infrastructure. Switching the reasoning model does not invalidate the agent system. Knowledge stays external and versioned; the agent contract is the substrate, not the vendor.
- Auditable autonomy. Agents can act. Every action carries provenance, every action is replayable, every action is bounded by policy. Autonomy without governance is not enterprise-ready.
What enterprises should not rebuild
- Context governance. Shaping, scoping, and redacting context before any model receives it — under policy, with evidence.
- Evidence lineage. Every value the system surfaces resolves back to source coordinates and a verifiable hash.
- Deterministic reasoning. Operational answers that are computed against rule packs, not improvised by a language model.
- Replay-safe orchestration. Every decision can be reconstructed from logged context, inputs, and policy at the moment of the decision.
- Dependency intelligence. Cross-document and cross-system references tracked as a typed graph; cascades resolved automatically.
- Survivability. Knowledge stays external and versioned. A change of vendor, model, or jurisdiction does not invalidate the system.
- Policy mediation. Access, redaction, scope, and tool invocation all evaluated against declared policy — not against developer intent.
- Auditability. Auditors read a record, not a guess. Every query, response, redaction, and denial is logged with provenance.
Vertical backbones
- Real Estate — Relationship intelligence. Proof: AI-LTOR — MergeOn’s first real estate vertical operating system, built on the MergeOn substrate.
- Logistics — Shipment state and exception flow.
- Healthcare — Authorization and care-pathway intelligence.
- Finance — Transaction, risk, and approval governance.
- Insurance — Claims, policy, and exposure intelligence.
- Legal — Agreement, obligation, and matter runtime.
Supported reasoning providers
OpenAI, Anthropic, Google, Azure OpenAI, AWS Bedrock, Cohere, Self-Hosted.
Compliance posture
Aligned to: SOC 2, HIPAA, GDPR.
- Bring Your Own Storage. Documents and records remain in customer infrastructure.
- Customer-Managed Keys. AES-256 encryption with customer-held key material.
- Regional Data Residency. Per-region deployment and policy enforcement.
- Evidence Chain. Every operation links to source coordinates and a verifiable hash.
- Replay & Audit. Every query, response, and policy decision is logged and replayable.
- PII Redaction by Policy. Sensitive fields removed before any external model receives context.
╔══════════════════════════════════════════════════════════════════╗ ║ MERGEON SUBSTRATE SPECIFICATION ║ ║ Machine-Readable Specification ║ ║ ║ ║ This document contains structured data for automated systems. ║ ║ Human-readable substrate overview: mergeon.com/what-we-do ║ ╚══════════════════════════════════════════════════════════════════╝