MAINTENANCE_CHECKLIST.md 1.9 KB

Atlas Maintenance Checklist

This file documents the current maintenance contract so the implementation stays aligned.

1) Property policy

  • Unitary / strict
    • MID
    • Wikidata QID
    • birth date
    • coordinates
  • Lenient for now
    • VIAF (many-to-one / cluster-style)
    • ISNI
    • other ambiguous authority identifiers

2) Source responsibilities

  • resolve_entity
    • searches / resolves
    • marks wikidata_status = hit
    • does not persist the full Wikidata payload by default
  • maintain_entities.py
    • downloads the full Wikidata entity object
    • persists it
    • performs type-specific enrichment
    • supersedes conflicting claims when appropriate

3) Type inference

  • Use Wikidata P31 as the primary type signal.
  • Reason over the Wikidata ontology hierarchy first.
  • If the ontology cannot decide, ask the LLM.
  • If the type is already clear, skip to enrichment + write-back.

4) Type-specific enrichment

  • Person
    • birth date (strict)
    • birth place
    • citizenship
  • Location
    • latitude (strict)
    • longitude (strict)
    • country / region when available
  • Organization
    • inception
    • headquarters
    • industry

5) Supersession rules

  • Never silently overwrite a claim.
  • If a stronger claim replaces a weaker one:
    • mark the old claim as superseded
    • create the new claim as active
  • Normal JSON must show only active claims.

6) JSON output policy

  • Normal resolve_entity output should stay compact and human-readable.
  • Use payloads=true for raw payload snapshots.
  • Use debug=true for full claim/provenance detail.

7) Persistence policy

  • Persist the full Wikidata object once, then reuse it.
  • Refresh only on explicit command.
  • Maintenance is the only path that should write enriched payloads back.

8) Verification

For a single entity, the test loop is:

  1. resolve
  2. maintain
  3. resolve again
  4. verify that type, identifiers, and type-specific enrichment are correct