ATLAS_ONTOLOGY.md 6.2 KB

Atlas Internal Ontology Draft

This file sketches the internal canonical ontology for Atlas.

Atlas is not a facts store and not a domain app. It is the semantic resolver and enricher that normalizes entities into a stable internal model, then persists those mappings and related graph data in Virtuoso via the MCP server.

The ontology below is deliberately small at first. It is meant to support:

  • repeated entity resolution without re-querying external services
  • provenance tracking
  • canonical labeling
  • external identifier mapping
  • future enrichment outputs

Core idea

A single real-world thing may have many aliases and many external identifiers. Atlas should resolve all of those into one canonical internal entity record, then attach graph evidence around it.

Example:

  • input alias: Trump
  • canonical entity: a single Atlas entity node
  • external identifiers: Wikidata ID, MID, provider-specific IDs
  • provenance: where the mapping came from
  • derived representation: the bundle consumed by facts-mcp / news-mcp

Proposed classes

atlas:Entity

The canonical internal entity node.

Purpose

  • the stable internal identity for one real-world referent

Typical fields

  • atlas:entityId
  • atlas:canonicalLabel
  • atlas:entityType
  • atlas:createdAt
  • atlas:updatedAt

atlas:Alias

A surface form, nickname, variant label, or query label that can resolve to an entity.

Typical fields

  • atlas:aliasLabel
  • atlas:aliasLanguage
  • atlas:aliasSource

atlas:ExternalIdentifier

An identifier from another system.

Examples

  • Wikidata QID
  • Google Knowledge Graph MID
  • provider-specific ids

Typical fields

  • atlas:identifierValue
  • atlas:identifierSource
  • atlas:identifierType

atlas:Provenance

Where a mapping or claim came from.

Typical fields

  • atlas:provenanceSource
  • atlas:retrievedAt
  • atlas:retrievalMethod
  • atlas:confidence

atlas:ResolvedMapping

A record that says: this alias or external identifier points to this canonical entity.

Typical fields

  • atlas:sourceRef
  • atlas:targetEntity
  • atlas:provenance
  • atlas:status

atlas:EnrichmentDataset

A computed set of related entities, relations, and evidence produced by Atlas.

This is a function result, not the final domain-facing bundle.

Typical fields

  • atlas:seedEntity
  • atlas:relatedEntity
  • atlas:relatedRelation
  • atlas:queryContext
  • atlas:generatedAt

atlas:EntityType

Canonical type nodes owned by Atlas.

Purpose

  • represent the stable internal class (Person, Organization, Instrument, etc.)
  • map external type labels/URIs onto Atlas types (via owl:sameAs/skos:exactMatch style links)

atlas:ExternalType

Raw type evidence from sources such as Google Trends, Wikidata, etc.

Purpose

  • capture the literal strings we received (e.g. "46th U.S. President")
  • keep provenance about where/when we saw them
  • allow later mapping to canonical Atlas types

atlas:DomainProjection

The conceptual bundle consumed by domain apps.

This is the representation a domain-specific app uses after Atlas has resolved and enriched the entity.

Typical fields

  • atlas:projectionFor
  • atlas:sourceEntity
  • atlas:projectionPayload
  • atlas:projectionContext

Key predicates

Identity and naming

  • atlas:canonicalLabel
  • atlas:aliasLabel
  • atlas:entityType (literal fallback when canonical type is unknown)

Type system

  • atlas:hasCanonicalType (Entity → EntityType)
  • atlas:hasExternalType (Entity → ExternalType)
  • atlas:externalTypeLabel
  • atlas:equivalentType / owl:sameAs links to external ontologies

Mapping

  • atlas:hasAlias
  • atlas:hasExternalIdentifier
  • atlas:resolvedTo
  • atlas:preferredIdentifier

Provenance

  • atlas:hasProvenance
  • atlas:provenanceSource
  • atlas:retrievedAt
  • atlas:confidence

Enrichment

  • atlas:hasEnrichment
  • atlas:relatedEntity
  • atlas:relatedRelation
  • atlas:enrichmentDepth

Domain projection

  • atlas:hasDomainProjection
  • atlas:projectionFor
  • atlas:projectionPayload

Minimal resolution flow

  1. Receive alias/text.
  2. Check Virtuoso for an existing mapping.
  3. If found, return the canonical entity.
  4. If not found, query upstream resolution sources.
  5. Normalize the result into the Atlas ontology.
  6. Store the mapping and provenance in Virtuoso.
  7. Return the resolved entity.

Minimal enrichment flow

  1. Receive canonical entity.
  2. Compute a related-entity dataset.
  3. Attach constraints, provenance, and depth.
  4. Return the enrichment dataset.
  5. Optionally build a domain projection from the result.

Draft Turtle sketch

@prefix atlas: <http://world.eu.org/atlas_ontology#> .
@prefix atlas_data: <http://world.eu.org/atlas_data#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .

atls:entity/trump a atlas:Entity ;
  atlas:canonicalLabel "Donald Trump" ;
  atlas:hasCanonicalType atls:type/person ;
  atlas:entityType "person" ;  # literal fallback from resolver
  atlas:preferredIdentifier atls:ext/wikidata/Q22686 ;
  atlas:hasAlias atls:alias/trump ;
  atlas:hasProvenance atls:prov/resolve/google-trends-2026-04-03 .

atls:type/person a atlas:EntityType ;
  atlas:canonicalLabel "Person" ;
  atlas:equivalentType <http://schema.org/Person> .

atls:alias/trump a atlas:Alias ;
  atlas:aliasLabel "Trump" ;
  atlas:aliasSource "query" .

atls:ext/wikidata/Q22686 a atlas:ExternalIdentifier ;
  atlas:identifierValue "Q22686" ;
  atlas:identifierSource "wikidata" ;
  atlas:identifierType "wikidata-qid" .

atls:prov/resolve/google-trends-2026-04-03 a atlas:Provenance ;
  atlas:provenanceSource "google-trends" ;
  atlas:retrievalMethod "entity-resolution" ;
  atlas:confidence "0.93"^^xsd:decimal .

Open questions

  • Should atlas:Entity be one node per referent, with aliases and IDs attached as properties, or should aliases and identifiers be fully separate nodes?
  • Should the canonical label be unique or only preferred?
  • Which fields are required for the first cache hit path?
  • How much of the enrichment dataset should be persisted versus computed on demand?
  • What is the smallest useful domain projection for facts-mcp and news-mcp?

Working rule

If a property is unclear, keep the ontology small and make the implementation prove it later.