Nenhuma descrição

Lukas Goldschmidt d6b9da445a comments 1 mês atrás
app d6b9da445a comments 1 mês atrás
config 60894b8f23 refine atlas maintenance and stable claim identifiers 1 mês atrás
examples 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
ontology aa9c6789ad shaky enrichment steps 1 mês atrás
prompts aa9c6789ad shaky enrichment steps 1 mês atrás
scripts ba8f9b8601 feat: maintenance MID/type fixes (feature incomplete) 1 mês atrás
tests 36532ac5a5 release: v0.0.2 harmonize payload timestamps and entity/claim model 1 mês atrás
.env.example aa9c6789ad shaky enrichment steps 1 mês atrás
.gitignore d6b9da445a comments 1 mês atrás
ATLAS_ONTOLOGY.md 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
CLAIM_TRIPLE_MAPPING.md 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
MAINTENANCE_CHECKLIST.md aa9c6789ad shaky enrichment steps 1 mês atrás
MANIFEST.md 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
PROJECT.md 60894b8f23 refine atlas maintenance and stable claim identifiers 1 mês atrás
README.md aa9c6789ad shaky enrichment steps 1 mês atrás
RELEASE_NOTES_v0.0.1.md 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
RELEASE_NOTES_v0.0.2.md 36532ac5a5 release: v0.0.2 harmonize payload timestamps and entity/claim model 1 mês atrás
RESPONSE_SCHEMA.md 60894b8f23 refine atlas maintenance and stable claim identifiers 1 mês atrás
SPARQL_SNIPPETS.md 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
killserver.sh 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
maintain_entities.sh 60894b8f23 refine atlas maintenance and stable claim identifiers 1 mês atrás
requirements.txt 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
restart.sh 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
run.sh 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás
tests.sh 9c72d7b1ad Atlas v0.0.1 claim lifecycle and docs 1 mês atrás

README.md

Atlas MCP

Atlas-MCP implements the semantic intelligence tier for the existing MCP stack. It follows the manifest’s mandate: Atlas is the only layer that resolves and enriches entities. For now, Atlas has exactly two public responsibilities: entity resolution and enrichment. The facts-mcp docs reinforce the same design pressure: keep the authoritative truth layer small, canonical, and explicit; Atlas should not blur into that role, but instead cooperate with it through clean graph contracts.

Today’s goals

  1. Bootstrap the FastMCP + FastAPI service with the basic /health route and deployment scripts so the runtime mirrors our other MCP servers.
  2. Capture the goals for news-mcp integration: entity resolution and enrichment only for now; trend discovery, persistence, and caching remain design concerns but are not first-cut Atlas tools.
  3. Document the folder layout, plans, and dependencies so future contributors can extend it safely.

Architecture snapshot

  • FastMCP powers the service boundary and service registration.
  • FastAPI + Uvicorn provide the HTTP interface (including /health) on port 8550 by default.
  • Household scripts (run.sh, killserver.sh, restart.sh) mirror the operational pattern from other MCP projects.
  • Ontologies and enrichment: Atlas ingests external definitions, normalizes them into the canonical schema, runs expand(entity, constraints, depth) workflows, and emits resolved/enriched representations. Persistence and caching will come later.
  • Type intelligence: resolution now feeds a pipeline of (1) local cache, (2) Virtuoso cache, (3) Google Trends evidence, (4) Wikidata wbsearchentities + Special:EntityData lookups (direct HTTP, with a proper user-agent/contact), (5) optional LLM classification (Groq meta-llama/llama-4-scout-17b-16e-instruct by default, falling back to OpenAI gpt-4o-mini) with optional caller-provided context, then finally the manual curation flag if everything fails.
  • News-mcp / Virtuoso-mcp collaboration: news-mcp asks Atlas to resolve text into canonical IDs; Atlas performs enrichment. Trends may assist resolution, but the resolution logic belongs in Atlas, not news-mcp. When Atlas eventually stores or recalls triples it must do so via virtuoso-mcp only; direct connections to the underlying Virtuoso instance are off-limits.

Folder layout sketch

atlas-mcp/
├── README.md
├── PROJECT.md
├── requirements.txt
├── .gitignore
├── app/
│   ├── __init__.py
│   └── main.py
├── scripts/
│   ├── run.sh
│   ├── killserver.sh
│   ├── restart.sh
│   └── tests.sh
└── MANIFEST.md ← existing architecture manifesto (reference)

v0.0.2 status

Atlas v0.0.2 establishes the new entity-centered canonical model:

  • entity core fields are stable (atlas_id, canonical_label, entity_type)
  • claims are attached to the entity and carry per-claim provenance/timestamps
  • atlas_id is now opaque/hash-based (no semantic parsing)
  • MID/QID are represented as identifier claims (not encoded in IDs)
  • trends/wikidata payload timestamp semantics are harmonized around retrieved_at

See RELEASE_NOTES_v0.0.2.md for the full summary.

Next steps

  • Keep the /health endpoint as a minimal service check on port 8550.
  • Wire in configuration placeholders for news-mcp and virtuoso-mcp credentials so Atlas can resolve entities and enrich them.
  • Continue tightening claim lifecycle handling and store/read roundtrips via virtuoso-mcp.

Maintenance / enrichment rules

  • Maintenance is graph-first and atlas_id-first.
  • Persisted rows are read before any refresh.
  • External payload refresh is explicit only (maintenance should not silently refetch).
  • Ontology-backed type inference uses Wikidata P31 first; LLM is only for ambiguity.
  • Superseded claims remain in the graph with status = "superseded"; normal JSON only shows active claims.
  • Normal resolve output is intentionally compact; add payloads=true when you want raw payload snapshots.
  • Debug output is where provenance and full claim metadata live.

See MAINTENANCE_CHECKLIST.md for the operational contract and test loop.

Ontology + graph URIs

  • Atlas’ current ontology file lives in ontology/atlas.ttl; load that into Protege to inspect the classes/predicates.
  • The ontology uses the atlas: prefix for http://world.eu.org/atlas_ontology#, and stored data should use atlas_data: for http://world.eu.org/atlas_data#.
  • Override them via .env (ATLAS_PREFIX_IRI, ATLAS_GRAPH_IRI) if you need a different URI, but keep it consistent across Protege, the ontology file, and the data graph.

Quick mcporter check

After restarting Atlas, you can test the server with your local config file like this:

mcporter --config "$CONFIG" call atlas.resolve_entity subject=Trump context="Short snippet for disambiguation"

If your config uses a different server name, swap atlas for that name. The optional context argument is forwarded to the LLM classifier when needed. The /health route is separate and should be checked with HTTP, not mcporter.