# Atlas-MCP Project Plan (Agent View) ## Context from the manifest * Atlas is the **only** semantic intelligence layer; it resolves entities, mediates ontologies, and expands graphs strictly through `expand(entity, constraints, depth)` workflows. * Atlas consumes input from domain-specific MCPs (news-mcp in the current ticket), resolves entities, and computes enrichment datasets. Resolution comes first; enrichment is secondary and may vary over time. * Ontologies remain data; Atlas maps between external sources, the canonical layer, and the domain ontologies without embedding domain-specific logic. * We should be ontology-first: model the representations before chasing enrichment details. * The derived layer is the domain-specific representation used by facts-mcp, news-mcp, and similar applications; enrichment is the dataset that feeds it. * Facts-mcp is a useful cautionary reference: the authoritative truth layer should stay small and explicit, while Atlas remains the semantic interpreter and never turns into a general fact store. ## Today’s mission 1. **Baseline service** * Stand up a FastAPI/Uvicorn app in `app/main.py` with the `/health` route that reports status, uptime, and FastMCP registration placeholders on port `8550`. * Keep the web layer minimal so future MCPs call Atlas as the sole semantic brain, consistent with the manifest’s separation rule. 2. **Operational scripts** * `scripts/run.sh`: launch uvicorn (and FastMCP registration) with sensible logging on port `8550`. * `scripts/killserver.sh`: stop lingering uvicorn processes, report if stale instances were found, and exit cleanly. * `scripts/restart.sh`: call `killserver.sh` then `run.sh` so restarts stay sequential. * `scripts/tests.sh`: probe `/health` and verify the expected contract before moving on to richer tests. 3. **Documentation lineage** * `README.md` (human-facing) summarizes the architecture, today’s goals, folder layout, and the news/virtuoso collaboration strategy. * `PROJECT.md` (this file) tracks agent priorities and reminders about the manifest’s hard rules. 4. **Dependencies & housekeeping** * `requirements.txt` lists FastAPI, uvicorn, fastmcp, rdflib, httpx, and any enrichment helpers we’ll need in the canonical layer. * `gitignore` covers Python artifacts, FastAPI logs, and typical OS noise. ## Immediate placeholders * The `/health` route should respond with `{"status": "ok"}`, `uptime_seconds`, and `fastmcp_registered` fields, with TODOs for wiring real service discovery. * Keep TODO comments in the code pointing to entity resolution, ontology mapping, and enrichment so the manifest’s strict responsibilities stay visible. ## Follow-on goals * Build modules for news-mcp bindings that leave Atlas as the interpreter, with news-mcp defining relevance/constraints while Atlas owns canonicalization and enrichment. * Consider a Virtuoso-backed cache/knowledge-graph layer for resolved entities that carry MID, Wikidata ID, and source provenance. * Atlas should not need a second database yet; **only** virtuoso-mcp talks to the underlying Virtuoso instance. Atlas must read/write triples exclusively through virtuoso-mcp, never directly. * Add tests simulating news-mcp entity requests to assert Atlas returns canonical IDs plus a derived subgraph that flows into Virtuoso. * Even stubs should be tested; the shape of the contract matters before the implementation details are complete. * If a known alias is resolved a hundred times, Atlas should reuse the stored mapping instead of asking upstream services again. * Add a dedicated integration test layer for resolution and enrichment once the graph clients exist. * If we port resolution in-house later, keep the external resolver behind a thin adapter so the internal contract does not change. * Document `expand(entity, constraints, depth)` expectations, starting with rdflib-based stubs and SPARQL placeholders for future enrichment work. * Keep the implementation precise: no enrichment in news-mcp, no graph execution in Atlas, and no semantic interpretation in Virtuoso. * Add maintenance routines (script/cron) to re-check entities with missing source data (especially missing Wikidata), and to supersede stale claims without bloating the schema.