OUTLOOK.md 2.7 KB

News MCP Server — Project Vision & Status

Current version: v0.4.0 — see PROJECT.md for architecture details.

Core Design Principle

Raw news is useless to agents. Processed news is powerful.

  • ✅ Clusters are the unit of truth, not raw articles
  • ✅ 100 articles → 5–10 clusters, with entities, sentiment, importance
  • ✅ SQL-level filtering by time, entity, keyword — no full-table JSON parsing

Architecture (v0.4.0)

See PROJECT.md for full schema and architecture. Key points:

  • payload_ts generated column for indexed time-range queries
  • cluster_entities and cluster_keywords junction tables for O(log n) entity/keyword search
  • MCP tools and Dashboard REST API both query the same SQLite DB
  • Docker deployment on thinkcenter-2 (192.168.0.200:8506)

Tool Surface

Tool Status Notes
get_latest_events Time-filtered via payload_ts SQL index
get_events_for_entity ⚠️ MCP tool still uses Python-side entity matching (top-N limit). Dashboard uses SQL junction table. Known design flaw.
get_event_summary LLM-written narrative
detect_emerging_topics entity/keyword/phrase signal types, velocity scoring
get_news_sentiment ⚠️ Same Python-side entity matching limitation as get_events_for_entity
get_related_recent_entities Co-occurrence + Google Trends blend
get_feeds / toggle_feed Feed management
detect_emerging_topics(around=...) Scope to entity neighborhood

Known Design Issues

Two Stores (FIXED, May 2026)

DashboardStore was eliminated. All methods moved to SQLiteClusterStore. MCP tools now use SQL-level junction-table entity/keyword search via get_clusters_by_entity_or_keyword() — no row-limit blind spot.

MCP Tool Entity Search (FIXED, May 2026)

get_events_for_entity and get_news_sentiment now use SQLiteClusterStore.get_clusters_by_entity_or_keyword() with proper SQL-level filtering across the full time window via the cluster_entities and cluster_keywords junction tables.

Backfill Scripts

After deploying junction table schema changes:

docker exec -it news-mcp python3 scripts/backfill_junction_tables.py

For timestamp normalization (already run on live server):

docker exec -it news-mcp python3 scripts/normalize_cluster_timestamps.py

Future Directions (v0.5.0+)

"Emerging entity graph over time"

  • Collapse detect_emerging_topics() results into canonical entity nodes
  • Build weighted edges from co-occurrence in recent clusters
  • Infer communities (story neighborhoods)
  • Track graph evolution across refresh windows (node momentum, edge strength changes)
  • Agent tool: get_emerging_entity_graph(timeframe, limit)