Current version: v0.4.0 — see PROJECT.md for architecture details.
Raw news is useless to agents. Processed news is powerful.
See PROJECT.md for full schema and architecture. Key points:
payload_ts generated column for indexed time-range queriescluster_entities and cluster_keywords junction tables for O(log n) entity/keyword search| Tool | Status | Notes |
|---|---|---|
get_latest_events |
✅ | Time-filtered via payload_ts SQL index |
get_events_for_entity |
⚠️ | MCP tool still uses Python-side entity matching (top-N limit). Dashboard uses SQL junction table. Known design flaw. |
get_event_summary |
✅ | LLM-written narrative |
detect_emerging_topics |
✅ | entity/keyword/phrase signal types, velocity scoring |
get_news_sentiment |
⚠️ | Same Python-side entity matching limitation as get_events_for_entity |
get_related_recent_entities |
✅ | Co-occurrence + Google Trends blend |
get_feeds / toggle_feed |
✅ | Feed management |
detect_emerging_topics(around=...) |
✅ | Scope to entity neighborhood |
DashboardStore was eliminated. All methods moved to SQLiteClusterStore. MCP tools now use SQL-level junction-table entity/keyword search via get_clusters_by_entity_or_keyword() — no row-limit blind spot.
get_events_for_entity and get_news_sentiment now use SQLiteClusterStore.get_clusters_by_entity_or_keyword() with proper SQL-level filtering across the full time window via the cluster_entities and cluster_keywords junction tables.
After deploying junction table schema changes:
docker exec -it news-mcp python3 scripts/backfill_junction_tables.py
For timestamp normalization (already run on live server):
docker exec -it news-mcp python3 scripts/normalize_cluster_timestamps.py
detect_emerging_topics() results into canonical entity nodesget_emerging_entity_graph(timeframe, limit)