Project: news-mcp

Goal

Provide a signal-extraction MCP server that converts RSS into deduplicated, enriched news clusters that are easy for agents to use.

FastMCP SSE server mounted at /mcp
SQLite cache for clusters + Groq summary caches
RSS fetch (breakingthenews.net)
v1 dedup via fuzzy title similarity
optional Ollama embeddings path for clustering (when NEWS_EMBEDDINGS_ENABLED=true)
configurable embedding similarity threshold (NEWS_EMBEDDING_SIMILARITY_THRESHOLD)
optional embeddings backfill script for precomputing cluster vectors in SQLite
optional merge-analysis script for threshold experiments before any DB rewrite
optional merge pass for destructive consolidation after threshold review
optional article-dedup cleanup for repeated article variants inside a cluster
Groq enrichment (topic/entities/sentiment/keywords)
Tools expose semantic queries over cached clusters

Instead of treating detect_emerging_topics() as a flat list, we want a higher-level representation:

Convert emerging topic/entity co-occurrence signals into a weighted entity graph
Group the graph into communities (story neighborhoods)
Track time evolution across refresh windows:
- node “momentum” (trend_score/count changes)
- edge strength changes (relation tightening/weakening)
- community emergence/disappearance

Eventual agent tool shape (later): get_emerging_entity_graph(timeframe, limit).

Tests pass offline (dedup/storage unit tests)
Server exposes tool surface with valid schemas
Caching prevents repeated Groq calls for unchanged clusters
Embeddings remain optional: Ollama is tried first when enabled, otherwise the heuristic path stays active
Embeddings backfill script exists for older cluster rows before the server restart
Merge-analysis script exists to inspect candidate cluster pairs at multiple thresholds
Merge pass exists for destructive consolidation once thresholds look sane
Article-dedup cleanup exists for fixing duplicated article records already in SQLite
Entity lookup now respects timeframe as the scan window, with limit acting as a cap