1 tydzień temu · f8677e48b5
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -53,9 +53,9 @@ This project spans two machines. **Always check which machine you're operating o
 
				 docker exec -it news-mcp python3 scripts/backfill_junction_tables.py
			
 
				 ```
			
 
				 
			
 
				-## Design Flaw: Two Stores
			
 
				+## Design Flaw: Two Stores (FIXED May 2026)
			
 
				 
			
 
				-`SQLiteClusterStore` and `DashboardStore` are parallel copies. Only `DashboardStore` was updated with junction-table entity search. MCP tools (`get_events_for_entity`, `get_news_sentiment`) still use `SQLiteClusterStore` Python-side entity matching with a row limit (top 200), missing entities in older clusters. See PROJECT.md for full analysis and proposed fix.
			
 
				+`DashboardStore` was eliminated. `SQLiteClusterStore` is the single data access layer with junction-table entity/keyword search. All MCP tools use the proper SQL methods.
			
 
				 
			
 
				 ## Docker / Live Server Details
			
 
				 - `docker-compose.yml` mounts `./:/app` with `working_dir: /app`
			
--- a/OUTLOOK.md
+++ b/OUTLOOK.md
@@ -33,11 +33,11 @@ See PROJECT.md for full schema and architecture. Key points:
 
				 
			
 
				 ## Known Design Issues
			
 
				 
			
 
				-### Two Stores (see PROJECT.md § "Design Flaw")
			
 
				-`SQLiteClusterStore` and `DashboardStore` are parallel copies. Only `DashboardStore` was updated with junction-table entity search. MCP tools still use Python-side entity matching with a row limit. Proposed fix: collapse into single data access layer.
			
 
				+### Two Stores (FIXED, May 2026)
			
 
				+`DashboardStore` was eliminated. All methods moved to `SQLiteClusterStore`. MCP tools now use SQL-level junction-table entity/keyword search via `get_clusters_by_entity_or_keyword()` — no row-limit blind spot.
			
 
				 
			
 
				-### MCP Tool Entity Search
			
 
				-`get_events_for_entity` and `get_news_sentiment` fetch top-N clusters by time then filter entities in Python. Entities in clusters beyond the limit are missed. Fix: use junction table `get_clusters_by_entity()`.
			
 
				+### MCP Tool Entity Search (FIXED, May 2026)
			
 
				+`get_events_for_entity` and `get_news_sentiment` now use `SQLiteClusterStore.get_clusters_by_entity_or_keyword()` with proper SQL-level filtering across the full time window via the `cluster_entities` and `cluster_keywords` junction tables.
			
 
				 
			
 
				 ## Backfill Scripts
			
 
				 
			
--- a/PROJECT.md
+++ b/PROJECT.md
@@ -82,20 +82,24 @@ Keywords extracted by the LLM are now first-class search signals:
 
				 - Dashboard Keywords panel with SQL frequency counts via junction table
			
 
				 - Topic labels (crypto/macro/regulation/ai/other) filtered from keywords at extraction time
			
 
				 
			
 
				+## Two-Store Collapse (done, May 2026)
			
 
				+
			
 
				+`DashboardStore` has been eliminated. All of its methods were moved into `SQLiteClusterStore` (the single data access layer), and the REST API routes now use the shared `SQLiteClusterStore` instance directly.
			
 
				+
			
 
				+All MCP tools (`get_events_for_entity`, `get_news_sentiment`, `get_latest_events` entity mode) now use `SQLiteClusterStore.get_clusters_by_entity_or_keyword()` which searches via junction-table SQL joins — no row-limit blind spot. The `cluster_entities` and `cluster_keywords` junction tables are indexed for O(log n) lookup across any time window.
			
 
				+
			
 
				 ## Timestamp Pipeline (May 2026)
			
 
				 1. **Write**: `sanitize_cluster_payload()` normalizes `timestamp`/`first_seen`/`last_updated` to `YYYY-MM-DDTHH:MM:SS+00:00`. If all three missing, falls back to `datetime.now()`.
			
 
				 2. **Generated column**: `payload_ts` auto-extracts from JSON on write. Indexed.
			
 
				 3. **Read**: All queries filter by `payload_ts >= ?` in SQL. No JSON parsing for time filtering.
			
 
				 4. **Backfill**: One-time `scripts/backfill_junction_tables.py` populated junction tables from existing payloads. `payload_ts` was auto-populated by SQLite.
			
 
				 
			
 
				-## Design Flaw: Two Stores (KNOWN, fix planned)
			
 
				-
			
 
				-**Problem:** `SQLiteClusterStore` and `DashboardStore` are parallel copies of the same data access layer. Methods were duplicated when DashboardStore was added, with the same JSON-parsing approach. When junction tables were implemented, only `DashboardStore` was updated. `SQLiteClusterStore` (used by MCP tools) still does full-table JSON parsing for entity/keyword search.
			
 
				-
			
 
				-**Current state:**
			
 
				-- `DashboardStore` — uses SQL `payload_ts` filter + junction tables ✓
			
 
				-- `SQLiteClusterStore` — uses SQL `payload_ts` filter for time ✓, but MCP tool entity search (`get_events_for_entity`, `get_news_sentiment`) still fetches top-N clusters by time then filters entities in Python
			
 
				+## Design Flaw: Two Stores (FIXED, May 2026)
			
 
				 
			
 
				-**Consequence:** `get_events_for_entity("Pete Hegseth", timeframe="72h")` fetches the 200 most recent clusters (via `payload_ts`), then loops in Python checking entities. If the entity appears in 34 clusters but only 15 are in the top 200, 19 are missed.
			
 
				+**What happened:** `DashboardStore` was a thin read-only query layer that wrapped `SQLiteClusterStore`. The MCP tools (`get_events_for_entity`, `get_news_sentiment`, `get_latest_events` entity mode) did Python-side entity matching by fetching top-N clusters via `payload_ts` then filtering in Python. Entities in clusters beyond the limit were silently missed.
			
 
				 
			
 
				-**Proposed fix:** Collapse both stores into one. `SQLiteClusterStore` should be the single data access layer with proper junction-table methods for entity/keyword search. `DashboardStore` should be a thin wrapper or removed entirely. MCP tools should call `SQLiteClusterStore.get_clusters_by_entity()` using junction tables instead of Python-side filtering.
			
 
				+**Fix applied:** 
			
 
				+- `DashboardStore` was deleted. All its methods are now in `SQLiteClusterStore`.
			
 
				+- All MCP tools use `SQLiteClusterStore.get_clusters_by_entity_or_keyword()` — SQL-level junction-table search with no row-limit blind spot.
			
 
				+- The combined method uses `LEFT JOIN` on `cluster_entities` and `cluster_keywords` with `WHERE (ce.entity IN (...) OR ck.keyword IN (...))`, which matches both named entities and thematic keywords across any time window.
			
 
				+- Exact matching (via `IN`) replaced substring matching — more correct, no false positives from partial name matches.
			
--- a/news_mcp/dashboard/__init__.py
+++ b/news_mcp/dashboard/__init__.py
--- a/news_mcp/dashboard/dashboard_store.py
+++ b/news_mcp/dashboard/dashboard_store.py
@@ -1,357 +0,0 @@
 
				-from __future__ import annotations
			
 
				-
			
 
				-import json
			
 
				-from datetime import datetime, timedelta, timezone
			
 
				-from typing import Any
			
 
				-
			
 
				-from news_mcp.config import (
			
 
				-    NEWS_PRUNE_INTERVAL_HOURS,
			
 
				-    NEWS_PRUNING_ENABLED,
			
 
				-    NEWS_REFRESH_INTERVAL_SECONDS,
			
 
				-    NEWS_RETENTION_DAYS,
			
 
				-    DEFAULT_TOPICS,
			
 
				-)
			
 
				-from news_mcp.storage.sqlite_store import SQLiteClusterStore
			
 
				-
			
 
				-
			
 
				-class DashboardStore:
			
 
				-    """Read-only query layer for the dashboard."""
			
 
				-
			
 
				-    def __init__(self, store=None):
			
 
				-        if store is not None:
			
 
				-            self._store = store
			
 
				-        else:
			
 
				-            from news_mcp.config import DB_PATH
			
 
				-            self._store = SQLiteClusterStore(DB_PATH)
			
 
				-
			
 
				-    # ── Health & Stats ──────────────────────────────────────────────
			
 
				-
			
 
				-    def get_dashboard_stats(self) -> dict[str, Any]:
			
 
				-        with self._store._conn() as conn:
			
 
				-            total_clusters = conn.execute("SELECT COUNT(*) FROM clusters").fetchone()[0]
			
 
				-            total_entities = conn.execute("SELECT COUNT(*) FROM entity_metadata").fetchone()[0]
			
 
				-            cluster_entities = conn.execute(
			
 
				-                "SELECT COUNT(DISTINCT e.value) "
			
 
				-                "FROM clusters, json_each(clusters.payload, '$.entities') AS e"
			
 
				-            ).fetchone()[0]
			
 
				-            topic_counts = dict(conn.execute(
			
 
				-                "SELECT topic, COUNT(*) FROM clusters GROUP BY topic"
			
 
				-            ).fetchall())
			
 
				-
			
 
				-        last_refresh = self._store.get_meta("last_refresh_at")
			
 
				-        last_prune = self._store.get_meta("last_prune_at")
			
 
				-        
			
 
				-        # Freshness: did a refresh happen recently? (within 2x the configured interval)
			
 
				-        fresh = False
			
 
				-        if last_refresh:
			
 
				-            try:
			
 
				-                dt = datetime.fromisoformat(last_refresh.replace("Z", "+00:00"))
			
 
				-                if dt.tzinfo is None:
			
 
				-                    dt = dt.replace(tzinfo=timezone.utc)
			
 
				-                age_hours = (datetime.now(timezone.utc) - dt).total_seconds() / 3600
			
 
				-                fresh = age_hours < max(1.0, NEWS_REFRESH_INTERVAL_SECONDS / 3600) * 2
			
 
				-            except Exception:
			
 
				-                pass
			
 
				-
			
 
				-        feeds = {}
			
 
				-        with self._store._conn() as conn:
			
 
				-            for row in conn.execute("SELECT feed_key, last_hash, last_item_count, enabled, updated_at FROM feed_state ORDER BY updated_at DESC"):
			
 
				-                feeds[row[0]] = {"last_hash": row[1], "last_item_count": row[2], "enabled": bool(row[3]), "updated_at": row[4]}
			
 
				-
			
 
				-        return {
			
 
				-            "total_clusters": total_clusters,
			
 
				-            "total_entities": total_entities,
			
 
				-            "cluster_entities": cluster_entities,
			
 
				-            "clusters_by_topic": topic_counts,
			
 
				-            "last_refresh_at": last_refresh,
			
 
				-            "last_prune_at": last_prune,
			
 
				-            "data_fresh": fresh,
			
 
				-            "feeds": feeds,
			
 
				-            "feed_count": len(feeds),
			
 
				-            "pruning": {
			
 
				-                "enabled": NEWS_PRUNING_ENABLED,
			
 
				-                "retention_days": NEWS_RETENTION_DAYS,
			
 
				-                "interval_hours": NEWS_PRUNE_INTERVAL_HOURS,
			
 
				-                "last_prune_at": last_prune,
			
 
				-            },
			
 
				-        }
			
 
				-
			
 
				-    # ── Clusters ────────────────────────────────────────────────────
			
 
				-
			
 
				-    def get_clusters_page(
			
 
				-        self,
			
 
				-        topic: str | None = None,
			
 
				-        hours: float = 24,
			
 
				-        limit: int = 20,
			
 
				-        offset: int = 0,
			
 
				-    ) -> dict[str, Any]:
			
 
				-        """Paginated cluster listing filtered by SQL payload_ts index.
			
 
				-
			
 
				-        Returns {"clusters": [...], "total": int}.
			
 
				-        """
			
 
				-        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				-
			
 
				-        query = "SELECT payload FROM clusters WHERE payload_ts >= ?"
			
 
				-        params: list = [cutoff]
			
 
				-        if topic and topic != "all":
			
 
				-            query += " AND topic = ?"
			
 
				-            params.append(topic)
			
 
				-        # Get total count before pagination
			
 
				-        total = self._store._conn().execute(
			
 
				-            f"SELECT COUNT(*) FROM ({query})", params
			
 
				-        ).fetchone()[0]
			
 
				-        query += " ORDER BY payload_ts DESC LIMIT ? OFFSET ?"
			
 
				-        params.extend([limit, offset])
			
 
				-
			
 
				-        with self._store._conn() as conn:
			
 
				-            rows = conn.execute(query, params).fetchall()
			
 
				-
			
 
				-        return {
			
 
				-            "clusters": [
			
 
				-                {
			
 
				-                    "cluster_id": c.get("cluster_id", ""),
			
 
				-                    "headline": c.get("headline", ""),
			
 
				-                    "topic": c.get("topic", ""),
			
 
				-                    "sentiment": c.get("sentiment", "neutral"),
			
 
				-                    "sentimentScore": c.get("sentimentScore"),
			
 
				-                    "importance": c.get("importance", 0),
			
 
				-                    "entities": c.get("entities", []),
			
 
				-                    "sources": c.get("sources", []),
			
 
				-                    "timestamp": c.get("timestamp", ""),
			
 
				-                    "keywords": c.get("keywords", []),
			
 
				-                    "article_count": len(c.get("articles", [])),
			
 
				-                }
			
 
				-                for c in [json.loads(r[0]) for r in rows]
			
 
				-            ],
			
 
				-            "total": total,
			
 
				-        }
			
 
				-
			
 
				-    def get_cluster_detail(self, cluster_id: str) -> dict[str, Any] | None:
			
 
				-        with self._store._conn() as conn:
			
 
				-            cur = conn.execute(
			
 
				-                "SELECT payload FROM clusters WHERE cluster_id = ?", (cluster_id,)
			
 
				-            )
			
 
				-            row = cur.fetchone()
			
 
				-            if not row:
			
 
				-                return None
			
 
				-            c = json.loads(row[0])
			
 
				-            summary = None
			
 
				-            if c.get("summary_payload"):
			
 
				-                try:
			
 
				-                    summary = json.loads(c["summary_payload"])
			
 
				-                except Exception:
			
 
				-                    pass
			
 
				-            return {
			
 
				-                "cluster_id": c.get("cluster_id"),
			
 
				-                "headline": c.get("headline", ""),
			
 
				-                "summary": c.get("summary", ""),
			
 
				-                "topic": c.get("topic", ""),
			
 
				-                "sentiment": c.get("sentiment", "neutral"),
			
 
				-                "sentimentScore": c.get("sentimentScore"),
			
 
				-                "importance": c.get("importance", 0),
			
 
				-                "entities": c.get("entities", []),
			
 
				-                "entityResolutions": c.get("entityResolutions", []),
			
 
				-                "keywords": c.get("keywords", []),
			
 
				-                "sources": c.get("sources", []),
			
 
				-                "timestamp": c.get("timestamp", ""),
			
 
				-                "first_seen": c.get("first_seen", ""),
			
 
				-                "last_updated": c.get("last_updated", ""),
			
 
				-                "article_count": len(c.get("articles", [])),
			
 
				-                "articles": c.get("articles", []),
			
 
				-                "summary_text": summary.get("mergedSummary", "") if summary else "",
			
 
				-                "key_facts": summary.get("keyFacts", []) if summary else [],
			
 
				-            }
			
 
				-
			
 
				-    # ── Sentiment Series ────────────────────────────────────────────
			
 
				-
			
 
				-    def get_sentiment_series(
			
 
				-            self,
			
 
				-            topic: str | None = None,
			
 
				-            hours: float = 24,
			
 
				-            bucket_hours: float = 1,
			
 
				-        ) -> list[dict[str, Any]]:
			
 
				-        """Sentiment score averaged per time bucket.
			
 
				-
			
 
				-        Filters by payload_ts SQL index.
			
 
				-        """
			
 
				-        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				-        query = "SELECT payload FROM clusters WHERE payload_ts >= ?"
			
 
				-        params: list = [cutoff]
			
 
				-        if topic and topic != "all":
			
 
				-            query += " AND topic = ?"
			
 
				-            params.append(topic)
			
 
				-        query += " ORDER BY payload_ts ASC"
			
 
				-
			
 
				-        with self._store._conn() as conn:
			
 
				-            rows = conn.execute(query, params).fetchall()
			
 
				-
			
 
				-        buckets: dict[datetime, list[float]] = {}
			
 
				-        for (payload_text,) in rows:
			
 
				-            c = json.loads(payload_text)
			
 
				-            ts_str = c.get("timestamp")
			
 
				-            score = c.get("sentimentScore")
			
 
				-            if not ts_str or score is None:
			
 
				-                continue
			
 
				-            dt = datetime.fromisoformat(str(ts_str).strip())
			
 
				-            if dt.tzinfo is None:
			
 
				-                dt = dt.replace(tzinfo=timezone.utc)
			
 
				-            dt = dt.astimezone(timezone.utc)
			
 
				-            bucket_key = dt.replace(minute=0, second=0, microsecond=0)
			
 
				-            if bucket_hours > 1:
			
 
				-                bucket_key = bucket_key.replace(
			
 
				-                    hour=(bucket_key.hour // int(bucket_hours)) * int(bucket_hours)
			
 
				-                )
			
 
				-            buckets.setdefault(bucket_key, []).append(float(score))
			
 
				-
			
 
				-        return [
			
 
				-            {
			
 
				-                "time": bucket_key.isoformat(),
			
 
				-                "avg_sentiment": round(sum(scores) / len(scores), 3),
			
 
				-                "count": len(scores),
			
 
				-                "min": round(min(scores), 3),
			
 
				-                "max": round(max(scores), 3),
			
 
				-            }
			
 
				-            for bucket_key, scores in sorted(buckets.items())
			
 
				-        ]
			
 
				-
			
 
				-    # ── Entity Frequencies ──────────────────────────────────────────
			
 
				-
			
 
				-    def get_entity_frequencies(
			
 
				-        self,
			
 
				-        hours: float = 24,
			
 
				-        limit: int = 30,
			
 
				-    ) -> list[dict[str, Any]]:
			
 
				-        """Top entities by mention count, using SQL junction table + payload_ts index."""
			
 
				-        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				-
			
 
				-        with self._store._conn() as conn:
			
 
				-            rows = conn.execute(
			
 
				-                """
			
 
				-                SELECT ce.entity, COUNT(*) as cnt
			
 
				-                FROM cluster_entities ce
			
 
				-                JOIN clusters c ON c.cluster_id = ce.cluster_id
			
 
				-                WHERE c.payload_ts >= ?
			
 
				-                GROUP BY ce.entity
			
 
				-                ORDER BY cnt DESC
			
 
				-                LIMIT ?
			
 
				-                """,
			
 
				-                (cutoff, limit),
			
 
				-            ).fetchall()
			
 
				-
			
 
				-        result: list[dict[str, Any]] = []
			
 
				-        for label, count in rows:
			
 
				-            meta = self._store.get_entity_metadata(label)
			
 
				-            result.append({
			
 
				-                "label": label,
			
 
				-                "count": count,
			
 
				-                "canonical_label": meta["canonical_label"] if meta else label,
			
 
				-                "mid": meta["mid"] if meta else None,
			
 
				-            })
			
 
				-        return result
			
 
				-
			
 
				-    # ── Keyword Frequencies ─────────────────────────────────────────
			
 
				-
			
 
				-    def get_keyword_frequencies(
			
 
				-        self,
			
 
				-        hours: float = 24,
			
 
				-        limit: int = 30,
			
 
				-    ) -> list[dict[str, Any]]:
			
 
				-        """Top keywords by mention count, using SQL junction table + payload_ts index.
			
 
				-
			
 
				-        Excludes DEFAULT_TOPICS labels (crypto, macro, regulation, ai, other).
			
 
				-        """
			
 
				-        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				-        _topic_labels = {t.lower() for t in DEFAULT_TOPICS}
			
 
				-
			
 
				-        with self._store._conn() as conn:
			
 
				-            rows = conn.execute(
			
 
				-                """
			
 
				-                SELECT ck.keyword, COUNT(*) as cnt
			
 
				-                FROM cluster_keywords ck
			
 
				-                JOIN clusters c ON c.cluster_id = ck.cluster_id
			
 
				-                WHERE c.payload_ts >= ?
			
 
				-                GROUP BY ck.keyword
			
 
				-                ORDER BY cnt DESC
			
 
				-                LIMIT ?
			
 
				-                """,
			
 
				-                (cutoff, limit),
			
 
				-            ).fetchall()
			
 
				-
			
 
				-        return [
			
 
				-            {"label": label, "count": count}
			
 
				-            for label, count in rows
			
 
				-            if label.lower() not in _topic_labels
			
 
				-        ]
			
 
				-
			
 
				-    # ── Entity/Keyword Cluster Search ────────────────────────────────
			
 
				-
			
 
				-    def get_clusters_by_entity(
			
 
				-        self,
			
 
				-        entity: str,
			
 
				-        hours: float = 168,
			
 
				-        limit: int = 50,
			
 
				-        offset: int = 0,
			
 
				-    ) -> dict[str, Any]:
			
 
				-        """Return clusters matching an entity, SQL-level filter via junction table."""
			
 
				-        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				-        entity_norm = entity.strip().lower()
			
 
				-
			
 
				-        with self._store._conn() as conn:
			
 
				-            # Total count
			
 
				-            total = conn.execute(
			
 
				-                "SELECT COUNT(DISTINCT c.cluster_id) FROM clusters c "
			
 
				-                "JOIN cluster_entities ce ON c.cluster_id = ce.cluster_id "
			
 
				-                "WHERE c.payload_ts >= ? AND ce.entity = ?",
			
 
				-                (cutoff, entity_norm),
			
 
				-            ).fetchone()[0]
			
 
				-
			
 
				-            # Paginated results
			
 
				-            rows = conn.execute(
			
 
				-                "SELECT DISTINCT c.payload FROM clusters c "
			
 
				-                "JOIN cluster_entities ce ON c.cluster_id = ce.cluster_id "
			
 
				-                "WHERE c.payload_ts >= ? AND ce.entity = ? "
			
 
				-                "ORDER BY c.payload_ts DESC LIMIT ? OFFSET ?",
			
 
				-                (cutoff, entity_norm, limit, offset),
			
 
				-            ).fetchall()
			
 
				-
			
 
				-        return {
			
 
				-            "entity": entity_norm,
			
 
				-            "clusters": [json.loads(r[0]) for r in rows],
			
 
				-            "total": total,
			
 
				-            "hours": hours,
			
 
				-        }
			
 
				-
			
 
				-    def get_clusters_by_keyword(
			
 
				-        self,
			
 
				-        keyword: str,
			
 
				-        hours: float = 168,
			
 
				-        limit: int = 50,
			
 
				-        offset: int = 0,
			
 
				-    ) -> dict[str, Any]:
			
 
				-        """Return clusters matching a keyword, SQL-level filter via junction table."""
			
 
				-        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				-        kw_norm = keyword.strip().lower()
			
 
				-
			
 
				-        with self._store._conn() as conn:
			
 
				-            total = conn.execute(
			
 
				-                "SELECT COUNT(DISTINCT c.cluster_id) FROM clusters c "
			
 
				-                "JOIN cluster_keywords ck ON c.cluster_id = ck.cluster_id "
			
 
				-                "WHERE c.payload_ts >= ? AND ck.keyword = ?",
			
 
				-                (cutoff, kw_norm),
			
 
				-            ).fetchone()[0]
			
 
				-
			
 
				-            rows = conn.execute(
			
 
				-                "SELECT DISTINCT c.payload FROM clusters c "
			
 
				-                "JOIN cluster_keywords ck ON c.cluster_id = ck.cluster_id "
			
 
				-                "WHERE c.payload_ts >= ? AND ck.keyword = ? "
			
 
				-                "ORDER BY c.payload_ts DESC LIMIT ? OFFSET ?",
			
 
				-                (cutoff, kw_norm, limit, offset),
			
 
				-            ).fetchall()
			
 
				-
			
 
				-        return {
			
 
				-            "keyword": kw_norm,
			
 
				-            "clusters": [json.loads(r[0]) for r in rows],
			
 
				-            "total": total,
			
 
				-            "hours": hours,
			
 
				-        }
			
 
				-
			
--- a/news_mcp/mcp_server_fastmcp.py
+++ b/news_mcp/mcp_server_fastmcp.py
@@ -28,7 +28,6 @@ from news_mcp.config import (
 
				 )
			
 
				 from news_mcp.jobs.poller import refresh_clusters
			
 
				 from news_mcp.storage.sqlite_store import SQLiteClusterStore
			
 
				-from news_mcp.dashboard.dashboard_store import DashboardStore
			
 
				 from news_mcp.enrichment.llm_enrich import summarize_cluster_llm
			
 
				 from news_mcp.trends_resolution import resolve_entity_via_trends
			
 
				 from news_mcp.llm import active_llm_config
			
@@ -369,16 +368,10 @@ async def get_latest_events(topic: str | None = None, limit: int = 5, include_ar
 
				             clusters = store.get_latest_clusters(topic=topic_norm, ttl_hours=DEFAULT_LOOKBACK_HOURS, limit=limit)
			
 
				     else:
			
 
				         # Entity-aware mode: search recent clusters across all topics and match by
			
 
				-        # raw entity, canonical label, or MID.
			
 
				-        clusters = store.get_latest_clusters_all_topics(ttl_hours=DEFAULT_LOOKBACK_HOURS, limit=limit * 8)
			
 
				-        filtered = []
			
 
				-        for c in clusters:
			
 
				-            haystack = _cluster_entity_haystack(c)
			
 
				-            if any(any(term in item for item in haystack) for term in query_terms):
			
 
				-                filtered.append(c)
			
 
				-            if len(filtered) >= limit:
			
 
				-                break
			
 
				-        clusters = filtered
			
 
				+        # raw entity, canonical label, or MID using SQL-level junction table search.
			
 
				+        clusters = store.get_clusters_by_entity_or_keyword(
			
 
				+            query_terms=query_terms, hours=DEFAULT_LOOKBACK_HOURS, limit=limit
			
 
				+        )
			
 
				 
			
 
				     out = []
			
 
				     for c in _sort_clusters_by_recency(clusters):
			
@@ -429,19 +422,8 @@ async def get_events_for_entity(entity: str, limit: int = 10, timeframe: str = "
 
				 
			
 
				     store = SQLiteClusterStore(DB_PATH)
			
 
				 
			
 
				-    def _match_clusters(clusters: list[dict]) -> list[dict]:
			
 
				-        hits: list[dict] = []
			
 
				-        for c in _sort_clusters_by_recency(clusters):
			
 
				-            haystack = _cluster_entity_haystack(c)
			
 
				-            if any(any(term in item for item in haystack) for term in query_terms):
			
 
				-                hits.append(c)
			
 
				-            if len(hits) >= limit:
			
 
				-                break
			
 
				-        return hits
			
 
				-
			
 
				     hours = _parse_timeframe_to_hours(timeframe)
			
 
				-    clusters = store.get_latest_clusters_all_topics(ttl_hours=hours, limit=max(200, limit * 10))
			
 
				-    hits = _match_clusters(clusters)
			
 
				+    hits = store.get_clusters_by_entity_or_keyword(query_terms=query_terms, hours=hours, limit=limit)
			
 
				 
			
 
				     out = []
			
 
				     for c in hits:
			
@@ -903,12 +885,8 @@ async def get_news_sentiment(entity: str, timeframe: str = "24h"):
 
				         hours = 24
			
 
				     hours = max(1, min(int(hours), 168))
			
 
				 
			
 
				-    clusters = store.get_latest_clusters_all_topics(ttl_hours=hours, limit=500)
			
 
				-    matched = []
			
 
				-    for c in clusters:
			
 
				-        haystack = _cluster_entity_haystack(c)
			
 
				-        if any(any(term in item for item in haystack) for term in query_terms):
			
 
				-            matched.append(c)
			
 
				+    clusters = store.get_clusters_by_entity_or_keyword(query_terms=query_terms, hours=hours, limit=500)
			
 
				+    matched = clusters
			
 
				 
			
 
				     if not matched:
			
 
				         return {
			
@@ -1101,7 +1079,7 @@ def _api_err(exc: Exception, ctx: str) -> JSONResponse:
 
				 def api_health():
			
 
				     """Extended health + dashboard stats."""
			
 
				     try:
			
 
				-        store = DashboardStore(_shared_store)
			
 
				+        store = _shared_store
			
 
				         stats = store.get_dashboard_stats()
			
 
				         stats["version"] = _VERSION_HASH
			
 
				         return stats
			
@@ -1117,7 +1095,7 @@ def api_clusters(
 
				 ):
			
 
				     """Paginated cluster listing."""
			
 
				     try:
			
 
				-        store = DashboardStore(_shared_store)
			
 
				+        store = _shared_store
			
 
				         result = store.get_clusters_page(topic=topic, hours=hours, limit=limit, offset=offset)
			
 
				         return {"clusters": result["clusters"], "total": result["total"], "topic": topic or "all", "hours": hours}
			
 
				     except Exception as e:
			
@@ -1131,7 +1109,7 @@ def api_sentiment_series(
 
				 ):
			
 
				     """Sentiment time-series for Chart.js."""
			
 
				     try:
			
 
				-        store = DashboardStore(_shared_store)
			
 
				+        store = _shared_store
			
 
				         series = store.get_sentiment_series(topic=topic, hours=hours, bucket_hours=bucket_hours)
			
 
				         return {"series": series, "topic": topic or "all"}
			
 
				     except Exception as e:
			
@@ -1144,7 +1122,7 @@ def api_entities(
 
				 ):
			
 
				     """Top entity frequencies."""
			
 
				     try:
			
 
				-        store = DashboardStore(_shared_store)
			
 
				+        store = _shared_store
			
 
				         entities = store.get_entity_frequencies(hours=hours, limit=limit)
			
 
				         return {"entities": entities, "hours": hours}
			
 
				     except Exception as e:
			
@@ -1157,7 +1135,7 @@ def api_keywords(
 
				 ):
			
 
				     """Top keyword frequencies (thematic descriptors, excluding terms already counted as entities)."""
			
 
				     try:
			
 
				-        store = DashboardStore(_shared_store)
			
 
				+        store = _shared_store
			
 
				         keywords = store.get_keyword_frequencies(hours=hours, limit=limit)
			
 
				         return {"keywords": keywords, "hours": hours}
			
 
				     except Exception as e:
			
@@ -1172,7 +1150,7 @@ def api_clusters_by_entity(
 
				 ):
			
 
				     """Return clusters matching an entity, filtered by event time via SQL junction table."""
			
 
				     try:
			
 
				-        store = DashboardStore(_shared_store)
			
 
				+        store = _shared_store
			
 
				         return store.get_clusters_by_entity(
			
 
				             entity=entity.strip().lower(),
			
 
				             hours=hours,
			
@@ -1191,7 +1169,7 @@ def api_clusters_by_keyword(
 
				 ):
			
 
				     """Return clusters matching a keyword, filtered by event time via SQL junction table."""
			
 
				     try:
			
 
				-        store = DashboardStore(_shared_store)
			
 
				+        store = _shared_store
			
 
				         return store.get_clusters_by_keyword(
			
 
				             keyword=keyword.strip().lower(),
			
 
				             hours=hours,
			
@@ -1205,7 +1183,7 @@ def api_clusters_by_keyword(
 
				 def api_cluster_detail(cluster_id: str):
			
 
				     """Full cluster detail for drill-down."""
			
 
				     try:
			
 
				-        store = DashboardStore(_shared_store)
			
 
				+        store = _shared_store
			
 
				         detail = store.get_cluster_detail(cluster_id)
			
 
				         if not detail:
			
 
				             return JSONResponse(status_code=404, content={"error": "Cluster not found", "id": cluster_id})
			
--- a/news_mcp/storage/sqlite_store.py
+++ b/news_mcp/storage/sqlite_store.py
@@ -12,6 +12,7 @@ from email.utils import parsedate_to_datetime
 
				 from news_mcp.config import (
			
 
				     NEWS_PRUNE_INTERVAL_HOURS,
			
 
				     NEWS_PRUNING_ENABLED,
			
 
				+    NEWS_REFRESH_INTERVAL_SECONDS,
			
 
				     NEWS_RETENTION_DAYS,
			
 
				 )
			
 
				 from news_mcp.entity_normalize import normalize_entities
			
@@ -671,13 +672,34 @@ class SQLiteClusterStore:
 
				         with self._conn() as conn:
			
 
				             total_clusters = conn.execute("SELECT COUNT(*) FROM clusters").fetchone()[0]
			
 
				             total_entities = conn.execute("SELECT COUNT(*) FROM entity_metadata").fetchone()[0]
			
 
				+            cluster_entities = conn.execute(
			
 
				+                "SELECT COUNT(DISTINCT e.value) "
			
 
				+                "FROM clusters, json_each(clusters.payload, '$.entities') AS e"
			
 
				+            ).fetchone()[0]
			
 
				             topic_counts = dict(conn.execute(
			
 
				                 "SELECT topic, COUNT(*) FROM clusters GROUP BY topic"
			
 
				             ).fetchall())
			
 
				             last_refresh = self.get_meta("last_refresh_at")
			
 
				             feeds = {}
			
 
				-            for row in conn.execute("SELECT feed_key, last_hash, last_item_count, updated_at FROM feed_state"):
			
 
				-                feeds[row[0]] = {"last_hash": row[1], "last_item_count": row[2], "updated_at": row[3]}
			
 
				+            for row in conn.execute(
			
 
				+                "SELECT feed_key, last_hash, last_item_count, enabled, updated_at FROM feed_state ORDER BY updated_at DESC"
			
 
				+            ):
			
 
				+                feeds[row[0]] = {
			
 
				+                    "last_hash": row[1], "last_item_count": row[2],
			
 
				+                    "enabled": bool(row[3]), "updated_at": row[4],
			
 
				+                }
			
 
				+            # Freshness: did a refresh happen recently? (within 2x the configured interval)
			
 
				+            fresh = False
			
 
				+            if last_refresh:
			
 
				+                try:
			
 
				+                    dt = datetime.fromisoformat(last_refresh.replace("Z", "+00:00"))
			
 
				+                    if dt.tzinfo is None:
			
 
				+                        dt = dt.replace(tzinfo=timezone.utc)
			
 
				+                    age_hours = (datetime.now(timezone.utc) - dt).total_seconds() / 3600
			
 
				+                    fresh = age_hours < max(1.0, NEWS_REFRESH_INTERVAL_SECONDS / 3600) * 2
			
 
				+                except Exception:
			
 
				+                    pass
			
 
				+
			
 
				             last_prune = self.get_meta(META_LAST_PRUNE_AT)
			
 
				             prune_state = self.get_prune_state(
			
 
				                 pruning_enabled=NEWS_PRUNING_ENABLED,
			
@@ -687,11 +709,14 @@ class SQLiteClusterStore:
 
				             return {
			
 
				                 "total_clusters": total_clusters,
			
 
				                 "total_entities": total_entities,
			
 
				+                "cluster_entities": cluster_entities,
			
 
				                 "clusters_by_topic": topic_counts,
			
 
				                 "last_refresh_at": last_refresh,
			
 
				                 "last_prune_at": last_prune,
			
 
				-                "prune_state": prune_state,
			
 
				+                "data_fresh": fresh,
			
 
				                 "feeds": feeds,
			
 
				+                "feed_count": len(feeds),
			
 
				+                "prune_state": prune_state,
			
 
				             }
			
 
				 
			
 
				     def get_cluster_detail(self, cluster_id: str) -> dict[str, Any] | None:
			
@@ -730,3 +755,269 @@ class SQLiteClusterStore:
 
				                 "summary_text": summary.get("mergedSummary", "") if summary else "",
			
 
				                 "key_facts": summary.get("keyFacts", []) if summary else [],
			
 
				             }
			
 
				+
			
 
				+    # ── Paginated Clusters ────────────────────────────────────────────
			
 
				+
			
 
				+    def get_clusters_page(
			
 
				+        self,
			
 
				+        topic: str | None = None,
			
 
				+        hours: float = 24,
			
 
				+        limit: int = 20,
			
 
				+        offset: int = 0,
			
 
				+    ) -> dict[str, Any]:
			
 
				+        """Paginated cluster listing filtered by SQL payload_ts index.
			
 
				+
			
 
				+        Returns {"clusters": [...], "total": int}.
			
 
				+        """
			
 
				+        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				+        query = "SELECT payload FROM clusters WHERE payload_ts >= ?"
			
 
				+        params: list = [cutoff]
			
 
				+        if topic and topic != "all":
			
 
				+            query += " AND topic = ?"
			
 
				+            params.append(topic)
			
 
				+        total = self._conn().execute(
			
 
				+            f"SELECT COUNT(*) FROM ({query})", params
			
 
				+        ).fetchone()[0]
			
 
				+        query += " ORDER BY payload_ts DESC LIMIT ? OFFSET ?"
			
 
				+        params.extend([limit, offset])
			
 
				+
			
 
				+        with self._conn() as conn:
			
 
				+            rows = conn.execute(query, params).fetchall()
			
 
				+
			
 
				+        return {
			
 
				+            "clusters": [
			
 
				+                {
			
 
				+                    "cluster_id": c.get("cluster_id", ""),
			
 
				+                    "headline": c.get("headline", ""),
			
 
				+                    "topic": c.get("topic", ""),
			
 
				+                    "sentiment": c.get("sentiment", "neutral"),
			
 
				+                    "sentimentScore": c.get("sentimentScore"),
			
 
				+                    "importance": c.get("importance", 0),
			
 
				+                    "entities": c.get("entities", []),
			
 
				+                    "sources": c.get("sources", []),
			
 
				+                    "timestamp": c.get("timestamp", ""),
			
 
				+                    "keywords": c.get("keywords", []),
			
 
				+                    "article_count": len(c.get("articles", [])),
			
 
				+                }
			
 
				+                for c in [json.loads(r[0]) for r in rows]
			
 
				+            ],
			
 
				+            "total": total,
			
 
				+        }
			
 
				+
			
 
				+    # ── Sentiment Series ──────────────────────────────────────────────
			
 
				+
			
 
				+    def get_sentiment_series(
			
 
				+        self,
			
 
				+        topic: str | None = None,
			
 
				+        hours: float = 24,
			
 
				+        bucket_hours: float = 1,
			
 
				+    ) -> list[dict[str, Any]]:
			
 
				+        """Sentiment score averaged per time bucket.  Filters by payload_ts SQL index."""
			
 
				+        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				+        query = "SELECT payload FROM clusters WHERE payload_ts >= ?"
			
 
				+        params: list = [cutoff]
			
 
				+        if topic and topic != "all":
			
 
				+            query += " AND topic = ?"
			
 
				+            params.append(topic)
			
 
				+        query += " ORDER BY payload_ts ASC"
			
 
				+
			
 
				+        with self._conn() as conn:
			
 
				+            rows = conn.execute(query, params).fetchall()
			
 
				+
			
 
				+        buckets: dict[datetime, list[float]] = {}
			
 
				+        for (payload_text,) in rows:
			
 
				+            c = json.loads(payload_text)
			
 
				+            ts_str = c.get("timestamp")
			
 
				+            score = c.get("sentimentScore")
			
 
				+            if not ts_str or score is None:
			
 
				+                continue
			
 
				+            dt = datetime.fromisoformat(str(ts_str).strip())
			
 
				+            if dt.tzinfo is None:
			
 
				+                dt = dt.replace(tzinfo=timezone.utc)
			
 
				+            dt = dt.astimezone(timezone.utc)
			
 
				+            bucket_key = dt.replace(minute=0, second=0, microsecond=0)
			
 
				+            if bucket_hours > 1:
			
 
				+                bucket_key = bucket_key.replace(
			
 
				+                    hour=(bucket_key.hour // int(bucket_hours)) * int(bucket_hours)
			
 
				+                )
			
 
				+            buckets.setdefault(bucket_key, []).append(float(score))
			
 
				+
			
 
				+        return [
			
 
				+            {
			
 
				+                "time": bucket_key.isoformat(),
			
 
				+                "avg_sentiment": round(sum(scores) / len(scores), 3),
			
 
				+                "count": len(scores),
			
 
				+                "min": round(min(scores), 3),
			
 
				+                "max": round(max(scores), 3),
			
 
				+            }
			
 
				+            for bucket_key, scores in sorted(buckets.items())
			
 
				+        ]
			
 
				+
			
 
				+    # ── Entity / Keyword Frequencies ──────────────────────────────────
			
 
				+
			
 
				+    def get_entity_frequencies(
			
 
				+        self,
			
 
				+        hours: float = 24,
			
 
				+        limit: int = 30,
			
 
				+    ) -> list[dict[str, Any]]:
			
 
				+        """Top entities by mention count, using SQL junction table + payload_ts index."""
			
 
				+        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				+        with self._conn() as conn:
			
 
				+            rows = conn.execute(
			
 
				+                """\
			
 
				+                SELECT ce.entity, COUNT(*) as cnt
			
 
				+                FROM cluster_entities ce
			
 
				+                JOIN clusters c ON c.cluster_id = ce.cluster_id
			
 
				+                WHERE c.payload_ts >= ?
			
 
				+                GROUP BY ce.entity
			
 
				+                ORDER BY cnt DESC
			
 
				+                LIMIT ?
			
 
				+                """,
			
 
				+                (cutoff, limit),
			
 
				+            ).fetchall()
			
 
				+
			
 
				+        result: list[dict[str, Any]] = []
			
 
				+        for label, count in rows:
			
 
				+            meta = self.get_entity_metadata(label)
			
 
				+            result.append({
			
 
				+                "label": label,
			
 
				+                "count": count,
			
 
				+                "canonical_label": meta["canonical_label"] if meta else label,
			
 
				+                "mid": meta["mid"] if meta else None,
			
 
				+            })
			
 
				+        return result
			
 
				+
			
 
				+    def get_keyword_frequencies(
			
 
				+        self,
			
 
				+        hours: float = 24,
			
 
				+        limit: int = 30,
			
 
				+    ) -> list[dict[str, Any]]:
			
 
				+        """Top keywords by mention count, using SQL junction table + payload_ts index.
			
 
				+
			
 
				+        Excludes DEFAULT_TOPICS labels (crypto, macro, regulation, ai, other).
			
 
				+        """
			
 
				+        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				+        _topic_labels = {"crypto", "macro", "regulation", "ai", "other"}
			
 
				+
			
 
				+        with self._conn() as conn:
			
 
				+            rows = conn.execute(
			
 
				+                """\
			
 
				+                SELECT ck.keyword, COUNT(*) as cnt
			
 
				+                FROM cluster_keywords ck
			
 
				+                JOIN clusters c ON c.cluster_id = ck.cluster_id
			
 
				+                WHERE c.payload_ts >= ?
			
 
				+                GROUP BY ck.keyword
			
 
				+                ORDER BY cnt DESC
			
 
				+                LIMIT ?
			
 
				+                """,
			
 
				+                (cutoff, limit),
			
 
				+            ).fetchall()
			
 
				+
			
 
				+        return [
			
 
				+            {"label": label, "count": count}
			
 
				+            for label, count in rows
			
 
				+            if label.lower() not in _topic_labels
			
 
				+        ]
			
 
				+
			
 
				+    # ── Junction-Table Entity / Keyword Cluster Search ────────────────
			
 
				+
			
 
				+    def get_clusters_by_entity(
			
 
				+        self,
			
 
				+        entity: str,
			
 
				+        hours: float = 168,
			
 
				+        limit: int = 50,
			
 
				+        offset: int = 0,
			
 
				+    ) -> dict[str, Any]:
			
 
				+        """Return clusters matching an entity, SQL-level filter via junction table.
			
 
				+
			
 
				+        Returns {"entity": ..., "clusters": [...], "total": int, "hours": float}.
			
 
				+        """
			
 
				+        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				+        entity_norm = entity.strip().lower()
			
 
				+
			
 
				+        with self._conn() as conn:
			
 
				+            total = conn.execute(
			
 
				+                "SELECT COUNT(DISTINCT c.cluster_id) FROM clusters c "
			
 
				+                "JOIN cluster_entities ce ON c.cluster_id = ce.cluster_id "
			
 
				+                "WHERE c.payload_ts >= ? AND ce.entity = ?",
			
 
				+                (cutoff, entity_norm),
			
 
				+            ).fetchone()[0]
			
 
				+
			
 
				+            rows = conn.execute(
			
 
				+                "SELECT DISTINCT c.payload FROM clusters c "
			
 
				+                "JOIN cluster_entities ce ON c.cluster_id = ce.cluster_id "
			
 
				+                "WHERE c.payload_ts >= ? AND ce.entity = ? "
			
 
				+                "ORDER BY c.payload_ts DESC LIMIT ? OFFSET ?",
			
 
				+                (cutoff, entity_norm, limit, offset),
			
 
				+            ).fetchall()
			
 
				+
			
 
				+        return {
			
 
				+            "entity": entity_norm,
			
 
				+            "clusters": [json.loads(r[0]) for r in rows],
			
 
				+            "total": total,
			
 
				+            "hours": hours,
			
 
				+        }
			
 
				+
			
 
				+    def get_clusters_by_keyword(
			
 
				+        self,
			
 
				+        keyword: str,
			
 
				+        hours: float = 168,
			
 
				+        limit: int = 50,
			
 
				+        offset: int = 0,
			
 
				+    ) -> dict[str, Any]:
			
 
				+        """Return clusters matching a keyword, SQL-level filter via junction table."""
			
 
				+        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				+        kw_norm = keyword.strip().lower()
			
 
				+
			
 
				+        with self._conn() as conn:
			
 
				+            total = conn.execute(
			
 
				+                "SELECT COUNT(DISTINCT c.cluster_id) FROM clusters c "
			
 
				+                "JOIN cluster_keywords ck ON c.cluster_id = ck.cluster_id "
			
 
				+                "WHERE c.payload_ts >= ? AND ck.keyword = ?",
			
 
				+                (cutoff, kw_norm),
			
 
				+            ).fetchone()[0]
			
 
				+
			
 
				+            rows = conn.execute(
			
 
				+                "SELECT DISTINCT c.payload FROM clusters c "
			
 
				+                "JOIN cluster_keywords ck ON c.cluster_id = ck.cluster_id "
			
 
				+                "WHERE c.payload_ts >= ? AND ck.keyword = ? "
			
 
				+                "ORDER BY c.payload_ts DESC LIMIT ? OFFSET ?",
			
 
				+                (cutoff, kw_norm, limit, offset),
			
 
				+            ).fetchall()
			
 
				+
			
 
				+        return {
			
 
				+            "keyword": kw_norm,
			
 
				+            "clusters": [json.loads(r[0]) for r in rows],
			
 
				+            "total": total,
			
 
				+            "hours": hours,
			
 
				+        }
			
 
				+
			
 
				+    def get_clusters_by_entity_or_keyword(
			
 
				+        self,
			
 
				+        query_terms: set[str],
			
 
				+        hours: float,
			
 
				+        limit: int,
			
 
				+    ) -> list[dict]:
			
 
				+        """Search clusters by matching ANY query term against entities OR keywords.
			
 
				+
			
 
				+        Uses SQL-level junction-table filtering — no row-limit blind spot.
			
 
				+        Returns clusters sorted by recency.
			
 
				+        """
			
 
				+        terms = [q.strip().lower() for q in query_terms if q.strip()]
			
 
				+        if not terms:
			
 
				+            return []
			
 
				+        cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
			
 
				+        placeholders = ",".join("?" for _ in terms)
			
 
				+
			
 
				+        with self._conn() as conn:
			
 
				+            rows = conn.execute(
			
 
				+                f"SELECT DISTINCT c.payload FROM clusters c "
			
 
				+                f"LEFT JOIN cluster_entities ce ON c.cluster_id = ce.cluster_id "
			
 
				+                f"LEFT JOIN cluster_keywords ck ON c.cluster_id = ck.cluster_id "
			
 
				+                f"WHERE c.payload_ts >= ? "
			
 
				+                f"  AND (ce.entity IN ({placeholders}) OR ck.keyword IN ({placeholders})) "
			
 
				+                f"ORDER BY c.payload_ts DESC LIMIT ?",
			
 
				+                (cutoff, *terms, *terms, limit),
			
 
				+            ).fetchall()
			
 
				+
			
 
				+        return [json.loads(r[0]) for r in rows]