Bläddra i källkod

docs: document blacklist enforcement script

Lukas Goldschmidt 1 månad sedan
förälder
incheckning
8317f5afd9
1 ändrade filer med 16 tillägg och 0 borttagningar
  1. 16 0
      README.md

+ 16 - 0
README.md

@@ -135,4 +135,20 @@ mcporter --config "$CONFIG" call news.detect_emerging_topics limit=10
 ```bash
 mcporter --config "$CONFIG" call news.get_news_sentiment entity=Bitcoin timeframe=24h
 mcporter --config "$CONFIG" call news.get_news_sentiment entity=Ethereum timeframe=72h
+
+## Blacklist enforcement (optional back-clean)
+
+If you change `ENTITY_BLACKLIST`, existing clusters in `news.sqlite` may still
+contain entities/keywords that would now be filtered at extraction time.
+
+For one-off cleanup, run:
+
+```bash
+./.venv/bin/python scripts/enforce_news_blacklist.py --dry-run --limit 200
+./.venv/bin/python scripts/enforce_news_blacklist.py --limit 1000
+```
+
+This enforces `ENTITY_BLACKLIST` inside stored clusters by removing matching
+entries from `payload.entities` and `payload.keywords` and (if needed) setting
+`payload.topic = "other"`.
 ```