před 1 týdnem · b73e04cd73
--- a/prompts/extract_entities.prompt
+++ b/prompts/extract_entities.prompt
@@ -1,34 +1,33 @@
 
				-Input cluster JSON:
			
 
				-{cluster_json}
			
 
				+Extract a news signal from the headline AND summary. Never return empty entities if names appear in the text.
			
 
				+
			
 
				+Return STRICT JSON with EXACT keys: { topic, entities, sentiment, sentimentScore, keywords }
			
 
				 
			
 
				-You MUST extract a news signal from the headline AND summary. Do not leave entities empty when the text mentions obvious names.
			
 
				-Task:
			
 
				-1) infer the best top-level topic
			
 
				-2) extract concise entities from the cluster
			
 
				-3) assign sentiment from the wording/context
			
 
				-4) provide short keywords that justify the classification
			
 
				+FIELDS:
			
 
				+- topic: one of [crypto, macro, regulation, ai, other]
			
 
				+- entities: named people, places, orgs, conflicts, and finance/crypto terms
			
 
				+  (BTC, ETH, ETF, SEC, ECB, Fed, euro, inflation, rates). Canonical forms. 1–5 words each.
			
 
				+- sentiment: "positive" | "negative" | "neutral"
			
 
				+- sentimentScore: float –1.0 to 1.0, consistent with sentiment label
			
 
				+- keywords: 2–4 thematic tags, 1–2 words each. Noun phrases only (e.g. "drone strikes",
			
 
				+  "nuclear plant"). Not entity names. Not verb phrases. Not headline fragments.
			
 
				 
			
 
				-Entity rules (strict):
			
 
				-- Use short strings (1-5 words).
			
 
				-- Include all obvious named entities mentioned in headline or summary: people, countries, regions, organizations, ministries, presidents, leaders, wars/conflicts if named.
			
 
				-- Also include finance/crypto entities when present: BTC, ETH, Bitcoin, Ethereum, ETF, SEC, ECB, Fed, euro, inflation, rates.
			
 
				-- Prefer canonical entity forms over aliases when obvious (for example, use full organization or place names where helpful).
			
 
				-- Do NOT return empty entities if any such names/places appear.
			
 
				+TASKS:
			
 
				+1. Infer the best topic.
			
 
				+2. Extract all named entities from headline and summary.
			
 
				+3. Assign sentiment from tone and wording.
			
 
				+4. Choose keywords that capture themes, not entities.
			
 
				 
			
 
				-Keyword rules (strict):
			
 
				-- Each keyword MUST be 1-2 words. Never 3+.
			
 
				-- Keywords are thematic search tags, NOT headline restatements or verb phrases.
			
 
				-- Good keywords: noun phrases or named concepts (e.g. "drone strikes", "energy infrastructure", "nuclear plant", "oil refinery").
			
 
				-- Bad keywords: full headline fragments, verb-heavy phrases, or anything over 2 words.
			
 
				-- Keywords should capture the *themes* of the story, not repeat entity names already in the entities list.
			
 
				-- Return 2-4 keywords. Fewer is better than bad ones.
			
 
				+EXAMPLE:
			
 
				+Input:  { "headline": "ECB raises rates again as eurozone inflation stays elevated",
			
 
				+          "summary": "The European Central Bank increased its benchmark rate by 25bps,
			
 
				+                      citing persistent inflation across the eurozone." }
			
 
				 
			
 
				-Sentiment rules:
			
 
				-- positive: clearly encouraging, improving, or supportive tone
			
 
				-- negative: clearly alarming, worsening, severe, conflict, loss, risk, warning tone
			
 
				-- neutral: factual, balanced, or mixed
			
 
				-- sentimentScore must be a number from -1.0 to 1.0 and should reflect the sentiment label.
			
 
				+Output: { "topic": "macro",
			
 
				+          "entities": ["ECB", "European Central Bank", "eurozone", "inflation", "rates"],
			
 
				+          "sentiment": "negative",
			
 
				+          "sentimentScore": -0.4,
			
 
				+          "keywords": ["rate hike", "monetary policy"] }
			
 
				+
			
 
				+INPUT:
			
 
				+{cluster_json}
			
 
				 
			
 
				-Return STRICT JSON with EXACT keys only:
			
 
				-{ topic, entities, sentiment, sentimentScore, keywords }
			
 
				-where topic is one of [crypto, macro, regulation, ai, other].