| 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495 |
- Atlas Coarse Types for LLM Extraction
- ======================================
- Use these 12 types when prompting a cheap/small LLM for entity type suggestion.
- The suggested type is a hint to the entity resolver for candidate ranking — not
- a final classification. Pass 2 (Wikidata QID lookup) promotes to the fine-grained
- subtype from the full ontology.
- COARSE TYPES
- ------------
- Person
- Organization
- Location
- CreativeWork
- Event
- Product
- FinancialInstrument
- Animal
- Disease
- Building
- FictionalCharacter
- Other
- PASS 2 PROMOTION MAP
- --------------------
- Person -> Person
- Organization -> Organization
- PoliticalParty
- MilitaryUnit
- MediaOrganization
- Location -> Location
- Continent
- Country
- Region
- PopulatedPlace
- Neighbourhood
- NaturalFeature
- AdministrativeArea
- CreativeWork -> CreativeWork
- Film
- Book
- MusicAlbum
- TVSeries
- VideoGame
- Event -> Event
- Product -> Product
- Drug
- Food
- FinancialInstrument -> FinancialInstrument
- PublicCompany
- StockIndex
- Commodity
- Cryptocurrency
- Currency
- Animal -> Animal
- Disease -> Disease
- Building -> Building
- FictionalCharacter -> FictionalCharacter
- Other -> Other
- Award
- Sport
- EthnicGroup
- Concept
- NOTES
- -----
- - Animal and Disease are kept separate because confusing them with Product
- or Concept causes hard resolution failures.
- - Building is kept separate because landmarks (Eiffel Tower, White House)
- resolve very differently from cities or countries.
- - FictionalCharacter is kept separate because confusing a fictional entity
- with a real person is a hard failure, not a soft one.
- - Award, Sport, EthnicGroup and Concept fall into Other at the coarse level.
- A small model will mis-classify these anyway; the QID lookup in pass 2
- recovers the correct fine-grained type reliably.
|