TextChart Entity Types Reference

TextChart entity types are defined in the default TokenDefs.xml files under rsk-api-core/LxBase/conf/. This reference reflects those default settings.

Output Entity Types

These entities are defined in the entities section of TokenDefs.xml and are emitted by default:

PERSON - Named person entities.
ORG - Named organizations.
PLACE - Geographic places.
FACILITY - Facilities (buildings, installations, infrastructure).
ADDRESS - Postal addresses.
GEOCOORDINATE - Coordinates for geospatial mapping.
CONVEYANCE - Vehicles and other means of conveyance.
CRIME - Criminal offenses.
TIMESTAMP - Units of time greater than 24 hours (dates, date-like references).
TIMESPAN - Ranges of time or date ranges.
PRODUCT - Commercial product names.
IDNUM - Identification numbers (serial, SIM, ICCID, IBAN, etc.).
EVENT - Named or generic events.
PUBLICATION - Publications, film titles, awards.
WEAPON - Weapons.
DRUG - Drugs (legal and illicit).
EMAIL - Email addresses and social media account names.
SOCIAL - Social media usernames (e.g., handles).
PHONE - Phone numbers.
URL - Web addresses.
MONEY - Numeric amounts of money.
IMPLEMENT - Implements and instruments.
PUNITIVE_MEASURE - Punitive measures taken against an entity.
KEYWORD - Specific keywords or coded speech.
TRAUMA - Injuries or trauma descriptions.
POI - Person of interest (used in short or informal text sources).

Non-Output Entity Types

These entities are defined in the NoOutputEntities section of TokenDefs.xml and are not output by default. They may still be used for internal matching or can be enabled via rule customization:

AWARD
BIOMETRIC
CONTRACT_TYPE
CITATION
DNA
GENERIC
ALERT_TYPE
SalientPhrase
CONTROL
DISEASE
FILE_10.1.2Current
HASHTAG
IDEOLOGY
CHEMICAL
NON_SALIENT_WEB_CONTENT
MEASURE
MEDICAL_PROCEDURE
MISC
SCORE
PERCENT
NATIONALITY
TRANSIT
PROGRAM
PROFESSION
POLITICAL_AFFILIATION
GENE
RATING
QUOTE
USER_AGENT
FINANCIAL_INDEX
TICKER_SYMBOL
INFRASTRUCTURE
ANATOMICAL_TERM
FUNDS
CRYPTO
CLASSIFICATION_LEVEL

Attributes and Subtypes

Attributes and subtypes are defined per-entity in TokenDefs.xml. The list is extensive and varies by entity type. For the authoritative set of attributes and descriptions, consult:

rsk-api-core/LxBase/conf/TokenDefs.xml
rsk-api-core/LxBase/conf/TokenDefs_core.xml

Customizing Entity Extraction

You can customize which entity types are extracted and how they are identified by:

Modifying the LxBase: Edit linguistic rules to add or remove entity type extractions
Creating Custom Rules: Define custom patterns specific to your domain

See sdk-lxbase.md for instructions on customizing entity extraction.

Entity Type Relationships

Relationship extraction depends on rule configuration and is not hard-coded to specific entity pairs. If you need specific relationships (e.g., PERSON → ORG), add or adjust rules in the LxBase.