Curating extraction results

After i2 TextChart presents the results of processing a document in the Document View, you can review its output and make changes to the results. TextChart supports removing, and modifying the records it finds, as well as identifying new records of your own.

Removing entities

If TextChart identifies an entity that you don't want to appear in the results, you can remove it from the Document View. For example, the following view contains a contrived entity extraction result for an "online resource", which has the Generic entity type.

Document View with result to remove

When you select the text in the Document View, TextChart displays information about the extracted entity. To remove this result, click Remove Entity.

Adding entities

If TextChart failed to identify an entity that you do want to appear in the results, you can highlight the text to be extracted, and then select Tag > Tag Entity.

Tag Entity command

TextChart displays the Tag Entity Tool for you to provide information about the entity, and populates the Original field with the text that you highlighted.

Tag Entity Tool adding

TextChart also populates the Norm field with the same text, but here you have the option of changing the text to normalize it when a document uses different terminology to refer the same piece of information.

For example, in the image below, "England, GB" and "United Kingdom, GB" have been identified as separate places.

Normalization candidates

If you decide that TextChart should treat these instances as the same place, you can edit the Norm field of one so that it matches the other.

Editing a normalized value

The Gloss field can be helpful when you're processing documents in a language other than English, and you come across an important term that TextChart does not understand.

Finally, you must choose a type for your new entity. Different entity types have different attributes, and you can fill in additional information as you see fit.

Selecting an entity type

Editing entities

If TextChart identified an entity successfully, but the extracted result isn't exactly how you want it, you can right-click the highlighted text in the Document View and select Edit Entity.

Edit extracted entity

The behavior of the Tag Entity Tool is the same when you're editing entities as it is when you're creating them. You can edit the Norm field, add an English Gloss, change the entity Type, and modify attribute information.

Tag Entity Tool editing

When you finish editing, the changes you made are reflected in the feature bar on the right of the Document View.

Document View feature bar

After you edit an extracted entity, you must reprocess the other documents in the collection so that they receive the same modifications.