Data in i2 Analyze records

i2 Analyze deployments use i2 Analyze records to realize the entity and link types that schemas define. i2 Analyze records contain the property data for entities and links, plus metadata that enhances the analysis that users can carry out.

An i2 Analyze schema defines the types that determine what i2 Analyze records can represent. Every i2 Analyze record has a type that an i2 Analyze schema defines. If a record has a link type, then the record represents a link - it is a link record. If a record has an entity type, then it is an entity record.

This diagram shows how entity and link records compare, and how they are related to each other. It also introduces some other features of the data in i2 Analyze records.

Note: The diagram contains some simplifications:
  • Provenance is about the data sources that contributed to a particular i2 Analyze record. When the property values of an i2 Analyze record represent data from more than one source, that record can have more than one piece of provenance.
  • When an i2 Analyze record has more than one piece of provenance, it can contain all the data from all the contributing sources. In that case, the property values that the record presents are derived from the source data.
  • Metadata includes the following information:
    • Timestamps, which reflect when data in an i2 Analyze record was created or modified
    • Source references, which describe the sources that the data in a record came from
    • Notes, which users can write to add free-form text commentary to a record
    For link records, the metadata also includes information about the strength and direction of the link.

As an example of how to represent a simple entity that contains data from a single source, consider the following information about a person:

Full name Anna Harvey
Date of birth 5/5/74
Hair color Blonde
Eye color Blue

The following diagram shows one way to represent this information as an i2 Analyze record:



Note: An i2 Analyze entity record can contain properties that have any of the property types that the entity type defines. However, one record can contain only one property of each defined type.
The diagram also shows how the property types in the schema only partially determine the contents of an i2 Analyze record. Some of the other contents are due to the security schema, while others still are about identification:
  • All i2 Analyze records contain security dimension values, which i2 Analyze uses to determine the access level to the record that a particular user has.
  • When they enter the system (through ingestion to the Information Store, or through Analyst's Notebook Premium), i2 Analyze records receive a universally unique record identifier. This identifier is permanent for the lifetime of the record. If they have the necessary access level, any user of the system can use the record identifier to refer to a particular record.
  • i2 Analyze records that began life in an external data source contain one or more pieces of provenance. Each piece has a source identifier that references the data for the record in its original source. One record can have provenance from more than one source.
    Note: For records in an Information Store that were loaded through ingestion, source identifiers have the additional feature of being unique within the store. These source identifiers are known as origin identifiers.
  • All i2 Analyze records can contain timestamps in their metadata that specify when source data for the record was created or edited.
  • i2 Analyze link records contain an indication of their direction. i2 Analyze considers links to go 'from' one entity 'to' another. The direction of a link can be with or against that flow, or it can run in both directions or none.
When i2 Analyze records are stored in an Information Store, they contain a few extra pieces of data:
  • All i2 Analyze records retain timestamps in their metadata for when they were first created or uploaded to the Information Store, for when they were most recently uploaded, and for when they were last updated.
  • All i2 Analyze records can contain a correlation identifier. If two records have the same correlation identifier, the platform considers that they represent the same the real-world object and might merge them together.

Your data sources are likely to contain some, but not all, of the data that i2 Analyze records require. To enable an Information Store to ingest your data, or to develop a connector for the i2 Connect gateway, or to write an import specification, you must provide the extra information to i2 Analyze.