Define how property values of merged records are calculated
In the Information Store, each property of an i2 Analyze record can have only one value. When multiple pieces of source data contribute to an i2 Analyze record, the system must calculate a single value for each property type.
Intended audience
The information about defining the property values of merged records is intended for users who are database administrators and experienced in SQL. To define the property values, you must write complex SQL view definition statements.
Important: You must write and test the SQL view statements for defining the property values of merged records in a non-production deployment of i2 Analyze. If you create an incorrect view, you might have to clear all the data from the system. Before you implement your view definitions in a production system, you must complete extensive testing in your development and test environments.
Why define the property values
When a record contains more than one piece of provenance, it is a merged record. The properties for an i2 Analyze record are calculated when a merge or unmerge operation occurs, or when provenance is removed from a record.
By default, all of the property values for a record come from the source data that contributed to the record with the most recent source-last-updated-time. If no source data has a source-last-updated-time, the property values to use are determined by the ascending order of the origin identifier keys that are associated with the record. The source data that is last in the order is chosen.
If the default source-last-updated-time behavior does not match the requirements of your deployment, you can define how the property values are calculated for merged i2 Analyze records. You might define your own rules when multiple data sources contain values for different properties of an item type or one data source is more reliable for a particular item or property type.
To demonstrate when it is useful to define how to calculate the property values for merged records, imagine that the following two pieces of source data contributed to an i2 Analyze record of type Person:
Origin identifier | Correlation identifier | Ingestion source name | Source last updated | First given name |
---|---|---|---|---|
DVLA1234 | II1 | DVLA | 12:20:22 09/10/2018 | John |
PNC5678 | II1 | PNC | 14:10:43 09/10/2018 | Jon |
In the default behavior, the property values are used from the source data with the most recent value for the Source last updated column. The property values from the row with the ingestion source name of PNC are used for the merged i2 Analyze record, and the record gets the value of Jon for the first given name property.
If you know that data from the DVLA ingestion source is more reliable for this item type, you can define that source data with the value of DVLA for the ingestion source name takes precedence. By using this definition, the i2 Analyze record gets the value of John for the first given name property.
After you define this rule for the Person entity type, all future updates to the records of this item type take the property values from the DVLA ingestion source if it is present in any of the source data that contributed to a merged i2 Analyze record.
For more information about how to enable this function, and create your own rules, see Defining the property values of merged i2 Analyze records.