Ingesting example correlation data

The correlation example provides sets of data that were passed through a matching engine so that each row of data in the set is associated with a correlation identifier. You can ingest the example data, and then inspect the items in the Information Store from Analyst's Notebook Premium to demonstrate the correlation behavior.

About this task

Use the ingestExampleData toolkit task to ingest example data sets that demonstrate the correlation behavior in i2 Analyze. For more information about the operations that occur, and how the i2 Analyze records are effected, see Correlation operations.
law-enforcement-data-set-2
This data set contains data with correlation identifiers. After you ingest this data set, the Information Store database contains i2 Analyze records with correlation identifiers. No correlation operations occur when you ingest this data set.
law-enforcement-data-set-2-merge
This data set contains data that causes a number of merge operations to occur with some of the i2 Analyze records that were created from the first data set.
law-enforcement-data-set-2-unmerge
This data set contains data that causes a number of unmerge operations to occur with some of the i2 Analyze records that were merged from the second data set.
You can find the example data sets in the toolkit\examples\data directory.

Procedure

  1. In a command prompt, navigate to the toolkit\scripts directory.
  2. Ingest the first example data set.
    1. Run the following command to ingest the first example data set:
      setup -t ingestExampleData -e law-enforcement-data-set-2

      The examples\data\law-enforcement-data-set-2 directory contains a set of CSV files with data that was passed through a matching engine and processed to meet the requirements of the staging tables. Each row of data contains a correlation identifier type and a unique key value.

      The Information Store now contains i2 Analyze records with correlation identifiers. However, no correlation operations occurred during the ingestion.
    2. In Analyst's Notebook Premium, search for "Julia Yochum" and add the returned entity to the chart.
  3. Ingest the law-enforcement-data-set-2-merge data to demonstrate merge operations.
    1. Run the following command to ingest the second example data set:
      setup -t ingestExampleData -e law-enforcement-data-set-2-merge
      The examples\data\law-enforcement-data-set-2-merge directory contains a set of CSV files with data that was passed through a matching engine and processed to meet the requirements of the staging tables. Each row of data contains a correlation identifier type and key value. Each row contains a unique value for the source_id, which represents the origin identifier.

      In this scenario, the matching engine identified that some of the data represents the same real-world objects as data in the first data set. As part of this process, the correlation identifiers match with some of the existing data in the database, but the origin identifiers are different. These matches cause a number of merge operations to occur during the ingestion.

      For example, you can see on line 2 of the person.csv file the correlation identifier key is person_0, which matches the correlation identifier of the record for person "Julia Yochum". This match causes a merge between the existing record and the incoming row of data. As part of the merge, the property values from the incoming row of data are used for the record, this results in the full name changing to "Julie Yocham". This is an example of the scenario that is described in Figure 1

      In the ingestion reports, you can see the number and type of correlation operations that occurred during the ingestion. For more information about understanding the ingestion reports, see Understanding ingestion reports.

    2. In Analyst's Notebook Premium, select the chart item that represents Julia Yochum and click Get changes.
      You can see that the name changes due to the merge operation described previously.
  4. Ingest the law-enforcement-data-set-2-unmerge data to demonstrate unmerge operations.
    1. Run the following command to ingest the third example data set:
      setup -t ingestExampleData -e law-enforcement-data-set-2-unmerge
      The examples\data\law-enforcement-data-set-2-unmerge directory contains a set of CSV files with data that was passed through a matching engine and processed to meet the requirements of the staging tables.

      In this scenario, the matching engine identified that some of the data no longer represents the same real-world objects as it did previously. As part of this process, the correlation identifiers of previously merged data are changed. These changes cause a number of unmerge operations to occur during the ingestion.

      In this example, the correlation identifier of the data that was ingested has changed, but the origin identifier remained the same. For example, on line 2 of the person.csv file, the correlation identifier key is now person_1101 for origin identifier PER:GEN\1101. Before, this value was person_0. This causes an unmerge operation on the record that the data is currently associated with. This is an example of the scenario that is described in Figure 1

      You can see the number of unmerge operations that occurred in the ingestion reports.

    2. In Analyst's Notebook Premium, search for "Julie Yocham".
      There are now two entities for Julie Yocham.

What to do next

After you investigate the correlation behavior, you can clear the example data from your system. For more information, see Clearing data from the system.