Understanding ingestion reports

Every attempt to add, update, or delete data in the Information Store through the deployment or ETL toolkit adds rows to the IS_Public.Ingestion_Deletion_Reports view. You can use the contents of this view to track the history of all such operations, and to examine the impact of a particular operation.

About this task

Each time that you run a command that might change the contents of the Information Store, you create a job in the database. Each job acts on one or more batches of i2 Analyze records. There is always one batch per item type that the command affects, but there can also be several batches for the same type if the number of affected records is large.

For example, consider a command that processes updates for deleted Person entity data. The first batch in the resulting job is for Person records, and there might be more such batches if there are many records to be deleted. If the Person data has links, then the job has further batches for each type of link that might get deleted as a result of the entity deletion.

The IS_Public.Ingestion_Deletion_Reports view contains information about every batch from every toolkit operation to create or update data in the Information Store. When you query the view, include ORDER BY job_id to group entries for the same job.

Note: Deletion-by-rule operations also result in job and batch creation, and view population, according to the same rules. For more information, see the Deletion Guide.

The first few columns in the view have the same value for all batches within a job:

Column name

Description

label

The value that you passed in the importLabel parameter of a toolkit command, or the value that a deletion-by-rule operation generates, or null.

job_id

The server-assigned identifier for this ingestion or deletion job. This identifier is also a cross-reference to the Deletion_By_Rule_Log view if the job originated from a deletion-by-rule operation.

ingestion_mode

The value that you passed in the importMode parameter, or Delete for all deletion-by-rule operations.

validation_mode

A description of how the job was configured to react to errors during the operation.

error_threshold

The threshold that applies to some of the validation modes.

primary_item_type

The i2 Analyze schema ID of the item type that was specified at job creation.

primary_record_count

The number of records of the primary item type that were affected by the job. (Deleting entity data can affect link records too.)

start_time

The start time of the job as a whole.

end_time

The end time of the job as a whole.

The remaining columns can have different values for different batches of records:

Column name

Description

batch_item_type

The i2 Analyze schema ID of the item type that was acted on in this batch. For at least one batch, the batch_item_type is the same as the primary_item_type.

batch_start_time

The start time of this batch, which is always later than the start time of the job.

batch_end_time

The end time of this batch, which is always earlier than the end time of the job.

insert_count

The number of rows of data from this batch that were inserted to the Information Store, resulting in new i2 Analyze records.

update_count

The number of rows of data from this batch that updated existing records in the Information Store.

merge_count

The number of merge operations that occurred in the Information Store from this batch.

unmerge_count

The number of unmerge operations that occurred in the Information Store from this batch.

delete_count

The number of pieces of provenance that were deleted from the Information Store as a result of this batch.

delete_record_count

The number of records that were deleted from the Information Store as a result of this batch.

reject_count

The number of rows that were rejected from this batch during processing because they are invalid.

status

An indicator of the result of this batch, from success (all rows processed correctly) through partial success to failure (no rows processed).

reject_view

The full name of the view that contains details of any rejected rows.

stack_trace

If i2 Analyze generated a stack trace as a result of errors during ingestion or deletion, this column contains it.

Examples

Ingest example

The (abbreviated) report for successful ingestion operations might look like this:

job_id

1

2

ingestion_mode

Standard

Standard

primary_item_type

ET10

ET4

primary_record_count

62

8

batch_item_type

ET10

ET4

batch_start_time

2017-11-30 15:27:06.76

2017-11-30 15:27:09.45

batch_end_time

2017-11-30 15:27:09.87

2017-11-30 15:27:09.63

insert_count

57

7

update_count

0

0

merge_count

5

1

unmerge_count

0

6

delete_count

0

0

delete_record_count

0

0

reject_count

0

0

status

Succeeded

Succeeded

In this example, several commands to ingest entity records resulted in the creation of several jobs. Each job demonstrates different behavior that is possible during ingestion, including correlation operations:

  • JOB_ID 1

    This job demonstrates what the ingestion report can look like when data in the staging table causes merge operations. In this example, five merge operations are completed on the incoming rows of data, as shown in the merge_count column. This results in 57 i2 Analyze records created from the 62 rows of data, as shown in the insert_count and primary_record_count columns. This includes merging five rows of data with existing i2 Analyze records in the Information Store.

  • JOB_ID 2

    This job demonstrates what the ingestion report can look like when the data in the staging table causes unmerge and merge operations. In this example, six unmerge operations are completed on the incoming rows of data, as shown in the unmerge_count column. One merge operation is completed on the incoming rows, as shown in the merge_count column. This results in 7 i2 Analyze records created, from eight rows of data as shown in the insert_count and primary_record_count columns. The primary_record_count value does not include the unmerge_count.

Delete example

The (abbreviated) report for a successful delete operation might look like this:

Batch 1

Batch 2

Batch 3

Batch 4

Batch 5

job_id

26

26

26

26

26

ingestion_mode

Delete

Delete

Delete

Delete

Delete

primary_item_type

ET5

ET5

ET5

ET5

ET5

primary_record_count

324

324

324

324

324

batch_item_type

ET5

LAC1

LAS1

LEM1

LIN1

batch_start_time

2017-11-30 15:27:06.76

2017-11-30 15:27:08.60

2017-11-30 15:27:08.60

2017-11-30 15:27:09.43

2017-11-30 15:27:09.45

batch_end_time

2017-11-30 15:27:09.87

2017-11-30 15:27:09.30

2017-11-30 15:27:09.29

2017-11-30 15:27:09.62

2017-11-30 15:27:09.63

insert_count

0

0

0

0

0

update_count

0

0

0

0

0

merge_count

0

0

0

0

0

unmerge_count

0

0

0

0

0

delete_count

324

187

27

54

33

delete_record_count

320

187

27

54

33

reject_count

0

0

0

0

0

status

Succeeded

Succeeded

Succeeded

Succeeded

Succeeded

In this example, a command to update the Information Store for deleted entity data (with item type ET5) resulted in the creation of a job with five batches. The first few columns of the Ingestion_Deletion_Reports view contain the same values for all batches in the same job. Later columns reveal how deleting entity records results in the deletion of connected link records (with item types LAC1, LAS1, LEM1, LIN1).

In one case, the delete_record_count value is less than the delete_count value. This is because some of the provenance to be deleted was associated with an i2 Analyze record that had more than one piece of provenance. An i2 Analyze record is deleted only when the last associated provenance is deleted.