Understanding ingestion reports
Every attempt to add, update, or delete data in the Information Store through the deployment or ETL toolkit adds rows to the IS_Public.Ingestion_Deletion_Reports view. You can use the contents of this view to track the history of all such operations, and to examine the impact of a particular operation.
About this task
Each time that you run a command that might change the contents of the Information Store, you create a job in the database. Each job acts on one or more batches of i2 Analyze records. There is always one batch per item type that the command affects, but there can also be several batches for the same type if the number of affected records is large.
For example, consider a command that processes updates for deleted Person entity data. The first batch in the resulting job is for Person records, and there might be more such batches if there are many records to be deleted. If the Person data has links, then the job has further batches for each type of link that might get deleted as a result of the entity deletion.
The IS_Public.Ingestion_Deletion_Reports view contains information about every batch from every toolkit operation to create or update data in the Information Store. When you query the view, include ORDER BY job_id to group entries for the same job.
Note: Deletion-by-rule operations also result in job and batch creation, and view population, according to the same rules. For more information, see the Deletion Guide.
The first few columns in the view have the same value for all batches within a job:
Column name | Description |
---|---|
label | The value that you passed in the importLabel parameter of a toolkit command, or the value that a deletion-by-rule operation generates, or null. |
job_id | The server-assigned identifier for this ingestion or deletion job. This identifier is also a cross-reference to the Deletion_By_Rule_Log view if the job originated from a deletion-by-rule operation. |
ingestion_mode | The value that you passed in the importMode parameter, or Delete for all deletion-by-rule operations. |
validation_mode | A description of how the job was configured to react to errors during the operation. |
error_threshold | The threshold that applies to some of the validation modes. |
primary_item_type | The i2 Analyze schema ID of the item type that was specified at job creation. |
primary_record_count | The number of records of the primary item type that were affected by the job. (Deleting entity data can affect link records too.) |
start_time | The start time of the job as a whole. |
end_time | The end time of the job as a whole. |
The remaining columns can have different values for different batches of records:
Column name | Description |
---|---|
batch_item_type | The i2 Analyze schema ID of the item type that was acted on in this batch. For at least one batch, the batch_item_type is the same as the primary_item_type. |
batch_start_time | The start time of this batch, which is always later than the start time of the job. |
batch_end_time | The end time of this batch, which is always earlier than the end time of the job. |
insert_count | The number of rows of data from this batch that were inserted to the Information Store, resulting in new i2 Analyze records. |
update_count | The number of rows of data from this batch that updated existing records in the Information Store. |
merge_count | The number of merge operations that occurred in the Information Store from this batch. |
unmerge_count | The number of unmerge operations that occurred in the Information Store from this batch. |
delete_count | The number of pieces of provenance that were deleted from the Information Store as a result of this batch. |
delete_record_count | The number of records that were deleted from the Information Store as a result of this batch. |
reject_count | The number of rows that were rejected from this batch during processing because they are invalid. |
status | An indicator of the result of this batch, from success (all rows processed correctly) through partial success to failure (no rows processed). |
reject_view | The full name of the view that contains details of any rejected rows. |
stack_trace | If i2 Analyze generated a stack trace as a result of errors during ingestion or deletion, this column contains it. |
Examples
Ingest example
The (abbreviated) report for successful ingestion operations might look like this:
job_id | 1 | 2 |
---|---|---|
ingestion_mode | Standard | Standard |
primary_item_type | ET10 | ET4 |
primary_record_count | 62 | 8 |
batch_item_type | ET10 | ET4 |
batch_start_time | 2017-11-30 15:27:06.76 | 2017-11-30 15:27:09.45 |
batch_end_time | 2017-11-30 15:27:09.87 | 2017-11-30 15:27:09.63 |
insert_count | 57 | 7 |
update_count | 0 | 0 |
merge_count | 5 | 1 |
unmerge_count | 0 | 6 |
delete_count | 0 | 0 |
delete_record_count | 0 | 0 |
reject_count | 0 | 0 |
status | Succeeded | Succeeded |
In this example, several commands to ingest entity records resulted in the creation of several jobs. Each job demonstrates different behavior that is possible during ingestion, including correlation operations:
JOB_ID 1
This job demonstrates what the ingestion report can look like when data in the staging table causes merge operations. In this example, five merge operations are completed on the incoming rows of data, as shown in the merge_count column. This results in 57 i2 Analyze records created from the 62 rows of data, as shown in the insert_count and primary_record_count columns. This includes merging five rows of data with existing i2 Analyze records in the Information Store.
JOB_ID 2
This job demonstrates what the ingestion report can look like when the data in the staging table causes unmerge and merge operations. In this example, six unmerge operations are completed on the incoming rows of data, as shown in the unmerge_count column. One merge operation is completed on the incoming rows, as shown in the merge_count column. This results in 7 i2 Analyze records created, from eight rows of data as shown in the insert_count and primary_record_count columns. The primary_record_count value does not include the unmerge_count.
Delete example
The (abbreviated) report for a successful delete operation might look like this:
| Batch 1 | Batch 2 | Batch 3 | Batch 4 | Batch 5 |
---|---|---|---|---|---|
job_id | 26 | 26 | 26 | 26 | 26 |
ingestion_mode | Delete | Delete | Delete | Delete | Delete |
primary_item_type | ET5 | ET5 | ET5 | ET5 | ET5 |
primary_record_count | 324 | 324 | 324 | 324 | 324 |
batch_item_type | ET5 | LAC1 | LAS1 | LEM1 | LIN1 |
batch_start_time | 2017-11-30 15:27:06.76 | 2017-11-30 15:27:08.60 | 2017-11-30 15:27:08.60 | 2017-11-30 15:27:09.43 | 2017-11-30 15:27:09.45 |
batch_end_time | 2017-11-30 15:27:09.87 | 2017-11-30 15:27:09.30 | 2017-11-30 15:27:09.29 | 2017-11-30 15:27:09.62 | 2017-11-30 15:27:09.63 |
insert_count | 0 | 0 | 0 | 0 | 0 |
update_count | 0 | 0 | 0 | 0 | 0 |
merge_count | 0 | 0 | 0 | 0 | 0 |
unmerge_count | 0 | 0 | 0 | 0 | 0 |
delete_count | 324 | 187 | 27 | 54 | 33 |
delete_record_count | 320 | 187 | 27 | 54 | 33 |
reject_count | 0 | 0 | 0 | 0 | 0 |
status | Succeeded | Succeeded | Succeeded | Succeeded | Succeeded |
In this example, a command to update the Information Store for deleted entity data (with item type ET5) resulted in the creation of a job with five batches. The first few columns of the Ingestion_Deletion_Reports view contain the same values for all batches in the same job. Later columns reveal how deleting entity records results in the deletion of connected link records (with item types LAC1, LAS1, LEM1, LIN1).
In one case, the delete_record_count value is less than the delete_count value. This is because some of the provenance to be deleted was associated with an i2 Analyze record that had more than one piece of provenance. An i2 Analyze record is deleted only when the last associated provenance is deleted.