Understanding ingestion reports
Every attempt to add, update, or delete
data in the Information Store through the
deployment or ETL toolkit adds rows to the
IS_Public.Ingestion_Deletion_Reports
view. You can use the contents of this view to
track the history of all such operations, and to
examine the impact of a particular
operation.
About this task
Each time that you run a command that might change the contents of the Information Store, you create a job in the database. Each job acts on one or more batches of i2 Analyze records. There is always one batch per item type that the command affects, but there can also be several batches for the same type if the number of affected records is large.
For example, consider a command that processes updates for deleted Person entity data. The first batch in the resulting job is for Person records, and there might be more such batches if there are many records to be deleted. If the Person data has links, then the job has further batches for each type of link that might get deleted as a result of the entity deletion.
The
IS_Public.Ingestion_Deletion_Reports
view contains information about every batch from
every toolkit operation to create or update data
in the Information Store. When you query the view,
include ORDER BY job_id
to
group entries for the same job.
The first few columns in the view have the same value for all batches within a job:
Column name | Description |
---|---|
label |
The value that you passed in the importLabel parameter of a toolkit
command, or the value that a deletion-by-rule operation generates, or null. |
job_id |
The server-assigned identifier for this
ingestion or deletion job. This identifier is also
a cross-reference to the
Deletion_By_Rule_Log view if the
job originated from a deletion-by-rule
operation. |
ingestion_mode |
The value that you passed in the importMode parameter, or Delete
for all deletion-by-rule operations. |
validation_mode |
A description of how the job was configured to react to errors during the operation. |
error_threshold |
The threshold that applies to some of the validation modes. |
primary_item_type |
The i2 Analyze schema ID of the item type that was specified at job creation. |
primary_record_count |
The number of records of the primary item type that were affected by the job. (Deleting entity data can affect link records too.) |
start_time |
The start time of the job as a whole. |
end_time |
The end time of the job as a whole. |
The remaining columns can have different values for different batches of records:
Column name | Description |
---|---|
batch_item_type |
The i2 Analyze schema ID of the item type
that was acted on in this batch. For at least one
batch, the batch_item_type is the
same as the
primary_item_type . |
batch_start_time |
The start time of this batch, which is always later than the start time of the job. |
batch_end_time |
The end time of this batch, which is always earlier than the end time of the job. |
insert_count |
The number of rows of data from this batch that were inserted to the Information Store, resulting in new i2 Analyze records. |
update_count |
The number of rows of data from this batch that updated existing records in the Information Store. |
merge_count |
The number of merge operations that occurred in the Information Store from this batch. |
unmerge_count |
The number of unmerge operations that occurred in the Information Store from this batch. |
delete_count |
The number of pieces of provenance that were deleted from the Information Store as a result of this batch. |
delete_record_count |
The number of records that were deleted from the Information Store as a result of this batch. |
reject_count |
The number of rows that were rejected from this batch during processing because they are invalid. |
status |
An indicator of the result of this batch, from success (all rows processed correctly) through partial success to failure (no rows processed). |
reject_view |
The full name of the view that contains details of any rejected rows. |
stack_trace |
If i2 Analyze generated a stack trace as a result of errors during ingestion or deletion, this column contains it. |
Example
- Ingest example
- The (abbreviated) report for successful
ingestion operations might look like
this:
In this example, several commands to ingest entity records resulted in the creation of several jobs. Each job demonstrates different behavior that is possible during ingestion, including correlation operations:job_id
1 2 ingestion_mode
Standard Standard primary_item_type
ET10 ET4 primary_record_count
62 8 batch_item_type
ET10 ET4 batch_start_time
2017-11-30 15:27:06.76 2017-11-30 15:27:09.45 batch_end_time
2017-11-30 15:27:09.87 2017-11-30 15:27:09.63 insert_count
57 7 update_count
0 0 merge_count
5 1 unmerge_count
0 6 delete_count
0 0 delete_record_count
0 0 reject_count
0 0 status
Succeeded Succeeded JOB_ID
1
- This job demonstrates what the ingestion
report can look like when data in the staging
table causes merge operations. In this example,
five merge operations are completed on the
incoming rows of data, as shown in the
merge_count
column. This results in 57 i2 Analyze records created from the 62 rows of data, as shown in theinsert_count
andprimary_record_count
columns. This includes merging five rows of data with existing i2 Analyze records in the Information Store. JOB_ID
2
- This job demonstrates what the ingestion
report can look like when the data in the staging
table causes unmerge and merge operations. In this
example, six unmerge operations are completed on
the incoming rows of data, as shown in the
unmerge_count
column. One merge operation is completed on the incoming rows, as shown in themerge_count
column. This results in 7 i2 Analyze records created, from eight rows of data as shown in theinsert_count
andprimary_record_count
columns. Theprimary_record_count
value does not include theunmerge_count
.
- Delete example
- The (abbreviated) report for a successful
delete operation might look like this:
job_id
26 26 26 26 26 ingestion_mode
Delete Delete Delete Delete Delete primary_item_type
ET5 ET5 ET5 ET5 ET5 primary_record_count
324 324 324 324 324 batch_item_type
ET5 LAC1 LAS1 LEM1 LIN1 batch_start_time
2017-11-30 15:27:06.76 2017-11-30 15:27:08.60 2017-11-30 15:27:08.60 2017-11-30 15:27:09.43 2017-11-30 15:27:09.45 batch_end_time
2017-11-30 15:27:09.87 2017-11-30 15:27:09.30 2017-11-30 15:27:09.29 2017-11-30 15:27:09.62 2017-11-30 15:27:09.63 insert_count
0 0 0 0 0 update_count
0 0 0 0 0 merge_count
0 0 0 0 0 unmerge_count
0 0 0 0 0 delete_count
324 187 27 54 33 delete_record_count
320 187 27 54 33 reject_count
0 0 0 0 0 status
Succeeded Succeeded Succeeded Succeeded Succeeded In this example, a command to update the Information Store for deleted entity data (with item type ET5) resulted in the creation of a job with five batches. The first few columns of the
Ingestion_Deletion_Reports
view contain the same values for all batches in the same job. Later columns reveal how deleting entity records results in the deletion of connected link records (with item types LAC1, LAS1, LEM1, LIN1).In one case, the
delete_record_count
value is less than thedelete_count
value. This is because some of the provenance to be deleted was associated with an i2 Analyze record that had more than one piece of provenance. An i2 Analyze record is deleted only when the last associated provenance is deleted.