Troubleshooting the ingestion process
The commands that you run during the ingestion process send information about their progress to the command line and a log file. If any command encounters errors or does not run to completion, you can read the output to help you to diagnose the problem.
When an ingestion process runs to completion, the final output from the command is a report of what happened to the Information Store. The reports appear on the command line and in the ingestion log at toolkit\configuration\logs\importer\i2_Importer.log. The three possible end states are success, partial success, and failure.
- Success
-
If the ingestion command processed all of the rows in the staging table without error, then the Information Store reflects the contents of the staging table. The command reports success like this example:
> INFO [IImportLogger] - Total number of rows processed: 54 > INFO [IImportLogger] - Number of records inserted: 0 > INFO [IImportLogger] - Number of records updated: 54 > INFO [IImportLogger] - Number of merges: 0 > INFO [IImportLogger] - Number of unmerges: 0 > INFO [IImportLogger] - Number of rows rejected: 0 > INFO [IImportLogger] - Duration: 5 s > INFO [IImportLogger] - > INFO [IImportLogger] - Result: SUCCESS
- Partial success
-
If you ran the command in record-based failure mode, and it processed some of the rows in the staging table without error, then it reports partial success like this example:
> INFO [IImportLogger] - Total number of rows processed: 34 > INFO [IImportLogger] - Number of records inserted: 0 > INFO [IImportLogger] - Number of records updated: 30 > INFO [IImportLogger] - Number of merges: 0 > INFO [IImportLogger] - Number of unmerges: 0 > INFO [IImportLogger] - Number of rows rejected: 4 > INFO [IImportLogger] - Duration: 4 s > INFO [IImportLogger] - > INFO [IImportLogger] - Result: PARTIAL SUCCESS > INFO [IImportLogger] - > INFO [IImportLogger] - Total number of errors: 4 > INFO [IImportLogger] - Error categories: > INFO [IImportLogger] - ABSENT_VALUE: 4 > INFO [IImportLogger] - > INFO [IImportLogger] - The rejected records and errors are recorded in the database. For details, use the following view: > INFO [IImportLogger] - IS_Staging.S20171204122426717092ET5_Rejects_V
The records in the Information Store reflect the rows from the staging table that the command successfully processed. The report includes the name of a database view that you can examine to discover what went wrong with each failed row.
- Failure
-
If you ran the command in mapping-based failure mode, then any error you see is the first one that it encountered, and the report is of failure:
> INFO [IImportLogger] - Total number of rows processed: 1 > INFO [IImportLogger] - Number of records inserted: 0 > INFO [IImportLogger] - Number of records updated: 0 > INFO [IImportLogger] - Number of merges: 0 > INFO [IImportLogger] - Number of unmerges: 0 > INFO [IImportLogger] - Number of rows rejected: 0 > INFO [IImportLogger] - Duration: 0 s > INFO [IImportLogger] - > INFO [IImportLogger] - Result: FAILURE
When the process fails in this fashion, the next lines of output describe the error in more detail. In this event, the command does not change the contents of the Information Store.
If the command reports partial success, you might be able to clean up the staging table by removing the rows that were ingested and fixing the rows that failed. However, the main benefit of record-based failure is that you can find out about multiple problems at the same time.
The most consistent approach to addressing failures of all types is to fix up the problems in the staging table and run the ingestion command again. The following sections describe how to react to some of the more common failures.
Link rows in the staging table refer to missing entity records
Link data in the staging table refers to missing entity recordsThis message is displayed if the entity record at either end of a link is not present in the Information Store. To resolve the error:
- Examine the console output for your earlier operations to check that the Information Store ingested all the entity records properly.
- Ensure that the link end origin identifiers are constructed correctly, and exist for each row in the staging table.
- Ensure that the link type and the entity types at the end of the links are valid according to the i2 Analyze schema.
Rows in the staging table have duplicate origin identifiers
Rows in the staging table have duplicate origin identifiersThis message is displayed when several rows in a staging table generate the same origin identifier. For example, more than one row might have the same value in the
source_id
column.If more than one row in the staging table contains the same provenance information, you must resolve the issue and repopulate the staging table. Alternatively, you can separate the rows so that they are not in the same staging table at the same time.
This problem is most likely to occur during an update to the Information Store that attempts to change the same record (with the same provenance) twice in the same batch. It might be appropriate to combine the changes, or to process only the last change. After you resolve the problem, repopulate the staging table and rerun the ingestion command.
Geospatial data is in the incorrect format
During an ingestion procedure that contains geospatial data, you might see the following error messages in the console output:
On PostgreSQL:
ERROR: parse error - invalid geometry Hint: "FO" <-- parse error at position 2 within geometry.
On SQL Server:
System.FormatException: 24114: The label FOO(33.3 44.0) in the input well-known text (WKT) is not valid.
On Db2:
SQLERRMC=GSEGEOMFROMWKT;;GSE3052N Unknown type "FOO(33.3" in WKT.
This message is displayed when data in a geospatial property column is not in the correct format.
Data in geospatial property columns must be in the POINT(longitude latitude) format. For more information, see Information Store property value ranges.
Error occurred during a correlation operation
An error occurred during a correlation operation. There might be some data in an unusable state.
This message is displayed if the connection to the database or Solr is interrupted during a correlation operation.
To resolve the problem, you must repair the
connection that caused the error, and then run the
syncInformationStoreCorrelation
toolkit task. This task synchronizes the data in
the Information Store with the data in the Solr
index so that the data returns to a usable
state.
After you run the syncInformationStoreCorrelation
task, reingest the data that you were ingesting when the failure occurred. Any attempt to run an ingestion or a deletion command before you run syncInformationStoreCorrelation
will fail.
Ingestion with correlated data is still in progress
You cannot ingest data because an ingestion with correlated data is still in progress, or because an error occurred during a correlation operation in a previous ingestion.
If another ingestion is still in progress, you must wait until it finishes. If a previous ingestion failed during a correlation operation, you must run the syncInformationStoreCorrelation
toolkit task.
syncInformationStoreCorrelation
toolkit task, see Error occurred during a correlation operation.Ingestion of the same item type is still in progress
You cannot ingest data for item type <ET5> because an ingestion is still in progress. You must wait until the process is finished before you can start another ingestion for this item type.If another ingestion of the same item type is still in progress, you must wait until it finishes.
If you are sure that the ingestion is complete or not in progress, you can remove the file that is blocking the ingestion. To determine whether an ingestion is in progress, a file is created in the temporary directory on the server where the ingestion command was run. For example, AppData\Local\Temp. The file name is INGESTION_IN_PROGRESS_<item type ID>. After you remove the file, you can run the ingestion command again.
Bulk import mode error
The symptoms of this type of failure are a stack trace and failure message in the console and importer log. To recover from a failure at this time:
- Identify the cause of the failure. You must
use the SQL error codes to determine the cause of
the failure.
You might see error messages about the following issues:
- Log size or connectivity issues.
- Invalid data in the staging table.
- Fix the problem that caused the failure. This might include ensuring connectivity to the database or increasing the log size.
- After you resolve the problem that caused the
error, you can attempt the ingestion again. If any
of the rows in the staging table were already
ingested into the Information Store, you must
remove them from the staging table before you can
ingest in bulk mode.
- In the console or importer log, if the value
for
Number of rows accepted
is 0 then run the ingestion command again. - In the console or importer log, if the value
for
Number of rows accepted
is greater than 0, you must ensure that these records are not ingested again.Before you run the ingestion command again, add the
CheckExistingOriginIds=filter
setting to the import configuration file. When this value is set, the ingestion process calculates whether the origin identifiers in the staging table already exist in the Information Store and does not attempt to ingest them again.When this is set to
filter
, the ingestion might take longer to complete. After the ingestion that failed is complete, you can remove theCheckExistingOriginIds
setting from your import configuration file for future ingestion operations.For more information about creating an import configuration file, see References and system properties.
- In the console or importer log, if the value
for