Ingestion mapping files

An ingestion mapping file is an XML document whose structure is validated during the ingestion process. Every time that you instruct the Information Store to ingest data, you specify both the mapping file to use, and the ingestion mapping within it. You can choose to put all your ingestion mappings in one file, or to spread them across several files.

Ingestion mappings have two complementary purposes. First, they make the association between an entity type or a link type in the i2 Analyze schema and a staging table in the database. Second, they provide any extra information that the Information Store requires but the staging tables do not contain.

For all record types, the extra information that an ingestion mapping can provide includes:

  • The type identifier of the entity or link type that the mapping applies to

  • The name of the data source that the data to be ingested comes from

  • How to create an origin identifier for data of this type

  • The security dimension values that all records of this type receive, if you do not use per-record security

Link type ingestion mappings provide further information that addresses the requirements of link records:

  • The Information Store must be able to test that it already contains the entity records at the ends of an incoming link. The link type mapping must describe how to create the origin identifiers that are associated with those records so that the Information Store can look them up.

  • To make the look-up more efficient, the link type mapping also contains the type identifiers of the entity records that appear at the "from" and "to" ends of the incoming links. A link type that can connect entities of several different types requires a separate mapping for each valid combination of end types.

Ingestion mapping syntax

The root element of an ingestion mapping file is an <ingestionMappings> element from the defined namespace. For example:

<ns2:ingestionMappings
    xmlns:ns2="http://www.i2group.com/Schemas/2016-08-12/IngestionMappings"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
...
</ns2:ingestionMappings>

Within the ingestion mapping file, you use an <ingestionMapping> element to define a mapping for a particular entity type or link type. Each <ingestionMapping> element has a mandatory id attribute that must be unique within the mapping file. You use the value to identify the mapping when you start ingestion. For example:

<ingestionMapping id="Person">
    ...
</ingestionMapping>

Note: For examples of complete ingestion mapping files, search for files with the name mapping.xml in the i2 Analyze deployment toolkit. All of those files contain definitions that are similar to the definitions here.

Entity type ingestion mappings

When the mapping is for an entity type, the <ingestionMapping> element has the following children:

stagingArea

The <stagingArea> element specifies where the mapping gets its staged data from. In this version of i2 Analyze, the staged data is always in a staging table, and <stagingArea> always has a <tableName> child.

  • tableName

    The value of <tableName> is the name of the staging table that contains the data to be ingested.

For example:

...
<stagingArea xsi:type="ns2:databaseIngestionSource">
    <tableName>IS_Staging.E_Person</tableName>
</stagingArea>
...

itemTypeId

The value of the <itemTypeId> element is the identifier of the entity type (or the link type) to which the mapping applies, as defined in the i2 Analyze schema.

For example:

...
<itemTypeId>ET5</itemTypeId>
...

originId

The <originId> element contains a template for creating the origin identifier of each ingested row. <originId> has two mandatory child elements: <type> and <keys>.

For example:

...
<originId>
    <type>$(origin_id_type)</type>
    <keys>
        <key>$(origin_id_keys)</key>
    </keys>
</originId>
...

Here, $(origin_id_type) and $(origin_id_keys) are references to the columns named origin_id_type and origin_id_keys in the staging table to which this ingestion mapping applies. When the Information Store ingests the data, the values from the staging table become the origin identifier in the Information Store.

For more information about generating origin identifiers during ingestion, see Origin identifiers.

dataSourceName

The value of the <dataSourceName> element identifies the data source from which the data in the staging table came. It must match the name of an ingestion source that you provide to the Information Store during the ingestion process.

For example:

...
<dataSourceName>EXAMPLE</dataSourceName>
...

createdSource and lastUpdatedSource

By default, the ingestion process automatically puts the values from the source_created and source_last_updated columns of the staging tables into the Information Store. If you want to use the same values for all ingested data, you can override that behavior by including the non-mandatory <createdSource> and <lastUpdatedSource> elements and specifying values in the date-time string format for your database management system.

For example:

...
<createdSource>2002-10-04 09:21:33</createdSource>
<lastUpdatedSource>2002-10-05 09:34:45</lastUpdatedSource>
...

securityDimensionValues

Every row that the Information Store ingests must have at least one security dimension value from each dimension in the security schema. The Information Store staging tables contain a column for each access security dimension that the security schema defines.

In your ingestion process, you can use the staging table columns to store dimension values on a per-row basis. Alternatively, you can specify that all the data that the Information Store ingests through the same mapping get the same security dimension values.

In the ingestion mapping file, the <securityDimensionValues> element has <securityDimensionValue> children. For per-row security, use the value of each <securityDimensionValue> element to reference a security dimension column.

For example:

...
<securityDimensionValues>
    <securityDimensionValue>$(security_level)</securityDimensionValue>
    <securityDimensionValue>$(security_compartment)</securityDimensionValue>
</securityDimensionValues>
...

In the staging table, the referenced columns can contain either a single dimension value, or a comma-separated list of dimension values.

For per-mapping security, set the value of each <securityDimensionValue> element to a security dimension value.

For example:

...
<securityDimensionValues>
    <securityDimensionValue>HI</securityDimensionValue>
    <securityDimensionValue>UC</securityDimensionValue>
    <securityDimensionValue>OSI</securityDimensionValue>
</securityDimensionValues>
...

In either approach, the values that you specify must be present in the i2 Analyze security schema.