Merging two schemas

In this example of how to create type conversion mappings, two connectors - each with their own gateway schema - are being used in an i2 Analyze deployment. The two schemas contain similar item types, so item type mappings will be used to remove duplicates.

It is not possible to define type conversion mappings in both directions between two schemas. That is, you cannot map item types from schema A to schema B, and also map item types from schema B to schema A. In a situation like this, you can create a third gateway schema that contains all the types you want from the two original schemas, and then define mappings from the originals to the new schema.

Setting up the scenario

To follow this scenario, deploy i2 Analyze with the example NYPD connector and KCPD connector, both configured to use gateway schemas.

Reviewing the item types

Open the schema/nypd-complaint-data-schema.xml and schema/kcpd-crime-data-schema.xml schemas in Schema Designer to see the item types they define. Both schemas contain entity types that represent:

  • people

  • complaints/reports

  • locations

Both schemas also contain link types that represent:

  • a complaint having occurred at a location

  • a person being the victim of a complaint

  • a person being the suspect of a complaint

The KCPD-Crime schema contains additional link types that represent:

  • a person being complicit in a complaint

  • a person having been arrested for a complaint

  • a person having been charged for a complaint

There are obviously duplicate item types that can be resolved by defining item type mappings. Review the item types and decide which of the duplicate types you prefer. For the purposes of this example, the preferred item types are:

  • Person (NYPD-Complaints)

  • Report (KCPD-Crime)

  • Location (KCPD-Crime)

  • Located At (NYPD-Complaints)

  • Suspect Of (NYPD-Complaints)

  • Victim Of (NYPD-Complaints)

However, it is not possible to map the Person type in the KCPD-Crime schema to the Person type in the NYPD-Complaints schema, and map the Complaint type in the NYPD-Complaints schema to the Report type in the KCPD-Crime schema. To resolve this, you can create a schema that contains the preferred types from both schemas, configure it as a gateway schema, and then map types from the NYPD-Complaints schema and the KCPD-Crime schema to it.

Creating a schema

Use Schema Designer to create a schema that contains types identical to the preferred item types you chose from the NYPD-Complaints and KCPD-Crime schemas. An example containing the types listed above is provided as schema/nypd-kcpd-merged-schema.xml in the analyze-connect repository.

Configure the new schema as a gateway schema in your deployment. Choose an appropriate short name for this schema, such as "NYPD-KCPD-Merged".

Configuring item type mappings

Follow the process outlined in Configuring item type mappings to define mappings of types from the NYPD-Complaints and KCPD-Crime schemas to types in the NYPD-KCPD-Merged schema. Examples of specific mappings you might define are outlined below, but the general idea is to map pairs of duplicate types in NYPD-Complaints and KCPD-Schema to their corresponding type in the new merged schema.

Start by opening the i2 Analyze Server Admin Console.

Person

The preferred Person type listed above is the one defined in the NYPD-Complaints schema. This is duplicated as the Person type in the new NYPD-KCPD-Merged schema, so you can define mappings from the Person types in the NYPD-Complaints and KCPD-Crime schemas to the Person type in the NYPD-KCPD-Merged schema.

First, create a mapping from Person (NYPD-Complaints) to Person (NYPD-KCPD-Merged). Since these two types are identical, all the properties will have mappings generated automatically, as shown below. You can just click OK to confirm the mapping.


NYPD Person to Merged Person

Second, create a mapping from Person (KCPD-Crime) to Person (NYPD-KCPD-Merged) and map the properties as if you were mapping to the NYPD-Complaints Person type. Example property mappings you might define are shown below.


KCPD Person to Merged Person

Location

Similarly, the Location (NYPD-Complaints) and Location (KCPD-Crime) types can both be mapped to Location (NYPD-KCPD-Merged).

The Location (NYPD-KCPD-Merged) type is identical to Location (KCPD-Crime), since that is the preferred Location type. Start by mapping Location (KCPD-Crime) to Location (NYPD-KCPD-Merged), making use of the automatically-generated property mappings.

Then, map Location (NYPD-Complaints) to Location (NYPD-KCPD-Merged). Examples of property mappings you might define are shown below.


NYPD Location to Merged Location

Report

Following the same process, map the Complaint (NYPD-Complaints) and Report (KCPD-Crime) types to the Report (NYPD-KCPD-Merged) type. The Report type in the merged schema is identical to the Report type in the KCPD-Crime schema, so once again all the property mappings will be populated for you in that case.

The mapping of Complaint (NYPD-Complaints) to Report (NYPD-KCPD-Merged) might be defined as follows.


NYPD Complaint to Merged Report

Links

The Located At, Suspect Of, and Victim Of link types in both the NYPD-Complaints and KCPD-Crime schemas can be mapped to the corresponding link types in the NYPD-KCPD-Merged schema, which are identical to the link types in NYPD-Complaints.

When all mappings are defined, you should see that all types - except the Complicit In, Arrested, and Charged links from the KCPD-Crime schema - have been mapped to the NYPD-KCPD-Merged schema.


Merged schemas mapping summary
  1. Click Apply in the top-right. This applies the mappings to the test environment that is available only through the Admin Console. It does not apply the mappings to the live server.

  2. Click Preview services to open a preview of how the services would behave with the mappings you have configured.

  3. Go back and make any changes to the mappings, repeating steps 1 and 2 until you are satisfied with the configuration.

Applying the item type mappings to the i2 Analyze server

To apply the mapping configuration you have created on the i2 Analyze server for all users, see Applying the mapping configuration to the i2 Analyze server.

The result

By creating a new gateway schema and defining item type mappings to it, you have mitigated the problems caused by duplicate or similar item types in the NYPD-Complaints and KCPD-Crime schemas, without being limited to defining mappings in just one direction between them.