Match rules syntax
A match rules file is an XML document whose structure is validated when i2 Analyze starts. The match rules syntax is the same for both system match and Find Matching Records match rules files.
Root element: matchRules
The root element of a match rules file is a <matchRules>
element from the defined namespace. For example:
<tns:matchRules
xmlns:tns="http://www.i2group.com/Schemas/2019-05-14/MatchRules"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation=
"http://www.i2group.com/Schemas/2019-05-14/MatchRules MatchRules.xsd"
version="2"
enableSourceIdentifierMatching="true">
...
</tns:matchRules>
The <matchRules>
element has the following customizable attributes:
Attribute | Description |
---|---|
enableSourceIdentifierMatching | For a deployment that contains the Information Store and the i2 Connect gateway, controls whether system matching uses source identifiers to determine whether records match each other, regardless of whether they have property matches. When this attribute is |
version | The version of the match rules file, which must be 2 at this release. |
matchRule
Inside the root element, each <matchRule>
element defines a match rule for records of a particular entity or link type. The <matchRule>
element has the following attributes:
Attribute | Description |
---|---|
id | An identifier for the match rule, which must be unique within the match rules file. |
itemTypeId | The identifier for the entity or link type to which the rule applies, as defined in the i2 Analyze schema.
Important: For item types that are defined in gateway or connector schemas, i2 Analyze appends the schema short name to the item type identifier. For example, if the gateway schema defines an item type with the identifier ET5 , then the identifier to use here might be ET5-external .
In the modified type identifier, the short name is always in lower-case letters and separated from the original identifier with a hyphen. Any whitespace or non-alphanumeric characters in the short name are converted to single hyphens. When you create or edit match rules through Analyst's Notebook Premium, the application handles these modifications to item type identifiers for you. When you edit the XML file yourself, you are responsible for specifying item type identifiers correctly. |
displayName | The name of the rule, which is displayed to analysts in Analyst's Notebook Premium in Find Matching Records. |
description | The description of the rule, which is displayed to analysts in Analyst's Notebook Premium in Find Matching Records. |
active | Defines whether the rule is active. A value of true means that the rule is active; a value of false means that the rule is not active. |
linkDirectionOperator | Determines whether two links must have the same direction in order to match. Mandatory for link type rules, where it must have the value EXACT_MATCH or ANY . Must be absent or null for entity type rules. |
version | In earlier versions, the version attribute was mandatory on <matchRule> elements. At this release, the per-rule version is optional, and any value is ignored. |
For example, an entity type rule:
<matchRule
id="4a2b9baa-e3c4-4840-a9fd-d204711af50e"
itemTypeId="ET3"
displayName="Match vehicles"
description="Match vehicles with the same license plate when
either the registered state or region are the same."
active="true">
...
</matchRule>
And a link type rule:
<matchRule
id="8aa5f6f4-a1a8-41de-b5c5-8701e44bcde7"
itemTypeId="LAS1"
displayName="Match duplicate links"
description="Match link records between the same pair of entity
records when the links are in the same direction."
active="true"
linkDirectionOperator="EXACT_MATCH">
...
</matchRule>
matchAll and matchAny
To specify the behavior of a match rule, you can use the following children of the<matchRule>
element:matchAll
- The
<matchAll>
element specifies that all of the conditions within it must be met. matchAny
- The
<matchAny>
element specifies that at least one of the conditions within it must be met.
<matchRule>
element must have both the <matchAll>
and <matchAny>
elements. For example:<matchRule ... >
<matchAll>
...
</matchAll>
<matchAny />
</matchRule>
condition
All match rules must contain the <matchAll>
and <matchAny>
elements, although both can be empty for link type rules. It is valid to create a rule that makes all link records of the same type between the same pair of entity records match each other, regardless of any other considerations.
All entity type rules must contain at least one condition. Many link type rules contain conditions too. Each condition defines a comparison that takes place between values in different records, and specifies when those values are considered to match. Conditions can be refined by using operators, values, and normalizations.
To specify the conditions of a match rule, you use the <condition>
element that can be a child of the <matchAll>
and <matchAny>
elements. Each <condition>
element has a mandatory propertyTypeId
attribute, which is the identifier for the property type to which the condition applies, as defined in the i2 Analyze schema.
<condition propertyTypeId="VEH2">
...
</condition>
All conditions contain an <operator>
element, most of them a contain <value>
element, and many contain <normalizations>
.
operator
- The
<operator>
element defines the type of comparison between the property values in different records, or between the property value and a static value specified within the rule. The possible operators are:Operator Description EXACT_MATCH The values that are compared must match each other exactly. EXACT_MATCH_START A specified number of characters at the start of string values must match each other exactly. EXACT_MATCH_END A specified number of characters at the end of string values must match each other exactly. EQUAL_TO The property values must match each other, and the specified <value>
.For example:<condition propertyTypeId="VEH2"> <operator>EXACT_MATCH</operator> ... </condition>
For more information about the operators that you can use, depending on the logical type of the property, see Table 1.
value
- The contents of the
<value>
element affect the behavior of the<operator>
of the condition. Different operators require different value types.- If the operator is
EXACT_MATCH_START
orEXACT_MATCH_END
, the value is an integer that specifies the number of characters to compare at the start or end of the property value:<operator>EXACT_MATCH_START</operator> <value xsi:type="xsd:int">3</value>
- If the operator is
EQUAL_TO
, the value is a string to compare with the property value:<operator>EQUAL_TO</operator> <value xsi:type="xsd:string">red</value>
- If the operator is
EXACT_MATCH
, it is not valid to specify a<value>
element.<operator>EXACT_MATCH</operator>
- If the operator is
normalizations
- The
<normalizations>
element contains child<normalization>
elements that define how property values are compared with each other (and sometimes with the contents of the<value>
element). The possible values for the<normalization>
element are:
For example, you might have the following normalizations for anNormalization Description IGNORE_CASE Ignores case during the comparison ( 'a'
matches'A'
)IGNORE_DIACRITICS Ignores diacritic marks on characters ( 'Ã'
matches'A'
)IGNORE_WHITESPACE_BETWEEN Ignores whitespace between characters ( 'a a'
matches'aa'
)IGNORE_WHITESPACE_AROUND Ignore whitespace around a string ( ' a '
matches'a'
)IGNORE_NUMERIC Ignore numeric characters ( 'a50'
matches'a'
)IGNORE_ALPHABETIC Ignore alphabetic characters ( 'a50'
matches'50'
)IGNORE_NONALPHANUMERIC Ignore non-alphanumeric characters ( 'a-a'
matches'aa'
)SIMPLIFY_LIGATURES Simplify ligatures ( 'æ'
matches'ae'
)EXACT_MATCH
operator:
In this example, the values "b m w xdrive" and "BMW x-drive" are considered a match.<condition ... > <operator>EXACT_MATCH</operator> <normalizations> <normalization>IGNORE_CASE</normalization> <normalization>IGNORE_NONALPHANUMERIC</normalization> <normalization>IGNORE_WHITESPACE_BETWEEN</normalization> </normalizations> </condition>
Schema logical type | Operators | Normalization |
---|---|---|
SINGLE_LINE_STRING | All | All |
SELECTED_FROM | All | All |
SUGGESTED_FROM | All | All |
BOOLEAN | EXACT_MATCH | None |
INTEGER | EXACT_MATCH | None |
DECIMAL | EXACT_MATCH | None |
DOUBLE | EXACT_MATCH | None |
DATE_AND_TIME | EXACT_MATCH | None |
DATE | EXACT_MATCH | None |
TIME | EXACT_MATCH | None |
- GEOSPATIAL
- MULTIPLE_LINE_STRING
The following XML is an example of a match rules file that contains a single entity match rule. The rule matches vehicle records that have the same values for the license plate property, and the same values for either the state or region properties.
<tns:matchRules
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=
"http://www.i2group.com/Schemas/2019-05-14/MatchRules MatchRules.xsd "
version="2"
xmlns:tns="http://www.i2group.com/Schemas/2019-05-14/MatchRules">
<matchRule id="4a2b9baa-e3c4-4840-a9fd-d204711af50e"
itemTypeId="ET3"
displayName="Match vehicles"
description="Match vehicles with the same license plate,
when either the registered state or
region are the same."
active="true">
<matchAll>
<condition propertyTypeId="VEH2">
<operator>EXACT_MATCH</operator>
<normalizations>
<normalization>IGNORE_WHITESPACE_BETWEEN</normalization>
</normalizations>
</condition>
</matchAll>
<matchAny>
<condition propertyTypeId="VEH16">
<operator>EXACT_MATCH</operator>
<normalizations>
<normalization>IGNORE_CASE</normalization>
<normalization>IGNORE_DIACRITICS</normalization>
<normalization>IGNORE_WHITESPACE_BETWEEN</normalization>
<normalization>IGNORE_NONALPHANUMERIC</normalization>
</normalizations>
</condition>
<condition propertyTypeId="VEH15">
<operator>EXACT_MATCH</operator>
<normalizations>
<normalization>IGNORE_CASE</normalization>
<normalization>IGNORE_DIACRITICS</normalization>
<normalization>IGNORE_WHITESPACE_BETWEEN</normalization>
<normalization>IGNORE_NONALPHANUMERIC</normalization>
</normalizations>
</condition>
</matchAny>
</matchRule>
</tns:matchRules>