Example highlight queries

An installation and deployment of i2 Analyze provides two sets of example highlight queries that you can use alongside the documentation when you write your own queries.

The first set of example highlight queries is in the i2 Analyze deployment toolkit. The toolkit\examples\highlight-queries\highlight-queries-configuration.xml file contains an annotated set of highlight queries that are compatible with the example law enforcement schema. In an example deployment of i2 Analyze, these queries take the place of the automatically generated set.

The second set of example highlight queries is in the XML that you can fetch from a running i2 Analyze deployment. These automatically generated queries are compatible with the live Information Store schema.

A worked example

The following listing represents one of the more complicated highlight queries in the toolkit example:

<highlightQuery title="Money laundering?" automatic="false">
  <description>Organizations whose accounts transact with an account that also transacts with
    this person's accounts (and might therefore be an intermediary), sorted by value of the
    person's transactions.</description>
  <path>
    <segments>
      <segment>
        <Access_To-I/>
        <Account-I/>
      </segment>
      <segment>
        <Transaction-I>
          <conditions>
            <transaction_currency-P>
              <equalTo>
                <value>US Dollars</value>
              </equalTo>
            </transaction_currency-P>
          </conditions>
          <exportFields>
            <propertyField propertyType="transaction_value-P" aggregatingFunction="MAX"
              id="value"/>
            <propertyField propertyType="date_and_time-P" aggregatingFunction="MAX"
              id="dateTime"/>
            <countField id="transactioncount" />
          </exportFields>
        </Transaction-I>
        <Account-I/>
      </segment>
      <segment>
        <Transaction-I>
          <exportFields>
            <countField id="transactioncount2"/>
          </exportFields>
        </Transaction-I>
        <Account-I/>
      </segment>
      <segment>
        <AnyLinkType/>
        <Organization-I/>
      </segment>
    </segments>
    <outputs>
      <field label="# txns Per -- Inter" id="transactioncount"/>
      <field label="# txns Inter -- Org" id="transactioncount2"/>
      <field label="Largest txn value" id="value"/>
      <field label="Most recent txn date" id="dateTime"/>
    </outputs>
  </path>
  <sortBy>
    <field id="value" order="DESC"/>
  </sortBy>
</highlightQuery>

This highlight query is for subject records of type 'Person'. The results that it finds are of type 'Organization', because that is the entity type in the final segment. There are four segments in all, which has the potential to make the query resource-intensive. The enclosing <highlightQuery> element has its automatic attribute set to false so that the query only runs when a user requests it.

The first segment in the query finds all the 'Account' records in the Information Store to which the subject is connected through an 'Access To' link. There are no conditions on either part of the segment, and no values are exported for use elsewhere.

The second segment takes all the accounts to which the subject has access, and finds accounts that they have exchanged transactions with. However, it does not find all such accounts, because of the condition on the link type:
          <conditions>
            <transaction_currency-P>
              <equalTo>
                <value>US Dollars</value>
              </equalTo>
            </transaction_currency-P>
          </conditions>
The condition restricts the accounts that this segment finds to those where transactions have taken place in US dollars. ("US Dollars" is configured in the Information Store schema as a possible value for properties of this type.) The next part of the segment then exports information about these dollar transactions for later use:
          <exportFields>
            <propertyField propertyType="transaction_value-P" aggregatingFunction="MAX"
              id="value"/>
            <propertyField propertyType="date_and_time-P" aggregatingFunction="MAX"
              id="dateTime"/>
            <countField id="transactioncount" />
          </exportFields>

For each of the eventual results of the highlight query, these lines record the value of the largest transaction at this location in the path, and the date of the most recent transaction. These values might come from different transactions if the count - which we also export here - is greater than one.

The inputs to the third segment, then, are all the accounts that have transacted in dollars with accounts to which the subject has access. The third segment goes on to find any accounts in the Information Store that have exchanged transactions with the inputs. It also exports the count of transactions at this location in the path:
          <exportFields>
            <countField id="transactioncount2"/>
          </exportFields>
The final segment takes these accounts that are twice-removed from accounts to which the subject has access, and finds organizations in the Information Store that are connected to them in any way. To do so, it makes use of an <AnyLinkType> element:
      <segment>
        <AnyLinkType/>
        <Organization-I/>
      </segment>

It is worth recounting what this means from an investigative point of view. For an organization to be found by this query, it must be linked to an account that has transacted with an account that has also exchanged dollar transactions with an account to which the subject has access. In simpler terms, it might be that accounts belonging to the person and the found organizations are exchanging money through third accounts. The query certainly is not conclusive, but the results might become targets for further investigation.

When the results of the query are presented to users, they include the values that were exported from the second and third segments:
    <outputs>
      <field label="# txns Per -- Inter" id="transactioncount"/>
      <field label="# txns Inter -- Org" id="transactioncount2"/>
      <field label="Largest txn value" id="value"/>
      <field label="Most recent txn date" id="dateTime"/>
    </outputs>

These lines show the challenges of displaying useful information in the relatively confined space of a highlight pane. In fact, only the first two fields appear in the pane; the others are displayed when the user clicks Show more to display more results (and more property values from those results). Ideally, the labels work in harmony with the <description> of the query that you write to explain the results to your users.

The final part of the highlight query specifies how the application should sort the results. In general, you can request multiple sorts here that are applied in sequence. In this instance, the single criterion is the highest transaction value from the second segment. Setting the order to DESC means that numbers run from high to low, which is the more common requirement. The opposite is true when you sort on text values, and setting order to ASC places the results in alphabetical sequence.