If your deployment includes logic that extracts, transforms, and loads data on a different server from the i2® Analyze application or the Information Store, consider deploying the ETL toolkit. The ETL logic can then run ETL toolkit commands to automate loading and ingesting data into the Information Store.
About this task
In an i2 Analyze deployment that uses data from an external source, the ETL logic is the processing that transforms source data for loading into the Information Store staging tables. In mature deployments, it is common for the ETL process to be automated so that loading and ingesting data happen in sequence, on a schedule.
When your ETL logic is colocated with the standard i2 Analyze deployment toolkit, the logic can use that toolkit to drive the ingestion process automatically. When those components are on separate servers, you can deploy the ETL toolkit to the server that hosts the ETL logic. The ETL toolkit provides the ingestion functions of the deployment toolkit in a stand-alone package.
Procedure
The ETL toolkit must be able to communicate with the Information Store with all the same credentials as the deployment toolkit. To enable this behavior, you use the deployment toolkit to create the ETL toolkit, and then copy it to the ETL logic server.
-
On the server that has the deployment toolkit, open a command prompt and navigate to the toolkit\scripts directory.
-
Run the
createEtlToolkit
command to generate the ETL toolkit:
setup -t createEtlToolkit -p outputPath=output_path
This command creates the ETL toolkit
in a directory that is named
etltoolkit
in the output path
that you specify.
-
Copy the ETL toolkit to the server that hosts the ETL logic.
If the ETL logic and toolkit are on the same
server as the database management system that
hosts the Information Store, you do not need to
modify the connection configuration. If the
database management system is on a different
server, then you must ensure that the ETL toolkit
can communicate with the remote database.
-
Depending on your database management system, install the client tools for PostgreSQL, or Microsoft™ Command Line Utilities for SQL Server, or Db2® client software on the server that hosts the ETL toolkit.
-
Navigate to the classes
directory of the ETL toolkit and open the
Connection.properties file in
a text editor.
-
Ensure that the value for the
db.installation.dir
setting is correct for the path to the PostgreSQL client tools or Microsoft Command Line Utilities for SQL Server or the Db2 client on the server that hosts this ETL toolkit.
For example:
db.installation.dir=C:/Program Files/IBM/SQLLIB
-
If you are using Db2 to host the Information
Store, you must catalog the remote Db2 database.
Run the following commands to enable the ETL
toolkit to communicate with the Information
Store:
db2 catalog tcpip node node-name host-name server port-number
db2 catalog database instance-name at node node-name
Here, host-name,
port-number, and
instance-name are the values
that are specified in the
topology.xml file.
node-name can be any value that
you choose, but you must use the same value in
both commands.
If the database management system
that hosts the Information Store is not using SSL,
then the process is complete.
If the database
management system is configured to use SSL, you
must also enable the ETL toolkit to communicate by
using SSL. The detail of this process depends on
your choice of database management system.
- If you're using PostgreSQL, take the i2-database_management_system-certificate.cer certificate that you exported from the database management system when you configured SSL on the server that hosts the ETL toolkit, and store it in a convenient location on the server.
- If you're using SQL Server or Db2:
- Register the
i2-database_management_system-certificate.der
certificate that you exported from the database
management system when you configured SSL on the
server that hosts the ETL toolkit.
-
Create a truststore and import into the
truststore the certificate that you exported from
the database management system when you configured
SSL.
For example, run the following command:
keytool -importcert -alias "dbKey"
-file C:\i2\etltoolkit\i2-database_management_system-certificate.der
-keystore "C:\i2\etltoolkit\i2-etl-truststore.jks"
-storepass "password"
Enter
yes in response to the
query, Trust this
certificate?
- Navigate to the
classes directory of the ETL
toolkit and open the
TrustStore.properties file in
a text editor.
- If you're using PostgreSQL, populate the
DBTrustStoreLocation
property with the full path to the certificate that you stored earlier. - If you're using SQL Server or Db2:
- Populate the
DBTrustStoreLocation
and
DBTrustStorePassword
properties with the full path to the truststore that you
created, and the password that is required to access it.For
example:
DBTrustStoreLocation=C:/i2/etltoolkit/i2-etl-truststore.jks
DBTrustStorePassword=password
- You can use the Liberty profile securityUtility
command to encode the password for the truststore.
- Navigate to the bin directory of the Open Liberty deployment that was configured by the deployment toolkit.
- In a command prompt, run
securityUtility encode password
, which generates and displays the encoded password. Use the entire value, including the {xor}
prefix, for the DBTrustStorePassword
property value. For more information about using the security utility, see securityUtility encode.
Results
The ETL toolkit is ready for use by your ETL
logic to modify the Information Store. At key
points in the processes of preparing for and
performing ingestion, you can use commands in the
ETL toolkit in place of deployment toolkit
functions.