SDK Usage Guide
This guide covers the most common use cases for the TextChart SDK with practical code examples.
Overview
The TextChart SDK provides a straightforward API for text processing. The typical workflow involves:
Initialize the Rosoka instance
Process a document or string
Extract entities from results
Access relationships and metadata
Basic Setup
Adding Dependencies
If you're using Maven, add the following to your pom.xml:
<dependency>
<groupId>com.rosoka</groupId>
<artifactId>RosokaAPI</artifactId>
<version>10.1.2</version>
</dependency>
<dependency>
<groupId>com.rosoka</groupId>
<artifactId>RosokaJAXB</artifactId>
<version>10.1.2</version>
</dependency>Initializing Rosoka
The simplest way to initialize the TextChart SDK is to use the default ROSOKA_HOME location:
import com.rosoka.RosokaAPI.Rosoka;
import com.rosoka.JAXB.RosokaFullObject;
try {
// Initialize with default ROSOKA_HOME
Rosoka rosoka = Rosoka.getRosokaInstance();
// Or specify a custom path
Rosoka rosoka = Rosoka.getRosokaInstance("/path/to/rosoka/home");
} catch (FileNotFoundException | RosokaLicenseException | RosokaConfigurationException e) {
System.err.println("Failed to initialize Rosoka: " + e.getMessage());
e.printStackTrace();
}Processing Text
Process a String
Extract entities from a text string:
import com.rosoka.RosokaAPI.Rosoka;
import com.rosoka.JAXB.RosokaFullObject;
import com.rosoka.JAXB.Entitylist;
import com.rosoka.JAXB.Entity;
Rosoka rosoka = Rosoka.getRosokaInstance();
String inputText = "John Smith works at Acme Corporation in New York. " +
"He can be reached at john.smith@acme.com or (555) 123-4567.";
try {
// Process the string and get full object
RosokaFullObject result = rosoka.processStringRosokaFullObject(inputText);
// Get entities
Entitylist entities = result.getEntities();
// Iterate through extracted entities
for (Entity entity : entities.getEntity()) {
System.out.println("Entity: " + entity.getValue());
System.out.println("Type: " + entity.getEntitytype());
System.out.println("Confidence: " + entity.getConfidence());
System.out.println("---");
}
} catch (Exception e) {
e.printStackTrace();
}Process a File
Extract entities from a file:
import com.rosoka.RosokaAPI.Rosoka;
import com.rosoka.JAXB.RosokaFullObject;
import java.io.File;
Rosoka rosoka = Rosoka.getRosokaInstance();
try {
// Process file (supports 30+ formats)
File inputFile = new File("document.pdf");
RosokaFullObject result = rosoka.processFileRosokaFullObject(inputFile);
// Process results...
} catch (Exception e) {
e.printStackTrace();
}Process Multiple Files
Batch processing for efficiency:
import com.rosoka.RosokaAPI.Rosoka;
import com.rosoka.JAXB.RosokaFullObject;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
Rosoka rosoka = Rosoka.getRosokaInstance();
File directory = new File("documents/");
try {
Files.walk(Paths.get(directory.getPath()))
.filter(Files::isRegularFile)
.forEach(filePath -> {
try {
File file = filePath.toFile();
RosokaFullObject result = rosoka.processFileRosokaFullObject(file);
System.out.println("Processed: " + file.getName());
System.out.println("Entities found: " +
result.getEntities().getEntity().size());
} catch (Exception e) {
System.err.println("Error processing " + filePath + ": " + e.getMessage());
}
});
} catch (Exception e) {
e.printStackTrace();
}Working with Results
Accessing Entity Information
Entities contain detailed information about extracted values:
import com.rosoka.JAXB.Entity;
import com.rosoka.JAXB.atomic.KeyValue;
Entity entity = /* ... from results ... */;
// Basic properties
String value = entity.getValue(); // The extracted text
String type = entity.getEntitytype(); // Entity type (PERSON, ORG, etc.)
String confidence = entity.getConfidence(); // Confidence score
String offset = entity.getOffset(); // Position in original text
// Attributes (type-specific properties)
List<KeyValue> attributes = entity.getAttribute();
for (KeyValue attr : attributes) {
System.out.println(attr.getKey() + ": " + attr.getValue());
}Getting XML Output
Get the raw XML representation of results:
import com.rosoka.RosokaAPI.Rosoka;
Rosoka rosoka = Rosoka.getRosokaInstance();
try {
RosokaObject result = rosoka.processString(inputText);
// Get as XML string
String xmlString = rosoka.getPSOAsXMLstring(result);
System.out.println(xmlString);
// Get entities as XML
String entityXml = rosoka.getEntitiesAsXMLstring(result);
System.out.println(entityXml);
} catch (Exception e) {
e.printStackTrace();
}Getting JSON Output
Convert results to JSON format:
import com.fasterxml.jackson.databind.ObjectMapper;
import com.rosoka.JAXB.RosokaFullObject;
RosokaFullObject result = /* ... from processing ... */;
try {
ObjectMapper mapper = new ObjectMapper();
String jsonString = mapper.writeValueAsString(result);
System.out.println(jsonString);
} catch (Exception e) {
e.printStackTrace();
}Entity Types
The TextChart SDK can extract many entity types. Common types include:
PERSON: Names of people (John Smith, Mary Johnson)
ORGANIZATION: Company names (Acme Corp, Google Inc.)
LOCATION: Geographic locations (New York, United States)
EMAIL: Email addresses (user@example.com)
PHONE: Phone numbers ((555) 123-4567)
URL: Web addresses (https://example.com)
DATE: Dates (January 15, 2024)
CURRENCY: Money amounts ($1,000, €500)
PERCENT: Percentages (25%, 3.14%)
IDENTIFIER: ID numbers, SSN, etc.
For a complete list, see Entity Types Reference.
Working with Relationships
Extract and analyze relationships between entities:
import com.rosoka.JAXB.RosokaFullObject;
import com.rosoka.JAXB.Relationshiplist;
import com.rosoka.JAXB.Relationship;
RosokaFullObject result = /* ... from processing ... */;
Relationshiplist relationships = result.getRelationshiplist();
if (relationships != null) {
for (Relationship rel : relationships.getRelationship()) {
System.out.println("From: " + rel.getFromvalue());
System.out.println("Type: " + rel.getRelationshiptype());
System.out.println("To: " + rel.getTovalue());
System.out.println("Confidence: " + rel.getConfidence());
System.out.println("---");
}
}Error Handling
The SDK can throw several exception types:
import com.rosoka.RosokaAPI.Rosoka;
import com.rosoka.RosokaAPI.RosokaException;
import com.rosoka.EntityExtraction.genericrules.RosokaRuleParserException;
try {
Rosoka rosoka = Rosoka.getRosokaInstance();
RosokaFullObject result = rosoka.processStringRosokaFullObject(text);
} catch (FileNotFoundException e) {
// Configuration or license file not found
System.err.println("Setup error: " + e.getMessage());
} catch (RosokaLicenseException e) {
// License validation failed
System.err.println("License error: " + e.getMessage());
} catch (RosokaConfigurationException e) {
// Configuration error
System.err.println("Configuration error: " + e.getMessage());
} catch (RosokaRuleParserException e) {
// Rule parsing error
System.err.println("Rule error: " + e.getMessage());
} catch (Exception e) {
// General exception
System.err.println("Processing error: " + e.getMessage());
e.printStackTrace();
}Advanced Topics
Custom Rules
Customize entity extraction by modifying rule files in the LxBase. See LxBase Configuration for details.
Performance Optimization
For high-volume processing:
Use thread pools for parallel processing
Configure document caching
Adjust JVM heap size appropriately
Process documents in batches
See Properties Reference for configuration details.
Examples
See the examples/ directory in your TextChart SDK installation for complete working examples including:
Basic text processing
File processing
Batch processing
Custom rule configuration
Integration examples
Web service examples
Getting Help
Review the included documentation
Check the examples directory
Contact i2 support for assistance