Designing a Quantexa streaming solution that addresses both functional and non-functional requirements can be challenging, especially when specific use cases require capabilities beyond the OOB functionalities of Quantexa platform. While some extensions may be necessary to meet these requirements, it is crucial to avoid unnecessary customizations and instead adhere to Quantexa’s best practices to ensure the solution remains:
- Future-proof: Ready for platform upgrades and changes.
- Aligned with industry standards: Compliant with recognized guidelines and practices.
- Optimally performing: Delivering high efficiency and responsiveness.
- Scalable: Easily adaptable to growing data and user demands.
- Seamlessly integrated: Enhancing operational efficiency within the Quantexa ecosystem.
This article explores how to design Kafka streaming solutions that address unique requirements while staying aligned with Quantexa's best practices, minimizing customizations, and leveraging the platform’s capabilities effectively.
Introduction
The Quantexa Streaming tier provides robust capabilities for near-real-time data ingestion and processing. While it is designed to address a wide range of use cases, some unique requirements, such as non-standard Kafka message formats, unsupported schemas, or complex integrations with a Kafka instance, may necessitate additional extensions.
Rather than customizing Quantexa’s core streaming components, the recommended approach is to extend the platform by building bespoke applications that integrate seamlessly with the Quantexa ecosystem. This ensures adherence to Quantexa’s supported configurations, which is vital for maintaining the long-term performance, reliability, and scalability of your Kafka streaming solutions.
This article offers practical recommendations for designing and implementing solutions that extend the capabilities of Quantexa platform while aligning with best practices and avoiding unsupported customizations.
Design and build a Data Mapper application
In cases where the upstream system cannot generate events that conform to the Record Extraction schema (see Record Extraction Schema Documentation), or cannot produce Kafka events consumable by the Record Extraction service, it is recommended to first explore adapting the upstream system to conform to the required schema and standards. Where this is not feasible, we recommend designing a bespoke application positioned between the upstream system and the Record Extraction service to translate incoming messages into a format that aligns with the standards of the Record Extraction.
The diagram below illustrates how to embed a data mapper service between the upstream data source and the Quantexa Record Extraction service.
- Upstream system publishes raw data messages into Raw Data Input topic.
- Data Mapper application reads raw data messages from Raw Data Input topic. Data Mapper maps, or translates the raw data message schema to the document model schema.
- Data Mapper publishes the document model message into Record Extraction Input topic.
- Record Extraction service reads document model message from Record Extraction Input topic.
To create and integrate bespoke services as a new application within the Quantexa Application tier, adhere to the following key principles:
- Additional application: Follow the outlined steps to add the new application to the Quantexa Application tier using the Quantexa Helm Chart.
- Configuration definition: Utilize the Quantexa quanfiguration service to define configurations for the bespoke application, ensuring consistency with existing standards and formats.
- Routing: If applicable, specify the routing settings for the bespoke application within the Gateway service to ensure proper traffic management and integration.
Deploy a dedicated Kafka instance for Quantexa platform
If an existing Kafka instance is being used as an enterprise messaging bus, it may not always be suitable for handling the internal communication needs of the Quantexa platform. Architectural considerations, such as ensuring flexibility for adding topics, maintaining message format consistency, and preserving architectural cleanliness, often make it more practical to provision a separate Kafka instance dedicated to the Quantexa platform.
In scenarios where the primary Kafka instance cannot be configured to meet Quantexa platform supported configurations (e.g., handling encrypted messages or unsupported formats like XML), bespoke translator services can be developed and deployed. These services act as an intermediary, translating messages between the primary Kafka instance and the Quantexa dedicated Kafka instance. By managing this integration, messages can be mapped and aligned with the formats and schemas required by both systems while maintaining compatibility.
In cases where a dedicated Kafka instance is allocated for the Quantexa platform, a bespoke Translator or Acknowledgment service can be developed and deployed, similar to the Mapper service described earlier. This service would facilitate the translation of outcome messages produced by Quantexa streaming services, such as Document Ingest, into the Document Ingest Success topic. It would then route these messages back to a designated topic in the primary Kafka instance, ensuring they are available for consumption by other systems in accordance with their operational requirements and compatible message formats.
The diagram below illustrates how to design a streaming solution when Quantexa platform streaming services’ supported configurations cannot be utilized to integrate with an existing Kafka instance.
- Upstream system publishes raw data into Raw Data Input topic in the primary Kafka instance.
- Data Mapper service reads raw data messages from Raw Data Input topic. Data Mapper service maps, or translates the raw data message schema to the document model schema.
- Data Mapper publishes the document model message into Record Extraction Input topic in the Dedicated Kafka instance for Quantexa platform.
- Record Extraction application reads document model message from Record Extraction Input topic.
- Record Extraction service publishes compounds and cleansed document model messages to Document Ingest Notify topic.
- Document Ingest service reads cleansed data and compounds from Document Ingest Notify topic.
- Document Ingest service publishes success message to Document Ingest Success topic.
- Message Converter bespoke service reads the success messages from Document Ingest Success topic and translates them into the desired format required by Primary Kafka topic.
- Translator / Data Mapper Acknowledge service publishes the success messages to the Data Ingest Success topic in the primary Kafka instance
Align your bespoke streaming service configurations with Quantexa application standards
For implementations requiring the development of a bespoke service, such as a service leveraging the Graph API for resolving entities, building graphs and scoring, Quantexa's best practice recommends adding this new service as an additional component to the Quantexa Application tier. It is highly recommended to follow a similar configuration structure to that of Quantexa Kafka streaming services. This design approach ensures consistency between Quantexa-supported services and bespoke services tailored for specific use cases.
Following this pattern will allow bespoke microservices to integrate with existing Kafka streams in the same manner as Quantexa-supported Kafka services (similar configurations pattern and approach).
For more information on building a bespoke (standalone) Kafka microservice that follows the same service design pattern as Quantexa Kafka streaming services, please refer to the project-example standalone app kafka expand score alert.
Data mapper error handling
Quantexa's best practice recommends applying the same error handling mechanisms as those used by the Record Extraction service. For instance, the following error types are recommended for consideration:
Service connectivity to Kafka stream
- Error: The service is unable to connect to Kafka stream to consume messages from the topic or submit the offset to the topic due to an unresponsive or connection issues with Kafka stream.
- Behavior: The service will retry connecting to Kafka a specified number of times. If it remains unsuccessful, the service will terminate.
Schema mapping
- Error: Any exception hindering the service from effectively mapping the input event schema to the targeted Record Extraction service’s required schema may arise from issues like incorrectly formatted input messages or incomplete input messages.
- Behavior: The service publishes an error message to the designated Error topic. The service submits the offset to the Input topic and attempts to consume the next message.
Intermittent errors
- Error: Intermittent JVM issue or a defect within the Mapper service
- Behavior: Service terminates
What you learn
By implementing Quantexa’s recommended best practices, you can:
- Design solutions that bridge gaps between upstream systems and Quantexa’s Streaming tier services.
- Deploy bespoke Kafka streams to address complex requirements without compromising system integrity.
- Maintain alignment with Quantexa’s core architecture, ensuring long-term reliability and scalability.
Conclusion
Meeting functional and non-functional requirements in Quantexa Kafka streaming use cases may sometimes exceed the default capabilities of Quantexa’s Streaming tier. In such scenarios, leveraging Quantexa’s flexible architecture allows for customized solutions while maintaining adherence to supported configurations.
By following the Quantexa's best practices outlined here, you can design solutions that mitigate risks, enhance performance, and support future scalability. Collaborating with Quantexa’s experts ensures your deployment is robust, reliable, and ready for the challenges of tomorrow.
The following article helps you understand further the principles of data streaming to enrich information in a Quantexa streaming solution.