Advanced Language Parsers Release
We are excited to announce the release of our new Advanced Language Parsers, designed to support the accurate parsing of non-Latin alphabets natively in the Quantexa platform. This new capability will enable our customers to build contextual insights from across their data estate and expand Quantexa’s use in a wider range of geographies. In this first release we will support Japanese language parsing. Parsers in the Quantexa Platform Quantexa is very well known for its best-in-class Entity Resolution. Parsers play a significant role in making our Entity Resolution as accurate as it is. Parsing is the process of extracting relevant information from ingested data and transforming it into a structured format that can be easily analyzed. For example: in a customer system you’d typically have a record such as ‘Mrs. Jane Doe’. Parsing will extract it into manageable pieces - Title: Mrs.; GivenName: Jane, FamilyName: Doe. It would do the same for a record of a different format too, such as ‘Jane Doe, Mrs.’ as it identifies the different components. The more complicated the data, the more processing is needed to prepare for the high-quality Entity Resolution, for example, translation, transliteration, normalization of the data, etc. Quantexa’s existing Standard Parsers are proven to parse data with high accuracy, while providing the ability to incorporate cultural differences and increase the accuracy of parsing of data from specific geographies by tailoring the Parsers. However, they work best with data in Latin character sets. For more information about Quantexa’s Parsers see our documentation. In order to process data in alphabets other than Latin out of the box, we have created ML-powered Advanced Language Parsers with the first release of the Advanced Japanese Parser (more Advanced Parsers are on the roadmap for later this year). This will significantly streamline Data Ingestion and result in far more accurate Entity Resolution for these non-Latin languages. By the way, now you can explore our roadmap and give feedback on our features and functionality in our Product Roadmap & Ideas Portal! Be a part of our product development! What are we working with? Japanese words can come in 3 different scripts: Kanji (Traditional Chinese Characters) Hiragana (Phonetic lettering system, used for words not covered by Kanji, and for grammatical inflections) Katakana (Phonetic lettering system, used for transcription of foreign-language words into Japanese) Apart from using different character sets, data in Japanese has a lot of interesting characteristics. For example, Japanese addresses are typically formatted from big to small values (from the country > city > street > house number), while Western addresses are usually formatted small to big (house number > street > city > country). Transliteration vs Translation Japanese words can be transliterated to create a Romanized version of the Japanese words using Latin script – Romaji. Or translated – so that English equivalent of the word is used if exists. Japanese Romaji English ソニーグループ株式会社 Sonī Gurūpu Kabushiki-gaisha Sony Group Corporation What is included in Advanced Parsers? Advanced Japanese Parsers includes Individual, Business and Address parsers. Individual parser Based on a library provided by the CJK institute which tokenizes and transliterates characters representing Japanese names Library consists of code and a database to be distributed Code makes calls to the database to retrieve most likely transliterations of Japanese names based on combinations of input characters Business Parser Uses existing business parser architecture with Japanese standardizations Translates using lookup from JMDict Transliterates using two third-party tools Address Parser Uses an AI model – a ‘Mixed Field Parser’ trained on Japanese data for parsing addresses Transliterates only (no translation) using 2 third-party transliterators Can produce enriched variants using publicly available address postcode information Also, a new configuration of Email Parser was created to handle emails with Unicode characters (including Japanese). What is needed to configure Japanese Parsers? To create entities with the Japanese data, you will need to take the following steps: Add data sources that contain Japanese names, addresses, businesses. For the data sources with Japanese data, update the parse method to use the Advanced Japanese parsers: Advanced Parser will be applied if the input address contains Japanese characters. If input contains Latin characters only – parse data using standard (composite) parsers Modify the entity files to use new compound groups. Add custom Japanese resolution templates + compounds to the resolver config. Run ETL with the correct usage of the Advanced Parsers that include CJK and MFP files/Spark config. Important to know For now, Advanced Language Parsers are an experimental release in Parsers 4.2.1 version. Advanced parsers include a few tools (including an ML model) that are targeted to increase the accuracy of the data processing and subsequently – ER. The trade-off of accuracy is performance. Users can expect an increase in runtime (compared to the standard parsers), and on average 2x increase in Elastic Index sizes. The good news is that these estimations would only be applied on the % of the data that is in Japanese characters and will not affect the figures for the data processed by standard parsers. For more information about performance and testing, check the Release Notes. How can I get Advanced Japanese Parser for my project? Full information about Advanced Japanese Parser is available on the Doc site. However, since it is an experimental release of the functionality, please reach out to @Anastasia Petrovskaia if you feel that the parser is applicable for your project. We are working on adding this capability to the Demo environment and targeting March 2025 with this piece of work. What is next? Adoption and feedback from the users would be a big part of maturing the Advanced Parsers, so there are no immediate plans to move the capability straight to EA/GA. You can provide feedback directly to the Advanced Language Parsers for Non-Latin Scripts using the Product Roadmap & Ideas Portal. The next Parser release will be focused on the improvements of the Standard Parsers. More Advanced Language Parsers for different languages/countries (e.g. Chinese, Arabic) are expected in H2 2025. For more information reach out to Anastasia Petrovskaia .Explore the New Search Demo 🔎
Our enhanced Search functionality (aka "Search 2") is here as of version 2.7.7! With a redesigned, entity-focused UI, faster setup, and better performance, it’s built to make using Search better than ever. 💬 We want your feedback! Explore the full release details in the New & Improved Search Functionality release announcement Check out the New Search Demo on the Product Demos page 🔒member exclusive Go to the Search tile under the General Availability tab in the Product Roadmap & Ideas Portal to share your feedback 🔒member exclusive Try out the demo now and see how the new and improved Search can enhance your Quantexa experience.51Views1like0CommentsNew and Improved Search Functionality
We’re excited to announce that our new and improved Search capability has now moved to General Availability as of 2.7.7. (this is referred to as Search 2 within our technical documentation). Let’s face it – the current Search experience came out with the first versions of the Quantexa Platform and needed a facelift. Although it did the job, and quite a good one at that, we received feedback from Quantexa users that gave us ideas on how to overhaul the functionality to both keep the UX consistent and to make it a lot more intuitive. Throughout this journey, we also found synergies in the Search setup process by coupling it up with our Data Fusion tool, and making Search a lot faster to set up. By the way, now you can explore our roadmap and give feedback on our features and functionality in our Product Roadmap & Ideas Portal! Be a part of our product development! How we brought Entities front and center From user feedback, we realized that the way users interact with the Search functionality can be improved by bringing more power to search using what Quantexa already knows about entities thereby optimizing the Search query construction. To enable this the Query Building aspects of the Search UI have been redesigned to reframe the search experience around the Entities available in the Platform. This gives the user the power to tell the system what they are looking for (which Entity Type) and then the system offers the most appropriate Data Sources and Search Fields to find the Individual, Business or other Entity the user wishes to search for. Before, the search results could sometimes be not exactly what -the user expected due to incomplete or sub-optimal search configuration. This brings us to our next update – new configuration through Data Fusion! How we simplified Search configuration If you are a Quantexa user, you might be familiar with the Search configuration process which is separate from entity resolution and tuning. What we figured is, if our users are overwhelmingly looking for entities in their searches, why not streamline the search setup up by ‘inheriting’ the settings from Data Fusion, where all the magic happens with entities. To be specific, you no longer manually have to define the search fields and groups yourself as we derive them from Fusion config. So you not only save efforts in setting up a tool, but you also get far better results from reusing the configuration that’s been tuned and tested. We have also improved some existing features: We have updated how our Results Filters work once the user has executed their query. It is now possible to filter using more than one item in the Filter list in the results screen. Additionally, we have improved the Table View of results. How we made Search faster If you are using new Search with our Entity Store then you are able to perform very fast direct Entity Searches even across massive amounts of data. The Entity Store keeps a pre-built version of your resolved entities available at hand, but also updates as soon as there are any changes or new data is introduced, so you get the speed without compromising on the content. How we made de-bugging faster Lastly, it is worth noting that this new version of Search has both re-used existing components within the Platform and is much simpler to support and de-bug than the old version of Search. Together this reduces the overall support burden on Quantexa’s R&D department – you might be thinking this is more of Quantexa benefit, but actually it means we are able to handle support requests faster and free up engineers to work on other enhancements to the Platform. What has been removed? At a fundamental level the new version of Search (known as Search 2 in our technical documentation) has feature parity with the old version of Search (Search 1) plus the new features already explained. However, there are several feature removals that are worth noting: As the Fusion configuration will now drive the search field mappings, there is no need to generate Search Fields anymore, thus we removed the feature altogether. We have removed the support for field boosting as the feature was deemed not useful by our users Facets have been replaced with Aggregations and can be easily migrated from the old Search to the new Search. We have removed the type ahead on Search Fields in the Search Bar as the query-building experience has been re-designed and this feature is no longer necessary. Deprecation and Removal of the old version of Search The old version of Search (Search 1) is deprecated at 2.8.0. We expect to remove the old version of Search from the Platform at the earliest at 2.9.0. All new customers are required to adopt the new version of Search from 2.7.7. onwards. View the New and Improved Search Demo Head over to Product Demos and launch the New Search Demo (login required) for an interactive walk-through of the new features included: Product Demos - Quantexa Community Take a self-guided tour through our interactive demos to see how the Quantexa Platform solves real-world challenges and unlocks powerful insights. How do I provide Feedback? We know this is not the end of the improvements necessary to our search capabilities, so expect to make further investments in the medium term. It would be valuable to hear feedback about the new Search results screen and the query editing experience. Please provide feedback directly to the Search initiative using the Product Roadmap & Ideas Portal. Future roadmap For the next 12 months we will not be making major changes to introduce new capabilities to our Search functionality. Our focus will be ensuring that as our existing customers upgrade and adopt this new version of Search and that the experience of migrating from old Search to new Search is as simple, fast and pain-free as possible. I encourage you to get started with the new Search functionality and let us know how you find it! Read the full release notes and Search 2 Migration Guide on the Documentation site.686Views1like0CommentsWelcome to Parsers 4.2 | Release Announcement
Alongside the release of QP2.7, we are happy to share the release of 4.2.0 of Standard Parsers. This release extends the cleansing options you can define purely in config - configurable simple generic cleansers. We have introduced new config-based cleansers that allow you to perform replacements in strings, remove and keep parts of a string based on pre-defined options, change the case of a string for different languages and to extract specific parts of input strings, all without writing any Scala. These back-end config improvements extend to the front-end. The users of QP 2.7 will have access to the extended configurability described above within the UI - more details are available in the release notes and documentation for QP2.7. The release also brings some key bug fixes and a small improvement to business parsing that should catch more edge cases of business names with odd punctuation distributions. For more information on the release features, please see the 4.2.0 release notes and for general info on Parsers, see the documentation.112Views1like0CommentsWelcome to Parsers 4.1 | Release Announcement
We are excited to announce the release of version 4.1 of Quantexa's Standard Parsers. This release focuses on improving the integration with Fusion UI (look out for exciting 2.6 release announcements coming soon) and improvements of the file structure of configuration files. This release includes the following highlights, which are detailed below. Consistency of configuration files - general improvements to Parser and lexicon configuration and files have been introduced to make sure the way you use these files is consistent across all available Parsers. This will make your configuration easier to understand and simplify the process of making future modifications. To minimise redundant data storage in Elastic Search, you can now exclude business standardisation terms that aren’t used in areas such as exclusions for Entity Resolution. Similarly, you can now choose to parse multiple names or just a single name for the Individual Parser to reduce your Elastic Search footprint. For the Telephone Parser you can now specify conditional parsing rules to increase the output accuracy. For example, if you have more specific parsing rules for UK telephone numbers, you can now use country code to parse these telephone numbers differently to the default telephone parsing behaviour. Note: There are no changes in this release that affect the output of parsing.151Views1like1CommentDriving Adoption of the Quantexa Platform & April Community Highlights
April Top Picks: 📈Driving Adoption of the Quantexa Platform 🏆Last Chance to Enter our Competition: The Knowledge Exchange 📣Release Announcement: Detection Packs 0.3 📣 New Education Program Launch 🚀 | Quantexa User Foundations Release 2 📣Quantexa successfully recertifies with the British Standards Institution - member exclusive (login required) Latest from the Community Library: 📖Tips and Tricks for Understanding Entity Lab - member exclusive 📖Using Data Viewer for the first time - member exclusive 📖A Day in the life of a… Solution Engineer Upcoming Events: 🗓️ 3rd May: Community Connect 🗓️ 7th May: NA | RSA eFraud Global Forum USA 🗓️ 12th June: Maximising Value from Community Support Quantexa In-depth: 💭 All things Glyph! 💭 Common Elastic Loader errors ✅ FAQ: How can I handle dates when writing a Quantexa Score in Scala? ✅ Q&A: Implementing scorecard for multiple data sources - member exclusive New & Popular Ideas: 💡Error Messages that improve the User Experience - New Idea 💡Ordering of advanced search fields - In Progress! Get Rewarded: 🌟Announcing the April Community Member of the Month 🌟Congratulations to all the Rising Stars on Community! Community quick links 🚀 Submit and vote for Ideas in our Ideas Portal 🗣️ Join one of our Specialist User Groups: FinCrime, Insurance, Data Management & KYC 📚️ Browse blogs, articles and guides in our Community Library131Views1like0CommentsWelcome to Parsers 4 | Parsers 4.0.0 Release Announcement
We are excited to announce the version 4.0 release of Quantexa's Standard Parsers. This release marks a shift in the way the Standard Parsers are used, and it includes an expansion of the data models so that more information is made available for use in Entity Resolution. Here are a few of the features you’ll find in this release: A new low-code interface Standard Parsers are now quicker and easier to deploy and use with new, easily shareable customization options, integration with Data Fusion, and less custom code, giving better coverage of auto-migrations to support upgrades. The setup and customization now uses configuration files, much like Data Fusion, meaning anyone can make changes without the need for writing code. Updated Data Models Entity Resolution can now form higher quality Entities, thanks to more information stored in our Standard Data Models at the parsing stage, so Entities have more contextual information associated with them and alignment with international standards. Higher match rates with Variants The introduction of Variants enables even more use cases, and lets users resolve the same Entity in more ways, in both Search and within Entity Resolution itself, so it’s possible to catch more edge cases. You can now use out-of-the-box Variants with the `address`, `individual` and `business` data models, and even define custom Variants. Name flexibility and internationalisation In the global market, localizations to both individual and business names can impact the accuracy of data, so we’ve updated name structures to represent names in a wider variety of cultures more accurately to the real world, improving Entity quality. For more detail on the new features – including contextual parsing (also known as composite parsing) and additional experimental country-specific address parsing, as well as compatibility and support with different versions of the Quantexa Platform, and other minor changes, see the full set of Release Notes on the Quantexa Documentation site. If you are unable to access the Documentation site, please get in touch with your Quantexa point of contact or the Community team at community@quantexa.com.231Views1like1CommentDetection Packs 0.3 Release
We are excited to announce the release of version 0.3 of Detection Packs. This is the third major release of Detection Packs and builds on the 0.2 version which introduced our low-code interface. For full details of the release, including compatible Quantexa Platform versions and minor enhancements, please see the Quantexa Documentation site. Expanded Score Coverage This release focuses on the expansion of our score coverage and general maturing of the product, with no significant changes to the interface, enabling those projects already using 0.2 to upgrade to 0.3 easily. 2 new transaction score pipelines were added, each with 4 score types such as “Transaction with Different Currencies” and “Transaction in Listed Jurisdiction”. 5 new Entity Record score types have been added, such as “Highly Connected Entity” and “Entity With Listed Type”. 5 new Entity Network score types have also been added, such as “Entity With Indirect Relation To Listed Jurisdiction” and “Entity Linked To Entity With Listed Status”. In total, the Fincrime Detection Pack now contains 26 pre-written configurable, re-usable, and extensible Score types which can be combined to produce a total of 56 Scores. For the full documentation on these please see our technical documentation. These new scores, in addition to those already in the Fincrime Detection pack, can be extended further to meet project-specific needs by utilizing the customization options documented on the Quantexa Documentation Site. The collection of supporting Reference Scores has continued to expand even as several have been adopted into this Detection Packs release. As a reminder, Reference Scores are pre-written Scores created in conjunction with our users to provide additional Scores over and above the core Detection Pack for FinCrime. They also cover additional use cases outside of FinCrime, and the catalogue currently contains over 50 further scores. Recent updates to the Reference Scores include a new correspondent banking use-case, and updates to transaction scores such as ‘Transaction With Mirrored Trading’ and ''Transaction in High Proportion of Low Value Security”. Simplified User Experience In addition to the expanded scoring options available, the Detection Packs user experience has been simplified by reducing the amount and complexity of configuration required for your project. In v0.2 of Detection Packs, projects which only wished to use a subset of supported scores were still required to setup all of their data mappings. From v0.3 this is simpler with various configuration options no longer required if not utilised. Coming soon to Detection Packs We are currently targeting mid 2024 for the 0.4 release of Detection Packs, with lots of exciting new features. Here are some of the planned features our users can look forward to in this release and beyond: Adoption of many more Reference Scores into officially supported, configuration-driven Detection Packs Scores Simplified graph-scripting support Dynamic pipeline generation Additional use case support, such as an Entity-level detection model Improved out-of-the-box testing and tooling Multi-typology and Multi-product Scorecard support Score versioning and seamless upgrade supportWelcome to Quantexa 2.5 | 2.5.0 Release Announcement
We are pleased to announce Quantexa 2.5! Read below for a few exciting highlights. Stream Updates to the Entity Store The Entity Store can now be updated with a streamed ingest of new Documents, and creates a change log describing how the Entity population has changed. In combination with the ability to query the full population of resolved Entities, this update enables near real-time access to the most up-to-date view of your data. Why is this important? Users and downstream systems can now get the latest view of an Entity resolved by Quantexa, enabling business processes that need to access or act upon that update in near real-time - in a Master Data Management solution (MDM), for example - to do so. Deploy and Change Custom Scoring Pipelines through Configuration The addition of Scoring Extension Mode to the Assess Template Generation functionality helps simplify the process of setting up Scoring on a project. With the new Extension Mode, it is now possible to change and add new nodes to a custom Scoring model using configuration files. Why is this important? This significantly reduces the technical skills required for setting up the Scoring pipeline and makes the generation of a custom Scoring pipeline more flexible. Flexible Scorecards: Identify and Aggregate Insights Across Multiple Typologies Assess capabilities and helper functions have been added to support multiple Scorecards on the same level within one Scoring pipeline. Scores can contribute to one or more Scorecards and flexible alerting logic can be set up to include the outcome of multiple Scorecards. Why is this important? This enables the flagging of different types of risks or insights at the same level - for different typologies or different products, for example - and helps an investigator or analyst understand and act upon the full context of the data available. Tune Scorecards with QPython QPython utilities can now interact with Assess configuration files and Scoring pipelines, which simplifies the process of tuning a Scoring model, providing an easy way to conduct a what-if analysis. The interface, using pre-written Python Notebooks, calculates a set of the most common and useful metrics for tuning. The use of Jupyter Notebooks allows the export of the reports for inclusion in the wider model governance process. Why is this important? Scorecard tuning is one of the most important parts of model development, as the model must output the most relevant (e.g., risky or interesting) information for a specific use case. These utilities make the process of Scorecard tuning much more accessible, simple, and quick. It also makes sure that the model addresses the business need and risk appetite of the deployment. Other highlights Speed up and simplify deployment of Explorer with no-code configuration; Merge and split Entities, and override Entity Attributes through the Entity Management Panel (e.g., for Master Data Management solutions); Integrate with Kafka streaming typologies more easily. Share your thoughts on this release in our poll: Which feature of Quantexa 2.5 are you most excited for? To receive updates on every release, be sure to follow the Release Announcements topic: Want to learn more? Check out the Release Notes on the Quantexa Documentation site. If you are unable to access the Documentation site, please get in touch with your Quantexa point of contact or the Community team at community@quantexa.com.842Views1like1CommentWelcome to Quantexa 2.7 | 2.7.0 Release Announcement
We are pleased to announce the release of Quantexa 2.7.0. This release includes: Graph Scripting QSL (Early Access) Highlights Simplified Interface: Configuration-Based Expansions: Define Batch Graph Scripts using the Quantexa Scripting Language (QSL) with a simplified interface, reducing the need for custom Scala code. Enhanced Network Precision: Path-based Expansions: Use path-based Expansions instead of perimeter-based Expansions, resulting in tighter and more focused Networks. Efficient Data Processing: Data Pre-Filtering: Options to pre-filter data help minimize the volume of data that needs to be processed, enhancing overall efficiency. Improvements to the Entity Store Cost Efficiency and Performance: Reduced Elastic Utilization: New indexing format reduces Elastic resource usage by up to 40%, leading to significant cost savings. Smaller Index Size: Optimized Entity structure decreases index size by 60%, allowing projects to run on smaller Elasticsearch clusters. Operational Enhancements: Exclusion of Unchanged Entities: Logs can now exclude unchanged Entities, streamlining log management. Entity Version Tracking: Introduction of a version field in all Entities to track data updates, necessitating a full reload of Elasticsearch indices upon upgrade. Loading Improvements: Flexible Output Paths: Enhanced configuration accepts various path types for output locations, removing the need for file:/// prefixes. Parallel Load Execution: Specify Entity types for parallel loading, significantly reducing load times. Storage Optimization: Entities exceeding a threshold of records are stored without compounds, reducing Elastic storage requirements. REST API Enhancements: Advanced Query Capabilities: Support for the negation operator ! and wildcard queries * across all string attributes, enabling more complex and flexible searches. Pagination Support: Total hits in search responses help determine the availability of additional data, facilitating efficient data retrieval. For detailed migration steps and configuration settings, refer to the 2.6.x - 2.7.0 Migration Guide and relevant configuration references. New Global UI Configuration Service Highlights Streamlined Configuration Management: Unified Configuration Service: Global UI settings have been moved from the Explorer service to a new Global UI Configuration service, providing a centralized method for managing these settings. Dedicated REST API Endpoints: A new REST API allows programmatic access to global UI settings, enhancing integration and automation for users and Low-Code Configuration parts of the Quantexa UI. Future-Proofing and Intuitiveness: Intuitive Settings Location: This reorganization places global UI settings in a more logical location, facilitating easier management and future enhancements. Foundation for Future Improvements: While there is no immediate functional impact in the 2.7.0 release, this change lays the groundwork for future improvements in platform UI configuration. More information ➡️For more details on the release, see the 2.7.0 Release Notes on Platform Documentation. If you cannot access the Documentation site, please get in touch with your Quantexa point of contact or the Community team at community@quantexa.com. To receive updates on every release, click Subscribe in the top-right corner of the Release Announcements page (login required).875Views1like0Comments