parsers

4 Topics

Entity Resolution Configuration & Parsing Health Checks
Entity Resolution (ER) and good Entity quality underpin all Quantexa deployments. The accuracy of Entity configuration and parsing are two areas that impact Entity quality. These articles outline Entity Resolution (ER) health checks and Parsing health checks to be carried out by the development team on deployments. This allows the team to identify, prioritize, and fix any potential underlying issues that could be reducing Entity quality. These checks must be completed as part of the initial deployment, but also periodically over the lifetime of the deployment. New product functionality, and data changing, may mean configuration needs to be changed, or enhanced over time. Topics covered: Entity Resolution Health Checks Pre-requisites Resolver JSON configuration health check steps Perform a comparison to the latest core Resolver JSON configuration Review configured Element exclusion criteria Review configured exclusions for Compounds in the relevant template Compound model health check steps Are all required Compounds being generated in ETL for the relevant Document types? Are Compounds being generated to populate elements required for exclusions in other Compounds? Do the traversals all look sensible? Do you have good coverage of unit tests? Parsing Health Checks Pre-requisites Parsing health check steps Is your deployment using the latest versions of Parsers? Has your deployment applied custom Parsing functions or wrappers? How does your Parsing compare to best practice Parsing? How well are the Parsers performing per source and country? How well-populated are the Parsed fields?
David_Short
4 months ago Place Getting Started
98Views
0likes
0Comments
Welcome to Parsers 4.2 | Release Announcement
Alongside the release of QP2.7, we are happy to share the release of 4.2.0 of Standard Parsers. This release extends the cleansing options you can define purely in config - configurable simple generic cleansers. We have introduced new config-based cleansers that allow you to perform replacements in strings, remove and keep parts of a string based on pre-defined options, change the case of a string for different languages and to extract specific parts of input strings, all without writing any Scala. These back-end config improvements extend to the front-end. The users of QP 2.7 will have access to the extended configurability described above within the UI - more details are available in the release notes and documentation for QP2.7. The release also brings some key bug fixes and a small improvement to business parsing that should catch more edge cases of business names with odd punctuation distributions. For more information on the release features, please see the 4.2.0 release notes and for general info on Parsers, see the documentation.
Anastasia_Petrovskaia
12 months ago Place Release Announcements
134Views
1like
0Comments
Welcome to Parsers 4 | Parsers 4.0.0 Release Announcement
We are excited to announce the version 4.0 release of Quantexa's Standard Parsers. This release marks a shift in the way the Standard Parsers are used, and it includes an expansion of the data models so that more information is made available for use in Entity Resolution. Here are a few of the features you’ll find in this release: A new low-code interface Standard Parsers are now quicker and easier to deploy and use with new, easily shareable customization options, integration with Data Fusion, and less custom code, giving better coverage of auto-migrations to support upgrades. The setup and customization now uses configuration files, much like Data Fusion, meaning anyone can make changes without the need for writing code. Updated Data Models Entity Resolution can now form higher quality Entities, thanks to more information stored in our Standard Data Models at the parsing stage, so Entities have more contextual information associated with them and alignment with international standards. Higher match rates with Variants The introduction of Variants enables even more use cases, and lets users resolve the same Entity in more ways, in both Search and within Entity Resolution itself, so it’s possible to catch more edge cases. You can now use out-of-the-box Variants with the `address`, `individual` and `business` data models, and even define custom Variants. Name flexibility and internationalisation In the global market, localizations to both individual and business names can impact the accuracy of data, so we’ve updated name structures to represent names in a wider variety of cultures more accurately to the real world, improving Entity quality. For more detail on the new features – including contextual parsing (also known as composite parsing) and additional experimental country-specific address parsing, as well as compatibility and support with different versions of the Quantexa Platform, and other minor changes, see the full set of Release Notes on the Quantexa Documentation site. If you are unable to access the Documentation site, please get in touch with your Quantexa point of contact or the Community team at community@quantexa.com.
Irene_Zhang
2 years ago Place Release Announcements
255Views
1like
1Comment
Welcome to Parsers 4.1 | Release Announcement
We are excited to announce the release of version 4.1 of Quantexa's Standard Parsers. This release focuses on improving the integration with Fusion UI (look out for exciting 2.6 release announcements coming soon) and improvements of the file structure of configuration files. This release includes the following highlights, which are detailed below. Consistency of configuration files - general improvements to Parser and lexicon configuration and files have been introduced to make sure the way you use these files is consistent across all available Parsers. This will make your configuration easier to understand and simplify the process of making future modifications. To minimise redundant data storage in Elastic Search, you can now exclude business standardisation terms that aren’t used in areas such as exclusions for Entity Resolution. Similarly, you can now choose to parse multiple names or just a single name for the Individual Parser to reduce your Elastic Search footprint. For the Telephone Parser you can now specify conditional parsing rules to increase the output accuracy. For example, if you have more specific parsing rules for UK telephone numbers, you can now use country code to parse these telephone numbers differently to the default telephone parsing behaviour. Note: There are no changes in this release that affect the output of parsing.
Irene_Zhang
2 years ago Place Release Announcements
179Views
1like
1Comment