Entity Resolution Configuration & Parsing Health Checks

David_Short
David_Short Posts: 22 QUANTEXA TEAM
edited April 15 in Getting Started

Entity Resolution (ER) and good Entity quality underpin all Quantexa deployments. The accuracy of Entity configuration and parsing are two areas that impact Entity quality.

These articles outline Entity Resolution (ER) health checks and Parsing health checks to be carried out by the development team on deployments. This allows the team to identify, prioritize, and fix any potential underlying issues that could be reducing Entity quality.

These checks must be completed as part of the initial deployment, but also periodically over the lifetime of the deployment. New product functionality, and data changing, may mean configuration needs to be changed, or enhanced over time.

Topics covered:

Entity Resolution Health Checks

  1. Pre-requisites
  2. Resolver JSON configuration health check steps
    1. Perform a comparison to the latest core Resolver JSON configuration
    2. Review configured Element exclusion criteria
    3. Review configured exclusions for Compounds in the relevant template
  3. Compound model health check steps
    1. Are all required Compounds being generated in ETL for the relevant Document types?
    2. Are Compounds being generated to populate elements required for exclusions in other Compounds?
    3. Do the traversals all look sensible?
    4. Do you have good coverage of unit tests?

Parsing Health Checks

  1. Pre-requisites
  2. Parsing health check steps
    1. Is your deployment using the latest versions of Parsers?
    2. Has your deployment applied custom Parsing functions or wrappers?
    3. How does your Parsing compare to best practice Parsing?
    4. How well are the Parsers performing per source and country?
    5. How well-populated are the Parsed fields?