This blog discusses starting a deployment using base Entity Resolution configuration with the Quantexa On-demand Demo and Data Packs.
What are the Quantexa On-demand Demo and Data Packs?
The Quantexa On-demand Demo is a best practice demonstration of how to apply Quantexa’s Entity Resolution using Data Packs. All deployments should use this as a base for their Entity Resolution.
Data Packs are resources that provide prewritten data models, ETL code, data generators, and UI components for commonly used third-party data sources. The four Data Packs used within the on-demand demo include:
- Bureau van Dijk
- Dun and Bradstreet
- ICIJ
- Dow Jones
Further information on Data Packs can be found within the Data Packs documentation. Not all deployments will be using data sources with a Data Pack.
Why is this useful?
The on-demand demo contains a core fragment that contains common Element definitions, Compound definitions, Entity definitions, and Resolution Templates used across multiple Data Packs, such as Entity Resolution configuration for an Individual Entity. This fragment provides deployments with a significant time saving.
A lot of the configuration is based on the output of the Standard Parsers, so it is recommended that users also use the Standard Parsers on any bespoke data sources.
All of the prebuilt Resolver configuration fragments used by the Quantexa On-demand Demo are hosted in the quantexa-ondemand
repository. There is also a core Resolver configuration fragment, as well as each Data Pack having an associated Resolver configuration fragment.
What did you find out during implementation?
- Copying the core fragment directly into your repository is a shortcut for a first draft of your Entity Resolution configuration, especially as it has already been tuned against the various Data Packs models, some of which contain large amounts of data.
- Having multiple Resolver configuration fragments makes it easier to manage and revise Resolver configurations, particularly for deployments with multiple data sources. It also helps to centralize common configuration in one location and avoid duplication.
Design considerations
- The recommended approach is to avoid modifying the core fragment and to put all additions within your respective <data-source>-resolver-config-fragment.json. This way if there are changes made to the core fragment such as a Parsers update or change to best practice, it is easier to take the updated version. However, if you wish to remove configuration, for example, a Compound from the default Resolution Templates, you must edit the core fragment.
- The effort to add new configuration on top of the core fragment is minimal, especially if your deployment’s data sources do not have many custom additional Elements and Compounds to add. Provided you use the Standard Parsers to cleanse individual names, business names, addresses, the Standard Parsers produce the Elements and Compounds which are already defined in the core fragment.
- You may find that your deployment’s data source models include additional fields to use as Elements and Compounds. These can be defined in Data Fusion, and then configured in a Resolver configuration fragment specific to that data source.
- Any significant modifications required that are out of scope should be fed back to the Quantexa On-demand Demo team.
Additional Resources