Knowledge Base Article

Unify: A closer look at selected key features

This page provides further detail and guidance on some key features of the Unify workload.

Overview

The features and capabilities explained on this page are those you will encounter as you use the Unify workload. For a step-by-step walkthrough on setting up and using the Unify workload, see Unify: Step-by-step guide to using the workload.

Features

The following sub-sections provide further detail on some of Unify's key features.

Data Mapping

For a definition of Data Mapping, see Unify: Core concepts.

Data Mapping is an integral part of Quantexa’s Entity Resolution solution. Quantexa’s Data Mapping process in the Unify workload focuses on mapping your Data Sources to pre-defined Entity Type and Entity Group fields.

In the context of the Unify workload, Data Mapping seeks to answer some initial questions about your Data Source such as the following:

  • What source fields match the Unify Entity attribute fields? Which should they be mapped to?
  • For source fields that do not directly match Unify’s pre-defined Data Mapping fields, what are the most suitable matches? If there are no suitable matches, why?
  • What Entity Types and Entity Groups are being populated by the source data? To what percentage are these fields being populated?

As noted in the step-by-step walkthrough, while you may edit the Data Mapping process output, the process itself runs automatically on loading a Data Source. This saves significant time and manual effort. However, to ensure accurate Data Mapping in Unify, your data must be in a suitable format and have some logical structure for the mapping process to read it effectively.

Iterations

For a definition of Iteration, see Unify: Core concepts.

Running an Iteration serves two purposes:

  1. Conducting Entity Resolution on the Data Sources you include for that Iteration.
  2. Comparing Entity Resolution outputs across multiple Iterations that use different Data Sources, or different combinations of Data Sources. In addition to comparisons on the data content, an Iteration can help you compare data quality, Entity Resolution metrics and field population rates between your Data Sources.

The first scenario is straightforward, and thanks to Quantexa’s Entity Resolution features within the Unify workload, you can use the workload to build a trusted data foundation directly.

The second scenario would be more complex without the Unify workload, as it would require a significant investment of time and resources to conduct a true comparison. However, with the Unify workload, the complex is made simple. You simply run multiple iterations using the straightforward step-by-step process.

Matching Levels

For a definition of Matching Level, see Unify: Core concepts.

The availability of Matching Levels helps you tailor Unify’s Data Mapping and Entity Resolution processes to your Project’s needs.

As a reminder, there are three available Matching Levels within the Quantexa Unify workload: Default, Fuzzy, and Strict.

The following are example use cases for Fuzzy and Strict Matching Levels.

  • Fuzzy: You can use a Fuzzy Matching Level in a scenario like matching customers to a watchlist in the Financial Crime arena. Due to the seriousness of the matter, you want to ensure you find all possible matches. Even where there is Overlinking, you are happy to manually review the matches to find the correct ones.
  • Strict: You can use a Strict Matching Level in a scenario like generating a master set of customers in Master Data Management. As the output may be used to trigger automatic action, such as contacting customers, and you are unlikely to review the matches, you want to ensure that all generated matches are correct. Even where there is Underlinking, you are happy to have a smaller scope of matches given the reputational and practical consequences of any incorrect matches.

The following factors can help you decide which Matching Level to choose at the Data Mapping stage and for each Iteration:

  • The quality of your Data Source.
  • The completeness of your Data Source.
  • Your particular use case.
    • For example, if you are planning to use the Entity Resolution output to execute automated tasks without reviewing all matches, it may be better to use a Strict matching level.
    • For cases where you want to ensure you have all possible matches, even with overlinking, you may want to use a Fuzzy matching level.

If you are not sure which Matching Level to use, you can opt for the Default Matching Level, as this strikes a balance between Overlinking and Underlinking.

Automated output

After completing an Iteration, the Unify workload automatically outputs the results of the Entity Resolution process into the following:

  • Iteration summary
    • The summary shown for an Iteration after Entity Resolution is a bar-chart in the top-right corner. The bar chart shows a comparison between the total number of input Records against the total number of resolved Entities for each Entity Type.

       

  • Power BI Report
    • The automatic report shows summaries of key information for Entity Types, such as Entity size, Entities by Address and Entities by Business and Individual counts.

       

  • Entity Resolution records tables and Entities tables
    • Records tables show the records that triggered the resolution of a particular Entity. For example, the workload outputs multiple tables showing the relevant records for a particular Entity. Each record table covers a specific Entity type, such as Individual or Address.
    • Entities tables show the Entities the source data has resolved to. For example, you may input two Data Source tables, and after Entity Resolution, the workload outputs multiple additional tables showing the resolved Entities. Each table covers a specific Entity type, such as Individual or Address.Entity Resolution records and Entities tables.

  • Semantic Model
    • An Iteration’s Semantic Model shows the relationships between the tables described in the preceding point and your input Data Source tables, within an Iteration. For further information on Semantic Models in Microsoft Fabric, see Power BI Semantic Models in Microsoft Fabric.

       

Additionally, using the automatic outputs, you can optionally create other outputs within the broader Fabric suite, including the following:

  • Other types of Power BI reports
    • Power BI is a functionality provided by Microsoft Fabric, and not by the Unify workload. Power BI reports are typically based on one Semantic Model and can feature visualizations such as charts, graphs and tables to provide data insights. They can help you explore your data – and the output of Unify – further. For more information on Power BI reports, see Reports in Power BI.
  • Notebooks
  • Power Query (M script) with Dataflow Gen2

Next steps

For a guide to using the Unify workload, see Unify: Step-by-step guide to using the workload. For an applied example of the step-by-step guide, see Unify: Example workflow.

Updated 3 days ago
No CommentsBe the first to comment
Related Content