Unify: Example workflow

This page provides an example, easy-to-follow workflow as a hypothetical customer using the Data Sources provided in the Unify Demo version.

It shows the step-by-step guide in action, to help you develop a comprehensive understanding of how to use the workload and the different ways to use its output.

Overview
Prerequisites
Creating your Project
Adding Data Sources
Data Mapping
Running a first Iteration
Viewing and using the Unify output
Uploading a new Data Source and running a new Iteration
Comparing results between Iterations
Next steps

Overview

The following steps show how you would apply the step-by-step guide and interact with the workload at each stage, including analyzing and using the workload outputs.

Prerequisites

Before beginning, you complete all prerequisites required.

These are listed in the Prerequisites section of the step-by-step guide.

Creating your Project

You now proceed to creating your project.

You navigate to your workspace and click to create a New Item.
You select the Unify workload from the pop-up list.
You call the new Project BrandABC and create the Project.

Adding Data Sources

You have both the Contosa and Northwind data. You want to analyze the Contosa data first, so you decide to upload it by itself for now.

On your Project homepage, you click to add a Data Source from the Explorer panel.
In the pop-up, you select Contoso as your first Data Source and connect it.

The Data Mapping process automatically begins.

Data Mapping

Once the Data Mapping is complete, you are presented with the mapping output.

You review the mapping output and see that the nationalId field is not mapped. It is, therefore, not being included in the Entity Resolution process.
You decide to edit the mapping schema to map it into the Individual -> nationalId field, which now includes it in the Data Mapping.

Running a first Iteration

You want to judge the quality of the Contoso Data Source by itself first, so you decide to conduct Entity Resolution for only that source first.

NOTE: Remember that single-source Entity Resolution is a viable use for the Unify workload, even though you would more typically run Entity Resolution between two or more Data Sources.

You create the first Iteration, name the Iteration Contoso, and click Run.
The Iteration completes and you can now analyze the results.

Viewing and using the Unify output

The automatic outputs of the workload are as follows:

Iteration summary
Power BI Report
Entity Resolution tables
Semantic Model of the Entity Resolution tables

(A) You first view the Iteration summary.

The summary appears automatically once the Iteration is complete.
You see that the Total Entities chart and summary of Output Data show the total number of resolved Entities for each Entity Type.

(B) You next want to view the Power BI Report, which includes more detailed bar charts and data tables.

You navigate to your Workspace and open the Power BI Report. For guidance on how to find the Power BI Report, see Unify: Step-by-step guide to using the workload.
You see the bar charts showing counts for Entities by certain measures. You navigate through the report using the different tabs on the Pages panel on the left, which allows you to view the results for each Entity Type.
You also want to explore the underlying data. Therefore, you scroll down to the data table below the bar charts.
You click on the table and see that the Filters panel on the right now shows certain options. You use these options to examine the data.

- For example, under the Individual tab on the Pages panel, you use the Name filter to search for all names matching ‘DEBORA’ (uppercase) and find that three Entities match.
- As another example, under the Address tab on the Pages panel, you use the Address filter to search for all addresses matching ‘St. Peters’ and find two Entities that match.

IMPORTANT: The search filters within the Pages panel are case sensitive.

You navigate to your Workspace and open the Semantic Model. For guidance on how to find the Semantic Model, see Unify: Step-by-step guide to using the workload.
On the Semantic Model homepage, you see the Tables panel on the right, which lists the output data tables from your Iteration. You click on one of the tables to view the underlying data.
You next open the semantic Model of the tables.
You review the model to identify the relationships between the tables.
Clicking on a specific table in the Data panel on the right, takes you to that table in the Semantic Model.

IMPORTANT: You can also view the output data by opening the OneLake Lakehouse you sent your Iteration output to. Note that if your Lakehouse is shared in your organisation, that other data tables unrelated to your Iteration appear here too.

Uploading a new Data Source and running a new Iteration

Once you have reviewed the results of your first Iteration that uses only the Contoso Data Source, you decide to upload the Northwind Data Source and run an Iteration between the two.

The steps are the same as in the preceding points, starting from Adding a Data Source.
However, one difference is that you must ensure you select Northwind as a second Data Source when running your second Iteration.
You name this Iteration Contoso-Northwind.
Once the second Iteration runs, you view the results, using the steps outlined in the Viewing and using the Unify output section.

Comparing results between Iterations

You want to compare results between the two Iterations. You do this programmatically by using the Notebook functionality within Fabric.

You complete the following steps:

You open the Lakehouse where the output data is stored.
You click Open notebook from the top menu bar and select New notebook.
Next, you input the code required to run the comparison. The following code block provides an example of the code you would input if you were comparing the changes to the Individual Entity type, especially the national ID Entity Group.

IMPORTANT: For your own Project, you must configure the name of the Lakehouse you selected for the outputs, and the Project and Iteration names.

# Configure to suit your Project.

lakehouse="EntityOutputs"   #Name of the lakehouse containing the output of the iterations
projectName="BrandABC"     #Name of the Quantexa unify project
contosoOnlyIterationName="Contoso"      #Name of the iteration with Contoso data
contosoNorthwindIterationName="ContosoNorthwind"    #Name of the iteration with Contoso and Northwind data

#Raw input data
contosoRecords = spark.sql(f"SELECT * FROM {lakehouse}.contoso")

#Resolved entity output: Contoso Only
individualsContosoOnly = spark.sql(f"SELECT * FROM {lakehouse}.quantexa_{projectName}_{contosoOnlyIterationName}_individual_records")

#Resolved entity output: Contoso and Northwind
individualsContosoNorthwind = spark.sql(f"SELECT * FROM {lakehouse}.quantexa_{projectName}_{contosoNorthwindIterationName}_individual_records")

#Find any contoso records which have changed the Entity they are associated with
changedDocuments=individualsContosoNorthwind.exceptAll(individualsContosoOnly)
display(changedDocuments)

#And all the Entities in the "contosoNorthwind" build that have changed
changedEntities=individualsContosoNorthwind.join(changedDocuments.select("entityId"), "entityId")
display(changedEntities)

#join on the raw data
entitiesWithRawData=changedEntities.join(contosoRecords, changedEntities["documentId"]==contosoRecords["customerID"], "INNER")
display(entitiesWithRawData.drop("entityType","documentType","documentId"))

This outputs a table showing the differences between the two Iterations.

Next steps

Remember to refer back to the step-by-step guide when using the workload, for full instructions on how to use it.

Updated 30 days ago

Anita_Subedi

Quantexa Team

Joined November 20, 2024

View Profile

Knowledge Base Article

Unify: Example workflow

Table of Contents

Overview

Prerequisites

Creating your Project

Adding Data Sources

Data Mapping

Running a first Iteration

Viewing and using the Unify output

Uploading a new Data Source and running a new Iteration

Comparing results between Iterations

Next steps

Related Content

Unify: Core concepts

Unify: FAQs

Unify: How the workload can help you

Adding the Unify workload: technical prerequisites

Unify: What the workload does

Quick Links

Solutions

Company

Explore