ContributionsMost RecentMost LikesSolutionsTagged:TagUnify: Core concepts This page describes the core concepts underpinning the Quantexa Unify workload. Overview The Unify workload is built on the Entity Resolution component of Quantexa’s industry-leading Decision Intelligence Platform. This provides Unify with best-in-class Entity Resolution capabilities: all available in a few clicks inside your Fabric tenant. The concepts listed on this page are fundamental to understanding the Unify workload’s capabilities and Quantexa’s Entity Resolution features within the workload. Core concepts The following sub-sections describe core concepts underpinning the Unify workload. Entity An Entity is the representation of a real-world person or object, such as a customer or bank account. Quantexa distinguishes Entities from their real-world counterparts to make clear that Entities are simply compiled from data points found in the Data Sources you provide. Entity Type An Entity Type is a category of Entity. Entity Resolution in the Unify workload recognizes the following Entity Types: Individual Business Telephone Address Account Entity Group An Entity Group provides further refinement of Entities within an Entity Type. For example, a Telephone Entity may contain a landline entry as well as a mobile phone entry. Both the landline and mobile phone numbers are each Entity Groups within the Telephone Entity Type. Quantexa provides several predefined Entity Groups within the Unify workload. Entity Resolution Entity Resolution is the process of identifying Entities within your Data Sources by finding various and likely disparate occurrences of that Entity across the available data. Based on your use case and data quality, you can adjust the strictness threshold for matching, known as the Matching Level, that the workload uses for Entity Resolution. The Matching Level impacts the level of Overlinking or Underlinking. These concepts are explained below. REMEMBER: You can view metrics about the resolved Entities in, for example, PowerBI, or using the workload’s output tables or Delta Lake files in OneLake. → Matching Level A Matching Level is a strictness threshold for matching that Unify refers to when deciding whether to resolve Entity references. You must specify the Matching Level that the workload should apply to an Iteration. For each Iteration, you can choose one of the following three Matching Level options: Default: The standard Matching Level that applies to most use cases, striking a balance between Overlinking and Underlinking. Overlinking and Underlinking are explained below. Fuzzy: A looser Matching Level that casts a wider net. It enables more matches to be found, but may result in some Overlinking. Strict: A stricter Matching Level that only resolves Entity references where there is strong confidence that the match is correct. It ensures no incorrect matches are made, but may result in some Underlinking. For further information on Matching Levels, see Unify: A closer look at selected key features. → Overlinking Overlinking occurs when multiple references are incorrectly linked to the same Entity, even though they refer to different real-world Entities. An Overlinked Entity is an Entity that is incorrectly resolved with one or more other Entities. Overlinking is typically caused by similarities between the records of different Entities, such as two separate customers having the same name and even address. → Underlinking Underlinking occurs when two or more references to the same real-world Entity are not linked in the dataset. An Underlinked Entity is an Entity that is only partially resolved. Underlinking is typically caused by missing or incorrectly entered data, such as one customer being listed multiple times in one database under different names or addresses, and with no other data to connect those references. Project A Project is one instance of your Unify workload. It is a collection of Data Sources you have uploaded that you can then use for various Iterations. Version History All changes to a Project, such as the upload of new Data Sources, are automatically recorded in the Version History. You can view the history of your changes by clicking Version History under your workload's Home tab. Data Source A Data Source is a Lakehouse Table in OneLake, which you can create from a file you upload or from another source in OneLake. You upload your Data Sources to a Project within your Unify workload. You can upload multiple Data Sources to your Project. However, you can only upload one Data Source at a time. In the Demo version of Unify, you cannot use your own Data Sources. Instead, Quantexa provides example customer Data Sources for the following two fictional product brands: Contoso Northwind Each one contains example data such as names, addresses, and telephone numbers for customers of the brand, but each file has a different schema and columns, reflecting the diversity and messy data typically encountered in an organization. In Full User and Trial versions of Unify, you may use your own Data Sources. These may contain your organization’s internal data or external data from third parties, such as corporate registries or watchlists. Data Mapping The Data Mapping process is an automatic process that runs when you upload a Data Source to your Project. It does the following: Analyzes the uploaded Data Source’s contents. Uses an inference engine to determine the appropriate data schema. For example, a field containing names is mapped to the Individual Entity Type. From this, it then maps the component parts of the field to the appropriate Entity Groups within that Entity Type, such as Forename or Surname. Applies the necessary parsing, cleansing, and standardization of your raw input data. For further information on this, see the definition for Parsing, Cleansing and Standardization on this page. After the mapping process is complete, a Data Mapping panel lets you view and refine the results of the process. You can also view data quality metrics for the raw input data. For further information on Data Mapping, see Unify: A closer look at selected key features. For guidance on reviewing and editing the initial Data Mapping output, see Unify: Step-by-step guide to using the workload. Parsing, Cleansing, and Standardization The Unify workload parses, cleanses, and standardizes your Data Source data automatically as part of the Data Mapping process. It uses Quantexa’s Machine Learning model to do so. Parsing splits source data into its component parts. For example, parsing a raw full name data entry of Michael Greene creates a Forename = Michael and Surname = Greene . Cleansing manipulates the raw data to prepare it for optimal Entity Resolution. For example, removing generic terms such as Ltd or Organization , and removing punctuation and default values. It also converts all data to uppercase. Standardization replaces different presentations of the same data with a single version for consistent formatting. For example, a dataset may contain USA , AMERICA , UNITED STATES , or UNITED STATES OF AMERICA in the country field. Standardization converts all of these to US . The main purpose of parsing, cleansing, and standardization is to create consistent data that facilitates linking through Entity Resolution. Iteration An Iteration is the execution of Entity Resolution for a Project at a specific version. You can select a different set of Data Sources for each Iteration, which may help you identify the Data Sources that provide the highest quality of Entity data. An Iteration execution submits a series of automatic background jobs to do the following: Resolve and build Entities. Generate the resulting Entity data as Lakehouse tables, which you can view in OneLake or Power BI. These typically include tables for the different Entity Types, and a table containing links between the records and the resulting Entities to show how the Entities have been built from your Data Sources. You can also view a Semantic Model showing the relationships between the output tables. When executing an Iteration, you can select the Matching Level you want to use when resolving Entities. For further details, see the definition for Matching Levels in this document. For further information on Iterations, see Unify: A closer look at some key features. Semantic Model The Semantic Model output by an Iteration shows the relationships between the input and output tables of that Iteration. For further information on Semantic Models in Microsoft Fabric, see Power BI Semantic Models in Microsoft Fabric. For further information on Semantic Models and other automated Unify outputs, see Unify: A closer look at some key features. Next Steps For a guide to using the Unify workload, see Unify: Step-by-step guide to using the workload. For an applied example of the step-by-step guide, see Unify: Example workflow. Unify: How the workload can help you This page provides an overview of the Quantexa Unify workload for Microsoft Fabric and how it can help you in your data projects. Overview of the Quantexa Unify workload The Quantexa Unify workload brings a critical data transformation component into the Microsoft Fabric ecosystem: Entity Resolution. The Unify workload is built on the industry-leading AI-driven Entity Resolution component of Quantexa’s Decision Intelligence Platform. As a result, the workload empowers data teams by enhancing data quality and usability, eliminating data silos, and allowing you to connect data at scale. How can the Unify workload help you? The Unify workload delivers best-in-class Entity Resolution, providing deeper contextualization and refinement of your datasets compared to traditional record-matching methods. It also simplifies data management and allows you to integrate and update data from multiple sources continuously. Entity Resolution through the Unify workload quickly and easily elevates the data on which you base your data analysis and real-world decision-making. This helps you unlock deeper insights and make smarter decisions with ease. For more information on Entity Resolution in the Unify workload, see Unify: What the workload does. Why should I use Unify instead of other Entity Resolution tools? By using the Quantexa Unify workload, you will benefit from Quantexa's industry-leading Entity Resolution capabilities. Additionally, key features of the Unify workload include the following: No-code interface that allows users of all types to benefit from the workload. Automated data mapping. Advanced Entity matching, including the ability to adjust the ‘strictness’ of Entity matching between Iterations. End-to-end Entity Resolution processing that can complete in under one hour. Scalable for high-volume datasets and many multiples of datasets. Outputs data into tables that you can use to build Semantic Models or to enhance your data analytics, for example within Power BI and other applications. Outputs deduplicated, AI-ready data that can be used, for example, for Machine Learning and AI models in Fabric. Helps you identify quality issues through Power BI reports. Seamless integration into your Fabric project. Low-friction sign-up process with minimal onboarding requirements. Supports team collaboration within a single platform. In short, the Quantexa Unify workload helps you easily and quickly create a trusted, connected, and contextualized data foundation. How the Unify workload fits into the Fabric ecosystem When you first add the Unify workload, you are provided with a Demo version of the workload that only allows you to use the Data Sources that Quantexa provides. On requesting a Full User license, you are then provided with full access to the workload. This allows you to use your own Data Sources and run the full workload within your Fabric tenant. An example workflow that shows how Unify fits into the Fabric ecosystem is as follows: You have Data Sources that include customer and supplier information. Therefore, before using the Unify workload, you use OneLake to connect and centralize access to your Data Sources. You connect multiple Data Sources within Fabric. Although your Data Sources contain customer and supplier information, there is no customer key or unique ID to indicate which references are to the same individuals or companies. Therefore, you use the Quantexa Unify workload to match references to the same individuals and companies across your Data Sources and create a unique ID for each individual and company. This is your ‘resolved’ data. Following on from the Unify workload, you could use your resolved data in the following ways: Data warehouse specialist: To aggregate your data in a Fabric Data Factory flow. Power BI engineer: To combine data from your Data Sources into visualizations in Power BI. Data scientist: To develop a machine learning model using Fabric Notebooks. The preceding example are just three in a vast range of scenarios of how you can use your resolved data downstream from the Unify workload. Next steps If you are working with datasets of any size that would benefit from Entity Resolution, try the Quantexa Unify workload. You can test out or purchase the workload in the following ways: Demo version: This is a free preview open to all Fabric users that allows you to test out some of the workload’s key features. In this preview, you can only use the Data Sources that Quantexa provides. To access the Demo version of Unify, also known as the Public Preview version, click here. NOTE: To access the link, ensure you are logged into your Fabric account in your browser. Full User and Trial versions: The Full User version provides you with full access to the Unify workload, including allowing you to use your own Data Sources. You can access the Full User version directly through a license subscription. Additionally, you can also gain temporary access to the Full User version of Unify through a Trial version. This allows you to explore all the workload features on a temporary, unpaid license. To purchase the Full User version or access the Trial version of Unify, contact UnifyAndFabric@Quantexa.com. To find out more about Entity Resolution and the Unify workload process, see Unify: What the workload does. Unify: Step-by-step guide to using the workload This page provides a step-by-step guide to setting up and using the Unify workload. The guide indicates where capabilities and features differ in the Demo version of the workload, which has restricted capabilities compared to the Full User and Trial versions. Overview The following steps provide a summary overview of your end-to-end process when using the Unify workload: Prepare your Data Sources. Launch the Quantexa Unify workload. Create a Project. Load your Data Sources. Loading your Data Sources automatically triggers Data Mapping of each Data Source. Review the output of the Data Mapping process and make any amendments if needed. Run an Iteration to conduct Entity Resolution. View your Iteration’s Entity Resolution results. The following diagram provides a visual overview of the workflow: Prerequisites Before proceeding to the Using the Unify workload section to start using the workload, complete the following prerequisites: Ensure your Fabric administrator has Enable Microsoft Fabric for your organization - Microsoft Fabric. Next, add the Unify workload to your Fabric tenant as follows: 2.1. From the Fabric home page, click the Workloads button in the left navigation pane. 2.2. From the Add more Workloads section, click the Quantexa Unify workload to add it to your tenant. 2.3. Once added, the workload appears on the My Workloads section on the same page. 2.4. For further details on adding workloads see the Microsoft tutorial. Next, someone in your organisation with the appropriate permissions, such as a Capacity, Tenant, or Workspace Administrator, must activate the Quantexa Unify workload. Before using the workload, you must also provide consent to the Quantexa application. Your organisation’s Workload Administrator will typically have provided consent on behalf of all users from your organisation already. However, if you experience any issues, contact your Workload Administrator. Finally, ensure you have prepared any Data Sources you want to use with the workload and that they are in a suitable format to upload as Lakehouse objects to Fabric. NOTE: For the Demo version of Unify, you cannot use your own Data Sources and must use those provided by Quantexa instead. Using the Unify workload The following steps guide you through using the Unify workload, after you have prepared your Data Sources and added the workload, and are separated into the following sub-sections: Creating your Project. Adding Data Sources. Data Mapping. Running an Iteration. Viewing and using the Unify output. (1) Creating your Project This section guides you through creating your Project in the Unify workload. After completing the prerequisites, navigate to the workspace you want to use the Unify workload in. You can navigate to the workspace by clicking on the Workspaces item on the left-hand sidebar and selecting the workspace from the list. Clicking on your workspace in the list takes you to the workspaces homepage. Launch the workload by clicking + New item in the top left of your workspace’s homepage. This brings up a pop-up list of workloads. Click on the Unify workload you want to use from the Others section at the bottom of the list. A pop-up titled New Project appears. IMPORTANT: If a permission request appears at this stage, click Agree. In the New Project pop up, type in a unique name for your Project that is easy for you to recall and identify. Click Create. You are then taken to the homepage of your Quantexa Unify Project. (2) Adding Data Sources Once on your Project homepage, see the Explorer panel on the left side of the page. This shows your Project’s Data Sources and Iterations. On setting up your Project for the first time, the panel will show that you do not have any Data Sources or Iterations. To add a Data Source, click Add Data Source using one of the following options: From the Explorer panel, click the + button next to Data Sources. From the menu under the Home tab, click Add Data Source. On the main section of the page, click Add Data Source. When you click Add Data Source, a pop-up appears listing the various Lakehouse objects you can choose as your Data Source. This is your OneLake catalogue. Select the Lakehouse you want to add as a Data Source and click Connect. NOTE: You can only add one Data Source at a time, so you must repeat this process for each Data Source you want to add. IMPORTANT: If you are using a Demo version of the Unify workload, you cannot upload your own Data Sources. Instead, you can only use the Data Sources Quantexa provides in Unify: Contoso and Northwind. (3) Data Mapping Once you have connected a Data Source, the Data Mapping process for that Data Source runs automatically. Once the Data Mapping process completes, a Data Mapping table appears in the main section of the page. The table contains the following columns: Field: this is the column name pulled from the Data Source, such as Forename, CustomerAddress and Email. The key symbol against a field, for example against customerID , indicates that the field is a Primary Key. A Primary Key is a unique identifier for the Entity Resolution process. A field must be 100% unique and 100% populated to qualify as a Primary Key. Entity Mapping: this is the Entity Group that the Field is or that it maps to, such as Business, Individual or Email. Mapping Field: this is the Entity attribute that the Field maps onto, such as forename and dateOfBirthString . Type: this details the Field’s data type, such as Optional Int , String or Optional String . For example, the schema does not strictly require an Individual Entity Type to include a date of birth, making CustomerDoB an Optional String . Uniqueness: this measures how many distinct values there are as a percentage of total rows. Populated: this details the proportion of rows that have a value in this field. Distinct Values: this counts the distinct values within the field column. At the bottom of the Data Mapping section is an expandable Data Viewer panel. The panel shows the following: The Raw Data tab shows the field input strings that the Data Mapping process pulled from your Data Source. The Entity Data tab shows the cleansed, parsed and standardized output for that Data Source, mapped to Entity Resolution fields. Once the Data Mapping table appears, you are advised to review its outputs and amend the Data Mapping schema as needed. For example, the process may accidentally recognize three component parts of one address as a separate address each. In such cases, you may want to manually amend the mapping table. To manually amend the Data Mapping, complete the following steps: Review the main Data Mapping section and amend any of the Entity Mapping and Mapping Field allocations using the drop down options, as needed. Additionally, review the Manage Entities tab, under the Data Source tab in the top left. The Manage Entities tab allows you to review the Entity Types that are pre-populated by the Data Mapping process, and the Entity Groups within these Entity Types. Unify provides six different Entity Types. However, you can add new Entity Types as needed, and this feature is available in all versions of Unify except for the Demo version. Within the Manage Entities tab, the Entity Groups sub-tab lists out the Entity Groups for each Entity Type. Ensure you click through each Entity Type and review the Entity Groups, making any amendments as needed. For example, you may need to delete, add or edit Entity Groups as part of your Data Mapping review. IMPORTANT: Manage Entities does not currently allow you to edit the Matching Level at a granular level for each Entity Type. You may only specify the Matching Level at the Iteration run stage. IMPORTANT: Ensure that you review the Manage Entities tab for each of your Data Sources as there may be differences between the Data Sources. For example, in the Demo version, you will notice that the Entity Types for Contoso and Northwind are slightly different. You can resolve Entities within a single Data Source, such as where a database contains multiple entries per customer. However, Entity Resolution is typically used to resolve Entities across multiple Data Sources. Therefore, if you are using more than one Data Source, you must connect your second Data Source once the Data Mapping process for your first source completes. The Data Mapping process outlined in the preceding points repeats for the second Data Source and any subsequent Data Sources. (4) Running an Iteration Once your Data Sources are mapped, you can now resolve and create Entities by running an Iteration. To run an Iteration, complete the following steps: Return to the Explorer panel on the left. Click the + next to Iterations. A New Iteration pop-up appears. Fill in the required details, with the following in mind: Provide a unique Iteration name that is easy to identify. Select the Data Sources you want to resolve in this specific Iteration. For example, if you have five different Data Sources, you may want to run multiple Iterations using different Data Source combinations each time. You may use any number of Data Sources in an Iteration, from a minimum of one. Choose the Matching Level for this Iteration. Select the Destination Lakehouse. This is the Lakehouse in which Unify outputs the Entity Resolution output tables. Click Run. The Iteration first completes some pre-processing jobs. It then resolves the Entities. The whole process can take approximately 10 minutes. Most of this time is spent on the overheads of setting up the jobs. Therefore, running 1 row, 1,000 rows or 100,000 million rows of data all take approximately the same time. The Quantexa system has been proven at 60 billion+ record volumes. NOTE: For the Demo version, the test files are intentionally small to allow you to easily view the results. Once complete, the page displays a summary of the Iteration as follows: The Information section on the top left contains administrative details about the Iteration. You can view further details by clicking on the Job Details button. The Total Entities section on the top right compares the number of input records against the number of resolved Entities in bar chart format, by Entity Type. The Output Data section at the bottom shows the Lakehouse tables created from the input records and the resolved Entities, by Entity type. This data is saved to the Lakehouse that you selected as your Destination Lakehouse in step 2. (5) Viewing and using the Unify output Once your Iteration is complete, you can view the Iteration output. These outputs are as follows: Entity Resolution Power BI Report Semantic Model of the Data Sources and the underpinning Entity Resolution tables For further detail on the content of these outputs, see Unify: A closer look at selected key features. To view these outputs, complete the following steps: Return to the Workspace in which you set up your Unify workload. You can navigate to it using the Workspaces button on the left-hand navigation bar. A list of Fabric items, including your Workspace folder structure, is shown in the bottom panel of your Workspace. Scroll down through the list to find the following items, and click to open them: The Iteration’s Power BI Report. The Report has the same name as your Iteration, preceded by 'Quantexa: '. The Iteration’s Semantic Model, which includes the Entity Resolution tables. The Semantic Model has the same name as your Iteration. Once in the Semantic Model space, you can view the related Entity Resolution tables by clicking on each table in the right-hand Tables panel. To view the Semantic Model itself, click Open semantic model in the top menu bar. Support If you run into any issues while using the Unify workload, visit the Unify Support page. You can post a question outlining your issue and request for help or view previous posts to see if they answer your question. Next steps For an applied example of the step-by-step guide, including ways to use your Unify output downstream, see Unify: Example workflow. Unify: A closer look at selected key features This page provides further detail and guidance on some key features of the Unify workload. Overview The features and capabilities explained on this page are those you will encounter as you use the Unify workload. For a step-by-step walkthrough on setting up and using the Unify workload, see Unify: Step-by-step guide to using the workload. Features The following sub-sections provide further detail on some of Unify's key features. Data Mapping For a definition of Data Mapping, see Unify: Core concepts. Data Mapping is an integral part of Quantexa’s Entity Resolution solution. Quantexa’s Data Mapping process in the Unify workload focuses on mapping your Data Sources to pre-defined Entity Type and Entity Group fields. In the context of the Unify workload, Data Mapping seeks to answer some initial questions about your Data Source such as the following: What source fields match the Unify Entity attribute fields? Which should they be mapped to? For source fields that do not directly match Unify’s pre-defined Data Mapping fields, what are the most suitable matches? If there are no suitable matches, why? What Entity Types and Entity Groups are being populated by the source data? To what percentage are these fields being populated? As noted in the step-by-step walkthrough, while you may edit the Data Mapping process output, the process itself runs automatically on loading a Data Source. This saves significant time and manual effort. However, to ensure accurate Data Mapping in Unify, your data must be in a suitable format and have some logical structure for the mapping process to read it effectively. Iterations For a definition of Iteration, see Unify: Core concepts. Running an Iteration serves two purposes: Conducting Entity Resolution on the Data Sources you include for that Iteration. Comparing Entity Resolution outputs across multiple Iterations that use different Data Sources, or different combinations of Data Sources. In addition to comparisons on the data content, an Iteration can help you compare data quality, Entity Resolution metrics and field population rates between your Data Sources. The first scenario is straightforward, and thanks to Quantexa’s Entity Resolution features within the Unify workload, you can use the workload to build a trusted data foundation directly. The second scenario would be more complex without the Unify workload, as it would require a significant investment of time and resources to conduct a true comparison. However, with the Unify workload, the complex is made simple. You simply run multiple iterations using the straightforward step-by-step process. Matching Levels For a definition of Matching Level, see Unify: Core concepts. The availability of Matching Levels helps you tailor Unify’s Data Mapping and Entity Resolution processes to your Project’s needs. As a reminder, there are three available Matching Levels within the Quantexa Unify workload: Default, Fuzzy, and Strict. The following are example use cases for Fuzzy and Strict Matching Levels. Fuzzy: You can use a Fuzzy Matching Level in a scenario like matching customers to a watchlist in the Financial Crime arena. Due to the seriousness of the matter, you want to ensure you find all possible matches. Even where there is Overlinking, you are happy to manually review the matches to find the correct ones. Strict: You can use a Strict Matching Level in a scenario like generating a master set of customers in Master Data Management. As the output may be used to trigger automatic action, such as contacting customers, and you are unlikely to review the matches, you want to ensure that all generated matches are correct. Even where there is Underlinking, you are happy to have a smaller scope of matches given the reputational and practical consequences of any incorrect matches. The following factors can help you decide which Matching Level to choose at the Data Mapping stage and for each Iteration: The quality of your Data Source. The completeness of your Data Source. Your particular use case. For example, if you are planning to use the Entity Resolution output to execute automated tasks without reviewing all matches, it may be better to use a Strict matching level. For cases where you want to ensure you have all possible matches, even with overlinking, you may want to use a Fuzzy matching level. If you are not sure which Matching Level to use, you can opt for the Default Matching Level, as this strikes a balance between Overlinking and Underlinking. Automated output After completing an Iteration, the Unify workload automatically outputs the results of the Entity Resolution process into the following: Iteration summary The summary shown for an Iteration after Entity Resolution is a bar-chart in the top-right corner. The bar chart shows a comparison between the total number of input Records against the total number of resolved Entities for each Entity Type. Power BI Report The automatic report shows summaries of key information for Entity Types, such as Entity size, Entities by Address and Entities by Business and Individual counts. Entity Resolution records tables and Entities tables Records tables show the records that triggered the resolution of a particular Entity. For example, the workload outputs multiple tables showing the relevant records for a particular Entity. Each record table covers a specific Entity type, such as Individual or Address. Entities tables show the Entities the source data has resolved to. For example, you may input two Data Source tables, and after Entity Resolution, the workload outputs multiple additional tables showing the resolved Entities. Each table covers a specific Entity type, such as Individual or Address.Entity Resolution records and Entities tables. Semantic Model An Iteration’s Semantic Model shows the relationships between the tables described in the preceding point and your input Data Source tables, within an Iteration. For further information on Semantic Models in Microsoft Fabric, see Power BI Semantic Models in Microsoft Fabric. Additionally, using the automatic outputs, you can optionally create other outputs within the broader Fabric suite, including the following: Other types of Power BI reports Power BI is a functionality provided by Microsoft Fabric, and not by the Unify workload. Power BI reports are typically based on one Semantic Model and can feature visualizations such as charts, graphs and tables to provide data insights. They can help you explore your data – and the output of Unify – further. For more information on Power BI reports, see Reports in Power BI. Notebooks Power Query (M script) with Dataflow Gen2 Next steps For a guide to using the Unify workload, see Unify: Step-by-step guide to using the workload. For an applied example of the step-by-step guide, see Unify: Example workflow. Unify: What the workload does This page provides a primer on Entity Resolution and the process of the Quantexa Unify workload for Microsoft Fabric. What is the Quantexa Unify workload? The Quantexa Unify workload is an end-to-end, AI-driven Entity Resolution feature in the Microsoft Fabric ecosystem. The workload is built out of the Entity Resolution component of Quantexa’s industry-leading Decision Intelligence (DI) Platform. For a more detailed overview of the Unify workload, see Unify: How the workload can help you. What is Entity Resolution and why is it useful? Entity Resolution is the process of working out whether multiple records are referencing the same real-world object, such as a person, organization, address, phone number, bank account, or device. The process takes multiple disparate data points from external and internal sources and resolves them into one distinct, unique Entity. Ultimately, Entity Resolution cleanses, distils, and unifies your data, making it more accurate and useful. This improves data analysis and real-world decision-making and, with Quantexa’s best-in-class Entity Resolution capabilities, helps you unlock deeper insights and make smarter decisions with ease. To learn more about Entity Resolution, and why it’s essential for businesses working with disparate datasets, see the following: The definition in Unify: Core concepts. What is Entity Resolution and How Does It Transform Data Into Value? Decision Intelligence: why entity resolution is foundational to success, an article by Quantexa’s Chief Technology Officer, Jamie Hutton. To learn more about why Entity Resolution is a much more effective way of handling disparate datasets compared to Record-to-Record matching, see the video below: How will Entity Resolution through the Unify workload help me? The Unify workload brings Quantexa’s industry-leading Entity Resolution capabilities to your doorstep in a secure and easy-to-use workload. Providing fast data onboarding and a no-code user interface, the workload is straightforward and quick to use, enabling you to resolve Entities end-to-end in under an hour and eliminating the need for more complex tooling and software. What does the Unify workload process look like? The Quantexa Unify workload supports end-to-end Entity Resolution, from ingesting, mapping, and resolving your source data, to generating reliable outputs for more effective decision-making. The process works as follows: Connect your Data Source First, you must connect your Data Source to a project. You can do so in just a few clicks. Data Mapping On connecting your Data Source, Unify automatically processes it to identify individual fields within the data. This stage is called Data Mapping The Data Mapping process uses an inference engine to determine the appropriate data schema. It pulls fields it identifies from your Data Source and creates a list of these fields. Each field is then mapped to predefined Entity Types. For example, the field DateOfBirth maps to the Individual Entity Type. You can also create new Entity Types for the Data Mapping stage, as needed. This stage applies the necessary parsing, cleansing, and standardization. After it is complete, you can view both the input and output data using the Data Mapping features. Run an Iteration With just a few clicks, you can then run an Iteration with the Data Sources you want to resolve and specifying the Matching Level you want to apply. The Iteration stage is the core Entity Resolution stage. Once you have started an Iteration, the workload automatically completes multiple steps as follows: The Entity Resolution process takes the parsed, cleansed, and standardized data from the Data Mapping stage and identifies connections between Entity occurrences. Unlike traditional record-matching approaches, Unify’s process does not rely solely on matching unique identifiers. With Quantexa’s best-in-class Entity Resolution capabilities, Unify helps you uncover connections from indirect relationships too. On the basis of the Entity occurrence matching, the workload outputs cleansed, unified, and organized information about each Entity. Quantexa’s proprietary code ensures that the output provides a more accurate, real-world representation of each Entity than your input data or traditional record-matching approaches. The workload then places its output tables into a Lakehouse of your choosing. In addition, it creates a Power BI report and a Semantic Model of the Iteration’s output tables. NOTE: Although the workload creates the Semantic Model automatically, you are able to amend it. Output and next steps Following these steps, you have now resolved Entities and can use the output information to create more detailed reports, such as a Power BI report. What output does the workload produce? For each Iteration, the workload produces the following outputs automatically: Entity Resolution records and Entities tables Default Semantic Model Unify Iteration summary Power BI Report For further details on these outputs, see Unify: A closer look at selected key features. Additionally, using the automatic outputs, you can optionally create outputs within the broader Fabric suite, including the following: Other types of Power BI report Notebooks Who can use Unify? The Unify workload is particularly useful for Data Warehouse specialists, Business Analysts, Data Engineers, and Data Scientists. However, due to the simplicity and no-code nature of Unify’s user interface, anyone working with uncleansed and disparate datasets can use Unify. Depending on the size of your dataset, you could run the Unify workload end-to-end in under an hour, giving you fast, accurate, and reliable Entity Resolution in just a few clicks. Additionally, with a Fabric subscription, you can quickly onboard and start using the Unify workload anytime. Next steps See the following guides for further information on core concepts in Unify and guides to setting up and using your workload: Unify: Core concepts Unify: Step-by-step guide to using the workload Unify: How the workload can help you This page provides an overview of the Quantexa Unify workload for Microsoft Fabric and how it can help you in your data projects. Overview of the Quantexa Unify workload The Quantexa Unify workload brings a critical data transformation component into the Microsoft Fabric ecosystem: Entity Resolution. The Unify workload is built on the industry-leading AI-driven Entity Resolution component of Quantexa’s Decision Intelligence Platform. As a result, the workload empowers data teams by enhancing data quality and usability, eliminating data silos, and allowing you to connect data at scale. How can the Unify workload help you? The Unify workload delivers best-in-class Entity Resolution, providing deeper contextualization and refinement of your datasets compared to traditional record-matching methods. It also simplifies data management and allows you to integrate and update data from multiple sources continuously. Entity Resolution through the Unify workload quickly and easily elevates the data on which you base your data analysis and real-world decision-making. This helps you unlock deeper insights and make smarter decisions with ease. For more information on Entity Resolution in the Unify workload, see What the Unify workload does. Why should I use Unify instead of other Entity Resolution tools? By using the Quantexa Unify workload, you will benefit from Quantexa's industry-leading Entity Resolution capabilities: Industry Recognition By using the Quantexa Unify workload, you will benefit from Quantexa’s industry-leadingEntity Resolution capabilities. Additionally, key features of the Unify workload include the following: No-code interface that allows users of all types to benefit from the workload. Automated data mapping. Advanced Entity matching, including the ability to adjust the ‘strictness’ of Entity matching between Iterations. End-to-end Entity Resolution processing that can complete in under one hour. Scalable for high-volume datasets and many multiples of datasets. Outputs data into tables that you can use to build Semantic Models or to enhance your data analytics, for example within Power BI and other applications. Outputs deduplicated, AI-ready data that can be used, for example, for Machine Learning and AI models in Fabric. Helps you identify quality issues through Power BI reports. Seamless integration into your Fabric project. Low-friction sign-up process with minimal onboarding requirements. Supports team collaboration within a single platform. In short, the Quantexa Unify workload helps you easily and quickly create a trusted, connected, and contextualized data foundation. How the Unify workload fits into the Fabric ecosystem When you first add the Unify workload, you are provided with a Demo version of the workload that only allows you to use the Data Sources that Quantexa provides. On requesting a Full User license, you are then provided with full access to the workload. This allows you to use your own Data Sources and run the full workload within your Fabric tenant. An example workflow that shows how Unify fits into the Fabric ecosystem is as follows: You have Data Sources that include customer and supplier information. Therefore, before using the Unify workload, you use OneLake to connect and centralize access to your Data Sources. You connect multiple Data Sources within Fabric. Although your Data Sources contain customer and supplier information, there is no customer key or unique ID to indicate which references are to the same individuals or companies. Therefore, you use the Quantexa Unify workload to match references to the same individuals and companies across your Data Sources and create a unique ID for each individual and company. This is your ‘resolved’ data. Following on from the Unify workload, you could use your resolved data in the following ways: Data warehouse specialist: To aggregate your data in a Fabric Data Factory flow. Power BI engineer: To combine data from your Data Sources into visualizations in Power BI. Data scientist: To develop a machine learning model using Fabric Notebooks. The preceding example are just three in a vast range of scenarios of how you can use your resolved data downstream from the Unify workload. Next steps If you are working with datasets of any size that would benefit from Entity Resolution, try the Quantexa Unify workload. You can test out or purchase the workload in the following ways: Demo Version: This is a free preview open to all Fabric users that allows you to test out some of the workload’s key features. In this preview, you can only use the Data Sources that Quantexa provides. Click here to access the Demo Version of Unify. Full User and Trial: The Full User version provides you with full access to the Unify workload, including allowing you to use your own Data Sources. You can access the Full User version directly through a license subscription. Additionally, you can also gain temporary access to the Full User version of Unify through a Trial version. This allows you to explore all the workload features on a temporary, unpaid license. To purchase the Full User version or access the Trial version of Unify, contact UnifyAndFabric@Quantexa.com. NOTE: To access the link, ensure you are logged into your Fabric account in your browser. To find out more about the Unify workload, see What the Unify workload does. A post containing sensitive data Sensitive Data & IP All the code from a deployment here <!DOCTYPE html> <html> <head> <style> .cool-button { background-color: #4B0082; color: white; padding: 12px 24px; border: none; border-radius: 8px; font-size: 16px; cursor: pointer; transition: background-color 0.3s ease; } .cool-button:hover { background-color: #6A0DAD; } </style> </head> <body> <button class="cool-button">Click Me</button> </body> </html> Re: test test Several common challenges cause the gap between data and decisions. Select the tabs to find out more. Discover, Connect, Engage: Sign Up for a Community Tour or Member Interview Today! At Quantexa, we aim to ensure you’re getting maximum value from our Community! To ensure this happens, we've introduced Community Tours and Member Interviews. These initiatives offer you and your colleagues an opportunity to learn more about Community features, provide feedback, and connect with members of our Community team. 🚶♀️ Community Tour: A 30-minute demo of the Community, where we'll explore key features that will enhance your Quantexa journey. 🗣️ Member Interview: Engage in a 45-minute interview, focusing on your experience with the Community so far and any suggestions you may have for improvements. To sign up for a member interview or Community tour please complete this brief survey.
My User GroupsFinancial Crime Global This public group welcomes customers, partners, colleagues, and thought leaders to share expertise, insights, and innovative practices in leveraging the Quantexa Platform to fight financial crime and fraud worldwide.3 years ago25 PostsFinancial Crime Leaders Group (Europe) 0 Posts
Financial Crime Global This public group welcomes customers, partners, colleagues, and thought leaders to share expertise, insights, and innovative practices in leveraging the Quantexa Platform to fight financial crime and fraud worldwide.3 years ago25 Posts
Would you like to join a Group?Groups & Events Connect with Community members and Quantexans in our specialise groups and through our events
Groups & Events Connect with Community members and Quantexans in our specialise groups and through our events