Data Science
-
Predicting Risk through Network Shape
Encoding network shape has been a big factor in the success of our recent shell company detection machine learning model. Central to the model is the ability to evaluate the context of a business – given by the network of nodes surrounding it – in a way that can be input into explainable ML models. Early on in our…
-
Automatic Data Cleaning Through Data Normalisation and Statistics
This piece is about how to clean up bad data to improve Entity Resolution. When performing Entity Resolution (ER), we want our produced Entities to be as accurate as possible. Having bad Entities provides an inaccurate customer view and makes it much harder to spot risks. In our ER, data is commonly brought in from a…