Meet Adrien…
I have been a Data Engineer for one year at Quantexa. I have mostly worked on ETL pipelines with a focus on Entity Resolution. In such projects, the goal is to provide a complete view of entities by bringing together information from various data sources.
What did you want to be when you were little?
I used to be really into paper folding. I have always appreciated the purity of the rules: start with a square paper and shape it into something meaningful. I enjoyed the geometry that lies in the process, and the creativity of the artists who embraced this art.
What first sparked your interest in pursuing a career as a Data Engineer?
During university, I worked on projects and internships where the ability to bring quality data together was the backbone of the project's success. Having the right data together in the first place can make a huge difference to the outcome.
What educational background or qualifications did you need to become a Data Engineer?
My background is in general engineering with a focus on mathematics and machine learning. My academic experience in programming helped me get through the interview process and join Quantexa, but I believe I only started to become a Data Engineer after a few months of working for Quantexa.
Read more Adrien's first few months and Quantexa and his time in the Academy in his 'A Day in the Life of… an Academy Trainee'.
How do you continue to grow and develop professionally?
I have gone through several steps of learning:
- The first two months in the Quantexa Data Engineer Academy gave me great exposure to a fully-fledged ETL.
- The second step was to apply these skills to a real project. The code must be robust, well-tested, and the results thoroughly validated. Cloud and CI/CD considerations come into play as well. For this, I have been lucky to work with and learn from an amazing team.
- Finally, I had the opportunity to make in-depth contributions to a specific but key problem in Entity Resolution: Parsing. Parsing is when you automate the extraction and labeling of specific information out of the raw data. To me, this is a particularly exciting area, which led me to revisit some of my old classes in statistics and natural language processing, on top of reading more recent research papers.
Can you describe your career journey that led you to your current role?
At school, I played with data, but it was more to do with analysing it, running statistics, or training a model on it. I had a short internship that looked like Data Engineering but was essentially bringing together some Excel fields through a Visual Basic Application (VBA) macro to automate processes. In another internship, I had to record sensor signals to learn cat activity patterns. This wasn’t as large-scale as Quantexa but it did give me some insights into some of the problems associated with Data Engineering.
What were some of the biggest challenges on your path to becoming a Data Engineer?
On each new project, there’s always a lot to learn which can be daunting at first. There are a variety of use cases Quantexa deploys, so it’s necessary to learn the ins and outs of each one. This could be learning a new language - I learned Scala from scratch after starting with Quantexa or learning a new Cloud platform such as GCP and Dataproc. It’s exciting to be continuously learning though – as Quantexa expands its capabilities, I get to expand mine.
What key skills are essential for succeeding in your role?
I believe curiosity, and being mindful of the higher-level design choices on top of the implementation technicalities is key to growing as an engineer.
What do you find most rewarding about your job?
Beyond the routine debugging, there is some grace in understanding how the data flows at each step of the ETL pipeline. Data Engineering is a lot about modeling the data correctly. There’s a satisfaction in having an elegant design, which expresses with simplicity something that once seemed untouchable.
What one piece of advice would you give to someone aspiring to become a Data Engineer?
There are plenty of pathways to becoming a Data Engineer. The technology and platforms are always evolving and you need to evolve with them. To tackle this, create a solid foundation for yourself; many programs and platforms share some core principles so you don't have to re-learn everything. Some examples include:
- Python and Scala
- Azure and GCP
- Spark and Pandas
Then there is also hard work and resilience. There will be times when the workflows do not run, or when the code will not compile. Using this as an opportunity to learn and expand your understanding will do wonders.