Data Engineering

The Data Engineering team designs and optimizes the flow and integration of data, enabling scientists to turn big data into smart data that is contextualized and actionable.

The Data Engineering team is responsible for the overall infrastructure and architecture of large-scale systems, data governance, data security and the deployment and optimization of machine learning models. Our data engineers build reliable and efficient pipelines that transform and transport data into formats that data scientists can use for analysis. The pipelines take structured and unstructured data from disparate sources and collect them into a data lake and knowledge graph with comprehensive meta data.

 

The team architects data stores, tunes databases, transforms and integrates experimental data into a centralized data warehouse, develops easy to use data dashboards and collaborates with data science teams to deploy and optimize machine learning models. Additional responsibilities include capacity planning and proactive monitoring of systems, data life cycle management (from initiation to archiving and deletion), database administration, data cataloging and conducting data quality audits.

 

An important goal of the team is the continued improvement and promotion of the FAIR data principles – a set of guiding principles to make data finable, accessible, interoperable and reusable. To make data FAIR, the team has unified data management, eliminated data silos, implemented a center-wide research data management policy, standardized data formats and introduced ongoing data stewardship. Our goal is to make the FAIR data principles an integral part of our culture and become a leader in FAIR data management.

 

Contact

Evelyn Travnik
CIO
DTU Biosustain
+45 93 51 89 48