The DEEM Lab is looking for a research associate to conduct research in responsible data engineering. The research will focus on data preparation and data pipelines for complex machine learning (ML) systems. Such ML systems are increasingly used to automate impactful decisions but suffer from many unsolved data management challenges with respect to their correctness, reliability, and compliance with legal regulations.
The goal of the research will be to design and efficiently implement data-centric methods to make ML systems guarantee their users control over their personal data (e.g., with respect to the "right-to-be-forgotten" from GDPR) and adhere to legal regulations such as the upcoming European AI Act.
This will be achieved via novel declarative methods to create, maintain and assess datasets for ML use cases. These will assist non-expert users with data-centric tasks, such as evaluating the robustness of their ML pipelines to data errors and potentially leverage the code generation capabilities of large language models. The resulting methods will be accompanied by efficient and scalable implementations and made publicly available as open source libraries. Teaching tasks.
Requirements
- Successfully completed university degree (Master, Diplom or equivalent) in Computer Science or Artificial Intelligence
- Strong programming skills in Python and at least one additional language (Java/Rust/C++)
- Knowledge in data processing with dataflow systems, relational databases and/or dataframe libraries (e.g., Apache Spark, DuckDB, pandas, etc.)
- Experience with increasing the efficiency, scalability and correctness of data-centric programs
- Basic knowledge of machine learning and common libraries (e.g., pandas, sklearn, pytorch, SparkML, etc.)
- The ability to teach in German and/or English is a prerequisite; willingness to acquire the missing language skills in each case
Salary grade: TV-L 13, Berliner Hochschulen
Starting date: May 1, 2025 (limited for 5 years)
Closing date: February 14, 2025
Full job posting: IV-22/25