Home >


Foundations and Methods

BIFOLD research groups conduct fundamental research on a wide range of topics concerning foundations and methods of Artificial Intelligence (AI). This includes the management and processing of distributed and Big Data. As Machine Learning (ML) is one of the main fields for modern AI and the new wave of AI applications, we also focus on a variety of Machine Learning methods such as reinforced and Bayesian Machine Learning as well as unsupervised and recurrent Deep Learning.

Research Topics
  • Database Systems and Information Management
  • Intelligent Data Analysis and Information Management
  • Deep Learning
  • Unsupervised Deep Learning
  • Recurrent Deep Learning Models
  • Reinforcement Learning
  • Inference und Bayesian Machine Learning
  • Distributed Data Processing
  • Big Data Processing

Management of Data Science Processes and Systems

The Management of Data Science Processes is fundamental for the development and application of Artificial Intelligence. BIFOLD research aims to drastically improve the efficiency of data preparation and management processes. In addition, a research focus lays on the development of security and visualization tools for Big Data management as well as solutions for managing discretizations of distributed graph data streams, in particular managing the state of a graph evolving over time.

Research Topics
  • Information Integration and Data Quality
  • Data Management for the Machine Learning Lifecycle
  • Information Visualization and Visual Analytics
  • Big Data Security
  • Graph Data Management

Architectures and Technologies

BIFOLD research groups develop Big Data infrastructures and tools for programming and data extraction. We investigate architectures and algorithms for the scalable processing of Big Data as well as advanced Machine Learning. Furthermore, we develop methods and tools for the software-engineering paradigm of Software 2.0 / Neural Network Programming and other data programming languages. Our researchers will also explore new methods for mining data from massive text and other media collections.

Research Topics
  • Big Data Architectures
  • Big Data Engineering and Benchmarking
  • Engineering Software 2.0: Neural Network Programming
  • Data Programming Languages
  • Knowledge Discovery from Massive Text Data Collections
  • Knowledge Discovery from Massive Image, Audio- and Video Collections

Responsible AI

Security and Transparency of Machine Learning Processes are cornerstones for Responsible AI. BIFOLD will therefore conduct research on Explainable AI as well as secure machine learning to counter threats such as data poisoning, adversarial examples or model extraction. Our research groups investigate how Big Data analysis can be conducted in a privacy-preserving way. Additionally, we aim to improve bias detection in training data, ensuring transparency, fairness, and reproducibility of algorithms, and the ethical and legal frameworks that guide responsible handling of data and algorithms.

Research Topics
  • Explainable AI
  • Secure Machine Learning
  • Technological Enablers for Informational Self-Determination
  • Technology-Aware Data Ethics and Law

Systems and Tools for Novel Applications

BIFOLD will contribute to the development of novel AI Applications by conducting practice-oriented research on tools and infrastructures. Data is an integral part of a digitized society, fueling the algorithms of Machine Learning and Artificial Intelligence. Data infrastructures provide the technical foundation for offering broad access to data and processing capabilities. We will therefore investigate technical, economic and legal aspects for information marketplaces. BIFOLD research groups also conduct interdisciplinary research on the theory and application of coupling machine learning and simulation methods to identify potential for novel applications.

Research Topics
  • Information Marketplaces
  • Simulation and Machine Learning
  • Big Data and Machine Learning foundations for Medicine
  • Big Data and Machine Learning foundations for the Natural Sciences

Main Application Areas


Medicine forms a spectrum of interdisciplinary AI challenges in the medical research field that ranges from basic scientific questions to complex gene regulation mechanisms and networks. BIFOLD will focus on the integration of microscopic-histological image data and proteogenomic “omics” data in translational cancer research, radiological and proteomic data in cardiology. Another research focus is the integration of heterogeneous and distributed clinical and highly noisy real-time data from intensive care medicine. Problematics of data privacy in the analysis of geographically distributed medical data from different data centers are also a scientific challenge.

Digital Humanities

A central question in the Digital Humanities is how to efficiently use complex a priori knowledge for the development of powerful interactive methods to deal with highly structured heterogeneous data. Typically, patterns and pattern groups from characteristics of historical sources such as layout, images in texts, text-image constellations, word combinations or word clusters are to be recognized and statistically analyzed. On the one hand, this would allow the exploration of models and the simulation of their consequences, and on the other hand, move forward the generation of heuristics that emulate scientific analysis methods. Both significantly increase the predictive power of the models. Futhermore, the use of methods to interpret the models might enable the automatic formulation of hypotheses. Fundamental problematics of AI such as a priori knowledge and graphene structures are to be investigated in the Digital Humanities.


In the application area communication, AI methods for coordination, compression and caching in ultra-dense meshed networks are to be explored. Like this, reliability as well as spectral and energy efficiency and latency times shall be reduced. In many cases, only limited and decentral or online data is available. Therefore, the development and application of novel methods will be necessary to enable distributed learning from minimal data sets during runtime of the system. The practical boundary conditions of communication applications and the statistical processes to be considered require novel algorithms to overcome these problems. Scalable, real-time language technology for user interfaces can enable the analysis of multimodal, extremely heterogenous data for natural language understanding of dialogue oriented assistants.