New cutting-edge IT Infrastructure
New cutting-edge IT Infrastructure
A future-proof IT infrastructure is increasingly becoming a decisive competitive factor – this applies not only to companies, but especially to research. In recent months, BIFOLD has been able to invest around 1.8 million euros in new research hardware, thereby significantly increasing the institute’s computing capacity. This cutting-edge IT infrastructure was financed by the German Federal Ministry of Education and Research (BMBF). “If we want to continue to conduct world-class research, investments into infrastructure are an important prerequisite” describes BIFOLD Co-Director Prof. Dr. Klaus-Robert Müller.
Current experiments in Machine Learning and Big Data management systems require hardware with very strong computing, storing and data transfer capabilities. The new systems include a specialized computer unit (node) and a full computer cluster both designed for simultaneous processing of a very large number of parallel workloads (massively parallel processing) with large main memory capacities as well as one cluster particularly suitable for the fast processing of sequential workloads. The central processing units (CPUs) in the latter cluster also support the so called Intel Software Guard Extension technology thereby enabling developers to create and execute code and data in a secure environment. The servers run high-performance file systems and will allow for the transfer of very large data with low latency. “We expect that this cutting-edge hardware will not only enrich our own research, but also enables us to establish new collaborations with our partners,” adds BIFOLD Co-Director Prof. Dr. Volker Markl.
In the group of Volker Markl, mainly two different projects benefit from the new possibilities: AGORA is a novel form of data management systems. It aims to construct an innovative unified ecosystem that brings together data, algorithms, models, and computational resources and provides them to a broad audience. The goal is easy creation and composition of data science pipelines as well as their scalable execution. In contrast to existing data management systems, Agora operates in a heavily decentralized and dynamic environment.
The NebulaStream platform is a general purpose, end-to-end data management system for the IoT. It provides an out-of-the box experience with rich data processing functionalities and a high ease-of-use. With the new IT infrastructure, both of these systems can be validated at a much larger scale and in a secure data processing environment.
High memory and parallel processing capabilities are also essential for large-scale Machine Learning simulations, e.g. solving high-dimensional linear problems, or training deep neural networks. Klaus-Robert Müller and his group will use the new hardware initially in three different projects: Specifically, it allows BIFOLD researchers to produce closed-form solutions of large dense linear systems, which are needed to describe correlations between large amounts of interacting particles in a molecule with high numerical precision. Researchers can also significantly extend the number of epigenetic profile regions that can be analyzed, thereby using significantly more information available in the data. It will also enable scientists to develop explainable AI techniques that incorporate internal explainability and feedback structures, and are significantly more complex to train than typical deep neural networks.