Four BIFOLD research groups actively contribute to the program of the 50th International Conference on Very Large Databases (VLDB 2024), held in Guangzhou, China, from August 26-30, 2024. Their presentations covers a range of topics aligning with the conference's themes, including presentations on analytical database queries, data management for machine learning, or data cleaning systems.
The participating groups:
- Database Systems and Information Management (DIMA), headed by BIFOLD co-Director Prof. Dr. Volker Markl
- Big Data Engineering (DAMS), led by Prof. Dr. Matthias Böhm
- Data Engineering for Machine Learning (DEEM), under the leadership of Prof. Dr. Sebastian Schelter
- Data Integration and Data Preparation (D2IP), directed by Prof. Dr. Ziawasch Abedjan
Special honors
Prof. Dr. Matthias Böhm received the Superstar AE Award for his exceptional service as an Associate Editor (AE). Prof.
Dr. Ziawasch Abedjan was honored with the Distinguished Reviewers Award for his commitment to delivering high-quality, timely reviews.
To celebrate 50 years of VLDB, Computer Scientist Prof. Chandrasekaran Mohan of Hong Kong Baptist University hosted a session titled "Half a Century of VLDB: A Celebration and Memories." Former and current Presidents of the VLDB Endowment's Board of Trustees, Volker Markl and Divesh Srivastava, also participated in the session.
During the conference the BIFOLD team presents research in multiply formats:
DIMA
- Efficient Placement of Decomposable Aggregation Functions for Stream Processing over Large Geo-Distributed Topologies
- Xenofon Chatziliadis (TU Berlin & BIFOLD), Eleni Tzirita Zacharatou (IT University of Copenhagen), Alphan Eracar (TU Berlin), Steffen Zeuch (TU Berlin & BIFOLD), Volker Markl (TU Berlin & BIFOLD)
- DOI: https://doi.org/10.14778/3648160.3648186
- Fainder: A Fast and Accurate Index for Distribution-Aware Dataset Search
- Authors: Lennart Behme (TU Berlin & BIFOLD), Sainyam Galhotra (Cornell University), Kaustubh Beedkar (IIT Delhi), Volker Markl (TU Berlin & BIFOLD)
- Format: Poster, Proceedings
- Link: https://lbeh.me/pdf/Fainder.pdf
- Looking Deeply into the Magic Mirror: An Interactive Analysis of Database Index Selection Approaches
- Authors: Stefan Halfpap (TU Berlin & BIFOLD), Jan Kossmann (Snowflake), Rainer Schlosser (Hasso Plattner Institute), Volker Markl (TU Berlin & BIFOLD)
- Format: Demo Paper
- Missing Value Imputation for Multi-attribute Sensor Data Streams via Message Propagation
- Xiao Li (Roskilde University), Huan Li (Zhejiang University), Hua Lu (Roskilde University), Christian S. Jensen (Aalborg University), Varun Pandey (TU Berlin & BIFOLD), Volker Markl (TU Berlin & BIFOLD)
- DOI: https://doi.org/10.14778/3632093.3632100
- Assisted design of data science pipelines
- Sergey Redyuk (TU Berlin & BIFOLD), Zoi Kaoudi (IT University of Copenhagen), Sebastian Schelter (TU Berlin & BIFOLD), Volker Markl (TU Berlin & BIFOLD)
- Format: Poster
- DOI: https://doi.org/10.1007/s00778-024-00835-2
DAMS
- POLAR: Adaptive and Non-invasive Join Order
- David Justen(TU Berlin & BIFOLD), Daniel Ritter (SAP), Campbell Fraser (Google), Andrew Lamb (InfluxData), Nga Tran (InfluxData), Allison Lee (Snowflake), Thomas Bodner (HPI), Mhd Yamen Haddad (INRIA), Steffen Zeuch (TU Berlin & BIFOLD), Volker Markl (TU Berlin & BIFOLD), Matthias Boehm (TU Berlin & BIFOLD)
- Presenter: David Justen
- Link: https://mboehm7.github.io/resources/pvldb2024a.pdf (PVLDB 2024 17(6))
DEEM
- How Data Management Research Helps to Improve Real World ML Applications
- Format: Keynote by Sebastian Schelter (BIFOLD & TU Berlin) at the "Quality in Databases" workshop
- Link: https://hpi.de/naumann/projects/conferences-and-workshops-hosted/qdb-2024.html
- A Flexible Forecasting Stack
- Authors: Tim Januschowski (Zalando), Yuyang Wang (Amazon), Jan Gasthaus (Meta), Syama Sundar Rangapuram (Amazon), Caner Turkmen (Amazon), Jasper Zschiegner (None), Lorenzo Stella (Amazon Research), Michael Bohlke-Schneider (Amazon Research), Danielle Maddix (Amazon Research ), Konstantinos Benidis (Amazon Research), Alexander Alexandrov (Unaffiliated), Christos Faloutsos (CMU), Sebastian Schelter (BIFOLD & TU Berlin)
- Format: Industry paper
- Link: https://www.amazon.science/publications/a-flexible-forecasting-stack
- Snapcase - Regain Control over Your Predictions with Low-Latency Machine Unlearning
- Authors: Sebastian Schelter (BIFOLD & TU Berlin); Stefan Grafberger (BIFOLD & TU Berlin); Maarten de Rijke (University of Amsterdam)
- Format: Demo Paper
- Link: https://deem.berlin/pdf/p1939-schelter.pdf
- Instrumentation and Analysis of Native MLPipelines via Logical Query Plans
- Presenter: Stefan Grafberger (BIFOLD & TU Berlin)
- Format: Paper at the PhD workshop
- DOI: https://doi.org/10.48550/arXiv.2407.07560
D2IP
- VLDB PHD Workshop: Automating Pipeline Extraction and Data Lineage
- Authors: Sebastian Eggers (BIFOLD, TU Berlin) Ziawasch Abedjan (BIFOLD & TU Berlin)
- Presenter: Sebastian Eggers
- Link: https://vldb.org/2024/files/phd-workshop-papers/vldb_phd_workshop_paper_id_11.pdf
- Quality in Databases workshop paper: Accelerating the Data Cleaning Systems Raha and Baran through Task and Data Parallelism
- Authors: Fatemeh Ahmadi (TU Berlin & BIFOLD), Yusuf Mandirali (Leibniz University Hannover), Ziawasch Abedjan (TU Berlin & BIFOLD)
- VLDB Journal Poster: AutoML in heavily constrained applications
- Authors: Felix Neutatz, Marius Lindauer, Ziawasch Abedjan
- Presenter: Felix Neutatz
- DOI: https://doi.org/10.48550/arXiv.2306.16913