"Snapcase" allows users to regain control over their recommendations in online shopping platforms
European citizens have the "right-to-be-forgotten" as part of the General Data Protection Regulation (GDPR). This right empowers them not only to request the deletion of their personal data from data stores, but also to have their data "unlearned" from machine learning (ML) models. Unfortunately, the resulting data deletion and "unlearning" processes often take weeks or months in practice, which can have devastating consequences in some cases. Imagine a person struggling with alcohol addiction, who decides to stop consuming alcoholic products. Unfortunately, this person will still be exposed to recommendations for alcohol products online on e-commerce platforms, since the underlying ML models will have learned their preference for alcohol.
The DEEM Lab at BIFOLD is conducting research on "low-latency machine unlearning", where the goal is to design systems that can forget and "unlearn" personal user data within seconds. They developed "Snapcase", a recommender system for online shopping, which can unlearn user interactions with sub-second latency. Snapcase combines techniques from database management and machine learning by treating ML models as "materialised views" over training data, which are efficiently updated with carefully chosen operations and data structures.
In a recent demonstration paper accepted at the "International Conference on Very Large Databases (VLDB)", the researchers showcase the "Snapcase" system on a large grocery shopping dataset with 33 million purchases, 200 thousand users and 50 thousand unique products. They demonstrate how users can ask the system to "unlearn" past purchases of addictive or non-sustainable products in less than a second. Thereby users regain control over their recommendations for their next shopping basket. This allows them to adjust their recommendations (and for example get rid of alcoholic or otherwise unhealthy products) and at the same time reduce the negative impact that their past purchase behavior may have had on the recommendations for other people.
Paper: Snapcase– Regain Control over Your Predictions with Low-Latency Machine Unlearning
Authors: Sebastian Schelter, Stefan Grafberger, Maarten de Rijke
Link: https://www.vldb.org/pvldb/vol17/p4273-schelter.pdf
More information about the BIFOLD research group Management of Data Science Processes, supervised by Prof. Sebastian Schelter.