Banner Banner

Identifying Semantic Components for PBE-based Transformation Discovery

Dakai Men
Binger Chen
Ziawasch Abedjan

2025

Abstract: Complex data transformations involve a combination of syntactic and semantic operations. Recent LLM-based Programming-by-example (PBE) approaches aid in finding sequences of syntactic and semantic operations to satisfy given transformation examples. As testing LLM outputs is expensive, such approaches defer the prompting step to after all syntactic operations have been identified. However during this process, sequences of tokens that need semantic look-ups are split and their order is lost, harming the overall transformation accuracy. We address this problem by focusing on transformation tasks that are challenging and propose a pre-processing step that prevents destructive splits of such sequences.