Banner Banner

Towards Data Augmentation for Supervised Code Translation

Binger Chen
Jacek Golebiowski
Ziawasch Abedjan

May 23, 2024

Supervised learning is a robust strategy for data-driven program translation. This work addresses the challenge of insufficient parallel training data in code translation by exploring two innovative data augmentation methods: a rule-based approach specifically designed for code translation datasets and a retrieval-based method leveraging unorganized code repositories.