Application of seq2seq models on code correction

Files
frai-04-590215.pdf(2.36 MB)
Published version
Date
2021
Authors
Huang, Shan
Zhou, Xiao
Chin, Sang
Version
Published version
OA Version
Citation
S. Huang, X. Zhou, S. Chin. 2021. "Application of Seq2Seq Models on Code Correction.." Front Artif Intell, Volume 4, pp. 590215 - ?. https://doi.org/10.3389/frai.2021.590215
Abstract
We apply various seq2seq models on programming language correction tasks on Juliet Test Suite for C/C++ and Java of Software Assurance Reference Datasets and achieve 75% (for C/C++) and 56% (for Java) repair rates on these tasks. We introduce pyramid encoder in these seq2seq models, which significantly increases the computational efficiency and memory efficiency, while achieving similar repair rate to their nonpyramid counterparts. We successfully carry out error type classification task on ITC benchmark examples (with only 685 code instances) using transfer learning with models pretrained on Juliet Test Suite, pointing out a novel way of processing small programming language datasets.
Description
License
Copyright © 2021 Huang, Zhou and Chin.. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.