Proving and Disproving Equivalence of Functional Programming Assignments
We present an automated approach to verify the correctness of programming assignments, such as the ones that arise in a functional programming course. Our approach takes as input student submissions and reference solutions, and uses equivalence checking to automatically prove or disprove correctness of each submission. To be effective in the context of a real-world programming course, an automated grading system must be both robust, to support programs written in a variety of style, and scalable, to treat hundreds of submissions at once. We achieve robustness by handling recursion using functional induction and by handling auxiliary functions using function call matching. We achieve scalability using a clustering algorithm that leverages the transitivity of equivalence to discover intermediate reference solutions among student submissions. We implement our approach on top of the Stainless verification system, to support equivalence checking of Scala programs. We evaluate our system and its components on over 4000 programs drawn from a functional programming course and from the program equivalence checking literature; this is the largest such evaluation to date. We show that our system is capable of proving program correctness by generating inductive equivalence proofs, and providing counterexamples for incorrect programs, with a high success rate.
main.pdf
n/a
openaccess
CC BY-NC-ND
232.18 KB
Adobe PDF
5bc9e7594595973d8cfec65399c930ca
3591258.pdf
publisher
openaccess
CC BY
317.49 KB
Adobe PDF
24df31b72348e75c442444cd8065e456