Grading Complex Interactive Coding Programs with Reinforcement Learning

Using Reinforcement Learning to Evaluate Complex Interactive Coding Programs

Introduction:

In recent years, there has been a significant advancement in training AI algorithms to play complex games. This has led to the development of game-playing AI agents for various games, including those created by students as part of programming assignments. However, grading these game assignments poses a challenge as they require manual evaluation. Automated grading systems are needed to provide timely and scalable feedback to students in online coding education platforms. In our NeurIPS 2021 paper, we introduce the Play to Grade Challenge, where we propose a method to grade assignments by having a grading algorithm play the game instead of looking at the source code. We represent each program submission as a Markov Decision Process (MDP) and compare it to a reference MDP to determine the grade. Our approach eliminates the need to understand the underlying code and provides scalable feedback for online coding education.

Full Article: Using Reinforcement Learning to Evaluate Complex Interactive Coding Programs

Making Coding Assignments Playable for Grading: A NeurIPS 2021 Paper

A new approach to grading interactive coding assignments has been introduced in a recent NeurIPS 2021 paper. The paper explores the idea of treating grading as a game-playing challenge, utilizing AI algorithms that have been trained to compete in complex games such as Atari, Go, DotA, and StarCraft II. By applying these algorithms to coding assignments developed by students, researchers aim to automate the grading process and provide scalable feedback for online coding education platforms.

You May Also Like to Read  Cadence Tensilica Selects OmniSpeech as Audio Software Partner to Enhance Next-Generation AI Voice Algorithms, Catering to Automotive, Mobile, Consumer, and IoT Customers - AI Time Journal

The Challenges of Grading Coding Assignments

Online coding education has witnessed significant success in recent years, with platforms like Code.org reaching over 60 million learners worldwide. However, grading coding assignments poses a major challenge for these platforms. While manual grading may be feasible for small settings or simple multiple-choice assignments, more complex assignments like game development or interactive apps require human teachers to provide feedback and grading.

This reliance on manual grading hampers the scalability of online coding education, as students lacking additional teacher resources struggle to receive timely and constructive feedback. To address this issue, the researchers focus on coding assignments that involve game development, specifically those offered by Code.org. Students write JavaScript programs in an interactive code interface, creating playable games by implementing the physical rules of the game world.

The Complexity of Grading Code

Grading code assignments, particularly introductory level computer science assignments, is a challenging task. Two code solutions that appear similar in text can have drastically different behaviors, while two solutions written differently can produce the same behaviors. Moreover, coding submissions may be written in various programming languages, increasing the complexity of grading.

Developing a program capable of understanding multiple programming languages and accurately grading different assignments is a formidable task. Additionally, the program must generalize to new assignments without relying on massive labeled datasets for training. To overcome these challenges, the researchers propose a novel method that grades assignments by playing them, without examining the source code.

The Play to Grade Challenge

The researchers’ solution involves grading assignments by having an AI algorithm play the game created by the student’s program. The game is represented as a Markov Decision Process (MDP), defining the state space, action space, reward function, and transition dynamics. By running each student’s program, an MDP is constructed without analyzing the underlying code.

You May Also Like to Read  Meet Nicole DeCario: The AI Director Leveraging Expertise in Oct, 2023 | AI2 Spotlight

Since all student programs are written for the same assignment, commonalities arise in the generated MDPs, including shared state and action spaces. After playing the game and constructing the MDP, a comparison is made between the student’s MDP and the teacher’s solution (reference MDP). The goal is to determine if the two MDPs are identical.

Building an Algorithm for Accurate Grading

To solve the grading challenge, the researchers propose an algorithm consisting of two components: an agent that plays the game and can identify bug states, and a classifier that recognizes bug states by assigning a probability to observed states. Both components are crucial for accurate grading. An agent that explores all states but fails to identify bugs is as ineffective as a perfect classifier paired with an agent that struggles to trigger bugs.

An ideal agent should produce differential trajectories, allowing the differentiation of two MDPs. It should also contain at least one bug-triggering state when following a trajectory generated from an incorrect MDP. To train the agent and classifier, correct MDPs and a few incorrect MDPs are required. Incorrect MDPs can be provided by the teacher or identified through manual grading to find common issues.

Efficiency and Feasibility

Although manually labeling incorrect MDPs may seem cumbersome, the researchers demonstrate that the effort required is significantly less than grading each assignment individually. The paper shows that, in the task they tackle, only five incorrect MDPs are necessary to achieve satisfactory performance.

Conclusion

The NeurIPS 2021 paper introduces a groundbreaking approach to grading coding assignments by transforming them into playable games for AI algorithms. By employing reinforcement learning and MDPs, the researchers aim to automate the grading process, providing scalable feedback for online coding education platforms. With further development, this method has the potential to revolutionize coding education by offering effective and efficient grading solutions.

You May Also Like to Read  Revolutionary Fusion Power: AI and Accessibility Speeding up its Arrival!

Summary: Using Reinforcement Learning to Evaluate Complex Interactive Coding Programs

In a recent NeurIPS 2021 paper, researchers explore the idea of using AI algorithms to grade coding assignments for games developed by students. The traditional approach of manually grading these assignments poses a scalability issue for online coding education platforms. To overcome this challenge, the researchers propose a method that grades assignments by playing them, without looking at the source code. By representing the underlying game as a Markov Decision Process (MDP), they can compare the student’s MDP to the teacher’s solution and determine if they are the same. The researchers developed an algorithm with an agent that plays the game and a classifier that recognizes bug states, which together enable accurate grading. Through the use of incorrect MDPs, the researchers show that this method can achieve decent performance with minimal manual effort.