Applying Natural Language Processing techniques to enhance reading comprehension assessment

What is it about?

This study addresses the challenges of large-scale reading comprehension assessment using the Cloze test, a technique where students fill in gaps in a text. Traditionally, only exact words that match with the original text are considered correct, which limits the assessment of students' actual text comprehension. Our research proposes a practical solution: using Natural Language Processing (NLP) techniques, specifically word embedding models, to evaluate semantic similarity between the expected answer and the student's response. Word embeddings are numerical representations of words in a multi-dimensional space, where words with similar meanings are positioned close to each other. This approach allows for considering semantically similar answers as correct, even when they differ from the original word. We administered a Cloze test to elementary school students in Brazil and compared three different word embedding models (GloVe, Wang2Vec, and spaCy) to evaluate the semantic similarity of responses. To validate our approach, we engaged twelve human judges who ranked the students' answers.

Why is it important?

The results demonstrated that the GloVe model performed best, showing the highest correlation with human judges' evaluations. This indicates that word embedding models can efficiently automate the scoring of Cloze tests, particularly in large-scale applications. This solution offers significant benefits for teachers who often face the challenge of grading large volumes of assessments. By automating the correction process, educators can redirect their valuable time toward analyzing student results, identifying patterns of difficulties, and developing more effective pedagogical strategies. Furthermore, the proposed method enables a more accurate assessment of text comprehension by considering the semantic similarity of responses rather than just exact matching, providing a more nuanced understanding of students' reading abilities.

The following have contributed to this page:

Raquel Freitag and Túlio Gois