Cookies are used for analytics and marketing. Enable JavaScript to manage preferences, or see our Privacy Policy.

Abstracts Language, Literature, and Linguistic

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

dissertation.com
on Facebook

On the evaluation and application of neural language models for grammatical error detection

by Christopher Davis

Institution:	University of Cambridge
Department:
Degree:	PhD
Year:	2023
Keywords:	Computer Science; Natural Language Processing
Posted:	3/25/2025
Record ID:	2321576
Full text PDF:	https://doi.org/10.17863/CAM.108291 https://www.repository.cam.ac.uk/bitstreams/f9ca765c-f10e-4b64-8c10-3d3b7d38810d/download

Abstract

Neural language models (NLM) have become a core component in many downstream applications within the field of natural language processing, including the task of data-driven automatic grammatical error detection (GED). This thesis explores whether information from NLMs can positively transfer to GED within the domain of learning English as a second language (ESL), and looks at whether NLMs encode and make use of linguistic signals that would facilitate robust and generalisable GED performance. First, I investigate whether information from different types of neural language model can be transferred to models for GED. I evaluate five models against three publicly available ESL benchmarks, and report results showing positive transfer effects to the extent that fine-grained error detection using a single model is becoming viable. Second, I carry out a causal investigation to understand whether NLM-GED models make use of robust linguistic signals during inference – in theory, this would enable them to generalise across different data distributions. The results show a high degree of linear encoding of noun-number within each model’s token-level contextual representations, but they also show markedly varying error detection performance across model types and across in- and out-of-domain datasets. Altogether, the results indicate models employ different strategies for error detection. Third, I re-frame the typically downstream GED task as an evaluation framework to test whether the pre-trained NLMs implicitly encode information about grammatical errors as an artefact of their language modelling objective. I present results illustrating stark differences between masked language models and autoregressive language models – while the former seemingly encodes much more information related to the detection of grammatical errors, the results also present evidence of a brittle encoding across different syntactic constructions. Altogether, this thesis presents a holistic analysis of NLMs – how they might be applied to GED, whether they utilise linguistic information to enable robust inference, and whether their pre-training objective implicitly imbues them with knowledge about grammaticality.