Abstracts Language, Literature, and Linguistic

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

On the evaluation and application of neural language models for grammatical error detection

by Christopher Davis

Institution: University of Cambridge
Department:
Degree: PhD
Year: 2023
Keywords: Computer Science; Natural Language Processing
Posted: 3/25/2025
Record ID: 2321576
Full text PDF: https://doi.org/10.17863/CAM.108291 https://www.repository.cam.ac.uk/bitstreams/f9ca765c-f10e-4b64-8c10-3d3b7d38810d/download


Abstract

Neural language models (NLM) have become a core component in many downstream applications within the field of natural language processing, including the task of data-driven automatic grammatical error detection (GED). This thesis explores whether information from NLMs can positively transfer to GED within the domain of learning English as a second language (ESL), and looks at whether NLMs encode and make use of linguistic signals that would facilitate robust and generalisable GED performance. First, I investigate whether information from different types of neural language model can be transferred to models for GED. I evaluate five models against three publicly available ESL benchmarks, and report results showing positive transfer effects to the extent that fine-grained error detection using a single model is becoming viable. Second, I carry out a causal investigation to understand whether NLM-GED models make use of robust linguistic signals during inference – in theory, this would enable them to generalise across different data distributions. The results show a high degree of linear encoding of noun-number within each model’s token-level contextual representations, but they also show markedly varying error detection performance across model types and across in- and out-of-domain datasets. Altogether, the results indicate models employ different strategies for error detection. Third, I re-frame the typically downstream GED task as an evaluation framework to test whether the pre-trained NLMs implicitly encode information about grammatical errors as an artefact of their language modelling objective. I present results illustrating stark differences between masked language models and autoregressive language models – while the former seemingly encodes much more information related to the detection of grammatical errors, the results also present evidence of a brittle encoding across different syntactic constructions. Altogether, this thesis presents a holistic analysis of NLMs – how they might be applied to GED, whether they utilise linguistic information to enable robust inference, and whether their pre-training objective implicitly imbues them with knowledge about grammaticality.

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

Relevant publications

Book cover thumbnail image
Harriet Beecher Stowe's "Uncle Tom's Cabin" in Ara... Challenges of Cross-Cultural Translation
by AL-Sarrani, Abeer Abdulaziz
   
Book cover thumbnail image
An Analysis of the Knowledge and Use of English Co...
by Kurosaki, Shino
   
Book cover thumbnail image
Literature and Education Proposal of an English Literature Program for Prim...
by Puebla, Esther de la Peña
   
Book cover thumbnail image
Putting Assessment for Learning (AfL) into Practic...
by White, Edmund
   
Book cover thumbnail image
A Case Study on the Impact of Weblogs on the Writi...
by Higginson, Simon
   
Book cover thumbnail image
Thucydides and US Foreign Policy Debates after the...
by Bloxham, John A.
   
Book cover thumbnail image
Quantificational Modification The Semantics of Totality and Proportionality
by Tsouhlaris, Zaina Hafiz
   
Book cover thumbnail image
Language Choice in Interracial Marriages The Case of Filipino-Malaysian Couples
by Dumanig, Francisco Perlas