Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection

2021-03-23 | preprint

Jump to: Cite & Linked | Documents & Media | Details | Version history

Cite this publication

​Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection​
Wahle, J. P.; Ruas, T. ; Meuschke, N.  & Gipp, B. ​ (2021)

Documents & Media

License

GRO License GRO License

Details

Authors
Wahle, Jan Philip; Ruas, Terry ; Meuschke, Norman ; Gipp, Bela 
Abstract
The rise of language models such as BERT allows for high-quality text paraphrasing. This is a problem to academic integrity, as it is difficult to differentiate between original and machine-generated content. We propose a benchmark consisting of paraphrased articles using recent language models relying on the Transformer architecture. Our contribution fosters future research of paraphrase detection systems as it offers a large collection of aligned original and paraphrased documents, a study regarding its structure, classification experiments with state-of-the-art systems, and we make our findings publicly available.
Issue Date
23-March-2021

Reference

Citations