LOCRA

The Louvain Corpus of Research Articles is a corpus currently under development at the Centre for English Corpus Linguistics which aims to represent expert academic discourse in several disciplines.  It currently totals 11 million words and contains research articles from peer-reviewed top-rated journals in 11 disciplines: anthropology, business, economics, education, law, literature, linguistics, medicine, political science, psychology, and sociology (c. 1 million words per discipline). 

A subset of the LOCRA corpus was also used as a basis for the compilation of an English-French comparable corpus (LOCRA-bilingual comparable), which consists of 4 million words of English and French research articles in the Humanities. The texts were compiled from top-ranked journals and are evenly spread across five disciplines: anthropology, education, political science, psychology and sociology.

    Original English Original French
LOCRA Register Research articles in the Humanities from top-ranked journal across 5 disciplines (viz. anthropology, education, political science, psychology and sociology) Research articles in the Humanities from top-ranked journal across 5 disciplines (viz. anthropology, education, political science, psychology and sociology)
  Total # of words 2,033,106 2,025,372
  Total # of texts 271 313
  Average text length 7,502 words 6,470 words
  Publication date 2007–2013 2006–2014