MULT-ED

 

The Multilingual Editorial Corpus is a multilingual comparable corpus of newspaper editorials written in English, Dutch, French and Swedish. The English corpus spans several geographical varieties and contains texts from Great-Britain, South Africa and China. The CECL collaborates with Ghent University, which handles the collection of the Swedish subcorpus. Mult-Ed is currently composed of c. 1,000,000 words of British English leading articles, c. 1,000,000 words of French editorials, c. 500,000 words of Dutch editorials and c. 220,000 words of Swedish editorials, as shown in the following table:

 

LANGUAGE NEWSPAPERS NUMBER OF RUNNING WORDS TOTAL
English (UK) The Guardian 421,647 1,011,430
The Independent 61,479
The Times 72,629
The Observer 49,259
The Sunday Telegraph 50,812
The Daily Telegraph 296,340
The Economist 59,264
English (South African) City Press 25,216 301,945
Financial Mail 54,530
Mail & Guardian 126,430
The Times 95,769
English (China) China Daily 101,612 302,862
China Post 71,515
South China Morning Post 129,735
French Le Monde 270,972 997,280
Le Figaro 343,736
Libération 382,572
Dutch De NRC Handelsblad 296,175 490,415
Trouw 95,209
Het Parool 39,196
Utrechts Nieuwsblad 46,100
Haagsche Courant 13,735
Swedish Dagens Nyheter 94,455 220,065
Svenska Dagbladet 108,749
Göterborgs Posten 18,861