The Multilingual Editorial Corpus is a multilingual comparable corpus of newspaper editorials written in English, Dutch, French and Swedish. The English corpus spans several geographical varieties and contains texts from Great-Britain, South Africa and China. The CECL collaborates with Ghent University, which handles the collection of the Swedish subcorpus. Mult-Ed is currently composed of c. 1,000,000 words of British English leading articles, c. 1,000,000 words of French editorials, c. 500,000 words of Dutch editorials and c. 220,000 words of Swedish editorials, as shown in the following table:
LANGUAGE | NEWSPAPERS | NUMBER OF RUNNING WORDS | TOTAL |
English (UK) | The Guardian | 421,647 | 1,011,430 |
The Independent | 61,479 | ||
The Times | 72,629 | ||
The Observer | 49,259 | ||
The Sunday Telegraph | 50,812 | ||
The Daily Telegraph | 296,340 | ||
The Economist | 59,264 | ||
English (South African) | City Press | 25,216 | 301,945 |
Financial Mail | 54,530 | ||
Mail & Guardian | 126,430 | ||
The Times | 95,769 | ||
English (China) | China Daily | 101,612 | 302,862 |
China Post | 71,515 | ||
South China Morning Post | 129,735 | ||
French | Le Monde | 270,972 | 997,280 |
Le Figaro | 343,736 | ||
Libération | 382,572 | ||
Dutch | De NRC Handelsblad | 296,175 | 490,415 |
Trouw | 95,209 | ||
Het Parool | 39,196 | ||
Utrechts Nieuwsblad | 46,100 | ||
Haagsche Courant | 13,735 | ||
Swedish | Dagens Nyheter | 94,455 | 220,065 |
Svenska Dagbladet | 108,749 | ||
Göterborgs Posten | 18,861 |