English-Corpora: COCA [Davies] 1 1 billion word corpus of American English, 1990-2010 Compare to the BNC and ANC Large, balanced, up-to-date, and freely-available online
Text corpus - Wikipedia In linguistics and natural language processing, a corpus (pl : corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized, language resources, either annotated or unannotated
OPUS - Corpora New: MT560 dataset 2021-04-02 CCAligned and MultiCCAligned 2021-02-10 GoURMET and MIZAN 2020-11-27 EuroPat and tico-19 2020-10-31 OPUS-100 corpus 2020-06-30 ELRC public 2020-05-22 MultiParaCrawl 2019-10-16 Infopankki v1 2019-10-14 New corpus: memat (Xhosa English) 2018-10-06 New corpora: ParaCrawl, XhosaNavy 2018-02-15 New version
Santa Barbara Corpus of Spoken American English The Santa Barbara Corpus of Spoken American English is based on a large body of recordings of naturally occurring spoken interaction from all over the United States
CORPUS® Naturals CORPUS HAS REFINED AND REDESIGNED WHAT A NATURAL FORMULA CAN BE THE RESULT ARE PRODUCTS THAT GO ABOVE AND BEYOND WHAT YOU MAY HAVE COME TO EXPECT FROM “NATURAL ”