Search and Browse – ELRA Catalogue

Arabic
English
French

ID: ELRA-E0045

The MAURDOR project consists in evaluating systems for automatic processing of written documents. Collected written documents are scanned documents (printed, typewritten or manuscripts). In order to get images for the evaluation of automatic analysis systems, 10,000 original documents were c...

MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	500.00 €	10000.00 €
Licence: Evaluation Use - ELRA EVALUATION		5000.00 €
Licence: Commercial Use - ELRA VAR	10000.00 €	10000.00 €

NON MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	750.00 €	15000.00 €
Licence: Evaluation Use - ELRA EVALUATION		7500.00 €
Licence: Commercial Use - ELRA VAR	15000.00 €	15000.00 €

Mbochi speech corpus audio

Bantu languages
French

ID: ELRA-S0396

ISLRN: 747-055-093-447-8

The Mbochi speech corpus was developed in the framework of ANR-DFG BULB project. This project aims to provide field linguists (eg working on morphology) with tools for less or not written languages. The provided corpus is a subset from the corpus developed in this framework. The provided corpu...

MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €	0.00 €
Licence: Commercial Use - ELRA VAR	0.00 €	0.00 €

NON MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €	0.00 €
Licence: Commercial Use - ELRA VAR	0.00 €	0.00 €

MCL - Multifunctional Computational Lexicon of Contemporary Portuguese text

Portuguese

ID: ELRA-L0096

ISLRN: 489-956-642-755-8

MCL is a 26,443 lemma Frequency Lexicon with 140,315 tokens, with the minimum lemma frequency of 6, extracted from CORLEX, a contemporary Portuguese corpus (16,210,438 words). CORLEX is a subcorpus of the Reference Corpus of Contemporary Portuguese and contains written and spoken texts of several...

MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €	0.00 €
Licence: Commercial Use - ELRA VAR	0.00 €	0.00 €

NON MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €	0.00 €
Licence: Commercial Use - ELRA VAR	0.00 €	0.00 €

MDT Mandarin Chinese Conversational Recognition Corpus – 1 channel audio

Chinese

ID: ELRA-S0409-02

ISLRN: 234-140-315-272-4

This dataset consists of 4.98 hours of transcribed conversational speech in Mandarin Chinese, where 30 conversations are uttered by 32 speakers (16 males and 16 females). The audios are sampled at 16 kHz and quantized at 16 bits. For each conversation, there are two close-talking channels record...

MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	2615.00 €	2615.00 €

NON MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	3138.00 €	3138.00 €

MDT Mandarin Chinese Conversational Recognition Corpus – 2 channels audio

Chinese

ID: ELRA-S0409-03

ISLRN: 383-054-806-637-3

This dataset consists of 4.98 hours of transcribed conversational speech in Mandarin Chinese, where 30 conversations are uttered by 32 speakers (16 males and 16 females). The audios are sampled at 16 kHz and quantized at 16 bits. For each conversation, there are two close-talking channels record...

MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	3596.00 €	3596.00 €

NON MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	4315.20 €	4315.20 €

MDT Mandarin Chinese Conversational Recognition Corpus – 3 channels audio

Chinese

ID: ELRA-S0409-04

ISLRN: 235-882-638-211-2

This dataset consists of 4.98 hours of transcribed conversational speech in Mandarin Chinese, where 30 conversations are uttered by 32 speakers (16 males and 16 females). The audios are sampled at 16 kHz and quantized at 16 bits. For each conversation, there are two close-talking channels record...

MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	4577.00 €	4577.00 €

NON MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	5492.40 €	5492.40 €

MDT Mandarin Chinese Conversational Recognition Corpus – Complete set audio

Chinese

ID: ELRA-S0409-01

ISLRN: 559-956-475-937-1

This dataset consists of 4.98 hours of transcribed conversational speech in Mandarin Chinese, where 30 conversations are uttered by 32 speakers (16 males and 16 females). The audios are sampled at 16 kHz and quantized at 16 bits. For each conversation, there are two close-talking channels record...

MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	5557.00 €	5557.00 €

NON MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	6668.40 €	6668.40 €

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

Resource Type:

Media Type:

1673 Language Resources (Page 42 of 84)