EGSpanish Corpus

After almost exactly one year of intensive work on the oral research data from Equatorial Guinea, the big moment has finally come: The extensive EGSpanish Corpus is available in digitalized form!

All 135 Frog Stories and a selection of 36 sociolinguistic interviews were fully transcribed with the help of an ad hoc transcription team at the Seminar for Ibero-Romance Studies at the University of Basel using the EXMARaLDA Partitur Editor (https://exmaralda.org/en/partitur-editor-en/). Afterwards, Dr. Thomas Schmidt carried out the tokenization, lemmatization and POS tagging so that finally a corpus could be created that can now be analyzed and annotated in EXACT (EXMARaLDA Analysis- and Concordance Tool, https://exmaralda.org/en/exakt-en/).

The EGSpanish Corpus has a total duration of 34:14:02 and contains a total of 307.623 tokens.  

Many thanks for the great support to Thomas, Sara, Johannes, Linda, Miriam, Joel and Peter!


EGSpanish corpus_adicional1png


EGSpanish corpus_adicional2png