Please use the following text to cite this item or export to a predefined format:
Mikelenić, Bojana; Bikić-Carić, Gorana; Bezlaj, Metka; Oliver, Antoni and Tadić, Marko, 2025, RomCro v.2.0 - Parallel corpus of Romance languages ​​and Croatian, HR-CLARIN, http://hdl.handle.net/20.500.14615/2-16
dc.contributor.authorMikelenić, Bojana
dc.contributor.authorBikić-Carić, Gorana
dc.contributor.authorBezlaj, Metka
dc.contributor.authorOliver, Antoni
dc.contributor.authorTadić, Marko
dc.date.accessioned2025-02-04T13:37:35Z
dc.date.available2025-02-04T13:37:35Z
dc.date.issued2025-1-15
dc.descriptionThe corpus contains originals and translations in all seven languages, and the order of the segments has been changed. The first version (RomCro v.1.0) was published in 2022. RomCro v.2.0 contains 33 original texts, 213 texts in total, 166,738 translation units and 19.4 million words, an increase of 3.7 million compared to the previous version. In comparison to v.1.0,  v.2.0 also contains texts in Catalan.
dc.description.abstractRomCro v.2.0 is parallel multilingual and multidirectional corpus of literary texts in six Romance languages ​​(French, Portuguese, Romanian, Italian, Spanish and Catalan) and Croatian.
dc.identifier.urihttp://hdl.handle.net/20.500.14615/2-16
dc.language.isoes
dc.publisherFaculty of Humanities and Social Sciences
dc.rightsThe MIT Licence
dc.rights.labelPUB
dc.rights.urihttps://zzl-ffzg.mit-license.org/
dc.subjectParallel corpus
dc.subjectCatalan language
dc.subjectCroatian language
dc.subjectLiterary texts
dc.subjectHUMANITIES and RELIGION::Languages and linguistics::Romance languages::French language
dc.subjectItalian language
dc.subjectPortuguese language
dc.subjectHUMANITIES and RELIGION::Languages and linguistics::Romance languages::Romanian language
dc.subjectHUMANITIES and RELIGION::Languages and linguistics::Romance languages::Spanish language
dc.subjecthrvatski jezik
dc.subjectusporedni korpus
dc.subjectknjiževni tekstovi
dc.subjectfrancuski jezik
dc.subjectkatalonski jezik
dc.subjecttalijanski jezik
dc.subjectrumunjski jezik
dc.subjectšpanjolski jezik
dc.subjectportugalski jezik
dc.titleRomCro v.2.0 - Parallel corpus of Romance languages ​​and Croatian
dc.typecorpus
local.contact.personBojana Mikelenić bmikelen@ffzg.unizg.hr Faculty of Humanities and Social Sciences, University of Zagreb
local.files.count2
local.files.size315628164
local.has.filesyes
local.size.info19.4 million tokens
local.sponsornationalFunds MOBODL-2023-08-9511 Croatian Science Foundation NextGenerationEU
metashare.ResourceInfo#ContentInfo.mediaTypetext

Collections

This item isPublicly Available
and licensed under:
 Files in this item
Name
RomCro_2.0.tsv
Size
128.39 MB
Format
application/octet-stream
Description
tsv
MD5
4da24e88e056baa81e493e62c3044b02
Preview
  File Preview
Name
RomCro_2.0.tmx
Size
172.62 MB
Format
application/octet-stream
Description
tmx
MD5
c1b9be5e0703f3c09faa2cc35d1c26e1
Preview
  File Preview