RomCro v.2.0 - Parallel corpus of Romance languages ​​and Croatian

Please use the following text to cite this item or export to a predefined format:
Mikelenić, Bojana; Bikić-Carić, Gorana; Bezlaj, Metka; Oliver, Antoni and Tadić, Marko, 2025, RomCro v.2.0 - Parallel corpus of Romance languages ​​and Croatian, HR-CLARIN, http://hdl.handle.net/20.500.14615/2-16
Date issued
2025-1-15
Size
19.4 million tokens
Description
The corpus contains originals and translations in all seven languages, and the order of the segments has been changed. The first version (RomCro v.1.0) was published in 2022. RomCro v.2.0 contains 33 original texts, 213 texts in total, 166,738 translation units and 19.4 million words, an increase of 3.7 million compared to the previous version. In comparison to v.1.0,  v.2.0 also contains texts in Catalan.
Acknowledgement
Collections
This item isPublicly Available
and licensed under:
 Files in this item
Name
RomCro_2.0.tsv
Size
128.39 MB
Format
application/octet-stream
Description
tsv
MD5
4da24e88e056baa81e493e62c3044b02
Preview
  File Preview
Name
RomCro_2.0.tmx
Size
172.62 MB
Format
application/octet-stream
Description
tmx
MD5
c1b9be5e0703f3c09faa2cc35d1c26e1
Preview
  File Preview