CroSentiLex

CroSentiLex is a sentiment lexicon for Croatian. CroSentilex consists of two files (crosentilex-positives.txt and crosentilex-negatives.txt), each containing 37K Croatian lemmas ranked by positivity and negativity, respectively, with the corresponding PageRank scores. The rankings were created automatically based on small positive and negative seed sets and co-occurrence frequencies, using the PageRank algorithm. In addition to the automatically extracted lexicon, human (gold-standard) sentiment annotations for 1200 Croatian lemmas are provided in gs-sentiment-annotations.txt.

Publisher

TakeLab: Text Analysis and Knowledge Engineering Lab, University of Zagreb, Faculty of Electrical Engineering and Computing

Subject(s)

Collections

Show full item record

This item isPublicly Available