# | Name | Description | URL | Preview | Citaion |
---|---|---|---|---|---|
1 | News corpus | News collected from 自由時報財經版、蘋果日報財經版、工商時報 from 2018/10/12 to 2019/2/09 (16,541 news, approximate 10M words).
Files are named by NAME-YYYYMMDD-ID.extension , where *.raw means the raw news, *.txt means the
news with digits and punctuations removed, *.cut means tokenized news, and *tag stores companies mentioned
by this news (empty for none) |
Link | View | None |
2 | Market info | Various financial measures for each firm in TWSE in the span covered by the news (see above) | Link | See link | None |
3 | Translated sentiment words | Traditional Chinese translation of each word in Loughran and McDonald Sentiment Word Lists (2,041 words in total) | Link | See link | UTaipei 2015 |
4 | Expanded sentiment words | Contains similar words for each Chinese sentiment word using word2vec (8,561 words in total) | Link | See link | None |