top of page


This is a tiny Turkish corpus consisting of comments about Corona symptoms. The corpus is compiled from two Ekşi Sözlük headlines "covid-19 belirtileri" and "gün gün koronavirüs belirtileri".

This corpus

  • contains 178 raw, 175 processed comments

  • all comments are in Turkish

  • comes in 2 versions, raw and mildly processed   

For the processed version html tags, expressions in brackets and some other tags are removed.

If you want more information about how this dataset is crafted you can watch the playlist How to compile your own datasets. This corpus is featured in one more  Youtube playlist, Quick recipes with spaCy Turkish models.  In this playlist, we'll mine Corona-mini corpus to extract information related to symptoms and authors' Corona experiences.

Find this small yet handy corpus on its Github repo.

bottom of page