Skip to content

E3-JSI/dataset-SloATOMIC-2020

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

SloATOMIC 2020 Data Set

CC BY-SA 4.0

The SloATOMIC 2020 contains the Slovene translated examples of the ATOMIC 2020 data set. The translation was done using the DeepL translation service.

The main purpose of the data set is to train Slovene commonsense reasoning models.

🗃️ Data

The data set is publically available via the clarin.si repository.

The data set is available in the data folder which contains the following files:

Data Format

The data is in the tsv (tab-separated) format. Each line contains one example. The columns are:

  • head_event: The head event of the example.
  • relation: The relation between the head event and the tail event.
  • tail_event: The tail event of the example.

📚 Papers

The data set was used in the following papers:

🔎 Reference

If the data set was used for your research, please provide the following reference:

 @misc{11356/1724,
   title = {Slovene Translation of the Atomic 2020 data set {SloATOMIC} 2020},
   author = {Mladeni{\'c} Grobelnik, Adrian and Novak, Erik and Mladeni{\'c}, Dunja and Grobelnik, Marko},
   url = {http://hdl.handle.net/11356/1724},
   note = {Slovenian language resource repository {CLARIN}.{SI}},
   copyright = {Creative Commons - Attribution-{ShareAlike} 4.0 International ({CC} {BY}-{SA} 4.0)},
   issn = {2820-4042},
   year = {2022} 
 }

📣 Acknowledgments

This work is developed by Department of Artificial Intelligence at Jozef Stefan Institute.

The work is supported by the Slovenian Research Agency and the RSDO project.

⚖️ License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.