Word similarity dataset
Word similarity dataset is a dataset with similarities between words, often pairs of words. Such datasets are often used to evaluation tools that provide methods for lexical semantics, such as semantic similarity. Researchers have established several word similarity datasets.
Name | Year | Pairs | Range | Reference |
---|---|---|---|---|
Rubenstein-Goodenough (RG) | 1965 | 0.02–3.94 | Contextual correlates of synonymy | |
Miller-Charles (MG) | 1991 | Contextual correlates of semantic similarity | ||
WordSim-353 | 2001 | 353 | Placing search in context: the concept revisited | |
MEN | 2012 | Multimodal distributional semantics | ||
SimLex-999 | 2014 | 999 | 0.23–9.80 | SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation |
SimVerb-3500 | 2016 | 3500 | SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity |
Quiz
edit