This document discusses word space models and random indexing for determining text similarity. It explains that word space models plot words in a multidimensional space based on co-occurrence to determine semantic similarity. Random indexing is an efficient method that incrementally builds context vectors for words without constructing a large co-occurrence matrix first. The document outlines the key parameters for random indexing and discusses its benefits over models like LSA in being able to handle data incrementally with less computational resources.
Related topics: