The document discusses an algorithm for measuring text similarity called SAWA. It describes how SAWA calculates word-to-word and text-to-text similarity using Wikipedia as a concept hierarchy. Experimental results showed that optimizations improved performance by 10 times while maintaining quality results. Future work includes developing web service and web interfaces and releasing the source code as open-source.
Related topics: