The document outlines the implementation of a Sinhala language corpus, detailing aspects such as resource gathering, data storage, and user interface design. It discusses the architecture, composition, and evaluation of various database technologies, highlighting Cassandra for its performance in data insertion and retrieval. Limitations include the lack of part of speech tagging and the need for new column families for additional information needs.