The document outlines a project focused on automatic topic discovery from vast amounts of data generated by news agencies and in-house productions, addressing challenges like data overload. It details the workflow for information extraction, language recognition, named entity extraction, and topic detection, with a live demo showcasing results. Key findings include the need for improved models for non-English recognition and a dedicated user interface for optimal topic prediction.