This document proposes a system for automatically categorizing documents into a personalized knowledge graph (KnG) of categories relevant to an organization. The system works as follows:
1. It identifies key words in documents and matches them to candidate categories in the KnG.
2. It constructs an associative Markov network to model relationships between candidate categories and infer relevant categories collectively.
3. It allows active learning by soliciting user feedback to retrain the model and propagate constraints to avoid unwanted categories.
The system is evaluated on standard text datasets and shows improved accuracy over baselines through active learning and personalization to an organization's interests. Some open challenges remain around accumulating too many categories over time and scaling inference to