1. The document proposes representing text documents as graphs (graph-of-words) instead of bag-of-words and using frequent subgraph mining to extract features for text categorization.
2. It describes using the gSpan algorithm to efficiently mine frequent subgraphs from the graph-of-words representations to generate features.
3. An elbow method is used to select an optimal minimum support threshold that balances feature set size and accuracy. Representing documents as graphs and mining subgraph features is shown to improve accuracy over traditional bag-of-words on four text categorization datasets.