The document proposes a research strategy to produce computational summaries of legal cases at scale through semi-supervised learning of legal semantics. It summarizes three of the author's past papers on representing legal semantics and outlines a two-step approach: 1) Using natural language processing to automatically generate semantic interpretations of legal texts, and 2) Generalizing patterns of information extraction through unsupervised learning of semantics from a large corpus of cases. The current proposal is to initialize the model with word embeddings from legal texts and learn higher-level concepts by applying a theory of representation based on prototypes and manifolds.
Related topics: