29.09.2017 (piątek) - Seminarium Instytutowe - godz. 13:00, Włodek Zadrożny (University of North Carolina in Charlotte)
We will present a new way of addressing the challenges of patent analytics, and in particular of prior art retrieval. In this context, we discuss the recent progress in incorporating word order and semantics to the representations of text meaning that has yielded promising results in computational text classification and analysis. This development, and the availability of a large number of legal rulings from the PTAB (US Patent Trial and Appeal Board) motivated us to revisit possibilities for practical, computational models of legal relevance - starting with this narrow and approachable niche of jurisprudence. We present results from our analysis and experiments towards this goal using a corpus of approximately 8000 rulings from the PTAB. This work makes three important contributions towards the development of models for legal relevance semantics: (a) Using state-of-art Natural Language Processing (NLP) concept embeddings methods, we characterize the diversity and types of semantic relationships that are implicit in practical judgements of legal relevance at the PTAB (b) We achieve new state-of-art results on practical information retrieval using our customized semantic representations on this corpus (c) We outline promising avenues for future work in the area - including preliminary evidence from human-in-loop interaction, and new forms of text representation developed using input from over a hundred interviews with practitioners in the field. Finally, we argue that PTAB relevance is a practical and realistic baseline for performance measurement - with the desirable property of evaluating NLP improvements against “real world” legal judgement.