Presentation at LaTeCH 2014 on "Text Analytics for Legal History

Swedish Coast
The Swedish Coast
At the EACL 2014 Workshop Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), I’m presenting a paper on A Text Analytic Approach to Rural and Urban Legal Histories. Link to the presentation below.
A Text Analytic Approach to Rural and Urban Legal Histories
The ACL publication appears on
This is the bib reference

Papers in JURIX 2013

I’m co-author of two papers at The 26th International Conference on Legal Knowledge and Information Systems (JURIX 2013), Bologna, Italy.
Bologna.  Food.
Argumentation Schemes for Reasoning about Factors with Dimensions
Katie Atkinson, Trevor Bench-Capon, Henry Prakken, and Adam Wyner
In previous work we presented argumentation schemes to capture the CATO and value based theory construction approaches to reasoning with legal cases with factors. We formalised the schemes with ASPIC+, a formal representation of instantiated argumentation. In ASPIC+ the premises of a scheme may either be a factor provided in a knowledge base or established using a further argumentation scheme. Thus far we have taken the factors associated with cases to be given in the knowledge base. While this is adequate for expressing factor based reasoning, we can further investigate the justifications for the relationship between factors and facts or evidence. In this paper we examine how dimensions as used in the HYPO system can provide grounds on which to argue about which factors should apply to a case. By making this element of the reasoning explicit and subject to argument, we advance our overall account of reasoning with legal cases and make it more robust.
author = {Katie Atkinson and Bench-Capon, Trevor and Henry Prakken and Adam Wyner},
title = {Argumentation Schemes for Reasoning about Factors with Dimensions},
booktitle = {Proceedings of 26th International Conference on Legal Knowledge and Information Systems (JURIX 2013)},
year = {2013},
pages = {??-??},
address = {Amsterdam},
publisher = {IOS Press}
A Case Study on Legal Case Annotation
Adam Wyner, Wim Peters, and Daniel Katz
The paper reports the outcomes of a study with law school students to annotate a corpus of legal cases for a variety of annotation types, e.g. citation indices, legal facts, rationale, judgement, cause of action, and others. An online tool is used by a group of annotators that results in an annotated corpus. Differences amongst the annotations are curated, producing a gold standard corpus of annotated texts. The annotations can be extracted with semantic searches of complex queries. There would be many such uses for the development and analysis of such a corpus for both legal education and legal research.
author = {Adam Wyner and Peters, Wim, and Daniel Katz},
title = {A Case Study on Legal Case Annotation},
booktitle = {Proceedings of 26th International Conference on Legal Knowledge and Information Systems (JURIX 2013)},
year = {2013},
pages = {??-??},
address = {Amsterdam},
publisher = {IOS Press}
Shortlink to this page.
By Adam Wyner

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Tutorial on "Textual Information Extraction from Legal Resources" at the 16th International Conference on Artificial Intelligence and Law, Rome, Italy


Legal resources such as legislation, public notices, case law, and other legally relevant documents are increasingly freely available on the internet. They are almost entirely presented in natural language and in text. Legal professionals, researchers, and students need to extract and represent information from such resources to support compliance monitoring, analyse cases for case based reasoning, and extract information in the discovery phase of a trial (e-discovery), amongst a range of possible uses. To support such tasks, powerful text analytic tools are available. The tutorial presents an in depth demonstration of one toolkit the General Architecture for Text Engineering (GATE) with examples and several briefer demonstrations of other tools.


Participants in the tutorial should come away with some theoretical sense of what textual information extraction is about. They will also see some practical examples of how to work with a corpus of materials, develop an information extraction system using GATE and the other tools, and share their results with the research community. Participants will be provided with information on where to find additional materials and learn more.

Intended Audience

The intended audience includes legal researchers, legal professionals, law school students, and political scientists who are new to text processing as well as experienced AI and Law researchers who have used NLP, but wish to get a quick overview of using GATE.

Covered Topics

  • Motivations to annotate, extract, and represent legal textual information.
  • Uses and domains of textual information extraction. Sample materials from legislation, case decisions, gazettes, e-discovery sources, among others.
  • Motivations to use an open source tool for open source development of textual information extraction tools and materials.
  • The relationship to the semantic web, linked documents, and data visualisation.
  • Linguistic/textual problems that must be addressed.
  • Alternative approaches (statistical, knowledge-light, machine learning) and a rationale for a particular bottom-up, knowledge-heavy approach in GATE.
  • Outline of natural language processing modules and tasks.
  • Introduction to GATE – loading and running simple applications, inspecting the results, refining the search results.
  • Development of fragments of a GATE system – lists, rules, and examination of results.
  • Discussion of more complex constructions and issues such as fact pattern identification, which is essential for case-based reasoning, named entity recognition, and structures of documents.
  • Introduction to ontologies.
  • Link textual information extraction to ontologies.
  • Introduction to related tools and approaches: C&C/Boxer (parser and semantic interpreter), Attempto Controlled English, scraperwiki, among others.

Date, Time, Location, and Logistics

Monday, June 10, afternoon session.
The tutorial was held at the Casa dell’Aviatore, viale dell’Università 20 in Rome, Italy.
Information about the conference is available at the website for the 16th International Conference on Artificial Intelligence and the Law (ICAIL).


The slides from the presentation are available here:
Textual Information Extraction from Legal Resources

Further Information

Contact the lecturer.


Dr. Adam Wyner
Lecturer, Department of Computing Science, University of Aberdeen
Aberdeen, Scotland
azwyner at abdn dot ac dot uk
The lecturer has a PhD in Linguistics, a PhD in Computer Science, and research background in computational linguistics. The lecturer has previously given a tutorial on this topic at JURIX 2009 and ICAIL 2011 along with an invited talk at RuleML 2012, has published several conference papers on text analytics of legal resources using GATE and C&C/Boxer, and continues to work on text analysis of legal resources.
A shortlink to this webpage
By Adam Wyner
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Presentation at LEX Summer School 2012

I was a lecturer at the LEX Summer School 2012 in Ravenna, Italy on September 14, 2012.
San Vitali Mosaic, Ravenna, Italy

The school aims at providing knowledge of the most significant ICT standards emerging for legislation, judiciary, parliamentary and administrative documents. The course provides understanding of their impact in the different phases of the legislative and administrative process, awareness of the tools based on legal XML standards and of their constellations, and the ability to participate in the drafting and use of standard-compliant documents throughout law-making process. In particular we would like to create consciousness in the stakeholders in the legal domain about the benefits and the possibilities provided by the correct usage of Semantic Web technologies such as XML standards, ontologies, natural language processing techniques applied to legal texts, legal knowledge modelling and reasoning tools.

The zipped file contains the slides and some exercise material.
The first lecture (Part 1) introduces the general topic, some samples of results, and a discussion about crowdsourcing annotations in legal cases. The second lecture (Part 2) discusses the parsing and semantic representation of a fragment of the British Nationality Act. The class materials are used for an in class exercise about annotation.
Port of Classe mosaic
Shortlink to this page.
By Adam Wyner

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Paper at SWAIE 2012

A paper with Jodi Schneider accepted to 1st Workshop on Semantic Web and Information Extraction (SWAIE 2012) held at the 18th Conference on Knowledge Engineering and Knowledge Management, Galway, Ireland.
Identifying Consumers’ Arguments in Text
Jodi Schneider and Adam Wyner
Product reviews are a corpus of textual data on consumer opinions. While reviews can be sorted by rating, there is limited support to search in the corpus for statements about particular topics, e.g. properties of a product. Moreover, where opinions are justified or criticised, statements in the corpus indicate arguments and counterarguments. Explicitly structuring these statements into arguments could help better understand customers’ disposition towards a product. We present a semi-automated, rule-based information extraction tool to support the identification of statements and arguments in a corpus, using: argumentation schemes; user, domain, and sentiment terminology; and discourse indicators.
author = {Jodi Schneider and Adam Wyner},
title = {Identifying Consumers’ Arguments in Text},
booktitle = {Proceedings of the 1st Workshop on Semantic Web and Information Extraction (SWAIE 2012)},
year = {2012},
address = {Galway, Ireland},
note = {To appear}}
Shortlink to this page.
By Adam Wyner

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Papers at CMNA 2012 and AT 2012

Recent papers at two conferences. One is in the 12th workshop on Computational Models of Natural Argument (CMNA 2012), Montpellier, France. A second paper is in the 1st International Conference on Agreement Technologies (AT 2012), Dubrovnik, Croatia.
Questions, arguments, and natural language semantics
Adam Wyner
Computational models of argumentation can be understood to bridge between human and automated reasoning. Argumentation schemes represent stereotypical, defeasible reasoning patterns. Critical questions are associated with argumentation schemes and are said to attack arguments. The paper highlights several issues with the current understanding of critical questions in argumentation. It provides a formal semantics for questions, an approach to instantiated argumentation schemes, and shows how the semantics of questions clarifies the issues. In this approach, questions do not attack schemes, though answers to questions might.
author = {Adam Wyner},
title = {Questions, Arguments, and Natural Language Semantics},
booktitle = {Proceedings of the 12th Workshop on Computational Models of Natural Argumentation ({CMNA} 2012)},
year = {2012},
address = {Montpellier, France},
note = {To appear}}
Arguing from a Point of View
Adam Wyner and Jodie Schneider
Evaluative statements, where some entity has a qualitative attribute, appear widespread in blogs, political discussions, and consumer websites. Such expressions can occur in argumentative settings, where they are the conclusion of an argument. Whether the argument holds depends on a the premises that express a user’s point of view. Where different users disagree, arguments may arise. There are several ways to represent users, e.g. by values and other parameters. The paper proposes models and argumentation schemes for evaluative expressions, where the arguments and attacks between arguments are relative to a user’s model.
author = {Adam Wyner and Jodi Schneider},
title = {Arguing from a Point of View},
booktitle = {Proceedings of the First International Conference on Agreement Technologies},
year = {2012},
address = {Dubrovnick, Croatia},
note = {To appear}}
Shortlink to this page.
By Adam Wyner

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Papers at COMMA 2012

At the 4th International Conference on Computational Models of Argumentation in Vienna, Austria, I have a short paper in the main conference and a paper in the demo session.
Semi-automated argumentative analysis of online product reviews
Adam Wyner, Jodi Schneider, Katie Atkinson, and Trevor Bench-Capon
Argumentation is key to understanding and evaluating many texts. The arguments in the texts must be identified; using current tools, this requires substantial work from human analysts. With a rule-based tool for semi-automatic text analysis support, we facilitate argument identification. The tool highlights potential argumentative sections of a text according to terms indicative of arguments (e.g. suppose or therefore) and domain terminology (e.g. camera names and properties). The information can be used by an analyst to instantiate argumentation schemes and build arguments for and against a proposal. The resulting argumentation framework can then be passed to argument evaluation tools.
author = {Adam Wyner and Schneider, Jodi and Katie Atkinson and Trevor Bench-Capon},
title = {Semi-Automated Argumentative Analysis of Online Product Reviews},
booktitle = {Proceedings of the 4th International Conference on Computational
Models of Argument ({COMMA} 2012)},
year = {2012},
note = {To appear},
Critiquing justifications for action using a semantic model: Demonstration
Adam Wyner, Katie Atkinson, and Trevor Bench-Capon
The paper is two pages with no abstract.
author = {Adam Wyner and Atkinson, Katie and Trevor Bench-Capon},
title = {Critiquing Justifications for Action Using a Semantic Model: Demonstration},
booktitle = {Proceedings of the 4th International Conference on Computational Models of Argument ({COMMA} 2012)},
year = {2012},
pages = {1-2},
note = {To appear},
Shortlink to this page.
By Adam Wyner

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

21st Century Law Practice and Law Tech Camp Presentations

As part of the 21st Century Law Practice Summer London Law Program, I had the opportunity to present a class on Topics in Natural Language Processing of Legal Texts. My thanks to Dan Katz for organising this and to the class for their interest.
Dan, co-organiser Renee Knake at Michigan State University, and their colleagues at the University of Westminster are up to good things in law and technology – well worth watching.
To cap off the Law Program, the summer program organised a Law Tech Camp of short and TED style presentations on topics. It is an excellent program of talks from members of the legal industry, practicing lawyers, and academics. I have a talk about Crowdsourcing Legal Text Annotation, which is also discussed in a previous post. The talks are videotaped and made available online (TBA).
Shortlink to this page.
By Adam Wyner

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

EXTENDED CFP – Workshop on Semantic Processing of Legal Texts (SPLeT 2012)

In conjunction with
Language Resources and Evaluation Conference 2012 (LREC 2012)
27 May, 2012
Istanbul, Turkey
The legal domain represents a primary candidate for web-based information distribution, exchange and management, as testified by the numerous e-government, e-justice and e-democracy initiatives worldwide. The last few years have seen a growing body of research and practice in the field of Artificial Intelligence and Law which addresses a range of topics: automated legal reasoning and argumentation, semantic and cross-language legal information retrieval, document classification, legal drafting, legal knowledge discovery and extraction, as well as the construction of legal ontologies and their application to the law domain. In this context, it is of paramount importance to use Natural Language Processing techniques and tools that automate and facilitate the process of knowledge extraction from legal texts.
Since 2008, the SPLeT workshops have been a venue where researchers from the Computational Linguistics and Artificial Intelligence and Law communities meet, exchange information, compare perspectives, and share experiences and concerns on the topic of legal knowledge extraction and management, with particular emphasis on the semantic processing of legal texts. Within the Artificial Intelligence and Law community, there have also been a number of dedicated workshops and tutorials specifically focussing on different aspects of semantic processing of legal texts at conferences such as JURIX-2008, ICAIL-2009, ICAIL-2011, as well as in the International Summer School “Managing Legal Resources in the Semantic Web” (2007, 2008, 2009, 2010, 2011).
To continue this momentum and to advance research, a 4th Workshop on “Semantic Processing of Legal Texts” is being organized at the LREC-2012 conference to bring to the attention of the broader LR/HLT (Language Resources/Human Language Technology) community the specific technical challenges posed by the semantic processing of legal texts and also share with the community the motivations and objectives which make it of interest to researchers in legal informatics. The outcome of these interactions are expected to advance research and applications and foster interdisciplinary collaboration within the legal domain.
New to this edition of the workshop are two sub-events (described below) to provide common and consistent task definitions, datasets, and evaluation for legal-IE systems along with a forum for the presentation of varying but focused efforts on their development.
The main goals of the workshop and associated events are to provide an overview of the state-of-the-art in legal knowledge extraction and management, to explore new research and development directions and emerging trends, and to exchange information regarding legal language resources and human language technologies and their applications.
Dependency Parsing
The first sub-event will be a shared task specifically focusing on dependency parsing of legal texts: although this is not a domain-specific task, it is a task which creates the prerequisites for advanced IE applications operating on legal texts, which can benefit from reliable preprocessing tools. For this year our aim is to create the prerequisites for more advanced domain-specific tasks (e.g. event extraction) to be organized in future SPLeT editions. We strongly believe that this could be a way to attract the attention of the LR/HLT community to the specific challenges posed by the analysis of this type of texts and to have a clearer idea of the current state of the art. The languages dealt with will be Italian and English. A specific Call for Participation for the shared task is available in a dedicated page.
Semantic Annotation
The second sub-event will be an online, manual, collaborative, semantic annotation exercise, the results of which will be presented and discussed at the workshop. The goals of the exercise are: (1) to gain insight on and work towards the creation of a gold standard corpus of legal documents in a cohesive domain; and (2) to test the feasibility of the exercise and to get feedback on its annotation structure and workflow. The corpus to be annotated will be a selection of documents drawn from EU and US legislation, regulation, and case law in a particular domain (e.g. consumer or environmental protection). For this exercise, the language will be English. A specific Call for Participation for this annotation exercise is available in a dedicated page.
Areas of Interest
The workshop will focus on the topics of the automatic extraction of information from legal texts and the structural organisation of the extracted knowledge. Particular emphasis will be given to the crucial role of language resources and human language technologies.
Papers are invited on, but not limited to, the following topics:

  • Construction, extension, merging, customization of legal language resources, e.g. terminologies, thesauri, ontologies, corpora
  • Information retrieval and extraction from legal texts
  • Semantic annotation of legal text
  • Legal text processing
  • Multilingual aspects of legal text semantic processing
  • Legal thesauri mapping
  • Automatic Classification of legal documents
  • Logical analysis of legal language
  • Automated parsing and translation of natural language legal arguments into a logical formalism
  • Dialogue protocols for legal information processing
  • Controlled language systems for law
  • LREC Conference Information (Accommodation, Travel, Registration)
    Language Resources and Evaluation Conference 2012 (LREC 2012)
    Workshop Schedule – TBA
    Workshop Registration and Location – TBA
    Webpage URLs

  • This page is
  • An alternative workshop webpage
  • Important Dates:

  • REVISED Submission: 19 February 2012
  • Acceptance Notification: 12 March 2012
  • Final Version: 30 March 2012
  • Workshop date: 27 May 2012
  • Author Guidelines:
    Submissions are solicited from researchers working on all aspects of semantic processing of legal texts. Authors are invited to submit papers describing original completed work, work in progress, interesting problems, case studies or research trends related to one or more of the topics of interest listed above. The final version of the accepted papers will be published in the Workshop Proceedings.
    Short or full papers can be submitted. Short papers are expected to present new ideas or new visions that may influence the direction of future research, yet they may be less mature than full papers. While an exhaustive evaluation of the proposed ideas is not necessary, insight and in-depth understanding of the issues is expected. Full papers should be more well developed and evaluated. Short papers will be reviewed the same way as full papers by the Program Committee and will be published in the Workshop Proceedings.
    Full paper submissions should not exceed 10 pages, short papers 6 pages. See the style guidelines and files on the LREC site:
    Authors’ Kit and Templates
    Submit papers to:
    Submission for the workshop uses the START submission system at:
    Note that when submitting a paper through the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. For further information on this new initiative, please refer to:
    After the workshop a number of selected, revised, peer-reviewed articles will be published in a Special Issue on Semantic Processing of Legal Texts of the AI and Law Journal (Springer).
    Contact Information:
    Address any queries regarding the workshop to:
    Program Committee Co-Chairs:
    Enrico Francesconi (National Research Center, Italy)
    Simonetta Montemagni (National Research Center, Italy)
    Wim Peters (University of Sheffield, UK)
    Adam Wyner (University of Liverpool, UK)
    Program Committee (Preliminary):
    Kevin Ashley (University of Pittsburgh, USA)
    Johan Bos (University of Rome, Italy)
    Daniele Bourcier (Humboldt Universitat, Germany)
    Pompeu Casanovas (Universitat Autonoma de Barcelona, Spain)
    Jack Conrad (Thomson Reuters, USA)
    Matthias Grabmair (University of Pittsburgh, USA)
    Antonio Lazari (Scuola Superiore S.Anna, Italy)
    Leonardo Lesmo (Universita di Torino, Italy)
    Marie-Francine Moens (Katholieke Universiteit Leuven, Belgium)
    Thorne McCarty (Rutgers University, USA)
    Raquel Mochales Palau (Catholic University of Leuven, Belgium)
    Paulo Quaresma (Universidade de Evora, Portugal)
    Tony Russell-Rose (UXLabs, UK)
    Erich Schweighofer (Universitat Wien, Austria)
    Rolf Schwitter (Macquarie University, Australia)
    Manfred Stede (University of Potsdam, Germany)
    Daniela Tiscornia (National Research Council, Italy)
    Tom van Engers (University of Amsterdam, Netherlands)
    Giulia Venturi (Scuola Superiore S.Anna, Italy)
    Vern R. Walker (Hofstra University, USA)
    Radboud Winkels (University of Amsterdam, Netherlands)
    By Adam Wyner

    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

    Papers Accepted to the JURIX 2011 Conference

    My colleagues and I have had two papers (one long and one short) accepted for presentation at The 24th International Conference on Legal Knowledge and Information Systems (JURIX 2011). The papers are available on the links.
    On Rule Extraction from Regulations
    Adam Wyner and Wim Peters
    Rules in regulations such as found in the US Federal Code of Regulations can be expressed using conditional and deontic rules. Identifying and extracting such rules from the language of the source material would be useful for automating rulebook management and translating into an executable logic. The paper presents a linguistically-oriented, rule-based approach, which is in contrast to a machine learning approach. It outlines use cases, discusses the source materials, reviews the methodology, then provides initial results and future steps.
    Populating an Online Consultation Tool
    Sarah Pulfrey-Taylor, Emily Henthorn, Katie Atkinson, Adam Wyner, and Trevor Bench-Capon
    The paper addresses the extraction, formalisation, and presentation of public policy arguments. Arguments are extracted from documents that comment on public policy proposals. Formalising the information from the arguments enables the construction of models and systematic analysis of the arguments. In addition, the arguments are represented in a form suitable for presentation in an online consultation tool. Thus, the forms in the consultation correlate with the formalisation and can be evaluated accordingly. The stages of the process are outlined with reference to a working example.
    Shortlink to this page.
    By Adam Wyner

    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.