I (albeit remotely) participated in a Workshop in London on June 25, 2019 on Legislation and Computers. This meeting is part of ongoing developments by an international group of Parliamanetary Counsels to explore and discuss recent developments in machine-readable and executable legislation.
I gave my talk in conjunction with one of my collaborators Fraser Gough, Parliamentary Counsel of the Parliamentary Counsel Office, Scottish Government.
The program of the workshop is here and the slides of the talk are here.
I had the opportunity to give a talk at the Office of the Parliamentary Counsel, London, on 21 May 2019 about a small pilot project done with colleagues in the Parliamentary Counsel Office of the Scottish Government.
The talk is about applying some LegalRuleML elements as annotations to a corpus of Scottish legislation, making the annotated documents accessible on the Web, then visualising and querying the corpus to access particularly relevant information from across the corpus.
I spoke briefly at the online CodeX weekly meeting about small pilot project done with colleagues in the Parliamentary Counsel Office of the Scottish Government.
The talk is about applying some LegalRuleML elements as annotations to a corpus of Scottish legislation, making the annotated documents accessible on the Web, then visualising and querying the corpus to access particularly relevant information from across the corpus.
There were other excellent presentations by:
Daniel Hoadley, Incorporated Counsel for Law Reporting in England and Wales
CodeX will make a recording of the session available.
The slides of my talk are available here. The talk is a shortened and slightly modified version of a talk at the Office of the Parliamentary Counsel, London.
Thanks especially to Jameson Dempsey at CodeX for inviting me to participate.
Legal resources such as legislation, public notices, case law, and other legally relevant documents are increasingly freely available on the internet. They are almost entirely presented in natural language and in text. Legal professionals, researchers, and students need to extract and represent information from such resources to support compliance monitoring, analyse cases for case based reasoning, and extract information in the discovery phase of a trial (e-discovery), amongst a range of possible uses. To support such tasks, powerful text analytic tools are available. The tutorial presents an in depth demonstration of one toolkit the General Architecture for Text Engineering (GATE) with examples and several briefer demonstrations of other tools.
Goals
Participants in the tutorial should come away with some theoretical sense of what textual information extraction is about. They will also see some practical examples of how to work with a corpus of materials, develop an information extraction system using GATE and the other tools, and share their results with the research community. Participants will be provided with information on where to find additional materials and learn more.
Intended Audience
The intended audience includes legal researchers, legal professionals, law school students, and political scientists who are new to text processing as well as experienced AI and Law researchers who have used NLP, but wish to get a quick overview of using GATE.
Covered Topics
Motivations to annotate, extract, and represent legal textual information.
Uses and domains of textual information extraction. Sample materials from legislation, case decisions, gazettes, e-discovery sources, among others.
Motivations to use an open source tool for open source development of textual information extraction tools and materials.
The relationship to the semantic web, linked documents, and data visualisation.
Linguistic/textual problems that must be addressed.
Alternative approaches (statistical, knowledge-light, machine learning) and a rationale for a particular bottom-up, knowledge-heavy approach in GATE.
Outline of natural language processing modules and tasks.
Introduction to GATE – loading and running simple applications, inspecting the results, refining the search results.
Development of fragments of a GATE system – lists, rules, and examination of results.
Discussion of more complex constructions and issues such as fact pattern identification, which is essential for case-based reasoning, named entity recognition, and structures of documents.
Introduction to ontologies.
Link textual information extraction to ontologies.
Introduction to related tools and approaches: C&C/Boxer (parser and semantic interpreter), Attempto Controlled English, scraperwiki, among others.
Dr. Adam Wyner
Lecturer, Department of Computing Science, University of Aberdeen
Aberdeen, Scotland
azwyner at abdn dot ac dot uk Website
The lecturer has a PhD in Linguistics, a PhD in Computer Science, and research background in computational linguistics. The lecturer has previously given a tutorial on this topic at JURIX 2009 and ICAIL 2011 along with an invited talk at RuleML 2012, has published several conference papers on text analytics of legal resources using GATE and C&C/Boxer, and continues to work on text analysis of legal resources. A shortlink to this webpage
By Adam Wyner
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
As part of the 21st Century Law Practice Summer London Law Program, I had the opportunity to present a class on Topics in Natural Language Processing of Legal Texts. My thanks to Dan Katz for organising this and to the class for their interest.
Dan, co-organiser Renee Knake at Michigan State University, and their colleagues at the University of Westminster are up to good things in law and technology – well worth watching.
To cap off the Law Program, the summer program organised a Law Tech Camp of short and TED style presentations on topics. It is an excellent program of talks from members of the legal industry, practicing lawyers, and academics. I have a talk about Crowdsourcing Legal Text Annotation, which is also discussed in a previous post. The talks are videotaped and made available online (TBA). Shortlink to this page.
By Adam Wyner
Two short papers appear in the proceedings of LREC Workshop on SPLeT 2012 – Semantic Processing of Legal Texts. The papers are available on the links. Problems and Prospects in the Automatic Semantic Analysis of Legal Texts – A Position Paper
Adam Wyner Abstract
Legislation and regulations are expressed in natural language. Machine-readable forms of the texts may be represented as linked documents, semantically tagged text, or translation to a logic. The paper considers the latter form, which is key to testing consistency of laws, drawing inferences, and providing explanations relative to input. To translate laws to a machine-readable logic, sentences must be parsed and semantically translated. Manual translation is time and labour intensive, usually involving narrowly scoping the rules. While automated translation systems have made significant progress, problems remain. The paper outlines systems to automatically translate legislative clauses to a semantic representation, highlighting key problems and proposing some tasks to address them. Semantic Annotations for Legal Text Processing using GATE Teamware
Adam Wyner and Wim Peters Abstract
Large corpora of legal texts are increasing available in the public domain. To make them amenable for automated text processing, various sorts of annotations must be added. We consider semantic annotations bearing on the content of the texts – legal rules, case factors, and case decision elements. Adding annotations and developing gold standard corpora (to verify rule-based or machine learning algorithms) is costly in terms of time, expertise, and cost. To make the processes efficient, we propose several instances of GATE’s Teamware to support annotation tasks for legal rules, case factors, and case decision elements. We engage annotation volunteers (law school students and legal professionals). The reports on the tasks are to be presented at the workshop. Shortlink to this page.
By Adam Wyner
A study in online, collaborative legal informatics Adam Wyner, University of Aberdeen Wim Peters, University of Sheffield Daniel Katz, Michigan State University — Introduction —
This is an academic research study on legal informatics (information processing of the law). The study uses an online, collaborative tool to crowdsource the annotation of legal cases. The task is similar to legal professionals’ annotation of cases. The result will be a public corpus of searchable, richly annotated legal cases that can be further processed, analysed, or queried for conceptual annotations.
Adam and Wim are computer scientists who are interested in language, law, and the Internet. Dan is an academic lawyer also interested in law and the Internet.
We are inviting people to participate in this collaborative task. This is a beta version of the exercise, and we welcome comments on how to improve it. Please read through this blog post, look at the video, and get in contact. — Highlighting, Annotations, and Legal Case Briefs —
In reading, analysing, and preparing a summary of a legal case, law students and legal professionals annotate cases by highlighting and colour coding elements of the case to make for easy identification. Different elements are annotated: the holding, the parties, the facts, and so on. A sample image of annotations is:
— Problem —
To analyse a legal case, legal professionals annotate the case into its constituent parts. The analysis is summarised in a case brief. However, the current approach is very limited:
Analysis is time-consuming and knowledge-intensive.
Case briefs may miss relevant information.
Case analyses and briefs are privately held.
Case analyses are in paper form, so not searchable over the Internet.
Current search tools are for text strings, not conceptual information. We want to search for concepts such as for the holdings by a particular judge and with respect to causes of action against a particular defendant. With annotated legal cases, we can enable conceptual search.
There is no capacity to systematically compare, contrast, and evaluate the work by different annotators. Consequently, the annotation task itself is not used as an opportunity to gain greater expertise in case analysis.
— Solution: Crowdsource Annotation —
We use an online legal case annotation tool and share the results to support:
Online search in legal cases for case details and concepts.
Semantic web applications and information extraction.
Crowd-source a legal case corpus.
Training and learning for legal case analysis.
The results of the study would be useful to:
Law school students learning case analysis.
Legal professionals in identifying relevant cases.
Researchers of legal informatics.
Law faculty in training students to analyse legal cases.
Broadly speaking, a corpus of analysed cases makes case law a public resource that democratises legal knowledge. — Annotations: types and features —
To crowdsource conceptual annotations of legal cases, we use the General Architecture of Text Engineering (GATE) Teamware tool. Teamware is a web-based application that provides an annotator with a text to annotate and a list of annotations to use. The task is a web-based version of what legal analysts of cases already do.
We use familiar annotations for legal cases, divided (for ease of reference) into types and features. For example, we have a type Legal Roles and various features to select among, e.g. defendant. We are counting on you to have learned and used these annotations in the course of your legal study and practice.
You do not need to memorise the types and features as they will appear in the GATE Teamware tool. It may be handy to keep this webpage open so you can consult it or you could also print out the page.
The annotations we use are: Argument For Party – arguments for a particular party, using the most general notion:
for Appellee, for Appellant, for Defendant, for Plaintiff.
Facts – legal and procedural facts:
Cause of Action – the specific legal theory upon which the plaintiff brings the suit.
Defenses raised by Defendant – the defendant defenses against the cause of action.
Legal Facts – the legally relevant facts of the case that are used in arguing the issues.
Remedy requested by Plaintiff – what the plaintiff asks the court to grant.
Indexes – various indicative information:
Case Citation – the citation of the particular case being annotated.
Court Address – the address of the court.
Hearing Date – the date of the hearing.
Judge Name – the names of the judge, annotated one at a time.
Jurisdiction – the legal jurisdiction of the case.
Issues – the issues before the court:
Procedural Issue – what the appellee claims that the lower court did wrong.
Substantive Issue – the point of law that is in dispute (legal facts have their own annotation).
Legal Roles – the role of the parties in the case:
General – buyer/seller, employer/employee, landlord/tenant, etc.
Other – relevant information not covered by the other annotations. Procedural History – the disposition of the case with respect to the lower court(s):
Appeal Information – who appealed and why they appealed.
Damages – the damages awarded by the lower court.
Lower Court Decision – the lower court’s decision.
Reasoning Outcomes – various parts of the legal decision:
Concurring Opinion.
Dicta – commentary about the judgement and holding, but not part of the rationale.
Dissenting Opinion.
Holding – the rule of law or legal principle that was applied in making the judgement. You can think about this as the new ground that the court is covering in this case. What legal rule(s) is the court developing or clarifying? The case can have more than one holding if there is more than one legal rule being considered. Note that a holding from a cited precedent is to be considered part of the rationale.
Judgement – Given the holding and the corresponding rationale for the holding, the judgement is the court’s final decision about the rights of the parties, the court’s response to a party’s request for relief, and bearing on prior decisions (e.g. affirmed, reversed, remanded, etc.).
Rationale – the court’s analysis of the issues and the reasons for the holding.
— Strategic Phases —
From previous experience and following discussions, we believe it is best if the annotations are grouped together and done in three phases. This allows the annotator to do simpler tasks first and to keep in mind a subset of the relevant annotations.
Phase I: Indexes and Legal Roles
Phase II: Procedural History and Reasoning Outcomes
Phase III: Facts and Issues
For the time being, we are not attending to annotations of Arguments for Party and Other. — Collaborate —
Take a look at the instructional video below. If you wish to collaborate on the task, send an email to Adam Wyner – adam@wyner.info
In the email, please include brief information for:
Your name
Your professional affiliation, e.g. institution, company, firm…
Your role where you work
Your background as a legal professional
This will help us know who we are collaborating with; from the pool of candidates, we will select participants for this early study.
You will be sent a user name and password so you can login to Teamware.
We respect your privacy. We are only interested in data in the aggregate and will not reveal any personal data to third parties. — Next —
We have an instructional video that you can open in a new tab or window and that uses QuickTime. It lasts about 14 minutes. This will give you a good idea of what you will be doing. The presenter is Adam Wyner. You can see this here:
Or follow the link on YouTube — Crowdsourcing Legal Case Annotation Instructional Video. Please view in a large (ok definition) or full screen (grainy definition) mode, which may need to be reloaded in YouTube.
There are additional points about using the tool in section below on questions, problems, and observations.
After reading this blog, viewing the instructional video, and receiving your username and password, you can login to begin annotating at — GATE Teamware — Survey —
When you are done with your task, please answer the questions on the survey to give us feedback on your experience using the annotation tool. The survey is available below. You can scroll down and answer the questions. Don’t forget to hit the “Done” button to submit your responses, which will be very useful in helping us understand your experience and thoughts about using the tool:
Create your free online surveys with SurveyMonkey, the world’s leading questionnaire tool.
— What Then? —
We analyse the annotations from several annotators, comparing and contrasting them (interannotator agreement). This will show us similarities and differences in the understanding of the annotations and cases. As well, the results will help us develop a Gold Standard Corpus of legal cases, which are annotations of cases that annotators agree on. A Gold Standard is essential for information extraction and the development of advanced processing. We will publicly report the analysis of the exercise and make the annotated cases publicly available for re-use.
Once we have a better sense of how this study goes, we plan to roll out a larger version with more cases. And this is only the start…. — Questions, Problems, and Observations — Thanks to participants for letting us know about their problems and sending their observations. How easy is it to learn to use the tool? Take a look at the video to get a sense of this. With a little bit of practice, it is rather straightforward. What if I don’t agree with some of your annotations or features? Write a comment or send us an email, and we will consider your comment. Try to be as clear and specific as you can. We are not lawyers, and we are dealing with a global community with local variation, so it is likely there will be some disagreement and variation. Can I get the results of my annotations? Our approach is to make individual contributions to the whole. So, you will be able to access annotated cases after the exercise. There will be further information on how to work with the material. How many cases must I do? You can do one or you can do as many as we have (not many in the beta project). How much time will it take? About as long as it would take you to do a similar highlighting and annotation task with paper and markers. What if I have a problem with using the tool or if the tool is buggy? Be patient and try to work with the tool. Sometimes things go wrong. Write a comment or send us an email, and we will try to advise. Note – we are only consumers of GATE Teamware, so are not responsible for the system. How thoroughly should I annotate the cases? The more cases that are annotated fully and accurately, the better. Apply the same diligence as you would to thoroughly and carefully analyse cases with pen and paper. As you will be the beneficiary of the work of others, so too should you work to benefit them. Do we track good annotators and bad annotators? We are interested in data in the aggregate, and are only interested in interannotator agreement and disagreement. This information will help us better understand differences in how the cases are understood and annotated. But, we can see how much time each person takes with each annotation task and measure how they perform against other annotators or a gold standard. If we have bad annotators, we will see this in the results; we would contact the annotator and see how best to improve the situation. As we noted above, we are not sharing information with third parties. I cannot login with the username and password. Please let me know if you have this problem, and I will look into it. I can login, but I cannot get the java webstart file to start. This is a tough problem to address over the internet. Some people have no problem, but some people are. Please let me know if you have this problem. Do check that you have followed the instructions (on blog and in movie). I can login and start the annotation tool, but I cannot get the task. Please let me know, and I will look into it. The text is too small and single spaced. At the moment, there is nothing we can do about this. We’ll try to keep this in mind for the future. The highlighting tool is not easy to use. When I want to move from one annotated text to some new text, the tool doesn’t move to the new text. This is bit of a problem with the tool, which is not entirely reliable in the functionality. Try to play around with this to see what works for you. One strategy that I have found that improves performance is to annotate something. Then the annotation types appears in the upper right hand corner window among the list of annotations. Sometimes it is a good idea, when the problem occurs, is to click the annotations in that upper right hand corner window off and on (toggle them on and off). This seems to clear the system a bit so that one can go on to the next annotation. Give this a try. If you have problems, please let me know. I found it very challenging. It is important to us to know this information to gauge how much text and the variety of annotations. We might reduce the number of annotations, breaking up the whole set into parts of the overall task. Decision date is more important than hearing date, or at least should be provided in addition to hearing date. Probably this will be added to future iterations. A participant, e.g. “Cone”, was originally a defendant, but was dismissed out before this appeal. I wonder if he should still be coded as “Defendant” or if he should be coded as an other role-holder. Good observation. I’ll have to consult with some lawyers further about this point. There are sentences where the court introduced a fact and also appeared to reason using it. Is it right to code the whole sentence both as a legal fact and as a rationale. Yes, this is the way to handle this. Double annotations are always possible. A similar problem occurred where the court offered a fact but also put a gloss on it as to its legal significance. Double annotations are always possible. Some of the names of the categories were confusing or unclear. For example, using “Holding” for the name of the legal rule or principle was confusing (“Legal Rule” might be more intuitive). This is another point that we will need to consult further with other lawyers. There may also be some variation in terminology. There is sometimes unclarity about role-players. A case involved a plaintiff, who was an appellee but also a cross-appellant, and a defendant who was thus an appellant and cross-appellee. These can be coded where on is plaintiff and appellee and the other defendant and appellant. But, they could have both been coded as appellee and appellant, given the existence of the cross appeal. Double (or more) annotating is fine. Procedural History/Damages might be better framed as Procedural History/Remedies, as courts often provide injunctive relief or, as in this case, an accounting, as a remedy. This is another point that we will need to consult further with lawyers about terminology. What if a case does not state any legal rules? Can implicit legal rules be annotated. For example, where novelty and non-obviousness are a sine qua non of a valid patent, one would not have known to mark some of the sentences as rationales. This isn’t a problem. If something is not in the case, then it is not annotated. We are not (yet) concerned with implicit information. But, if you know the implicit information, then annotate it. How can I automatically search for and annotate the same string with the same annotation? In the instructional video, we wanted to keep the material short and to the point, so there are aspects of the annotation tool we did not cover. However, it is tedious to manually search for the same string and annotate it with the same annotation. Teamware’s Annotation Editor has a tool to support automatic search and annotation. To see how to do this, we have the video here:
How should I annotate holdings which may appear as holdings in cited cases and as part of the procedural history, as holdings in the current case, or as part of the rationale in the current case? This is an interesting and subtle point for us, and we will have to have a full consultation with lawyers to decide. But, for the time being, there can be no harm in multiple annotations, which we can then look at and work with later. — Paper —
If you are interested in some of the ideas behind this project, please see our paper: Semantic Annotations for Legal Text Processing using GATE Teamware
The paper will appear in May 2012 in the Proceedings of the LREC Conference Workshop on Semantic Processing of Legal Texts, Istanbul, Turkey. The exercise here is a version of the exercise proposed in the paper.
A shortlink to this blog page is: http://wyner.info/LanguageLogicLawSoftware/?p=1315 — Thanks for collaborating! —
— If you have any questions, please submit a comment! — — Update Note —
July 29, 2013 to reflect Dan Katz’s amended definitions for Holding. Updated in various ways July 12, 2013. The previous blog post of July 28, 2012 has been updated to note the participation of Dan Katz and his students of Michigan State University. — Honour Role —
For the very first study, we would like to thank the following individuals who gave of their time and intelligence to carry out their tasks.
A Special Issue the Journal of Artificial Intelligence and Law on Modelling Policy-making Special Issue Editors
Adam Wyner, University of Liverpool, adam@wyner.info
Neil Benn, University of Leeds, n.j.l.benn@leeds.ac.uk Paper Submission Deadline: May 28, 2012
We invite submission of papers on modelling policy-making. Below we outline the intended audience, context, the topics of interest, and submission details. Context
We live in an age where citizens are beginning to demand greater transparency and accountability of their political leaders. Furthermore, those who govern and decide on policy are beginning to realise the need for new governance models that emphasise deliberative democracy and promote widespread public participation in all phases of the policy-making cycle: 1) agenda setting, 2) policy analysis, 3) lawmaking, 4) implementation, and 5) monitoring. As governments must become more efficient and effective with the resources available, modern information and communications technology (ICT) are being drawn on to address problems of information processing in the phases. One of the key problems is policy content analysis and modelling, particularly the gap between on the one hand policy proposals and formulations that are expressed in quantitative and narrative forms and on the other hand formal models that can be used to systematically represent and reason with the information contained in the proposals and formulations. Special Issue Theme
The editors invite submissions of original research about the application of ICT and Computer Science to the first three phases of the policy cycle – agenda setting, policy analysis, and lawmaking. The research should seek to address the gap noted above. The journal volume focusses particularly on using and integrating a range of subcomponents – information extraction, text processing, representation, modelling, simulation, reasoning, and argument – to provide policy making tools to the public and public administrators. While submissions about tool development and practice are welcome, the editors particularly encourage submission of articles that address formal, conceptual, and/or computational issues. Some specific topics within the theme are:
information extraction from natural language text
policy ontologies
formal logical representations of policies
transformations from policy language to executable policy rules
argumentation about policy proposals
web-based tools that support participatory policy-making
tools for increasing public understanding of arguments behind policy decisions
visualising policies and arguments about policies
computational models of policies and arguments about policies
integration tools
multi-agent policy simulations
Submission Details:
Authors are invited to submit an original, previously unpublished, research paper of up to 30 pages pertaining to the special issue theme. The paper should follow the journal’s instructions for authors and be submitted online. See the dropdown tab under the section FOR AUTHORS AND EDITORS.
Instructions for Authors on: https://www.springer.com/computer/ai/journal/10506
Submit Online on: https://www.springer.com/computer/ai/journal/10506
Each submitted paper will be carefully peer-reviewed based on originality, significance, technical soundness, and clarity of exposition and relevance for the journal.
The shortlink to this webpage is: http://wyner.info/LanguageLogicLawSoftware/?p=1258
A PDF version of this CFP: CFP – Modelling Policy-making
Contact the special issue editors with any questions.
By Adam Wyner