PhD Projects in AI and Law

Proposals

Below, there are a range of PhD project in AI and Law. Other project proposals in the general AI and Law topic area are also welcome. I am interested in NLP (information extraction, semantic representation, controlled natural languages, classification, chatbots, dialogue/discourse), argumentation, various forms of legal reasoning (case-based reasoning, client consultations, legislation/regulations, court proceedings, contracts), and ontologies/knowledge graph.

An Argument Chatbot

Chatbots are a popular area of development. In this project, you develop a chatbot for policy-making, legal consultations, or scientific debates. The chatbot should be capable of various dialogue types such as information-seeking, deliberation, and persuasion; in addition, the dialogue should be tied to patterns of argument and critical questions. The underlying techniques will be natural language processing (rule-based and machine learning), structured argumentation, and knowledge representation and reasoning. The project may be done in collaboration with IBM UK, working with IBM scientists and engineers.

Argument Extraction and Reconstruction

The goal is to identify textual passages which indicate argument and rhetorical structure (premises, claim, continuation) or argumentation schemes (patterns of everyday reasoning such as Expert Witness, Practical Reasoning, Commitment, etc). The student will review some background literature, analyse a selection of argumentation schemes, identify the particular elements to be extracted using an NLP tool, create the processing components, carry out a small evaluation exercise, and connect the NLP output to a computational argumentation tool. The particular corpus of text is to be determined.

Textual Entailment

Textual entailment is about taking a sentence or passage and drawing inferences from it, for example, the sentence “Bill turned off the light” implies “The light was off”. There are several available NLP tools to develop textual entailment. In this project, the student will apply the textual entailment tools to the corpus, evaluate them against a “gold standard”, then modify a tool to improve performance. There are existing corpora of texts to train and evaluate textual entailment.

Contrast Identification

Debates express contrasting positions on a particular topic of interest. A key problem is to determine the semantic contrariness of the positions as expressed by statements within the positions. Such a task is relatively easy for people to do, but difficult for automated identification since there are many linguistic ways to express contrasts, some of which may be synonymous. Annotation of contrast would help support semi-automatic construction of arguments and counter-arguments from text. The student will review some background literature, analyse a selection of contrasting expressions, identify the particular elements to be extracted using an NLP tool, create the processing components, carry out a small evaluation exercise, and connect the NLP output to a computational argumentation tool.

Classification of legal texts

In legal texts such as legislation or case law, different segments of the text serve different purposes. For example, one portion may be a statement of facts, while another is a statement of a rule. The project specifies the portions and classifications of a corpus of legal texts, creates a gold standard, then applies machine learning techniques to classify the portions to a high level of accuracy. Another topic within this area is legal decision prediction, wherein legal decisions (cases) are classified in various ways.

Bar Exam

In the US, to become a lawyer, a Bar Exam must be taken and past. The Bar Exam consists of 200 multiple choice questions, covering an extensive range of legal topics. The task in the project is to classify, using machine learning, the questions in the Bar Exam and to design a system to pass the Bar Exam, using techniques from NLP, logic, and machine learning.

A Controlled Natural Language with Defeasibility

Controlled Natural Languages (CNLs) are standardised, formal subsets of a natural language (such as English), which are both human readable and machine processable. Several CNLs have been developed, such as IBM’s ITA Controlled English (CE) or Business Rules Language (BRL), OMG’s Semantics of Business Vocabulary and Business Rules, and several academic languages. Some CNLs support ‘strict’ reasoning for ontologies, terminologies, or Predicate Logic, which is sufficient in many contexts. However, defeasible reasoning is essential in other contexts where there is inconsistent and partial knowledge, such as in political, legal, or scientific debates.  The project explores the representation of and reasoning with defeasibility in a CNL, which could lead to a CNL that has much wide applicability and impact. The project can be done in collaboration with IBM UK, working with IBM scientists and engineers. The project can be either a theoretical study or an implementation (or a mixture of both). The supervisor has extensive background CNLs and argumentation/defeasibility.

Rule Extraction from Legislation or Case Law

Legal texts (legislation, regulations, and case law) provide the “operational legal rules” for businesses, organisations, and individuals. It is important to be able to identify and extract such rules, particularly for rulebook compliance or to transform rules in natural language into machine-readable, executable rules. The student will analyse a selection of regulations, identify the particular elements to be extracted using NLP, create the processing components, translate rules from natural language to executable rules, draw inferences, and evaluate the results.

An Expert System to Support Reasoning in Juries

Jury trials are a fundamental aspect of the Common Law legal system in the UK and USA. In jury trials, jurors are members of the public who are required to reason about the facts of the case and about the legal rules to arrive at a decision (e.g. whether the plaintiff is guilty or innocent). This is a difficult and important task for a person to do who is not schooled in the law. Fortunately, in some jurisdictions, there are standardised “catalogues” of jury instructions to guide the jurors in how to reason. In this project, the student analyses a selection of jury instructions and implements them as an interactive juror decision support tool.

Legal Case Based Reasoning

Case based reasoning is about using known information to determine unknown problems. Legal case based reasoning is the structure of legal reasoning in courts in the UK and the USA. The project will implement several existing formalisations of legal case based reasoning.

Logical Formalisations of the Law

The law can be formalised in a variety of ways, and there are tools and techniques to support the task. Such formalisations can be queried and inferences drawn. The project will examine existing tools, see what can be improved, and provide fragments of formalised law.

Legal Ontologies/Knowledge Graph

In an ontology/KG, domain knowledge about entities, their properties, and their relations are formally represented. Representations also facilitate querying, extraction, linking, and inference. There are legal ontologies/KGs that represent the law, legal processes, and legal relationships. The project will examine existing legal ontologies, augment them, and build a richer ontological representation using existing tools.

Abstract Contract Calculator in Haskell

Create a program in Haskell, which is a functional programming language, to execute ‘theoretical’ legal contracts, which are contracts that have the form of an actual legal contract, but not the content.