Information Extraction of Conditional Rules

In this post, we extract conditional rules, such as If it rains, then the sidewalk is wet both in simple examples and from a sample fragment of legislation. (See introductory notes on this and related posts.)
Sample legislation
In legislation (and elsewhere in the law), conditional statements of the form If P, then Q are used. A well-researched example in AI and Law is the UK Nationality Act. In this post, we provide some initial JAPE rules to annotate conditional statements.
We work with a several variants of simple conditional statements and a (modified) conditional statement from the UK Nationality Act. For each statement, we want to annotate them as rules as well as to identify the portions of the rule.

    If Bill is happy, then Jill is happy.
    Jill is happy, if Bill is happy.
    Jill is happy if:

        1) Bill is happy;
        2) Bill and Jill are together.

    Acquisition by birth or adoption

        (1) A person born in the United Kingdom after commencement shall be a British citizen if —
        (a) at the time of the birth his father or mother is a British citizen; or
        (b) at the time of the birth his father or mother is settled in the United Kingdom.


Output
What we want to get is not only do we have a sentence which we have identified as being a rule, but that we can also identify the parts of the rule, namely the antecedent and the consequent. This may be useful for further processing.
The results appear in a graphic as:
Rule Output
Below, we discuss some of the problems with annotating the legislative rule.
GATE
In the zip file we have the application state, text, graphic, and JAPE rules.
Lists
There are no particular lists for this section; we used the same lists from the rulebook development.
JAPE Rules
We have a cascade of rules as follows.

  • AntecedentInd01: finds the token “if” in the text. We use this as an indicator that the sentence is or may be a rule. We may have a range of such rules that we take to indicate a rule. We can use them to examine results from a body of texts, refining what is identified as a rule and how. Overgenerate, then prune. After we are clear about the results from individual rules, we can gather the annotations together under another annotation, which generalises the result.
  • AntecedentInd02: finds the conditional indicator inside a sentence and annotates the resulting sentence as a rule with a conditional. A general rule like this can be used as we refine the indicators of rule. It also is an example of sentence annotation with respect to properties contained in the sentence.
  • ConditionalParts01: finds the string between if and some punctuation, then labels it antecedent. This labels Bill is happy as antecedent in simple sentences such as If Bill is happy, then Jill is happy and Jill is happy, if Bill is happy. It does not work for the list.
  • ConditionalParts02: finds the string between a preceding sentence and a comma followed by a conditional indicator, then labels it consequent. This labels Jill is happy as consequent in simple sentences such as Jill is happy, if Bill is happy.
  • ConditionalParts03: finds the string between then and the end of the sentence, labelling it consequent. This labels Jill is happy as consequent in simple sentences such as If Bill is happy, then Jill is happy.
  • ConditionalParts04: find the string between a preceding sentence and a conditional indicateor followed by a colon, then labels it consequent. This labels Jill is happy as consequent in constructions where the antecedents are presented in a list such as Jill is happy if: Bill is happy and Jill and Bill are together.
  • ConditionalParts05: finds the strings between list indicators (see the section on legislative presentation) and some punctuation (semi-colon or period), and labels them as antecedents. This labels Bill is happy as antecedent in Jill is happy if: Bill is happy and Jill and Bill are together.
  • ConditionalSentenceClass: annotates sentences as conditionals if they contain a conditional indicator.

Application order
The order of application of the processing resources is:

  • Document Reset PR
  • ANNIE English Tokeniser
  • ANNIE Sentence Splitter
  • ListFlagLevel1
  • AntecedentInd01
  • ConditionalParts01
  • ConditionalParts02
  • ConditionalParts03
  • ConditionalParts04
  • ConditionalParts05
  • ConditionalSentenceClass

Comments
While our application clearly works well for the simple samples of conditional statements, it does not do well with respect to our sample legislation. There are a range of problems: list recognition “(x)”, use of “;” , use of “–“, and use of “or”. Most of these have to do with refining the notions of lists that we inherited from the rulebook example, so we need to refine the rules to the particular context of use. We leave this as an exercise.
By Adam Wyner
Distributed under the Creative Commons
Attribution-Non-Commercial-Share Alike 2.0

Leave a Reply

Your email address will not be published. Required fields are marked *