General Architecture for Text Engineering Summer School

Next week I’m attending a week long summer school on General Architecture for Text Engineering (GATE). GATE is an open-source and extensible toolkit for text mining, which has been used in a variety of areas. After having worked with people who had their “hands on” the tools, I decided it would better suit me to be able to work the material myself. I’ve been looking forward to this summer school for some time and am excited at the prospect of applying GATE tools to a DB of legal cases as well as developing an ontology.

Further Considerations on "The End of Lawyers"

In a previous post on Susskind’s The End of Lawyers, I briefly outlined some of the technologies Susskind discusses, then pointed out several fundamental technologies which he does not discuss in depth, but which I believe will have significant impact on the legal profession.
One topic which I did not discuss in that post was why the legal profession is so slow to adopt legal technology. Susskin points out several:

      Billable hours — the legal profession makes money by the hour, which is a disincentive to make legal processes more efficient.
      Conservativism and status — the legal profession has a long and distinguished position which changes slowly.
      Government funding — while governments may recognise the value of legal technologies, investing in them is another matter (though see the recent e-Government awards in the previous post).
      Information availability — only recently have legal documents (legislation, cases, other government information) been made publically and electronically available.

I think (and believe my colleagues in AI and Law would agree) that these are very significant contributing factors in the slow adoption of legal technolgoies by legal professionals, firms, and governments. But, there are others, and I believe that by identifying them, we can then make progress to addressing the problems that they raise.
To help us clarify the issues, we can compare and contrast the legal profession to another very ancient and prestigious profession — medicine. However, doctors, medical organisations, and medical researchers have adopted and advanced technologies very rapidly and on a large scale, yet the technologies are, at their core, similar to those available to legal professions. Therefore, technologically, there is little reason why the legal profession has not also adopted the technologies or more aggressively sought to adapt them.
While there are systems to support reasoning by doctors and medical records filing and retrieval, let me focus on two technologies which are equally available, fundamental, and influential to legal and medical professions — information extraction and ontologies.
In the medical field, there are are large corpora of textual information that must extract relevant information. The corpora are and have been publicly available for some time. There are academic and industry groups that have and develop software systems to extract the information (e.g. National Centre for Text Mining and Linguamatics, among others). Through links, one can find conferences, other groups, and government organisations; the interest is deep, widespread, and of high value. Moreover, medical ontologies are very far advanced such as the Systematised Nomenclature of Medicine Clinical Terms and the Foundational Model of Anatomy among others.
In the legal field, the corpora of textual information is only just beginning to be available. There has been some research on information extraction in the legal field. There has been some work on legal ontologies (e.g. LKIF, which was an EU project that I participated in).
In both areas — information extraction and ontologies — the medical field far outstrips the legal field. Why?
I think the differences are not so much those outlined above; one could argue that medical and legal fields have had, at least historically, similar constraints — the medical field has just overcome them. The most obvious apparent difference is that research medicine has been and continues to be advanced with scientific and technological means. Other research fields — biology, chemistry, statistics, anatomy — made relevant contributions. Moreover, the medical field has large research bodies that are well-funded (e.g. The Wellcome Trust). Finally, the culture of medical research and application of findings is such that information is disseminated, criticised, and verified. Let us put these into four points:

  • Scientific and technological approach
  • Contributions from other fields
  • Research bodies
  • Culture of research

In these respects, the legal field is very different to the medical field. Science and technology have not, until very recently, been relevant in terms of how the law is practiced. While there have been some contributions from other fields (e.g. sociology or psychology), the impact is relatively low. There are research bodies, but they not of the scale or influence of that in medicine. And the disposition of the legal community has been to closely hold information.
I believe that there is single (though very complex) underlying reason for the difference — the object of study. In medicine, the objects of study are physical, whether in chemistry, biology, anatomy, etc; these objects are and have been amenable to scientific study and technological manipulation. In contrast, in law, the object of study is non-physical; one might be tempted to say it is the law itself, but we can be more concrete and say it is the language in which the law is expressed, for at least language is something tangible and available for study.
Thus, the scientific study of language — Linguistics — is relevant. However, Linguistics as a scientific endeavour is relatively young (50 to 100 years, depending on one’s point of view). The technological means to study language can be dated to the advent of the digital computer which could process language in terms of strings of characters. Widespread, advanced approaches to computational linguistics for information extraction is even more recent — 10 to 20 years. Very large corpora and the motives to analyse them arose with the internet. And not only must we understand the language of the law, but we must also understand the legal concepts as they are expressed in law. Here, the study of deontic reasoning, while advanced, is “only” some 20 years old and has found few applications (see my 2008 PhD thesis Wyner 2008 PhD Thesis).
Language is the root of the issue; it can help explain some of the key differences in the application of technology to the legal field In our view, as the linguistic and logical analyses of the language of the law advance, so too will applications, research bodies, and results. However, it is somewhat early days and, in comparison to the medical field, there is much yet to be done.
Copyright © 2009 Adam Wyner

Susskind's "The End of Lawyers" is Part of the Story

Introduction
In this post, I briefly outline Richard Sussking’s background, elements from The End of Lawyers, and then turn to consider issues that Susskind is aware of but does not discuss in depth. These are issues which I believe are fundamental to how technology will impact legal practice such as the semantic web, textual information extraction, ontologies, and open source databases of legal documents.
Background
Susskind specialises in how information and communication technology (ICT) is used by lawyers and public administrators. His website is:
www.susskind.com
Besides the important and general interest of his line of work, its prominence in the community of practicing legal professionals gives us a good indication of the sorts of technologies that community is and is not aware of.
Richard Susskind has been writing about ICT since publication of his PhD thesis Expert Systems in Law (1987, Oxford University Press). He is among the early researchers in Artificial Intelligence and the Law. His subsequent books — The Future of Law and Transforming the Law — developed themes about the relation of ICT and the legal profession, focusing on the ways ICT would change the practice of law and the interactions among lawyers, government administrators, and the public. In addition to the books, Susskind consults widely, is an editor of the journal International Journal of Law and Information Technology, and is a law columnist for The Times. He is very uniquely informed about the technologies that are available and how the legal community regards and uses them. This makes it all the more interesting to draw attention to what he does not discuss in depth.
His recent book The End of Lawyers has garnered a very significant amount of attention, and online excerpts along with comments can be found at:
The End of Lawyers
Legal Technology Tools
In this book, he develops and elaborates his main themes. He points out a range of technologies, briefly outlined below, which will contribute to changing the legal profession. As there is substantial information already on line about his proposals, I will not here repeat them in depth, but to say that by and large I agree with many of the overt points he makes about the applicability of technology to the legal profession as well as why the legal profession has been and remains slow to take up ICT solutions.
Among the key technologies Susskind outlines, we find:

      Automated document assembly — structuring blocks of legal documents.
      Connectivity — email, fax, cell phones, facebook, twitter, blogs.
      Electronic legal marketplace — legal services advertised, rated, and traded.
      E-learning — lawyers and members of the public having the opportunity to learn about the law online.
      Online legal guidance — rather than face-to-face with individual lawyers, a chance to read, learn about the law, have questions addressed at different levels of formality.
      Legal open-sourcing — user generated content, free and unrestricted legal information (e.g. BAILII), legal wikis.
      Closed legal communities — collectives of lawyers, justices, or government officials exchange information.
      Workflow and project management — using software and services to monitor and support the work of legal professionals. This includes case-management and electronic filing.
      Embedded legal knowledge — legal information and knowledge is more readily transparent in daily interactions or prevents non-compliance.
      E-disclosure — finding and processing documents and information relevant to the disclosure phase of a case.
      Online dispute resolution — systems to mediate and support the resolution of disputes.
      Courtroom annotation — transcribing and noting courtroom proceedings manually and automatically.
      Improving access to law — giving citizens more information and advice.

Engineering and Managing Legal Knowledge
In the course of the book, he says that the engineering and management of legal knowledge is central to these technologies, where:

      Legal knowledge management (p. 155) — the systematic organization, standardization, preservation, and exploitation of the collective knowledge of a firm. It is intended to maximize the firm’s return on the combined experience of its lawyers over time.
      Legal knowledge engineer (p. 272): someone who carries out basic analysis, decomposition, standardization, and representation of legal knowledge in computer systems.

However, little is said about how the engineering and management is to be done other than that some of the technologies outlined above contribute to them.
What is said is largely by way of brief references or outlines to additional issues such as the semantic web (p. 68), wikis (but not semantic wikis), online dispute resolution (but little on current developments), and open source legal information (e.g. BAILII, but not WorldLii).
More to the point, there is no discussion of research on key technologies such as:

      Legal ontologies by which legal knowledge is formalised, acquired, processed, and managed.
      XML which underlies the semantic web
      Web-based inference systems
      Textual information extraction which is essential to make use of open source legal information
      Rule-based systems such as provided by Oracle (previously known as Softlaw, RuleBurst, and Haley) which are prominently used by UK tax authorities
      E-government services which go beyond providing information and submission of forms but also allow some interaction such as Parmenides and DEMO-net

These are all topics of central relevance to our blog and to the AI and Law community which organises around the International Conference on AI and Law or Jurix
We agree by and large with Susskind. However, there is much more which would be highly relevant and valuable to draw to the attention of the legal community. Moreover, it would be very valuable to the AI and Law community were his prominent and respected voice in the legal and governmental circles to be heard advocating further for research such as in AI and Law.
Copyright © 2009 Adam Wyner