NSF sponsored workshop: Automated Content Analysis and the Law

I was invited to participate in an NSF ­Sponsored Workshop
 Automated Content Analysis and Law, August 3 and 4 at NSF HQ in Arlington, VA and organised by Georg Vanberg (UNC).
There are two sessions planned. The first session will focus on identifying the theoretical/substantive puzzles in legal and judicial scholarship that might benefit from automated content analysis as well as what data and measurements are required. For the second session, the focus is on the state of automated content analysis/natural language processing, exploring the extent to which current technology is relevant to providing results with respect to issues raised in the first session and what might be needed.
There is an interesting mix of people, with a strong emphasis on legal scholarship bearing on the US Supreme Court and opinion mining. I had an email exchange with Georg, the workshop organiser about this, and we agree that attention ought to turn from the Supreme Court to lower levels of the legal system. I also suggested that participants consider some of the following points which bear on the motives and objectives of these lines of research in terms of who is being served and how the data or conclusions would be used.
Questions for Discussion

  • What sorts of artifacts and technologies (if any) will emerge from the research?
  • How does the research relate to the Semantic Web?
  • What public service does the research provide or support?
  • How does this research relate to:
    • E-discovery
    • Textual legal case based reasoning
    • Legislative XML Markup
    • Other research communities e.g. ICAIL and JURIX


  • Scott Barclay (NSF) – Barclay@uamail.albany.edu
  • Cliff Carrubba (Emory) – ccarrub@emory.edu
  • Skyler Cranmer (UNC) – skylerc@email.unc.edu
  • Barry Friedman (NYU)- friedmab@juris.law.nyu.edu
  • Susan Haire (NSF) – shaire@nsf.gov
  • Lillian Lee (Cornell) – llee@cs.cornell.edu
  • Jimmy Lin (Maryland) – jimmylin@umd.edu
  • Stefanie Lindquist (Texas) – SLindquist@law.utexas.edu
  • Will Lowe (Nottingham) – will.lowe@nottingham.ac.uk
  • Andrew Martin (Wash U) – admartin@wustl.edu
  • Wendy Martinek (NSF) – wemartin@nsf.gov
  • Kevin McGuire (UNC) – kmcguire@unc.edu
  • Wayne McIntosh (Maryland) – wmcintosh@gvpt.umd.edu
  • Burt Monroe (Penn State) – blm24@psu.edu
  • Kevin Quinn (Harvard) – kevin_quinn@harvard.edu
  • Jonathan Slapin (Trinity College) – jonslapin@gmail.com
  • Jeff Staton (Emory) – jkstato@emory.edu
  • Georg Vanberg (UNC) – gvanberg@unc.edu
  • Adam Wyner (University College London) – adam@wyner.info

General Architecture for Text Engineering Summer School

Next week I’m attending a week long summer school on General Architecture for Text Engineering (GATE). GATE is an open-source and extensible toolkit for text mining, which has been used in a variety of areas. After having worked with people who had their “hands on” the tools, I decided it would better suit me to be able to work the material myself. I’ve been looking forward to this summer school for some time and am excited at the prospect of applying GATE tools to a DB of legal cases as well as developing an ontology.