Recent THiNK Workshop at Aberdeen

I was a co-organiser, with Prof. Barbara Fennell, of Policy-making, Text Analysis, and Big Data: A Workshop in Digital Humanities and Knowledge Exchange, which was held at the University of Aberdeen, June 30 in the Sir Duncan Rice Library.
Sir Duncan Rice Library, University of Aberdeen.
The THiNK network is for Knowledge Exchange in the Arts and Humanities in the UK. It provides a forum in which various parties exchange knowledge about funding ideas and opportunities. I was a presenter at a recent THiNK Event in London and see also my previous post.
The follow-on workshop, Policy-making, Text Analysis, and Big Data: A Workshop in Digital Humanities/Knowledge Exchange focussed on issues relating to Text Analysis, Policy-making, and Big Data. The underlying idea is that we have to harvest textual data on a large scale in order to assist with public policy-making. The full announcement, slides, and further notes about the workshop are at the link above; below is some extracted information.

Workshop description:

Policy-making and the law are fundamental to communal life and social progress. Given that policies and law are expressed in language and in social contexts, they are a natural “object” to study in the Humanities. One new approach is to apply current text analytic and information retrieval tools to better understand the substance of the policy documents, deliberative discourse, and related documents. More broadly, textual analysis and retrieval is at the heart of a range of interdisciplinary and applied research; it is a key element of Digital Humanities. While small scale studies are feasible and illuminating, it is essential to scale up research to handle the abundance of textual information, so-called ‘Big Data’. We have organised a workshop of speakers and discussion sessions to consider the state-of-the art in policy-making, textual analysis, and Big Data as well as the opportunities for cross-disciplinary research and development. The workshop brings together academic researchers, SMEs, and the Public Sector to exchange knowledge and outline project proposals in Digital Humanities.


Professor Barbara Fennel, Department of Linguistics

Dr Adam Wyner, Department of Computing Sciences

Workshop Schedule:

  • 12:00-12:30 Registration/Lunch
  • 12:30-13:30 Session 1 Public Policy-making Practice (C. Cottrill) and Policy-making Support Tools (A. Wyner)
  • 13:30-14:30 Session 2 Deliberative Democracy in Action (M. Oliver) and Text Analysis, News Media, and Psychiatry (N. Akhtar)
  • 14:30-15:00 Coffee break
  • 15.00-16:00 Session 3 Big Data (A. Goker)
  • 16.00-17:00 Roundup
  • Presenters:

  • Caitlin Cottrill, Lecturer, Department of Geography and Environment, University of Aberdeen. Caitlin will outline her knowledge about and experience in a range of policy-making contexts, particularly in domains of transportation and the environment. She will discuss some current issues and trends in policy-making.
  • Nooreen Akhtar, Research Training Fellow, Department of Applied Medicine, University of Aberdeen. Nooreen will discuss her investigations of how patients, public and stakeholders perceive and interpret information about anti-depressants in UK newspapers. It uses computational linguistic analysis and face-to-face interviews.
  • Matthew Oliver, Unlock Democracy. Unlock Democracy promotes deliberative, participatory, and transparent democratic activities by organising meetings and making available web-based tools to inform the public. Matthew is a Press and Project Manager and National Coordinator at Unlock Democracy. He will discuss aspects of Unlock Democracy and deliberative democracy.
  • Adam Wyner, Lecturer, Department of Computing Science, University of Aberdeen. Adam’s research interests are in the intersection of Law, Logic, Computer Science, and Language. Adam will present aspects of web-based tools to support deliberative, public policy-making, along with the analysis of legal materials.
  • Ayse Goker, Professor, School of Computing Science and Digital Media, Robert Gordon University. Ayse’s research interests are driven by a desire to research and improve information access and retrieval for users. Ayse has been the Principal Investigator of a range of UK and EU projects. Most recently, all the Scottish University Computing schools are partners through SICSA on the Innovation Centre bid for Data Science, with Robert Gordon University as its proposed NorthEast hub.
  • Shortlink to this page.
    By Adam Wyner

    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

    Tutorial on "Textual Information Extraction from Legal Resources" at the 16th International Conference on Artificial Intelligence and Law, Rome, Italy


    Legal resources such as legislation, public notices, case law, and other legally relevant documents are increasingly freely available on the internet. They are almost entirely presented in natural language and in text. Legal professionals, researchers, and students need to extract and represent information from such resources to support compliance monitoring, analyse cases for case based reasoning, and extract information in the discovery phase of a trial (e-discovery), amongst a range of possible uses. To support such tasks, powerful text analytic tools are available. The tutorial presents an in depth demonstration of one toolkit the General Architecture for Text Engineering (GATE) with examples and several briefer demonstrations of other tools.


    Participants in the tutorial should come away with some theoretical sense of what textual information extraction is about. They will also see some practical examples of how to work with a corpus of materials, develop an information extraction system using GATE and the other tools, and share their results with the research community. Participants will be provided with information on where to find additional materials and learn more.

    Intended Audience

    The intended audience includes legal researchers, legal professionals, law school students, and political scientists who are new to text processing as well as experienced AI and Law researchers who have used NLP, but wish to get a quick overview of using GATE.

    Covered Topics

    • Motivations to annotate, extract, and represent legal textual information.
    • Uses and domains of textual information extraction. Sample materials from legislation, case decisions, gazettes, e-discovery sources, among others.
    • Motivations to use an open source tool for open source development of textual information extraction tools and materials.
    • The relationship to the semantic web, linked documents, and data visualisation.
    • Linguistic/textual problems that must be addressed.
    • Alternative approaches (statistical, knowledge-light, machine learning) and a rationale for a particular bottom-up, knowledge-heavy approach in GATE.
    • Outline of natural language processing modules and tasks.
    • Introduction to GATE – loading and running simple applications, inspecting the results, refining the search results.
    • Development of fragments of a GATE system – lists, rules, and examination of results.
    • Discussion of more complex constructions and issues such as fact pattern identification, which is essential for case-based reasoning, named entity recognition, and structures of documents.
    • Introduction to ontologies.
    • Link textual information extraction to ontologies.
    • Introduction to related tools and approaches: C&C/Boxer (parser and semantic interpreter), Attempto Controlled English, scraperwiki, among others.

    Date, Time, Location, and Logistics

    Monday, June 10, afternoon session.
    The tutorial was held at the Casa dell’Aviatore, viale dell’Università 20 in Rome, Italy.
    Information about the conference is available at the website for the 16th International Conference on Artificial Intelligence and the Law (ICAIL).


    The slides from the presentation are available here:
    Textual Information Extraction from Legal Resources

    Further Information

    Contact the lecturer.


    Dr. Adam Wyner
    Lecturer, Department of Computing Science, University of Aberdeen
    Aberdeen, Scotland
    azwyner at abdn dot ac dot uk
    The lecturer has a PhD in Linguistics, a PhD in Computer Science, and research background in computational linguistics. The lecturer has previously given a tutorial on this topic at JURIX 2009 and ICAIL 2011 along with an invited talk at RuleML 2012, has published several conference papers on text analytics of legal resources using GATE and C&C/Boxer, and continues to work on text analysis of legal resources.
    A shortlink to this webpage
    By Adam Wyner
    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.