Claire Grover

Senior Research Fellow in the Language Technology Group, which is part of the Institute for Language, Cognition and Computation (ILCC) in the School of Informatics.

Part-time Turing Fellow (Sept 2016 to Aug 2018) at the Alan Turing Institute.

 
  

Email: C.Grover@ed.ac.uk
Tel: +44 131 650 4441
Fax: +44 131 650 4587

 

Publications


Current and Recent Projects

The Edinburgh Geoparser

A system to automatically recognise place names in text and disambiguate them with respect to a gazetteer.

Robust large-scale text mining of UK healthcare records

Alan Turing Institute faculty fellowship. Building on previous work with Will Whiteley, I aim to consider text mining for a wide range of clinical reports by putting into place reusable, state-of-the-art tools and infrastructure, and exploring machine learning methods to improve performance and to enable adaptation to new clinical domains.

Administrative Data Research Centre - Scotland (ADRC-S)

ADRC-S involves world-leading experts in the theory, methods and policy of linking records for secondary uses. The Informatics team leads Work Package 6: Coding and classifying text - Extracting information from free text resources. We are investigating methods for enriching and contextualising administrative records by extracting information from unstructured text resources that can be associated with them.

Targeted treatment for acute stroke: development of prognostic models

Principal Investigator: Dr William Whiteley, Centre for Clinical Brain Sciences. We are developing a text mining system to analyse radiologists' reports of brain imaging (MRI and CT scans).

Reassembling the Republic of Letters

A digital framework for multi-lateral collaboration on Europe's intellectual history (1500-1800). EU COST Action. Member of working group on Space and Time.

Historical Texts: Geo-spatial Metadata

Geo-referencing of JISC's Historical Texts Collection.

S-CASE

The S-CASE project is about semi-automatically creating RESTful Web Services through multi-modal requirements using a Model Driven Engineering methodology. The Edinburgh team works on parsing software requirements and question answering to retrieve records of software artifacts from a repository.

Palimpsest

The Palimpsest project uses natural language processing technology, informed by literary scholars’ input, in order to text mine literary works set in Edinburgh and to visualise the results in accessible ways.

Hiberlink

The focus of the Hiberlink project is to assess the extent of so-called 'reference rot'. This two-year study investigates how web links in online scientific and other academic articles fail to lead to the resources that were originally referenced.

Trading Consequences

The Trading Consequences project is a multi-institutional, international collaboration between environmental historians in Canada and computer scientists in the UK that uses text-mining software to explore thousands of pages of historical documents related to international commodity trading in the British Empire, involving Canada in particular, during the 19th century, and its impact on the economy and environment.

BotaniTours

A Sicsa Smart Tourism project. Botanitours is a service which provides access to information about plants and gardens within a given locality.

DEEP (The Digitisation of English Placenames)

This project aimed to digitise the 86 volumes of the Survey of English Place-Names, a county by county survey started in 1922 by the specialists of the English Place-Name Society (EPNS). Our role was to convert the semi-structured text into a fully structured resource from which a historical gazetteer is derived for use with the Edinburgh Geoparser. A browsable view of the gazetteer can be found at placenames.org.uk and the derived data is distributed by JISC under a CC-BY-NC licence from mads.digitalresources.jisc.ac.uk/mads2017/.

SYNC3 (Synergetic Content Creation and Communication)

The goal of SYNC3 is to create a framework for structuring, rendering more accessible and enabling collaborative creation of the extensive user-provided content that is located in personal blogs and refers to running news issues. Funded by the European Union's 7th Framework Programme: Information and Communication Technologies (ICT).

GeoDigRef

GeoDigRef is a short project investigating the advantages of metadata enrichment across three diverse resource collections funded under the JISC Digitisation programme.

TXM

Text Mining for Biomedical Content Curation

EASIE (Edinburgh And Stanford Information Extraction)

Combining Shallow Semantics and Domain Knowledge: the project builds on existing techniques for information extraction (IE) in order to develop and implement improved methods for extracting semantic content from text.