Downloads
Gene/protein interaction extraction baseline
While working on my PhD I stumbled across a dumb-but-effective baseline measure for extracting gene/protein interactions from text. This is referred to as ‘Algorithm III’ in chapter 3 of my PhD thesis (PDF, 730KB) and is further tested in Kabiljo et al. 2009 (submitted).
The complete word list used in the paper is here, and the script will be posted too as soon as I’ve removed everything specifically related to the LLL Challenge data format.
Syntactic pattern matching with GraphSpider and MPL
Please check the GraphSpider/MPL website for software and data for syntactic pattern matching and information extraction.
Legacy data
If you are looking for downloads relating to my PhD thesis (PDF, 730KB) or any of my other published papers, see my old site at Birkbeck.
That content will be gradually migrated here, along with all new data and software.