<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>biotext.org.uk &#187; Events</title>
	<atom:link href="http://biotext.org.uk/category/events/feed/" rel="self" type="application/rss+xml" />
	<link>http://biotext.org.uk</link>
	<description>Not a typewriter</description>
	<lastBuildDate>Mon, 06 Sep 2010 12:44:30 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>BioGeeks tech meet, Science HackDay special</title>
		<link>http://biotext.org.uk/biogeeks-tech-meet-science-hackday-special/</link>
		<comments>http://biotext.org.uk/biogeeks-tech-meet-science-hackday-special/#comments</comments>
		<pubDate>Tue, 15 Jun 2010 09:23:31 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[open science]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=475</guid>
		<description><![CDATA[This month&#8217;s BioGeeks meeting at KCL is on Friday, June 18th, to coincide with the Science HackDay taking place over the weekend.
We have a special guest this month, Cameron Neylon, with an open-science-themed talk entitled &#8220;What have the public done for us?&#8221; Plus lightning talks on various subjects.
In other news, I&#8217;ve moved the blog over [...]]]></description>
			<content:encoded><![CDATA[<p>This month&#8217;s <a href="http://biogeeks.wordpress.com/2010/06/02/june-tech-meet-fri-18th-kcl/">BioGeeks meeting at KCL</a> is on Friday, June 18th, to coincide with the <a href="http://sciencehackday.com/">Science HackDay</a> taking place over the weekend.</p>
<p>We have a special guest this month, <a href="http://cameronneylon.net/">Cameron Neylon</a>, with an open-science-themed talk entitled &#8220;What have the public done for us?&#8221; Plus lightning talks on various subjects.</p>
<p>In other news, I&#8217;ve moved the blog over to the much cleaner and airier <a href="http://www.plaintxt.org/themes/barthelme/">Barthelme</a> theme. Drop a comment if it gives you any problems.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/biogeeks-tech-meet-science-hackday-special/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>London BioGeeks &#8212; May Tech Meet is next week</title>
		<link>http://biotext.org.uk/london-biogeeks-may-tech-meet-is-next-week/</link>
		<comments>http://biotext.org.uk/london-biogeeks-may-tech-meet-is-next-week/#comments</comments>
		<pubDate>Thu, 13 May 2010 16:11:52 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[bioinformatics]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=470</guid>
		<description><![CDATA[The May tech meet is on Thursday 20th at Imperial College.
This month&#8217;s speakers:
Catherine Canevet &#8212; Ondex: Data integration and visualisation
Christopher Barnes &#8212; ABC-SysBio: Approximate Bayesian Computation in Python with GPU support
N. Purswani, L. Tweedy, Z. Patel, C. Suriel-Melchor &#8212; DASbrick: A cloud based Rich internet application for Synthetic Biology Parts Registries
Does anyone have a link [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://biogeeks.wordpress.com/2010/05/07/may-tech-meet/">May tech meet</a> is on Thursday 20th at Imperial College.</p>
<p>This month&#8217;s speakers:</p>
<p><a href="http://www.rothamsted.bbsrc.ac.uk/Research/Centres/PersonDetails.php?PIID=5497">Catherine Canevet</a> &#8212; <a href="http://ondex.org/">Ondex</a>: Data integration and visualisation</p>
<p><a href="http://www3.imperial.ac.uk/theoreticalsystemsbiology/people/christopherbarnes">Christopher Barnes</a> &#8212; <a href="http://abc-sysbio.sourceforge.net/">ABC-SysBio</a>: Approximate Bayesian Computation in Python with GPU support</p>
<p>N. Purswani, L. Tweedy, Z. Patel, C. Suriel-Melchor &#8212; DASbrick: A cloud based Rich internet application for Synthetic Biology Parts Registries</p>
<p>Does anyone have a link for DASbrick?</p>
<p>Drinks afterwards at Imperial&#8217;s Eastside Bar. See <a href="http://biogeeks.wordpress.com/2010/05/07/may-tech-meet/">the BioGeeks blog</a> for full details.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/london-biogeeks-may-tech-meet-is-next-week/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Take Back Parliament demonstration &#8212; Sat 8 May</title>
		<link>http://biotext.org.uk/take-back-parliament-demonstration-sat-8-may/</link>
		<comments>http://biotext.org.uk/take-back-parliament-demonstration-sat-8-may/#comments</comments>
		<pubDate>Fri, 07 May 2010 11:02:17 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[politics]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=457</guid>
		<description><![CDATA[As someone who works with numbers, I&#8217;ve known for ages the electoral system in the UK is a very poor model.
The distribution of votes across the parties correlates very badly with the distribution of seats they get in return.
It&#8217;s possible, and not uncommon, for a party&#8217;s overall vote share to go down and yet its [...]]]></description>
			<content:encoded><![CDATA[<p>As someone who works with numbers, I&#8217;ve known for ages the electoral system in the UK is a very poor model.</p>
<p>The distribution of <strong>votes</strong> across the parties correlates very badly with the distribution of <strong>seats</strong> they get in return.</p>
<p>It&#8217;s possible, and not uncommon, for a party&#8217;s overall vote share to go down and yet its parliamentary influence to go up, or vice versa.</p>
<p>After years of adversarial flip-flopping, the system&#8217;s thrown up a result which <em>nobody</em> seems to be satisfied with, regardless of their party affiliation. (Except maybe the Greens <img src='http://biotext.org.uk/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  )</p>
<p>There are electoral reform demonstrations happening tomorrow, in London and all across the country:</p>
<p><a href="http://www.takebackparliament.com/">http://www.takebackparliament.com/</a></p>
<p>If you can spare a few hours to go along, hopefully we can get a good turnout and make the case for <em>real</em> change while it&#8217;s still very topical &#8212; not the waffly kind of change that politicians promise every time and never deliver.</p>
<p>I&#8217;ll be in Trafalgar Square from 2pm.</p>
<p>Andrew.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/take-back-parliament-demonstration-sat-8-may/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>London BioGeeks &#8212; April Tech Meet</title>
		<link>http://biotext.org.uk/london-biogeeks-april-tech-meet/</link>
		<comments>http://biotext.org.uk/london-biogeeks-april-tech-meet/#comments</comments>
		<pubDate>Mon, 19 Apr 2010 10:59:31 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[bioinformatics]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=452</guid>
		<description><![CDATA[This month&#8217;s tech meet is at 6pm on 21st April at University College London.
We have talks from&#8230;
Alison Cuff, UCL
The CATH database &#8212; Structural Diversity and the Question of the Fold Continuum
Andrew Martin, UCL
SAPTF &#8212; Sequence Analysis Plugin Tool Framework
John Pinney, Imperial College
GLASS &#8212; Gene LAyout by Semantic Similarity
Followed by drinks at 7:30-ish at the College [...]]]></description>
			<content:encoded><![CDATA[<p><strong>This month&#8217;s tech meet is at 6pm on 21st April at University College London.</strong></p>
<p>We have talks from&#8230;</p>
<p><em>Alison Cuff, UCL</em></p>
<p>The CATH database &#8212; Structural Diversity and the Question of the Fold Continuum</p>
<p><em>Andrew Martin, UCL</em></p>
<p>SAPTF &#8212; Sequence Analysis Plugin Tool Framework</p>
<p><em>John Pinney, Imperial College</em></p>
<p>GLASS &#8212; Gene LAyout by Semantic Similarity</p>
<p>Followed by <strong>drinks at 7:30-ish</strong> at the <a href="http://www.yelp.co.uk/biz/college-arms-london">College Arms</a>.</p>
<p>Full details, maps, directions etc. are <a href="http://biogeeks.wordpress.com/2010/03/18/april-tech-meet/">on the BioGeeks blog</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/london-biogeeks-april-tech-meet/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More live gigs!</title>
		<link>http://biotext.org.uk/more-live-gigs/</link>
		<comments>http://biotext.org.uk/more-live-gigs/#comments</comments>
		<pubDate>Fri, 26 Jun 2009 09:45:36 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[FuncNet]]></category>
		<category><![CDATA[webservices]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=377</guid>
		<description><![CDATA[I&#8217;ll also be running an interactive workshop on FuncNet at:
The EMBRACE-ENFIN workshop on Expression, Interactions, and System Level Modeling
Helsinki, 5th-6th October 2009
]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ll also be running an interactive workshop on <a href="http://funcnet.eu/">FuncNet</a> at:</p>
<p><a href="http://www.enfin.org/page.php?page=embrace_enfin">The EMBRACE-ENFIN workshop on Expression, Interactions, and System Level Modeling</a></p>
<p>Helsinki, 5th-6th October 2009</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/more-live-gigs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Live gigs!</title>
		<link>http://biotext.org.uk/live-gigs/</link>
		<comments>http://biotext.org.uk/live-gigs/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 17:52:33 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[FuncNet]]></category>
		<category><![CDATA[webservices]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=373</guid>
		<description><![CDATA[Couple of upcoming events I&#8217;ll be going to&#8230;
1. Data Integration in the Life Sciences (DILS 2009) in Manchester next month, with a poster and abstract about FuncNet.
2. EMBL-EBI/ENFIN 2009 annual forum for small-medium enterprises (SMEs), in Vienna in September, with a half-hour talk on the same subject.
No ISMB for me this year, not economically justifiable [...]]]></description>
			<content:encoded><![CDATA[<p>Couple of upcoming events I&#8217;ll be going to&#8230;</p>
<p>1. <a href="http://www.cs.manchester.ac.uk/DILS09/index.php">Data Integration in the Life Sciences</a> (DILS 2009) in Manchester next month, with a poster and abstract about <a href="http://funcnet.eu">FuncNet</a>.</p>
<p>2. <a href="http://www.ebi.ac.uk/industry/SME/">EMBL-EBI</a>/<a href="http://enfin.org/">ENFIN</a> <a href="http://www.enfin.org/page.php?page=sme_meeting_2009">2009 annual forum for small-medium enterprises</a> (SMEs), in Vienna in September, with a half-hour talk on the same subject.</p>
<p>No <a href="http://www.iscb.org/ismbeccb2009/">ISMB</a> for me this year, not economically justifiable without a speaking spot.</p>
<p>Andrew.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/live-gigs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Bioinformatics in the pub. Free as in free beer&#8230;</title>
		<link>http://biotext.org.uk/bioinformatics-in-the-pub-free-as-in-free-beer/</link>
		<comments>http://biotext.org.uk/bioinformatics-in-the-pub-free-as-in-free-beer/#comments</comments>
		<pubDate>Thu, 14 May 2009 17:33:07 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[drinking]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=357</guid>
		<description><![CDATA[Pub meet-up for bioinformaticians / technophile biologists at:
The Miller pub, near Guy’s Hospital (London Bridge) on Wednesday 27th May from 6pm onwards.
This first meeting will just be a social event and chance to chat other bio-geeks but if there’s enough interest then we might organise some more technical events in the future (topic suggestions welcome).
Feel [...]]]></description>
			<content:encoded><![CDATA[<p>Pub meet-up for bioinformaticians / technophile biologists at:</p>
<p><em><a href="http://www.themiller.co.uk/pub/map.asp">The Miller pub</a>, near Guy’s Hospital (London Bridge) on Wednesday 27th May from 6pm onwards.</em></p>
<p>This first meeting will just be a social event and chance to chat other bio-geeks but if there’s enough interest then we might organise some more technical events in the future (topic suggestions welcome).</p>
<p>Feel free to tell anyone you think might be interested. If you want to come you can just turn up, but it would be helpful if you let me or <a href="http://www.cassj.co.uk/blog/?p=237">Cass</a> know you&#8217;re coming. She&#8217;s found a recruitment agency who are interested in sponsoring the event (probably in the form of beer and food) so it would be useful to have an idea of numbers.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/bioinformatics-in-the-pub-free-as-in-free-beer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drones Club, 24th April 2009</title>
		<link>http://biotext.org.uk/drones-club-24th-april-2009/</link>
		<comments>http://biotext.org.uk/drones-club-24th-april-2009/#comments</comments>
		<pubDate>Fri, 17 Apr 2009 10:25:22 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[music]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=346</guid>
		<description><![CDATA[Think electronica, think punk attitude&#8230;
]]></description>
			<content:encoded><![CDATA[<div id="attachment_350" class="wp-caption aligncenter" style="width: 215px"><a href="http://biotext.org.uk/wp-content/uploads/2009/04/drones.jpg"><img class="size-medium wp-image-350" title="Drones Club April 2009" src="http://biotext.org.uk/wp-content/uploads/2009/04/drones-205x300.jpg" alt="[click for bigger]" width="205" height="300" /></a><p class="wp-caption-text">(click for bigger)</p></div>
<p><em>Think electronica, think punk attitude&#8230;</em></p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/drones-club-24th-april-2009/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SESL 2009 day two</title>
		<link>http://biotext.org.uk/sesl-2009-day-two/</link>
		<comments>http://biotext.org.uk/sesl-2009-day-two/#comments</comments>
		<pubDate>Tue, 31 Mar 2009 10:45:39 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[ontologies]]></category>
		<category><![CDATA[publishing]]></category>
		<category><![CDATA[SESL]]></category>
		<category><![CDATA[text_mining]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=312</guid>
		<description><![CDATA[Semantic Enrichment of the Scientific Literature 2009
Tue 31 Mar: &#8220;Semantic Enrichment of the literature for the benefit of all users&#8221;
(Monday&#8217;s notes are here)
Missed the early morning session. I don&#8217;t work in pharma any more so it didn&#8217;t seem worth a 5:45am wake-up (unhelpful train times). Although apparently Eric Neumann&#8217;s talk on linked data was good [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.ebi.ac.uk/Rebholz-srv/SESL/sesl.html">Semantic Enrichment of the Scientific Literature 2009</a></p>
<p><strong>Tue 31 Mar: &#8220;Semantic Enrichment of the literature for the benefit of all users&#8221;</strong></p>
<p>(Monday&#8217;s notes are <a href="http://biotext.org.uk/workshop-notes-sesl-2009/">here</a>)</p>
<p>Missed the early morning session. I don&#8217;t work in pharma any more so it didn&#8217;t seem worth a 5:45am wake-up (unhelpful train times). Although apparently Eric Neumann&#8217;s talk on linked data was good (&#8220;semantic web without the &#8217;semantic&#8217;&#8221; &#8212; <a href="http://duncan.hull.name/">Duncan</a>)</p>
<p>Alfonso Valencia &#8212; <a href="http://www.elixir-europe.org/">ELIXIR</a> &#8212; an EU project to upgrade Europe&#8217;s bioinformatics infrastructure. Includes a work package on literature integration &#8212; making lit. repositories, ontologies and traditional biological databases interoperate better. Good &#8212; too much text mining happens in isolation from the rest of the bioinformatics world. Targeted at wet-lab scientists not just computational people. Looks like it might include an effort to turn raw algorithms into usable tools/platforms. Still in the early phases.</p>
<p>He also discussed the <a href="http://biocreative.sourceforge.net/biocreative_2.html">BioCreative</a> project which has released various data sets and held challenges on several aspects of text mining. A spin-off from these is the <a href="http://bcms.bioinfo.cnio.es/">BioCreative MetaServer</a> which identifies genes and proteins mentioned in text by aggregating predictions from several prediction services.</p>
<p>Dietrich Rebholz-Schuhmann &#8212; <a href="http://ukpmc.ac.uk/">UKPMC</a> &#8212; a UK mirror of PubMed Central (with added value apparently) co-ordinated by the British Library. Working on information retrieval and data integration improvements. Sounds like the funding bodies are getting involved, many referring specifically to UKPMC in their open access policies. Paying for OA journal submissions is an issue. Apparently the Wellcome Trust have an OA fund which is under-utilized.</p>
<p>Also, <a href="http://www.ebi.ac.uk/Rebholz-srv/CALBC/">CALBC</a> &#8212; a project to semantically annotate a large biomedical corpus (named entities only?) by getting a consensus annotation from iteratively integrating the output of various information extraction systems, and then manually cleaning up the disagreements.</p>
<p>Stefano Bertolo (EU) &#8212; funding calls &#8212; deadline 3rd November&#8230;</p>
<p>(Great analogy: Human history has entered a phase where we can produce information by machine quicker than we can interpret it. What we need is &#8216;cognitive levers&#8217;.)</p>
<p><a href="http://cordis.europa.eu/fp7/ict/content-knowledge/">7th framework, SO 4.3, Call 5</a>, themes:</p>
<ul>
<li>Capturing tractable information</li>
<li>Delivering pertinent information</li>
<li>Collaboration and decision support</li>
<li>Personal sphere</li>
<li>Impact and science &#038; tech leadership</li>
</ul>
<p>They all sound a bit vague and buzzwordy without the explanations&#8230;</p>
<p>Key themes: large data sets and (close to) real-time processing. Requirement for robust, strongly-tested tools that can be distributed &#8212; not just &#8216;only works on the PC of the postdoc that wrote it, on a good day&#8217; <img src='http://biotext.org.uk/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Informal queries about proposals: infso-e2 at ec.europa.eu</p>
<p>Lunch! Then&#8230;</p>
<p>Keynote from UMLS guru Olivier Bodenreider on normalizing terms/concepts across different lexical/taxonomical/ontological resources. Lexical vs. semantic approaches &#8212; e.g. string munging vs. traversing known relationships. Latter complicated by fact that some pairs of concepts are synonymous in one resource and hyponymous in another. Also, semantic similarity &#8212; lowest common subsumer/definition by extension, e.g. famous Resnik measure.</p>
<p>Also mentioned <a href="http://bioportal.bioontology.org/">BioPortal</a>, not sure exactly how this differs from the UMLS in scope, probably more biological than medical? Must be overlap though.</p>
<p>These are forming a key part of CALBC (see above).</p>
<p>Sophia Ananiadou from <a href="http://www.nactem.ac.uk/">NaCTeM</a>&#8211; NLP view of semantic enrichment: terms and names entities &#8212; concepts &#8212; events and relationships. Termine and Acromine &#8212; extraction of terms and acronyms. <a href="http://www.biomedcentral.com/1471-2105/9/S11/S8">Accelerated annotation</a> methods &#8212; cunning. More on the importance of building proper tools rather than just prototypes/in-house algorithms. Glad the NLP scene is catching on to this. Hopefully they allow querying by unique accession rather than just names &#8212; this is another area where the NLP people don&#8217;t always understand what the bio people need.</p>
<p>She discussed some of NaCTeM&#8217;s flagship tools like MEDIE, FACTA and KLEIO &#8212; it does look like they&#8217;re starting to take all the pain out of text mining, by doing the difficult bits for us, so we can use the results to do actual mining. Also they are offering web service interfaces (&#8216;overdue&#8217; for some of them according to Sophia) &#8212; excellent news.</p>
<p>More from Udo &#8212; what do we mean by &#8217;semantics&#8217;? Mixed-bag talk. Problems with folksonomies/tag clouds, e.g. Flickr: &#8220;newyork&#8221; &#8220;newyorkcity&#8221; &#8220;nyc&#8221; &#8220;new&#8221; &#8220;york&#8221;. Biomedical lexicon an order of magnitude bigger than general English lexicon (based on Wordnet and typical competent speaker). Wow. Domain dictionaries like GO/UMLS: these inherit some of the problems of natural language because the terms themselves are stated/defined in natural language! Also often ontological relations are vague/underspecified/changing.</p>
<p>Last session&#8230;</p>
<p>Anita de Waard (Elsevier) &#8212; <em>FEBS Letters</em> structural digital abstracts experiment (author-provided PPI annotation). 75% author compliance, avg 1 hour per abstract. They&#8217;ve moved responsibility to the MINT curators instead of the authors, to increase compliance and efficiency! What does that tell us&#8230; Also mentioned <a href="http://www.okkam.org/">OKKAM</a> &#8212; a consortium trying to provide a UID for <em>every single entity on the web</em>. Umm&#8230; Holy crap. So far, 1.5 million entities covered, so they have a bit of a way to go, to say the least. She went on to discuss some aspects of discourse analysis of scientific text. Interesting point, hedging gets eroded by citation &#8220;these results suggest that&#8221; becomes &#8220;author X shows that&#8221; becomes just a cited fact.</p>
<p>She also discussed the Elsevier Grand Challenge &#8212; what&#8217;s the most interesting thing you can do with half a million full text articles? Finalists have been chosen, the winners will be announced next month. Next year: Future of Research Communication conference on same themes, probably March at Harvard.</p>
<p>EU-ADR (Erik van Mulligen) &#8212; federated data mining/text mining/epidemiological analysis to discover &#038; monitor novel adverse drug reactions. Five-year pan-European project. Sounds like an enormous piece of work with lots of engineering challenges &#8212; anonymization etc. for a start.</p>
<p><a href="http://www.wikigenes.org/">WikiGenes</a> (Robert Hoffmann) &#8212; a wiki for genes, chemicals, MeSH terms obviously &#8212; but pre-seeded with sentences yanked from <a href="http://www.ihop-net.org/UniPub/iHOP/">iHOP</a>. So experts can step in and add/fix stuff but without the momentum barrier to getting started. &#8216;Narcissistic drive&#8217; for authors of missed papers to add their own &#8212; cunning. Custom engine based on Apache Cocoon and Lucene. Authorship tracking down to individual strings of text, and it&#8217;s easy to view this information. The idea is that scientists will want to add their own work and get credit for it.</p>
<p>He makes the point that this is in many ways a much better way to publish biological information than several thousand different journals, and gives much better influence metrics than impact factors and H-index etc.</p>
<p>Is it in direct competition with WikiProteins? Not according to Robert &#8212; that&#8217;s more about knowledge engineering and formal semantic relationships, machine-readable stuff, whereas this is more supposed to be a modern publishing medium for human-readable information. Which hopefully the biologists will take to more readily.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/sesl-2009-day-two/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Workshop notes &#8212; SESL 2009</title>
		<link>http://biotext.org.uk/workshop-notes-sesl-2009/</link>
		<comments>http://biotext.org.uk/workshop-notes-sesl-2009/#comments</comments>
		<pubDate>Mon, 30 Mar 2009 12:56:17 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[ontologies]]></category>
		<category><![CDATA[publishing]]></category>
		<category><![CDATA[SESL]]></category>
		<category><![CDATA[text_mining]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=288</guid>
		<description><![CDATA[Semantic Enrichment of the Scientific Literature 2009
Monday 30 Mar: &#8220;Reliable factual data from the literature based on ontological resources&#8221;
Highlight of the morning session was Junichi Tsujii&#8217;s demo of the PathText system, which integrates manually-curated pathway information in CellDesigner or SBML format with text-mined relationships, and lets you browse the pathway maps and drill through to [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.ebi.ac.uk/Rebholz-srv/SESL/sesl.html">Semantic Enrichment of the Scientific Literature 2009</a></p>
<p><strong>Monday 30 Mar: &#8220;Reliable factual data from the literature based on ontological resources&#8221;</strong></p>
<p>Highlight of the morning session was Junichi Tsujii&#8217;s demo of the PathText system, which integrates manually-curated pathway information in CellDesigner or SBML format with text-mined relationships, and lets you browse the pathway maps and drill through to related literature.</p>
<p>It&#8217;s not finished yet but there&#8217;s a preview video available from <a href="http://www.nactem.ac.uk/pathtext/">NaCTeM</a>.</p>
<p>Also a bit of a preview of the <a href="http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/SharedTask/">BioNLP 2009 Shared Task</a> on extracting biomolecular events from text into semantic networks &#8212; which I&#8217;m reviewing entries for at the moment.</p>
<p>Lots of material about gene regulation today. An intro to the <a href="http://www.ebi.ac.uk/Rebholz-srv/GRO/GRO.html">Gene Regulation Ontology</a> (Jung-Jae Kim), a couple of talks about extracting regulatory events from free text (Kim and Udo Hahn), and the <a href="http://www.oreganno.org/oregano/Index.jsp">ORegAnno</a> project which is using text mining to support its manual curation of regulatory events (Stephen Montgomery). The new(ish) GeneReg corpus will be useful to anyone building systems like this, as would be the BioNLP 2009 data, I&#8217;ll find out if it is available to non-entrants.</p>
<p>Also a talk about populating/extending ontologies automatically from clinical reports (Wendy Chapman). Patterns like &#8220;NOUN_PHRASE_1, such as NOUN_PHRASE_2&#8243;. Simple and effective.</p>
<p>Back from lunch&#8230; And sitting at the front so I can hear better. Hence better notes!</p>
<p>Simonetta Montemagni just gave an excellent introduction to the BioLexicon project (also NaCTeM-related) which is essentially a huge database of biomedical/biological language, including such things as domain-specific syntax and semantic metadata, dead useful for text mining developers. Also contains a thesaurus of gene and protein names (inc. synonyms and variants) with links back to UniProt IDs which makes it much more useful for general bioinformatics use.</p>
<p>It isn&#8217;t all available yet, and will be published via a linguistic data provider, bit vague about licensing! So it may or may not be free (I&#8217;m guessing free for academic use, commercial for other uses).</p>
<p>Lots of data and tools for natural language processing from Udo Hahn&#8217;s group: <a href="http://www.julielab.de/">http://www.julielab.de/</a> &#8230; Plus some war stories about the difference in information extraction accuracy between &#8216;lab&#8217; tests and real world performance, e.g. from ~60% (close to human levels) to ~20% F-score. Ouch&#8230; But we&#8217;ve all been there. (See also note about GeneReg above)</p>
<p>Su Jian talked about designing evaluation tasks for genomic information retrieval (i.e. search engine) algorithms, and improving said algorithms with dedicated gene/protein name recognizers. Bit specialized for me &#8212; lots of score functions I didn&#8217;t know the definitions of&#8230;</p>
<p>Quick coffee break!</p>
<p>Nice talk about the <a href="http://www.ebi.ac.uk/microarray-srv/efo/">Experimental Factor Ontology</a> from the <a href="http://www.ebi.ac.uk/microarray-as/ae/">ArrayExpress</a> project (James Malone). This is for classifying experimental conditions in microarray experiments. They&#8217;ve gone to a lot of trouble to link their ontology into others as painlessly as possible, and have developed autonomous agents to trawl the semantic web for other ontologies that may be related, and to alert them when ontologies they link to change, as this might imply a link is no longer true. Cute. The EFO has also allowed them to offer federated queries with other databases, and they use it for sanity-checking the data people submit via reasoning rules &#8212; e.g. cardiovascular disease can&#8217;t occur in hair follicle cells.</p>
<p>UCSD (Lynn Fink) have written a very nice <a href="http://www.codeplex.com/UCSDBioLit">plugin for Word 2007</a> that watches your text as you type and automatically tags biomolecular database identifiers and terms from OBO ontologies when they appear &#8212; with the option to add/edit/remove/override manually of course, and the tags being preserved in Word&#8217;s XML files. Kind of like a spellchecker/thesaurus for semantic markup. I&#8217;m not a fan of word processors (give me LaTeX any day) but this is an excellent idea. Hopefully publishers and curators will be able to parse useful metadata out of the resulting files.</p>
<p>Some similar ideas from <a href="http://wwwdev.ebi.ac.uk/tc-test/textmining/PublicationValidator/">PaperMaker</a> (Piotr Pezik) which also does semantic tagging, along with things like spotting missing references, acronyms that haven&#8217;t been defined, and genes/proteins that have been referred to by non-recommended identifiers. It can also trawl PubMed for similar publications, at the whole-doc or paragraph level. Throws in spell checking, word count etc. Neat work, but I&#8217;m not entirely sure who it&#8217;s aimed at &#8212; biologists would surely prefer this to be a Word plugin like the previous.</p>
<p><strong>More <a href="http://biotext.org.uk/sesl-2009-day-two/">tomorrow</a>.</strong></p>
<p>Going back over notes and adding links as I have time.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/workshop-notes-sesl-2009/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
