<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>biotext.org.uk &#187; Research</title>
	<atom:link href="http://biotext.org.uk/category/research/feed/" rel="self" type="application/rss+xml" />
	<link>http://biotext.org.uk</link>
	<description>Not a typewriter</description>
	<lastBuildDate>Sat, 05 Feb 2011 14:18:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>RetriableTask &#8212; a generic wrapper for retrying operations in Java</title>
		<link>http://biotext.org.uk/retriabletask-a-generic-wrapper-for-retrying-operations-in-java/</link>
		<comments>http://biotext.org.uk/retriabletask-a-generic-wrapper-for-retrying-operations-in-java/#comments</comments>
		<pubDate>Wed, 20 May 2009 16:27:38 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[Tips]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[LJC]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=360</guid>
		<description><![CDATA[A couple of times recently I&#8217;ve needed to write methods that retry up to n times if an error occurs, and surprisingly, couldn&#8217;t find any standard patterns to accomplish this on the web. So I wrote my own. All comments appreciated (there may well be massive holes in my logic but it works so far). [...]]]></description>
			<content:encoded><![CDATA[<p>A couple of times recently I&#8217;ve needed to write methods that retry up to n times if an error occurs, and surprisingly, couldn&#8217;t find any standard patterns to accomplish this on the web. So I wrote my own. All comments appreciated (there may well be massive holes in my logic but it works so far).</p>
<p><span id="more-360"></span></p>
<pre class="brush: java">
import java.util.concurrent.Callable;
import java.util.concurrent.CancellationException;
import java.util.logging.Logger;

/**
 * This class is a wrapper for a Callable that adds retry functionality.
 * The user supplies an existing Callable, a maximum number of tries,
 * and optionally a Logger to which exceptions will be logged. Calling
 * the call() method of RetriableTask causes the wrapped object&#039;s call()
 * method to be called, and any exceptions thrown from the inner call()
 * will cause the entire inner call() to be repeated from scratch, as
 * long as the maximum number of tries hasn&#039;t been exceeded.
 * InterruptedException and CancellationException are allowed to
 * propogate instead of causing retries, in order to allow cancellation
 * by an executor service etc.
 *
 * @param &lt;T&gt; the return type of the call() method
 */
public class RetriableTask&lt;T&gt; implements Callable&lt;T&gt;
{

    private final Callable&lt;T&gt; _wrappedTask;

    private final int _tries;

    private final Logger _log;

    /**
     * Creates a new RetriableTask around an existing Callable. Supplying
     * zero or a negative number for the tries parameter will allow the
     * task to retry an infinite number of times -- use with caution!
     *
     * @param taskToWrap the Callable to wrap
     * @param tries the max number of tries
     * @param log a Logger to log exceptions to (null == no logging)
     */
    public RetriableTask ( final Callable&lt;T&gt; taskToWrap, final int tries, final Logger log )
    {
        _wrappedTask = taskToWrap;
        _tries = tries;
        _log = log;
    }

    /**
     * Invokes the wrapped Callable&#039;s call method, optionally retrying
     * if an exception occurs. See class documentation for more detail.
     *
     * @return the return value of the wrapped call() method
     */
    public T call() throws Exception
    {
        int triesLeft = _tries;
        while( true )
        {
            try
            {
                return _wrappedTask.call();
            }
            catch( final InterruptedException e )
            {
                // We don&#039;t attempt to retry these
                throw e;
            }
            catch( final CancellationException e )
            {
                // We don&#039;t attempt to retry these either
                throw e;
            }
            catch( final Exception e )
            {
                triesLeft--;

                // Are we allowed to try again?
                if( triesLeft == 0 ) // No -- rethrow
                    throw e;

                // Yes -- log and allow to loop
                if( _log != null )
                    _log.warning( &quot;Caught exception, retrying... Error was: &quot; + e.getMessage() );
            }
        }

    }

}
</pre>
<p><strong>EDIT:</strong> Just to make it clear how I would use this, it works best in the context of an ExecutorService, which manages multi-threading and timeouts for you. So for example, I could create an executor service like so:</p>
<pre class="brush: java">
ExecutorService pool = Executors.newCachedThreadPool();
</pre>
<p>Then create all the tasks I need, and wrap them in RetriableTasks:</p>
<pre class="brush: java">
// Create a task that returns a String
Callable&lt;String&gt; task1 = new Callable&lt;String&gt;()
{
    public String call()
    {
        // Do stuff here
    }
};
// Make it try up to three times
RetriableTask&lt;String&gt; retriable1 =
    new RetriableTask&lt;String&gt;( task1, 3, null );
</pre>
<p>Then add all the RetriableTasks to a Collection, and execute them all on the thread pool:</p>
<pre class="brush: java">
Collection&lt;Callable&lt;String&gt;&gt; tasks =
    new ArrayList&lt;Callable&lt;String&gt;&gt;();
tasks.add( retriable1 );
// TIMEOUT == timeout in seconds
List&lt;Future&lt;String&gt;&gt; results =
    pool.invokeAll( tasks, TIMEOUT, TimeUnit.SECONDS );
</pre>
<p>Finally, iterate through the results list to get all the return values:</p>
<pre class="brush: java">
for( Future&lt;String&gt; result : results )
{
    // Un-retried exceptions will pop out here
    String returnValue = result.get();
}
</pre>
<p>Or use a <a href="http://eng.genius.com/blog/2009/04/29/java-completionservice/">CompletionService</a>.</p>
<p>Obviously there&#8217;s lots of other things you could do within the same framework &#8212; introducing a delay before retrying would be useful in a lot of circumstances, and also you could supply a list of exceptions that would be allowed to propagate. Or only retry on checked exceptions, and automatically bubble unchecked ones. Really it depends on the use case.</p>
<p>Feel free to write an extended version and post it here or link to it :-)</p>
<p>Andrew.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/retriabletask-a-generic-wrapper-for-retrying-operations-in-java/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Simple server status monitoring with PHP and Perl</title>
		<link>http://biotext.org.uk/simple-server-status-monitoring-with-php-and-perl/</link>
		<comments>http://biotext.org.uk/simple-server-status-monitoring-with-php-and-perl/#comments</comments>
		<pubDate>Tue, 17 Feb 2009 16:05:59 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=229</guid>
		<description><![CDATA[Recently, one of our web servers fell victim to an apparent DoS attack, being hammered with hundreds of simultaneous dynamic page requests, far more than it&#8217;s specced to handle. To its credit, it stayed up, although it took about five minutes to log in via ssh, and when we spotted what was happening, the load [...]]]></description>
			<content:encoded><![CDATA[<p>Recently, one of our web servers fell victim to an apparent DoS attack, being hammered with hundreds of simultaneous dynamic page requests, far more than it&#8217;s specced to handle. To its credit, it stayed up, although it took about five minutes to log in via ssh, and when we spotted what was happening, the load average was over 100 which I think is the most loaded I&#8217;ve ever seen. The offending IP address was at UCSD who do a lot of bioinformatics, so there is a chance it was a misguided attempt to scrape a lot of data from us, rather than an actual hostile act, but in any case the upshot is the same.</p>
<p>It worried me that there was no easy way to remotely check the machine&#8217;s health, so I hacked together a quick PHP page to report various vital statistics on demand &#8212; load average, memory usage, disk usage etc. &#8212; and a Perl monitor that can raise the alarm if anything exceeds safe bounds.</p>
<p><span id="more-229"></span>The status script itself really is dead simple, it looks like this:</p>
<pre class="brush: php">

&lt;?php
    # status.php -- very simple server status monitor
    header( &#039;Content-type: text/plain&#039; );
    # Get and display load average times 3
    $load = sys_getloadavg();
    echo &quot;LoadAverage1: $load[0]\n&quot;;
    echo &quot;LoadAverage5: $load[1]\n&quot;;
    echo &quot;LoadAverage15: $load[2]\n&quot;;
    # Get and display all sorts of memory usage info
    echo join( &#039;&#039;, file( &#039;/proc/meminfo&#039; ) );
    # Get and display disk usage percentages
    $df = `/bin/df`;
    foreach( split( &quot;\n&quot;, $df ) as $line )
    {
        if( preg_match( &quot;/(\d+%)\s+(\S+)$/&quot;, $line, $matches ) )
        {
            $fs = $matches[ 2 ];
            $usage = $matches[ 1 ];
            echo &quot;Usage_$fs: $usage\n&quot;;
        }
    }
    # Count running processes
    $procs = `/bin/ps -e|wc -l`;
    echo &quot;RunningProcesses: $procs\n&quot;;
?&gt;
</pre>
<p>It uses PHP&#8217;s built-in sys_getloadavg function to return the load average, df to get the disk usage, and ps and wc to count the number of processes running. These should work on any Unix-ish system (let me know if they don&#8217;t!). Also, it uses the /proc filesystem to read lots of metrics about memory use, and this is Linux-specific. It produces output that looks like this:</p>
<pre>LoadAverage1: 0.59
LoadAverage5: 0.4
LoadAverage15: 0.29
MemTotal:      2021816 kB
MemFree:        337592 kB
Buffers:         35408 kB
Cached:         286676 kB
SwapCached:       3152 kB
Active:        1115608 kB
Inactive:       102288 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      2021816 kB
LowFree:        337592 kB
SwapTotal:     4192956 kB
SwapFree:      4185596 kB
Dirty:             304 kB
Writeback:           0 kB
AnonPages:      893740 kB
Mapped:         119900 kB
Slab:           256036 kB
PageTables:      21240 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   5203864 kB
Committed_AS:  1433344 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    280284 kB
VmallocChunk: 34359458043 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
Usage_/: 48%
Usage_/var: 9%
Usage_/tmp: 1%
Usage_/dev/shm: 0%
Usage_/export/people: 16%
Usage_/home/bsm: 79%
Usage_/LINUX/local64: 94%
Usage_/cath/opt: 13%
Usage_/cath/svnbin: 23%
Usage_/nfs/mail: 82%
Usage_/LINUX/local: 89%
RunningProcesses: 149</pre>
<p>The Perl monitoring script is a bit more complex, so I&#8217;ve made it available for download <a title="server status check script (Perl)" href="http://biotext.org.uk/static/server_status_check.pl.v0_1">here</a>. It lets you set up a config file with rules specifying named fields from the PHP script&#8217;s output, along with maximum and/or minimum allowable values for them. From the script&#8217;s comments:</p>
<pre>#!/usr/bin/perl

# server_status_check.pl
# Andrew Clegg
#
# This script parses the output of status.php and compares it to a
# list of minimum and maximum allowable values for server resources
# specified in a config file. The config file contains one rule per
# line, like so:
#
# min MemFree 7500000
# min Usage_/LINUX/local64 95
# max LoadAverage5 0.1
#
# Any line not in this format causes an error. Do not include any
# percent signs, units (e.g. kB) etc. in the config file; these
# are automatically stripped out from the results of status.php
# before applying the rules.
#
# For each resource that is lower than a min value or larger than
# a max value, a warning is printed. Also, if the config file
# contains any rules which name resources that are not found in
# the output of status.php at all, a warning is printed for each.
#
# It returns 0 if everything is fine, 255 if an error occurred, or
# the number of warnings issued if one or more of the resource
# rules are violated.</pre>
<p>You invoke the monitor script like this:</p>
<pre>./server_status_check.pl http://my.server/status.php my.config.file</pre>
<p>And it returns output that looks like this if anything rules from the config file are violated:</p>
<pre>MemFree has value 229472 which is less than minimum 250000
Usage_/ has value 86 which is greater than maximum 50
SomeIncorrectVariableName not found in server status report</pre>
<p>Since the return code is non-zero in case of a problem, you can easily use it in a cron job or shell script to take action when a server&#8217;s vital statistics move into dangerous ranges.</p>
<p>Of course, being PHP, you can use it from the command line for a quick summary of the local machine&#8217;s resources by just typing</p>
<pre>php status.php</pre>
<p>There are plenty more complex server monitoring tools out there, but you probably have to be a skilled sysadmin to use them, whereas these tools took a few hours to write, and five minutes to install. As usual, suggestions are welcome, and you are free to use them wherever and however you like, but please credit me and include a link back here.</p>
<p>Andrew.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/simple-server-status-monitoring-with-php-and-perl/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>IteratorReader &#8212; streaming character data from an iterator</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/</link>
		<comments>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/#comments</comments>
		<pubDate>Thu, 29 Jan 2009 11:10:02 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[LJC]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=146</guid>
		<description><![CDATA[Recently when plugging two components of a high-throughput web service together, I ran into a snag. One component (a data repository) exposes an Iterator for pulling XML-formatted records out of it one by one. The other (for serving SOAP response documents) needed, ideally, something that could be wrapped in a StreamSource &#8212; i.e. an InputStream [...]]]></description>
			<content:encoded><![CDATA[<p>Recently when plugging two components of a <a title="FuncNet" href="http://funcnet.eu/">high-throughput web service</a> together, I ran into a snag. One component (a data repository) exposes an Iterator for pulling XML-formatted records out of it one by one. The other (for serving SOAP response documents) needed, ideally, something that could be wrapped in a StreamSource &#8212; i.e. an InputStream or Reader. But although these are both pull-based ways of providing (in this case) character data, they&#8217;re not compatible.</p>
<p>One easy option is to iterate over the whole Iterator and buffer the results in a String, and then use a StringReader. But that&#8217;s not terribly efficient, when you might well be dealing with XML documents in the 10-20MB range. So I wrote an IteratorReader class, which is a Reader that can be wrapped around any Iterator. Each time it&#8217;s read from, it pulls enough elements from the Iterator to enable the request to be fulfilled, and buffers any remainder. This keeps its memory usage down, although this of course depends on (a) the number of characters requested at once from its read method, and (b) the size of the elements coming off the Iterator. (Each element is simply converted into a String via its toString method before being stored in a character buffer.)</p>
<p>Surprisingly, given the vast amount of Java source out there, I couldn&#8217;t find an existing solution for this &#8212; not even in the usually comprehensive <a title="Apache Commons" href="http://commons.apache.org/">Apache Commons</a>. The code is below, and you are free to do what you like with it, but a credit would be nice if you use it, and if you come up with any improvements I&#8217;d be interested to hear about them. In particular, I&#8217;m sure it could be optimized more, as it spends a lot of time garbage collecting in its current form. It&#8217;s pretty thoroughly tested, with an ArrayList of two million random strings as the source of the Iterator, and seems to work fine both with single-character reads and a BufferedReader wrapped round it. Actually, testing taught me some very interesting lessons, but that&#8217;s another post.</p>
<p><span id="more-146"></span></p>
<p><strong>Implementation note:</strong> As well as providing the Iterator to read from, you can optionally provide an object that implements the Closeable interface. This is because in the scenario I developed this for, the Iterator in question represented a stream of objects that was being generated on-the-fly from a live database connection, and implemented Closeable as well as Iterator so the connection could be closed when necessary. I needed a way of doing this automatically from the Reader&#8217;s point of view, so when the Iterator runs out of data (hasNext returns false) the close method of the attached Closeable, if present, is called.</p>
<p>Download the file: <a href="http://biotext.org.uk/static/IteratorReader.java.v0_1">IteratorReader.java.v0_1</a></p>
<p>All comments very gratefully received.</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #008000; font-style: italic; font-weight: bold;">/**
 * IteratorReader v. 0.1
 * Andrew B. Clegg
 */</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.io.Closeable</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.io.IOException</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.io.Reader</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.util.Iterator</span><span style="color: #339933;">;</span>
&nbsp;
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> IteratorReader <span style="color: #000000; font-weight: bold;">extends</span> <span style="color: #003399;">Reader</span>
<span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">// The iterator from which we'll read</span>
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Iterator<span style="color: #339933;">&lt;?</span> <span style="color: #000000; font-weight: bold;">extends</span> Object<span style="color: #339933;">&gt;</span> _iterator<span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">// Optionally, an object to close when we're done</span>
    <span style="color: #000000; font-weight: bold;">private</span> Closeable _closeable<span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">// Buffer to hold character pulled from iterator before they're read</span>
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> _leftoverCharsFromLastRead <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">// Flag to indicate when iterator is out of elements</span>
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000066; font-weight: bold;">boolean</span> _iteratorExhausted <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">false</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #008000; font-style: italic; font-weight: bold;">/**
     * Creates a new IteratorReader.
     * @param iterator the Iterator to read from
     */</span>
    <span style="color: #000000; font-weight: bold;">public</span> IteratorReader<span style="color: #009900;">&#40;</span> Iterator<span style="color: #339933;">&lt;?</span> <span style="color: #000000; font-weight: bold;">extends</span> Object<span style="color: #339933;">&gt;</span> iterator <span style="color: #009900;">&#41;</span>
    <span style="color: #009900;">&#123;</span>
        _iterator <span style="color: #339933;">=</span> iterator<span style="color: #339933;">;</span>
        _closeable <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #008000; font-style: italic; font-weight: bold;">/**
     * Creates a new IteratorReader whose Iterator is backed by a Closeable
     * object that must be cleanly closed when no longer needed.
     * @param iterator the Iterator to read from
     * @param closeable the Closeable object backing the Iterator
     */</span>
    <span style="color: #000000; font-weight: bold;">public</span> IteratorReader<span style="color: #009900;">&#40;</span> Iterator<span style="color: #339933;">&lt;?</span> <span style="color: #000000; font-weight: bold;">extends</span> Object<span style="color: #339933;">&gt;</span> iterator, Closeable closeable <span style="color: #009900;">&#41;</span>
    <span style="color: #009900;">&#123;</span>
        _iterator <span style="color: #339933;">=</span> iterator<span style="color: #339933;">;</span>
        _closeable <span style="color: #339933;">=</span> closeable<span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #008000; font-style: italic; font-weight: bold;">/**
     * Closes the Closeable object on which this reader's Iterator depends.
     * If there is no such Closeable, or it has already been closed, this
     * method does nothing. This method is automatically called when Iterator's
     * hasNext method returns false, but can be called earlier.
     * @throws IOException if the Closeable encounters a problem when closing
     */</span>
    @Override
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> close<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">IOException</span>
    <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> _closeable <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span> <span style="color: #009900;">&#41;</span>
        <span style="color: #009900;">&#123;</span>
            _closeable.<span style="color: #006633;">close</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            _closeable <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #008000; font-style: italic; font-weight: bold;">/**
     * Reads characters into a portion of an array. See Reader.
     * @param outBuf array to copy the characters into
     * @param outBufOffset offset at which to start storing characters
     * @param charsRequested maximum number of characters to read
     * @return the number of characters read, or -1 if the end of the iterator has been reached
     * @throws IOException if the Closeable encounters a problem when closing
     */</span>
    @Override
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">synchronized</span> <span style="color: #000066; font-weight: bold;">int</span> read<span style="color: #009900;">&#40;</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> outBuf, <span style="color: #000066; font-weight: bold;">int</span> outBufOffset, <span style="color: #000066; font-weight: bold;">int</span> charsRequested <span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">IOException</span>
    <span style="color: #009900;">&#123;</span>
<span style="color: #666666; font-style: italic;">//        System.out.format( &quot;read called: outBufOffset=%d, charsRequested=%d\n&quot;, outBufOffset, charsRequested );</span>
<span style="color: #666666; font-style: italic;">//        System.out.format( &quot;current state: _leftoverCharsFromLastRead has %d characters, _iteratorExhausted=%b\n&quot;,</span>
<span style="color: #666666; font-style: italic;">//                _leftoverCharsFromLastRead.length, _iteratorExhausted );</span>
&nbsp;
        <span style="color: #666666; font-style: italic;">// Have we already read enough characters from the iterator to feed this request?</span>
        <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> charsRequested <span style="color: #339933;">&lt;=</span> _leftoverCharsFromLastRead.<span style="color: #006633;">length</span> <span style="color: #009900;">&#41;</span>
        <span style="color: #009900;">&#123;</span>
            <span style="color: #666666; font-style: italic;">// Yes, we already have enough characters, copy them into output buffer</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span> _leftoverCharsFromLastRead, <span style="color: #cc66cc;">0</span>, outBuf, outBufOffset, charsRequested <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #666666; font-style: italic;">// Are there any left over?</span>
            <span style="color: #000066; font-weight: bold;">int</span> remainder <span style="color: #339933;">=</span> _leftoverCharsFromLastRead.<span style="color: #006633;">length</span> <span style="color: #339933;">-</span> charsRequested<span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">assert</span><span style="color: #009900;">&#40;</span> remainder <span style="color: #339933;">&gt;=</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> remainder <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#41;</span>
            <span style="color: #009900;">&#123;</span>
                <span style="color: #666666; font-style: italic;">// Copy remaining characters to new buffer (i.e. shrink buffer)</span>
                <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> tempBuf <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span> remainder <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span> _leftoverCharsFromLastRead, charsRequested, tempBuf, <span style="color: #cc66cc;">0</span>, remainder <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                _leftoverCharsFromLastRead <span style="color: #339933;">=</span> tempBuf<span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
            <span style="color: #000000; font-weight: bold;">else</span>
            <span style="color: #009900;">&#123;</span>
                <span style="color: #666666; font-style: italic;">// None left over, so reset buffer to zero-length</span>
                _leftoverCharsFromLastRead <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
            <span style="color: #666666; font-style: italic;">// Return the number of characters read</span>
            <span style="color: #666666; font-style: italic;">// (in this case, all the characters requested)</span>
            <span style="color: #000000; font-weight: bold;">return</span> charsRequested<span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #000000; font-weight: bold;">else</span>
        <span style="color: #009900;">&#123;</span>
            <span style="color: #666666; font-style: italic;">// We have been asked for more characters than we currently have, so we</span>
            <span style="color: #666666; font-style: italic;">// can return what we have (if there are no more in the iterator) or</span>
            <span style="color: #666666; font-style: italic;">// try to acquire more from the iterator</span>
&nbsp;
            <span style="color: #666666; font-style: italic;">// If iterator is exhausted and read has been called again, clean up and</span>
            <span style="color: #666666; font-style: italic;">// return straight away, after copying as many characters as we have left</span>
            <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> _iteratorExhausted <span style="color: #009900;">&#41;</span>
            <span style="color: #009900;">&#123;</span>
                <span style="color: #000066; font-weight: bold;">int</span> charsAvailable <span style="color: #339933;">=</span> _leftoverCharsFromLastRead.<span style="color: #006633;">length</span><span style="color: #339933;">;</span>
                <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> charsAvailable <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#41;</span>
                <span style="color: #009900;">&#123;</span>
                    <span style="color: #666666; font-style: italic;">// Nothing in the iterator or the buffer, we're done</span>
                    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
                <span style="color: #000000; font-weight: bold;">else</span>
                <span style="color: #009900;">&#123;</span>
                    <span style="color: #666666; font-style: italic;">// Copy what we have into output buffer</span>
                    <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span> _leftoverCharsFromLastRead, <span style="color: #cc66cc;">0</span>, outBuf, outBufOffset, charsAvailable <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #666666; font-style: italic;">// Clean up our own buffer and return number of characters copied</span>
                    _leftoverCharsFromLastRead <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                    <span style="color: #000000; font-weight: bold;">return</span> charsAvailable<span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
            <span style="color: #009900;">&#125;</span>
            <span style="color: #000000; font-weight: bold;">else</span>
            <span style="color: #009900;">&#123;</span>
                <span style="color: #666666; font-style: italic;">// There's still data in the iterator, so we can attempt to satisfy the whole request</span>
                <span style="color: #666666; font-style: italic;">// by doing another read -- open a stringbuilder of the desired length</span>
                StringBuilder sb <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> StringBuilder<span style="color: #009900;">&#40;</span> charsRequested <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #666666; font-style: italic;">// Insert however many characters we do have and reset our buffer to zero-length</span>
                <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> _leftoverCharsFromLastRead.<span style="color: #006633;">length</span> <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#41;</span>
                <span style="color: #009900;">&#123;</span>
                    sb.<span style="color: #006633;">append</span><span style="color: #009900;">&#40;</span> _leftoverCharsFromLastRead <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    _leftoverCharsFromLastRead <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
                <span style="color: #000066; font-weight: bold;">int</span> charsStillRequired <span style="color: #339933;">=</span> charsRequested <span style="color: #339933;">-</span> _leftoverCharsFromLastRead.<span style="color: #006633;">length</span><span style="color: #339933;">;</span>
                <span style="color: #666666; font-style: italic;">// Iteratively add new strings until no more characters are required</span>
                <span style="color: #000000; font-weight: bold;">while</span><span style="color: #009900;">&#40;</span> charsStillRequired <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">0</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #339933;">!</span>_iteratorExhausted <span style="color: #009900;">&#41;</span>
                <span style="color: #009900;">&#123;</span>
                    <span style="color: #666666; font-style: italic;">// Read another string from the underlying iterator</span>
                    <span style="color: #003399;">String</span> string <span style="color: #339933;">=</span> nextString<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #666666; font-style: italic;">// Add it to stringbuffer</span>
                    sb.<span style="color: #006633;">append</span><span style="color: #009900;">&#40;</span> string <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #666666; font-style: italic;">// Adjust number still required</span>
                    charsStillRequired <span style="color: #339933;">=</span> charsStillRequired <span style="color: #339933;">-</span> string.<span style="color: #006633;">length</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
                <span style="color: #666666; font-style: italic;">// Did we read to the end of the iterator?</span>
                <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> _iteratorExhausted <span style="color: #009900;">&#41;</span>
                <span style="color: #009900;">&#123;</span>
                    <span style="color: #666666; font-style: italic;">// We have read all the strings from the iterator, but can only return</span>
                    <span style="color: #666666; font-style: italic;">// as many characters as we managed to read, or as many as were requested,</span>
                    <span style="color: #666666; font-style: italic;">// whichever is lower</span>
                    <span style="color: #000066; font-weight: bold;">int</span> charsObtained <span style="color: #339933;">=</span> sb.<span style="color: #006633;">length</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> tempBuf <span style="color: #339933;">=</span> sb.<span style="color: #006633;">toString</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">toCharArray</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #666666; font-style: italic;">// charsToReturn is the number of chars requested, or obtained, whichever is lower</span>
                    <span style="color: #000066; font-weight: bold;">int</span> charsToReturn <span style="color: #339933;">=</span> <span style="color: #003399;">Math</span>.<span style="color: #006633;">min</span><span style="color: #009900;">&#40;</span> charsRequested, charsObtained <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #666666; font-style: italic;">// Copy this many characters into output buffer</span>
                    <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span> tempBuf, <span style="color: #cc66cc;">0</span>, outBuf, outBufOffset, charsToReturn <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #666666; font-style: italic;">// Do we have any left over in our buffer?</span>
                    <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> charsObtained <span style="color: #339933;">&gt;</span> charsRequested <span style="color: #009900;">&#41;</span>
                    <span style="color: #009900;">&#123;</span>
                        <span style="color: #666666; font-style: italic;">// Yes -- more obtained than requested -- save them for next request</span>
                        <span style="color: #000066; font-weight: bold;">int</span> charsToSave <span style="color: #339933;">=</span> charsObtained <span style="color: #339933;">-</span> charsRequested<span style="color: #339933;">;</span>
                        <span style="color: #000000; font-weight: bold;">assert</span><span style="color: #009900;">&#40;</span> charsToSave <span style="color: #339933;">+</span> charsToReturn <span style="color: #339933;">==</span> tempBuf.<span style="color: #006633;">length</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                        _leftoverCharsFromLastRead <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span> charsToSave <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                        <span style="color: #003399;">System</span>.<span style="color: #006633;">arraycopy</span><span style="color: #009900;">&#40;</span> tempBuf, charsToReturn, _leftoverCharsFromLastRead, <span style="color: #cc66cc;">0</span>, charsToSave <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #009900;">&#125;</span>
                    <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> charsObtained <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#41;</span>
                    <span style="color: #009900;">&#123;</span>
                        <span style="color: #666666; font-style: italic;">// No characters left in buffer or iterator; return -1 immediately</span>
                        _leftoverCharsFromLastRead <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">char</span><span style="color: #009900;">&#91;</span> <span style="color: #cc66cc;">0</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                        <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
                    <span style="color: #009900;">&#125;</span>
                    <span style="color: #000000; font-weight: bold;">else</span>
                    <span style="color: #009900;">&#123;</span>
                        <span style="color: #666666; font-style: italic;">// There are some remaining in buffer for next time, so just return</span>
                        <span style="color: #666666; font-style: italic;">// the number we acquired this time</span>
                        <span style="color: #000000; font-weight: bold;">return</span> charsToReturn<span style="color: #339933;">;</span>
                    <span style="color: #009900;">&#125;</span>
                <span style="color: #009900;">&#125;</span>
                <span style="color: #000000; font-weight: bold;">else</span>
                <span style="color: #009900;">&#123;</span>
                    <span style="color: #666666; font-style: italic;">// sb now contains text to return, and there are more strings to iterate through.</span>
                    <span style="color: #666666; font-style: italic;">// We can save a bit of effort by putting the entire contents of sb into</span>
                    <span style="color: #666666; font-style: italic;">// our 'leftover' characters buffer, and calling this method again to copy it over</span>
                    _leftoverCharsFromLastRead <span style="color: #339933;">=</span> sb.<span style="color: #006633;">toString</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">toCharArray</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #000000; font-weight: bold;">return</span> read<span style="color: #009900;">&#40;</span> outBuf, outBufOffset, charsRequested <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #003399;">String</span> nextString<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">IOException</span>
    <span style="color: #009900;">&#123;</span>
        <span style="color: #666666; font-style: italic;">// This should never get called after _iteratorExhausted has been set</span>
        <span style="color: #000000; font-weight: bold;">assert</span><span style="color: #009900;">&#40;</span> <span style="color: #339933;">!</span>_iteratorExhausted <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span> _iterator.<span style="color: #006633;">hasNext</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span>
        <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">return</span> _iterator.<span style="color: #006633;">next</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">toString</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #000000; font-weight: bold;">else</span>
        <span style="color: #009900;">&#123;</span>
            _iteratorExhausted <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">true</span><span style="color: #339933;">;</span>
            close<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #0000ff;">&quot;&quot;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009900;">&#125;</span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Java web service frameworks &#8212; a brief survey</title>
		<link>http://biotext.org.uk/java-web-service-frameworks-a-brief-survey/</link>
		<comments>http://biotext.org.uk/java-web-service-frameworks-a-brief-survey/#comments</comments>
		<pubDate>Sat, 27 Dec 2008 16:23:32 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[axis2]]></category>
		<category><![CDATA[cxf]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[jax-ws]]></category>
		<category><![CDATA[jboss]]></category>
		<category><![CDATA[metro]]></category>
		<category><![CDATA[soaplab]]></category>
		<category><![CDATA[webservices]]></category>

		<guid isPermaLink="false">http://biotext.org.uk/?p=64</guid>
		<description><![CDATA[One of the current ongoing R&#38;D projects at CATH is the development of a library of web services for both internal and external use. The first step in this process was the selection of a web service toolkit (or &#8216;stack&#8217;) to use, as there are several competing (and often mutually incompatible) frameworks available. This page [...]]]></description>
			<content:encoded><![CDATA[<p>One of the current ongoing R&amp;D projects at <a title="CATH database" href="http://cathdb.info">CATH</a> is the development of a library of web services for both internal and external use. The first step in this process was the selection of a web service toolkit (or &#8216;stack&#8217;) to use, as there are several competing (and often mutually incompatible) frameworks available. This page compares some of them in brief based on some informal evaluations and background reading.</p>
<p>Note that I am far from being an expert in this field, so feel free to leave a comment if there&#8217;s anything here you disagree with. I am, as ever, happy to be corrected.</p>
<p>Also, please don&#8217;t get the impression that this is a methodical side-by-side comparison. I would have liked to implement the exact same non-trivial service in each stack, and report systematically on my experiences with each, but in reality I just didn&#8217;t have time to do it that rigourously. What I present here is the result of going through one or two <a title="Glen Mazza's WSDL-first tutorial for CXF and Metro" href="http://www.jroller.com/gmazza/date/20080417">tutorials</a> (mostly of the Hello World/echo-a-string/double-a-number kind), reading the docs, hanging out on forums and garnering opinion from blog posts and articles. And the amount of time I spent with each toolkit is admittedly variable.</p>
<p>I hope it will be useful to someone, despite the caveats.</p>
<p>Andrew.</p>
<p><span id="more-64"></span></p>
<h3><a name="background">Background and assumptions</a></h3>
<p>All of these toolkits are primarily for the development of <a href="http://en.wikipedia.org/wiki/SOAP">SOAP</a> web services, although some provide facilities for other paradigms too (e.g. <a href="http://en.wikipedia.org/wiki/REST">REST</a> or <a href="http://en.wikipedia.org/wiki/POX">POX</a>). In addition, they were evaluated from the point of view of top-down (contract-first) design, which is when a <a href="http://en.wikipedia.org/wiki/Web_Services_Description_Language">WSDL</a> file and <a href="http://en.wikipedia.org/wiki/XML_Schema">XML Schema</a> are drawn up first, containing a description of the services available and their input-output formats. This is then used to generate code (more or less automatically) to handle the requests and responses at the client and/or server ends.</p>
<p>The opposite model, bottom-up (code-first), is very popular in the business world, as when the person writing the clients to access your services is in the same organization as you, internal standards and practices can be enforced at the code level and the actual service definitions can be a bit of an afterthought. Indeed it is relatively common for service or client programmers in this context to never even look at the generated WSDLs. However, there are <a href="http://static.springframework.org/spring-ws/sites/1.5/reference/html/why-contract-first.html">good arguments</a> for contract-first design, and all the more so when (a) you have to design services compliant with externally-imposed standards and (b) the clients accessing your services may be written using radically different toolkits, design philosophies or programming languages to your own. Much of the literature on the web relates to bottom-up design, and some of the claims made about different toolkits are only true of bottom-up projects, so <em>caveat lector</em>.</p>
<p>Also, there are two main competing service definition paradigms for web services, rpc/encoded and document/literal. Of these, the former is still only in widespread use in <a href="http://biomoby.open-bio.org/">MOBY</a> and in the Perl world, while the latter has become a de-facto standard elsewhere, particularly in its &#8216;wrapped&#8217; dialect. (The details of the differences are beyond the scope of this introduction but there&#8217;s a good technical article about them <a href="http://www.ibm.com/developerworks/webservices/library/ws-whichwsdl/">here</a>.) Document/literal is now becoming an &#8216;official&#8217; standard too with its adoption by the <a href="http://www.ws-i.org/">Web Services Interoperability</a> consortium. Therefore, all the frameworks below were assessed in the context of producing document/literal wrapped services.</p>
<p>There is a detailed (if slightly out-of-date) five-way interview with the lead developers/architects of Axis2, Metro, JBossWS, CXF and Spring-WS <a href="http://www.infoq.com/articles/os-ws-stacks-background">here</a>. It discusses the design philosophies and architectures of these five frameworks, although it doesn&#8217;t cover subjective aspects like ease-of-use.</p>
<h3><a name="axis2">Axis2</a></h3>
<p>This is an Apache project and largely unrelated to its predecessor Axis. It is a general-purpose implementation of SOAP web services which predates the now-standard JAX-WS model (see below). See: <a href="http://ws.apache.org/axis2/">http://ws.apache.org/axis2/</a></p>
<p>Pros:</p>
<ul>
<li>Broad choice of databinding and deployment models</li>
<li>Good community/developer support</li>
<li>Code generation tools produce skeleton implementation classes as well as just interfaces</li>
<li>REST/POX support requires little extra work</li>
</ul>
<p>Cons:</p>
<ul>
<li>Broad choice of databinding and deployment models (!)</li>
<li>Generated code seems to come in for criticism on the net</li>
<li>JAX-WS/JAXB-style services supported but still a bit crude; tools only produce interfaces, not skeleton classes</li>
<li>Error diagnosis can be awkward, requiring inspection of server logs</li>
</ul>
<h3><a name="metro">JAX-WS RI (Metro)</a></h3>
<p>Sun&#8217;s &#8216;Reference Implementation&#8217; of the <a href="http://en.wikipedia.org/wiki/JAX-WS">JAX-WS</a> standard, bundled alongside the <a href="https://glassfish.dev.java.net/">GlassFish</a> application server as part of the Java Enterprise Edition SDK. See: <a href="https://jax-ws.dev.java.net/">https://jax-ws.dev.java.net/</a></p>
<p>Pros:</p>
<ul>
<li>Sun are pushing it heavily = likely to be widespread adoption and good tool support</li>
<li>JAXB databinding and generated code are elegant and easy to use</li>
<li>Some of our <a href="http://www.enfin.org/">collaborators</a> use it already</li>
</ul>
<p>Cons:</p>
<ul>
<li>The code generation tools require you to do a lot more work for yourself than Axis2&#8242;s, e.g. writing the implementation classes from scratch rather than from auto-generated skeletons</li>
<li>The server/service configuration files are quite fiddly</li>
<li>Together these mean that adding new services or changing the interface of existing ones is a bit of a hassle</li>
<li>Error diagnosis can be awkward, requiring inspection of server logs</li>
</ul>
<h3><a name="jboss">JBossWS</a></h3>
<p>This is an implementation of the JAX-WS standard by <a href="http://www.jboss.com/">RedHat</a> and the <a href="http://www.jboss.org">JBoss community</a>. Seems to work in a very similar way to Metro, similar advantages, same major disadvantage. See: <a href="http://www.jboss.org/jbossws/">http://www.jboss.org/jbossws/</a></p>
<p>I have to admit, I didn&#8217;t actually go much beyond installing this and reading the docs. If anyone has any practical experiences to share, please add some comments below! Thanks.</p>
<h3><a name="cxf">CXF</a></h3>
<p>Formed from the merger of two older projects, Celtix and XFire, and has now been adopted by the Apache Foundation. Yes, this does mean that there are two entirely separate Apache web service platforms. Such is the bountiful joy of open-source.</p>
<p>Pros:</p>
<ul>
<li>Uses JAX-WS/JAXB natively, so compatibility and tool support shouldn&#8217;t be an issue</li>
<li>Generates skeleton implementation classes as well as just interfaces</li>
<li>Automatic generation of JavaScript client code, for testing or AJAX interfaces</li>
<li>More meaningful error messages than Metro or Axis2</li>
<li> Good community/developer support</li>
<li>Adding new services/operations or modifying existing ones is pretty intuitive</li>
<li>Support for REST via <a href="https://jsr311.dev.java.net/">JAX-RS</a> standard, and various other communication methods like <a href="http://java.sun.com/products/jms/">JMS</a></li>
</ul>
<p>Cons:</p>
<ul>
<li>Configuration can seem <em>horribly</em> complex &#8212; there&#8217;s almost too much flexibility</li>
<li>Non-Spring-programmers may be intimated by Spring-based architecture (also see Spring-WS below)</li>
</ul>
<h3><a name="spring">Spring-WS</a></h3>
<p>Spring Web Services is (obviously) the web services component of the Spring Framework, a popular application development framework that also provides useful facilities like configuration, transaction management, object-relational mapping, database abstraction, logging etc. See: <a href="http://www.springframework.org/">http://www.springframework.org/</a></p>
<p>Pros:</p>
<ul>
<li>Integration with rest of Spring could be useful (although CXF also designed to facilitate this)</li>
<li>Very firm (actually exclusive) focus on contract-first design</li>
<li>Tool for constructing WSDL automatically from XSD (<em>really</em> contract-first)</li>
<li>User can tie in support for various databinding technologies</li>
</ul>
<p>Cons:</p>
<ul>
<li>Young compared to the other frameworks</li>
<li>Additional learning curve for non-Spring-programmers</li>
<li>No tools for generating Java classes/interface automatically from XSD/WSDL = more trivial hand-coding</li>
<li>Document/message-centric model at odds with operation-centric (RPC-like) model used by most other frameworks &#8212; you get a blob of XML and can do with it what you want</li>
</ul>
<p>N.B. The last of these disadvantages is actually, if you believe the Spring-WS team, an advantage after all. This is quite a persuasive argument, since you can plug in any of several unmarshalling (i.e. databinding) steps if you like, or just pull bits out of the XML tree with a parser (again there are several available). Your implementation then won&#8217;t be tied to any particular databinding object model, freeing you from the danger of ending up with incorrectly-generated classes or XSD structures that can&#8217;t be well represented in Java. Also in many cases you ought to be able to expect better performance this way, e.g. by using forward-only streaming. But it makes the endpoint classes more verbose and slower to create, which would be an issue if you&#8217;re providing a lot of services.</p>
<p>See the Epilogue below for more on this way of working.</p>
<h3><a name="soaplab">Soaplab2</a></h3>
<p>Largely developed at the <a href="http://www.ebi.ac.uk/">EBI</a>, and replacing the earlier Soaplab, Soaplab2 is a toolkit to enable wrapping of existing software in web service front-ends with the minimum of actual coding. See: <a href="http://soaplab.sourceforge.net/soaplab2/">http://soaplab.sourceforge.net/soaplab2/</a></p>
<p>Pros:</p>
<ul>
<li>Consistent access method across different back-end operation</li>
<li>Server-side asynchronous functionality like status polling, job cancellation etc. is built in</li>
<li>Provides web and command-line clients</li>
<li>Creation of web services wrappers around existing programs is swift and usually doesn&#8217;t require any hand-coding</li>
</ul>
<p>Cons:</p>
<ul>
<li>All Soaplab2 services are designed contract-first, but in <a href="http://soaplab.sourceforge.net/soaplab2/MetadataGuide.html">ACD</a>, not WSDL/XSD</li>
<li>Doesn&#8217;t use XML Schema to describe input and output types (not in any useful sense)</li>
<li>WSDLs contain no details of the actual underlying programs or their data requirements</li>
</ul>
<p>Although it&#8217;s a powerful tool, Soaplab2&#8242;s approach to SOAP/WSDL is extremely unusual and seems, in a sense, to miss the point. The WSDLs and XSDs it produces don&#8217;t even attempt to describe the underlying processes &#8212; it just provides generically-named operations like <code>createJob</code>, <code>run</code>, <code>destroy</code>, <code>getSomeResults</code> etc. where the actual &#8216;service&#8217; required (known as an &#8216;analysis&#8217;) is one of the parameters. Instead of listing these services in the WSDL and providing distinct SOAP endpoints to access them, it provides a list of available analyses via another web service. Elements tend to have non-descriptive names like arg0, arg1 etc. and parameters are generally passed in name-value pairs rather than proper typed data structures.</p>
<p>EDIT (13 Jan 2009): Apparently the Soaplab team have recently started working on &#8216;typed&#8217; WSDLs (i.e. WSDLs how most other people use them) which may render some of these objections obselete. However, I don&#8217;t think they&#8217;ve actually released a new version yet which includes these.</p>
<h3><a name="others">Others?</a></h3>
<p>These are all of the open-source options that I&#8217;m currently aware of. There are some commercial options around, but even some of these use open-source products under the hood (e.g. IBM WebSphere which is apparently Axis2-based).</p>
<p>Drop a comment if you know any different.</p>
<h3><a name="conclusions">Conclusions</a></h3>
<p>It is a shame that Soaplab2 web services have such a unique and quirky architecture, since its overall aim &#8212; enabling easy access to existing bioinformatics resources and tools &#8212; covers 90% of what we need web services for at CATH. However, taking the Soaplab2 option would enforce a certain amount of &#8216;vendor&#8217; lock-in, since only Soaplab2-aware tools would be able to get the best out of the resulting services. For other clients, it would lead to an extra level of abstraction between the service consumer and the service. Rather than simply looking for suitable WSDL-based service descriptions in a registry, one would have to find every Soaplab2-based service and then query each one in order to find out what services it actually offered. The consequences for registry architecture of having some services fully described by their WSDLs and others with definitions hidden in Soaplab2 responses are not clear, but they can&#8217;t be good, and the same goes for consistent test suites and automatic service level monitoring.</p>
<p>It is also unclear how Soaplab2-based services would allow for semantic markup of the sort provided by methods like <a href="http://www.w3.org/TR/sawsdl/">SAWSDL</a>, since the same SOAP operation (e.g. <code>createJob</code>) could take parameters with any one of many different semantic types &#8212; sequence, accession number, structure, <a href="http://www.geneontology.org/">GO term</a>, whatever &#8212; depending on what actual underlying process was being invoked. This would seriously reduce the potential for interoperability, automated service discovery and semantic data mining that could be provided by properly-specified services. One of the Soaplab2 developers claimed in June 2008 on the embrace-tech mailing list that &#8216;typed&#8217; services were on the to-do list and would be ready in a matter of “weeks or months”, but this is apparently still a work-in-progress six months later, and of course we cannot be certain what form these will take when they do appear. EDIT (13 Jan 2009): This is apparently now being worked on, see above.</p>
<p>Looking instead at general-purpose web service stacks, I would consider Spring-WS if we already possessed in-house Spring expertise, since the focus on contract-first development is a good match for our requirements. Furthermore the Spring framework is very popular with Java developers and seems to provide useful solutions to a lot of common [web] application development problems. The lack of class generation tools could be problematic but is probably solvable with Ant tasks or even Perl scripts. However, the exhaustive feature list and broad compatibility of CXF, coupled with its good reputation on blogs and discussion fora, make it the best candidate available. And it offers Spring integration itself, meaning that should we wish to explore the capabilities of Spring later on, we will still have that option.</p>
<h3><a name="epilogue">Epilogue</a></h3>
<p>Having eventually gone for CXF, I quickly discovered that for performance reasons, <a href="http://funcnet.eu/">our first Java web services project</a> would need to use raw XML messages, rather than databinding to JAXB objects. Thankfully, it&#8217;s pretty straightforward to <a href="http://cwiki.apache.org/CXF20DOC/provider-services.html">bypass the unmarshalling steps</a> and just treat a SOAP message as raw XML, exactly like Spring-WS does. This is very useful if you want to produce high-throughput web services that process a large amount of data. Several of the services in the project use Oracle to output XML directly (up to tens of thousands of rows at a time) which can be injected into the body of a SOAP message by CXF, using a &#8216;payload provider&#8217; service. Fast and simple.</p>
<p>I believe Metro also supports provider services, as well as the corresponding &#8216;dispatch&#8217; clients for sending raw chunks of XML, as these are part of the JAX-WS spec &#8212; so this solution could also have been implemented with Metro. Although, having got my head around CXF&#8217;s complexity, I am still highly impressed by its flexibility as well as the quality of feedback from its developers and community. So I would have no qualms about recommending it again.</p>
]]></content:encoded>
			<wfw:commentRss>http://biotext.org.uk/java-web-service-frameworks-a-brief-survey/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

