<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: IteratorReader &#8212; streaming character data from an iterator</title>
	<atom:link href="http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/feed/" rel="self" type="application/rss+xml" />
	<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/</link>
	<description>Not a typewriter</description>
	<lastBuildDate>Wed, 29 Feb 2012 22:19:56 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: Andrew</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-149</link>
		<dc:creator>Andrew</dc:creator>
		<pubDate>Mon, 09 Nov 2009 11:56:01 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-149</guid>
		<description>Generally the way Providers work is that one class (implementing Provider) implements all the operations exposed by one web service, all via a single invoke() method. So if the service exposes multiple operations, the invoke() method has to look inside the XML payload to determine what operation to carry out.

In the case of document/literal services, this means looking at the name of the outermost XML element in the payload, as this corresponds to the name of the operation. So your invoke method would have to look at this element and see if it was called GetTaxonomyFromNCBITaxonId or GetTaxonomyFromPatricTaxonId and take whichever action was appropriate.

Some further documentation:

http://cwiki.apache.org/CXF20DOC/provider-services.html

http://java.sun.com/mailers/techtips/enterprise/2006/TechTips_July06.html

For the implementation class, you use @WebServiceProvider instead of @WebService. That&#039;s all covered in the CXF docs. I don&#039;t know how to deploy them to JBoss Portal though, sorry.</description>
		<content:encoded><![CDATA[<p>Generally the way Providers work is that one class (implementing Provider) implements all the operations exposed by one web service, all via a single invoke() method. So if the service exposes multiple operations, the invoke() method has to look inside the XML payload to determine what operation to carry out.</p>
<p>In the case of document/literal services, this means looking at the name of the outermost XML element in the payload, as this corresponds to the name of the operation. So your invoke method would have to look at this element and see if it was called GetTaxonomyFromNCBITaxonId or GetTaxonomyFromPatricTaxonId and take whichever action was appropriate.</p>
<p>Some further documentation:</p>
<p><a href="http://cwiki.apache.org/CXF20DOC/provider-services.html" rel="nofollow">http://cwiki.apache.org/CXF20DOC/provider-services.html</a></p>
<p><a href="http://java.sun.com/mailers/techtips/enterprise/2006/TechTips_July06.html" rel="nofollow">http://java.sun.com/mailers/techtips/enterprise/2006/TechTips_July06.html</a></p>
<p>For the implementation class, you use @WebServiceProvider instead of @WebService. That&#8217;s all covered in the CXF docs. I don&#8217;t know how to deploy them to JBoss Portal though, sorry.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tian Xue</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-148</link>
		<dc:creator>Tian Xue</dc:creator>
		<pubDate>Thu, 05 Nov 2009 20:18:49 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-148</guid>
		<description>Dear Andrew,

Thanks again for all your help. I still have some problem. I tried to search web to get more information how to use Provider but I can not find a suitable case. I create a regular Webservice and the interface as below:
@WebService
@XmlSeeAlso({Object[].class, java.lang.Object.class})
public interface PatricDatabaseService {
public String getTaxonomyFromNCBITaxonId(@WebParam(name=&quot;ncbiTaxonId&quot;) String ncbiTaxonId);}
public String getTaxonomyFromPatricTaxonId(@WebParam(name=&quot;patricTaxonId&quot;) String patricTaxonId);

And this webservice is deployed on jboss portal and I have a webservice client to invoke this service to pass for example 8333 and return a string. (For test StreamSource purpose), I create a StreamSourceProvider service based on the provider code you send to me. I don&#039;t know how to invoke the Provider service to pass information in and get response out. Could you please give me some information or sample code on this?

Thanks in advance.

Tian</description>
		<content:encoded><![CDATA[<p>Dear Andrew,</p>
<p>Thanks again for all your help. I still have some problem. I tried to search web to get more information how to use Provider but I can not find a suitable case. I create a regular Webservice and the interface as below:<br />
@WebService<br />
@XmlSeeAlso({Object[].class, java.lang.Object.class})<br />
public interface PatricDatabaseService {<br />
public String getTaxonomyFromNCBITaxonId(@WebParam(name=&#8221;ncbiTaxonId&#8221;) String ncbiTaxonId);}<br />
public String getTaxonomyFromPatricTaxonId(@WebParam(name=&#8221;patricTaxonId&#8221;) String patricTaxonId);</p>
<p>And this webservice is deployed on jboss portal and I have a webservice client to invoke this service to pass for example 8333 and return a string. (For test StreamSource purpose), I create a StreamSourceProvider service based on the provider code you send to me. I don&#8217;t know how to invoke the Provider service to pass information in and get response out. Could you please give me some information or sample code on this?</p>
<p>Thanks in advance.</p>
<p>Tian</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tian Xue</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-147</link>
		<dc:creator>Tian Xue</dc:creator>
		<pubDate>Wed, 04 Nov 2009 17:21:19 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-147</guid>
		<description>Hi, Andrew,

Thank you so much for the help. I will look into the sample code and try to adapt my DAO into it and take a test. I will let you know the result.

Thanks again.</description>
		<content:encoded><![CDATA[<p>Hi, Andrew,</p>
<p>Thank you so much for the help. I will look into the sample code and try to adapt my DAO into it and take a test. I will let you know the result.</p>
<p>Thanks again.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andrew</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-146</link>
		<dc:creator>Andrew</dc:creator>
		<pubDate>Wed, 04 Nov 2009 11:48:44 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-146</guid>
		<description>Hi Tian,

Have a look at &lt;a href=&quot;/static/BioMinerPredictorProvider.java&quot; rel=&quot;nofollow&quot;&gt;BioMinerPredictorProvider.java&lt;/a&gt; ...

This invokes a DAO object which queries a database for XML directly, returning each matching row from the DB as a String containing the XML pre-formatted for returning in the web service. This is in the form of a List, although it could be any sort of Iterator, but bear in mind you don&#039;t want the DAO to keep the database connection open longer than necessary.

Then it wraps this List in an IteratorReader, providing a header and footer with the opening and closing response element tags, wraps the IteratorReader in a StreamSource, and returns that to CXF.

You can pretty much follow this pattern, and just drop in your own DAO which queries your DB and returns strings containing records as XML elements.

There&#039;s lots of things &lt;em&gt;slightly&lt;/em&gt; non-perfect about the code, both of this class and the IteratorReader (see Bob&#039;s comments above), but it&#039;s well-tested and works well in production, so I haven&#039;t been motivated to go back and fiddle any more.</description>
		<content:encoded><![CDATA[<p>Hi Tian,</p>
<p>Have a look at <a href="/static/BioMinerPredictorProvider.java" rel="nofollow">BioMinerPredictorProvider.java</a> &#8230;</p>
<p>This invokes a DAO object which queries a database for XML directly, returning each matching row from the DB as a String containing the XML pre-formatted for returning in the web service. This is in the form of a List, although it could be any sort of Iterator, but bear in mind you don&#8217;t want the DAO to keep the database connection open longer than necessary.</p>
<p>Then it wraps this List in an IteratorReader, providing a header and footer with the opening and closing response element tags, wraps the IteratorReader in a StreamSource, and returns that to CXF.</p>
<p>You can pretty much follow this pattern, and just drop in your own DAO which queries your DB and returns strings containing records as XML elements.</p>
<p>There&#8217;s lots of things <em>slightly</em> non-perfect about the code, both of this class and the IteratorReader (see Bob&#8217;s comments above), but it&#8217;s well-tested and works well in production, so I haven&#8217;t been motivated to go back and fiddle any more.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tian Xue</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-145</link>
		<dc:creator>Tian Xue</dc:creator>
		<pubDate>Tue, 03 Nov 2009 21:18:40 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-145</guid>
		<description>Andrew,

I am interested in the way you implemented to deal with the large data set. I am currently in a project to create a CXF-based webservice to retrieve all genome features based on a taxonomyID. For some reason,the output can not be paged or indexed and can not random accessed which means we need to retrieve all at one time. The return (as an ArrayList) is supposed to bind to XML and send back to user. However it is not possible due to the size and get outofmemory exception all the time. I looked through your idea and found it is so promising to apply it in my project. I am new to webservice technology. Could you please send me some sample code like provider you implemented then I can use the sample code to test in my site.

Thanks in advance.

Tian</description>
		<content:encoded><![CDATA[<p>Andrew,</p>
<p>I am interested in the way you implemented to deal with the large data set. I am currently in a project to create a CXF-based webservice to retrieve all genome features based on a taxonomyID. For some reason,the output can not be paged or indexed and can not random accessed which means we need to retrieve all at one time. The return (as an ArrayList) is supposed to bind to XML and send back to user. However it is not possible due to the size and get outofmemory exception all the time. I looked through your idea and found it is so promising to apply it in my project. I am new to webservice technology. Could you please send me some sample code like provider you implemented then I can use the sample code to test in my site.</p>
<p>Thanks in advance.</p>
<p>Tian</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Carpenter</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-134</link>
		<dc:creator>Bob Carpenter</dc:creator>
		<pubDate>Fri, 10 Jul 2009 21:34:25 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-134</guid>
		<description>There&#039;s a lot of things you can do to make this more efficient.  One, you don&#039;t need to keep allocating arrays in tempbuf!  You can just work from the string created from the iterated object.

You can also remove the complex part of the read() method because the read() method specification in java.io.Reader does not require the input buffer to be filled. You can even return 0-length outputs.  If you want output buffers maximally filled for some reason (though clients should be careful not to expect that of Reader instances), you can wrap in a BufferedReader, which&#039;ll do the buffering you&#039;re doing a bit more efficiently.

Your read() method doesn&#039;t need to declare a thrown IOException.  

What happens with iterators that return null from next() [it&#039;s legal, because collections can have null members]?  Should we get null pointer exceptions, or iterate to the next object?  

To make the read() method more robust to buggy clients, you should make sure to verify that the args are legal and fail early if they&#039;re not.  Implementations like BufferedReader throw an undocumented IndexOutOfBoundsException, doing some redundant checking along the way:

if ((off  cbuf.length) 
&#124;&#124; (len  cbuf.length) 
&#124;&#124; ((off + len) &lt; 0))

What were they thinking?  The following&#039;s enough:

(off &lt; 0 &#124;&#124; len  cbuf.length)

I&#039;ll e-mail you a simpler version with the arg tests and null handling; I don&#039;t think it&#039;ll fit in the comment :-)

Thanks for the Friday afternoon programming puzzle; this is better than TopCoder!</description>
		<content:encoded><![CDATA[<p>There&#8217;s a lot of things you can do to make this more efficient.  One, you don&#8217;t need to keep allocating arrays in tempbuf!  You can just work from the string created from the iterated object.</p>
<p>You can also remove the complex part of the read() method because the read() method specification in java.io.Reader does not require the input buffer to be filled. You can even return 0-length outputs.  If you want output buffers maximally filled for some reason (though clients should be careful not to expect that of Reader instances), you can wrap in a BufferedReader, which&#8217;ll do the buffering you&#8217;re doing a bit more efficiently.</p>
<p>Your read() method doesn&#8217;t need to declare a thrown IOException.  </p>
<p>What happens with iterators that return null from next() [it's legal, because collections can have null members]?  Should we get null pointer exceptions, or iterate to the next object?  </p>
<p>To make the read() method more robust to buggy clients, you should make sure to verify that the args are legal and fail early if they&#8217;re not.  Implementations like BufferedReader throw an undocumented IndexOutOfBoundsException, doing some redundant checking along the way:</p>
<p>if ((off  cbuf.length)<br />
|| (len  cbuf.length)<br />
|| ((off + len) &lt; 0))</p>
<p>What were they thinking?  The following&#039;s enough:</p>
<p>(off &lt; 0 || len  cbuf.length)</p>
<p>I&#8217;ll e-mail you a simpler version with the arg tests and null handling; I don&#8217;t think it&#8217;ll fit in the comment :-)</p>
<p>Thanks for the Friday afternoon programming puzzle; this is better than TopCoder!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andrew</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-132</link>
		<dc:creator>Andrew</dc:creator>
		<pubDate>Thu, 25 Jun 2009 12:35:32 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-132</guid>
		<description>&quot;I’m not saying your code is buggy&quot; ...

I&#039;m not saying it *isn&#039;t*, though :-)

If you do find an input XML document that looks different after going through the iterator, please let me know! A good way to test it would be to put all of the strings that make up one of the problematic XML docs into an array or list, wrap that object&#039;s Iterator in the IteratorReader, then read from that into a StringBuffer until exhausted. Then compare the contents of the StringBuffer with the original strings.

Or, if it doesn&#039;t contain confidential info, send me the document and I can test it here...

Thanks!</description>
		<content:encoded><![CDATA[<p>&#8220;I’m not saying your code is buggy&#8221; &#8230;</p>
<p>I&#8217;m not saying it *isn&#8217;t*, though :-)</p>
<p>If you do find an input XML document that looks different after going through the iterator, please let me know! A good way to test it would be to put all of the strings that make up one of the problematic XML docs into an array or list, wrap that object&#8217;s Iterator in the IteratorReader, then read from that into a StringBuffer until exhausted. Then compare the contents of the StringBuffer with the original strings.</p>
<p>Or, if it doesn&#8217;t contain confidential info, send me the document and I can test it here&#8230;</p>
<p>Thanks!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Monica</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-131</link>
		<dc:creator>Monica</dc:creator>
		<pubDate>Thu, 25 Jun 2009 12:24:14 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-131</guid>
		<description>:), I&#039;m not saying your code is buggy, most likely something I&#039;ve done :) 
Just wondered if you had seen it.
I&#039;ve removed the CDATA section and I get it now with the normal XML:

Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag ; expected .
 at [row,col {unknown-source}]: [1,536722333]

So it looks like somehow sometimes it gets confused, or it misses some characters ... Thought it does not happen all the time, the same request will always consistently fail.

Yes, my setup is very similar to what you describe. I&#039;ll try to test it better.

Thanks,

Monica</description>
		<content:encoded><![CDATA[<p>:), I&#8217;m not saying your code is buggy, most likely something I&#8217;ve done :)<br />
Just wondered if you had seen it.<br />
I&#8217;ve removed the CDATA section and I get it now with the normal XML:</p>
<p>Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag ; expected .<br />
 at [row,col {unknown-source}]: [1,536722333]</p>
<p>So it looks like somehow sometimes it gets confused, or it misses some characters &#8230; Thought it does not happen all the time, the same request will always consistently fail.</p>
<p>Yes, my setup is very similar to what you describe. I&#8217;ll try to test it better.</p>
<p>Thanks,</p>
<p>Monica</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andrew</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-130</link>
		<dc:creator>Andrew</dc:creator>
		<pubDate>Thu, 25 Jun 2009 11:47:54 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-130</guid>
		<description>Hmm, not seen that, but then none of my services use CDATA sections. I&#039;ve seen

Unexpected EOF in prolog

when you try to parse XML from an empty stream (e.g. when a buggy service doesn&#039;t return anything).

What&#039;s your setup like? I&#039;m guessing you have an Iterator that&#039;s yielding XML, then an IteratorReader wrapping that, then wrapping the IteratorReader in a StreamSource to return from a webservice. Is that right?

You can check the IteratorReader is working correctly by reading from it into a string, and then comparing the contents of the string with the original XML (ignoring whitespace differences etc.). That&#039;s basically how my unit tests work. If they turn out the same, the bug&#039;s not in my code :-)</description>
		<content:encoded><![CDATA[<p>Hmm, not seen that, but then none of my services use CDATA sections. I&#8217;ve seen</p>
<p>Unexpected EOF in prolog</p>
<p>when you try to parse XML from an empty stream (e.g. when a buggy service doesn&#8217;t return anything).</p>
<p>What&#8217;s your setup like? I&#8217;m guessing you have an Iterator that&#8217;s yielding XML, then an IteratorReader wrapping that, then wrapping the IteratorReader in a StreamSource to return from a webservice. Is that right?</p>
<p>You can check the IteratorReader is working correctly by reading from it into a string, and then comparing the contents of the string with the original XML (ignoring whitespace differences etc.). That&#8217;s basically how my unit tests work. If they turn out the same, the bug&#8217;s not in my code :-)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Monica</title>
		<link>http://biotext.org.uk/iteratorreader-streaming-character-data-from-an-iterator/comment-page-1/#comment-129</link>
		<dc:creator>Monica</dc:creator>
		<pubDate>Thu, 25 Jun 2009 10:01:37 +0000</pubDate>
		<guid isPermaLink="false">http://biotext.org.uk/?p=146#comment-129</guid>
		<description>Hi Andrew,

Have you come across with your code to an error like this?

com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in CDATA section at [row,col {unknown-source}]: [1,17055]

I think it has got till the end of the stream, but it has not detected the end of the cdata section even it being there.
You think it can be problematic if the cdata goes across several reads?

Thanks,

Monica</description>
		<content:encoded><![CDATA[<p>Hi Andrew,</p>
<p>Have you come across with your code to an error like this?</p>
<p>com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in CDATA section at [row,col {unknown-source}]: [1,17055]</p>
<p>I think it has got till the end of the stream, but it has not detected the end of the cdata section even it being there.<br />
You think it can be problematic if the cdata goes across several reads?</p>
<p>Thanks,</p>
<p>Monica</p>
]]></content:encoded>
	</item>
</channel>
</rss>

