biotext.org.uk

Research

RetriableTask — a generic wrapper for retrying operations in Java

by Andrew on May.20, 2009, under Research, Tips

A couple of times recently I’ve needed to write methods that retry up to n times if an error occurs, and surprisingly, couldn’t find any standard patterns to accomplish this on the web. So I wrote my own. All comments appreciated (there may well be massive holes in my logic but it works so far).

(continue reading…)

11 Comments :, more...

Simple server status monitoring with PHP and Perl

by Andrew on Feb.17, 2009, under Research

Recently, one of our web servers fell victim to an apparent DoS attack, being hammered with hundreds of simultaneous dynamic page requests, far more than it’s specced to handle. To its credit, it stayed up, although it took about five minutes to log in via ssh, and when we spotted what was happening, the load average was over 100 which I think is the most loaded I’ve ever seen. The offending IP address was at UCSD who do a lot of bioinformatics, so there is a chance it was a misguided attempt to scrape a lot of data from us, rather than an actual hostile act, but in any case the upshot is the same.

It worried me that there was no easy way to remotely check the machine’s health, so I hacked together a quick PHP page to report various vital statistics on demand — load average, memory usage, disk usage etc. — and a Perl monitor that can raise the alarm if anything exceeds safe bounds.

(continue reading…)

3 Comments :, , , more...

IteratorReader — streaming character data from an iterator

by Andrew on Jan.29, 2009, under Research

Recently when plugging two components of a high-throughput web service together, I ran into a snag. One component (a data repository) exposes an Iterator for pulling XML-formatted records out of it one by one. The other (for serving SOAP response documents) needed, ideally, something that could be wrapped in a StreamSource — i.e. an InputStream or Reader. But although these are both pull-based ways of providing (in this case) character data, they’re not compatible.

One easy option is to iterate over the whole Iterator and buffer the results in a String, and then use a StringReader. But that’s not terribly efficient, when you might well be dealing with XML documents in the 10-20MB range. So I wrote an IteratorReader class, which is a Reader that can be wrapped around any Iterator. Each time it’s read from, it pulls enough elements from the Iterator to enable the request to be fulfilled, and buffers any remainder. This keeps its memory usage down, although this of course depends on (a) the number of characters requested at once from its read method, and (b) the size of the elements coming off the Iterator. (Each element is simply converted into a String via its toString method before being stored in a character buffer.)

Surprisingly, given the vast amount of Java source out there, I couldn’t find an existing solution for this — not even in the usually comprehensive Apache Commons. The code is below, and you are free to do what you like with it, but a credit would be nice if you use it, and if you come up with any improvements I’d be interested to hear about them. In particular, I’m sure it could be optimized more, as it spends a lot of time garbage collecting in its current form. It’s pretty thoroughly tested, with an ArrayList of two million random strings as the source of the Iterator, and seems to work fine both with single-character reads and a BufferedReader wrapped round it. Actually, testing taught me some very interesting lessons, but that’s another post.

(continue reading…)

18 Comments :, more...

Java web service frameworks — a brief survey

by Andrew on Dec.27, 2008, under Research

One of the current ongoing R&D projects at CATH is the development of a library of web services for both internal and external use. The first step in this process was the selection of a web service toolkit (or ’stack’) to use, as there are several competing (and often mutually incompatible) frameworks available. This page compares some of them in brief based on some informal evaluations and background reading.

Note that I am far from being an expert in this field, so feel free to leave a comment if there’s anything here you disagree with. I am, as ever, happy to be corrected.

Also, please don’t get the impression that this is a methodical side-by-side comparison. I would have liked to implement the exact same non-trivial service in each stack, and report systematically on my experiences with each, but in reality I just didn’t have time to do it that rigourously. What I present here is the result of going through one or two tutorials (mostly of the Hello World/echo-a-string/double-a-number kind), reading the docs, hanging out on forums and garnering opinion from blog posts and articles. And the amount of time I spent with each toolkit is admittedly variable.

I hope it will be useful to someone, despite the caveats.

Andrew.

(continue reading…)

3 Comments :, , , , , , , more...

Search

Use the form below to search the site:

Leave a comment if you can't find what you need.