Illumina/Solexa pipeline w/ Grid Engine
From the trenches @ BioTeam …
Illumina (www.illumina.com) is one of a few companies involved in the “next generation” DNA Sequencing race. Each company has has technology that wildly decreases the cost and time involved in large scale genome analysis. Everyone in this space wants to be the first company to produce a box capable of cranking out a human genome for $10K or less.
These lab instruments produce terabyte volumes of data per experimental run, and thus often need to get hooked into complex IT infrastructures (this is what pays my bills …)
This screenshot shows the end result of integrating the instrument data analysis pipline with Grid Engine software running on a midsized linux cluster.
In this test I have the software using 32 cores on 6 servers for the run and I’m timing it against the same analysis done manually on a single server.


Comment by Mike Davis on 24 October 2008:
Chris,
We have also done this with the Solexa Pipeline. In my opinion, compiling the pipeline is more difficult than integrating with Grid Engine via the use of qmake.
What version of the Pipeline are you using? We are having compile issues with the IOProcess.cpp of Bustard in 0.3.0. It consistently complains about undefined references for gzopen, gzread, etc. etc.
If you are using version 0.3.0 did you notice any of that?
Thanks,
Mike Davis