Abstracts - Comments

2009-06 ISMB Stockholm

BLASTScanner: Fast BLAST data processing with database output

Detlef Groth(1), Joachim Selbig (1), Albert J. Poustka(2), Georgia Panopoulou (2)

  1. University of Potsdam, AG Bioinformatics
  2. MPIMG Berlin, Evolution and Development

Motivation: One of the most widely used bioinformatic tools to compare protein or nucleotide sequences is BLAST. Although a large number of applications and frameworks exists for parsing and analyzing BLAST output files, none of them completely fulfills the requirements of an easy and platform independent installation, fast processing speed, and storage of the parsing results in a manner suitable for downstream processing.

Results: We developed BLASTScanner, a small cross-platform console application which translates BLAST text files into standard database code suitable for loading into modern relational databases like SQLite, PostgreSQL and MySQL. BLASTScanner can be easily compiled, requires no installation, and was faster and easier to use than other commonly used BLAST parsers.

Availability: The source code and binaries for various platforms are freely available at the Sourceforge project page http://bioscanners.sf.net.

Contact: dgroth-a-uni-potsdam.de (replace -a- with @)

Poster: Attach:poster-ismb-2009.pdf

2007-04 (dg)

An often occuring task in nowadys biological/physiological work is analysing, also called “parsing”, of large amounts of biological data. Although tools has been written many times in many programming languages for such tasks most applications do not satisfy the need of end users - researchers with limited programming skills. Installation, speed and output are not suited for easy usage of tools and concentrating on the scientifc task. Instead of handwriting applications we use scanner generators like Flex and Jflex to create fast scanners for biological data which are easy to code, easy to maintain, easy to install and as standalone applications without any external dependencies easy to use. The data output of those tools is standard SQL-code not tied to a certain RDBMS which can be used to fill the database. This allows researches to use their database of choice and SQL statements to assembl data from different resources and applications. Some practical examples are provided to document the advantages of using scanners which can fill databases to combine results from different scanning procedures.