Wednesday, August 17, 2011

How to run Meme on cluster


The latest version of Meme, v4.6.1, was built with mpich, while the earlier versions were built with openmpi. You need to set the path correctly as in the examples below.
Your input database should consist of a file containing sequences in fasta format. In the example below, the file is 'mini-drosoph.s'.
Maxsize parameter: The maximum dataset size in characters. Determine the number of characters in your dataset by typing 'wc -c filename'. e.g.
[user@biowulf mydir]$ wc -c mini-drosoph.s 
506016 mini-drosoph.s
For this dataset, the maxsize parameter has to be set to greater than 506,016, so we will use 600000.Set up a batch script along the lines of the ones below:

Batch script for Meme 4.6.1

----  this file is called meme.batch ---------
#!/bin/bash
#This batch script can be used with Meme 4.6.1
#PBS -N Meme
#PBS -m be
#PBS -j oe

export PATH=/usr/local/mpich-1.2.7p1-gcc4_64/bin:$PATH
cd /data/user/meme/
time mpirun -machinefile $PBS_NODEFILE -np $np /usr/local/meme_4.6.1/bin/meme_p \
     /data/user/meme/test.fa -oc /data/user/meme/meme_out \
     -maxsize 10000000 -p $np


// The above tips are from http://biowulf.nih.gov/apps/meme.html

btw, the above script use PBS (Portable Batch System) submission system. Alternatively, there are other options, like LoadLeveler (Champion), LSF (Lonestar). There are syntax comparison here. Also, more detail for SGE and PBS here

No comments:

Post a Comment