One Tip Per Day: Three ways to convert bam/bed file to bigwig, separated by strand

Thursday, July 12, 2012

Three ways to convert bam/bed file to bigwig, separated by strand

Here are three ways to convert bam/bed to bigwig, separated by strand:

# ----------- method 1

bamToBed -i accepted_hits.bam -split > accepted_hits.bed

awk '{if($6=="+") print}' accepted_hits.bed | sort -k1,1 | bedItemOverlapCount mm9 -chromSize=ChromInfo.txt stdin | sort -k1,1 -k2,2n > accepted_hits.plus.bedGraph
awk '{if($6=="-") print}' accepted_hits.bed | sort -k1,1 | bedItemOverlapCount mm9 -chromSize=ChromInfo.txt stdin | sort -k1,1 -k2,2n | awk '{OFS="\t"; print $1,$2,$3,"-"$4}' > accepted_hits.minus.bedGraph

bedGraphToBigWig accepted_hits.plus.bedGraph ChromInfo.txt accepted_hits.plus.bw

bedGraphToBigWig accepted_hits.minus.bedGraph ChromInfo.txt accepted_hits.minus.bw

# ----------- method 2

bamToBed -i accepted_hits.bam -split > accepted_hits.bed

sort -k1,1 accepted_hits.bed | awk -v '{print $0 >> "accepted_hits.bed"$6}'
bedItemOverlapCount $index -chromSize=ChromInfo.txt accepted_hits.bed+ | sort -k1,1 -k2,2n > accepted_hits.plus.bedGraph
bedItemOverlapCount $index -chromSize=ChromInfo.txt accepted_hits.bed- | sort -k1,1 -k2,2n | awk '{OFS="\t"; print $1,$2,$3,"-"$4}' > accepted_hits.minus.bedGraph

bedGraphToBigWig accepted_hits.plus.bedGraph ChromInfo.txt accepted_hits.plus.bw

bedGraphToBigWig accepted_hits.minus.bedGraph ChromInfo.txt accepted_hits.minus.bw

# ----------- method 3

bedtools genomecov -ibam -bg -split -strand + -i accepted_hits.bam -g ChromInfo.txt > accepted_hits.plus.bedGraph
bedtools genomecov -ibam -bg -split -strand - -i accepted_hits.bam -g ChromInfo.txt > accepted_hits.minus.bedGraph

bedGraphToBigWig accepted_hits.plus.bedGraph ChromInfo.txt accepted_hits.plus.bw

bedGraphToBigWig accepted_hits.minus.bedGraph ChromInfo.txt accepted_hits.minus.bw

8 comments:

Unknown2:46 PM
Hi!! Very useful post!!

I first tried the 3rd method because looks more simpler that the others. I put in a bash for loop like this:

for i in $(ls *.bam)

do

bedtools genomecov -ibam -bg -split -strand + -i $i -g chromInfo.txt > $i.plus.bedGraph
bedtools genomecov -ibam -bg -split -strand - -i $i -g chromInfo.txt > $i.minus.bedGraph

bedGraphToBigWig $i.plus.bedGraph chromInfo.txt accepted_hits.plus.bw
bedGraphToBigWig $i.minus.bedGraph chromInfo.txt accepted_hits.minus.bw

done

But this error appears:

Input error: Chromosome chr10 found in non-sequential lines. This suggests that the input file is not sorted correctly.
Input error: Chromosome chr18 found in non-sequential lines. This suggests that the input file is not sorted correctly.
Input error: Chromosome chr11 found in non-sequential lines. This suggests that the input file is not sorted correctly.
Input error: Chromosome chr2 found in non-sequential lines. This suggests that the input file is not sorted correctly.
Input error: Chromosome chr12 found in non-sequential lines. This suggests that the input file is not sorted correctly.

So, it's seems that a sort step is missing

But the first method works great, here is the bash for loop if some want to apply this pipeline to a bath of bams.

for i in $(ls *.bam)

do

name=${i%.bam}

echo $name

bamToBed -i $name.bam -split > $name.bed

awk '{if($6=="+") print}' $name.bed | sort -k1,1 | bedItemOverlapCount hg19 -chromSize=chromInfo.txt stdin | sort -k1,1 -k2,2n > $name.plus.bedGraph
awk '{if($6=="-") print}' $name.bed | sort -k1,1 | bedItemOverlapCount hg19 -chromSize=chromInfo.txt stdin | sort -k1,1 -k2,2n | awk '{OFS="\t"; print $1,$2,$3,"-"$4}'$

bedGraphToBigWig $name.plus.bedGraph chromInfo.txt $name.plus.bw
bedGraphToBigWig $name.minus.bedGraph chromInfo.txt $name.minus.bw

done

cheers.
ReplyDelete
Replies
Anonymous2:26 PM
The bedtools genomecov syntax in the 3rd method doesn't generate the right bedGraph, instead, it generates the data for coverage historgram.
ReplyDelete
Replies
Unknown9:15 AM
There is a little correction as bedtools expects bed/gff/vcf files. This was tested with bedtools v2.17.0.

Method 3 should be

bedtools genomecov -bg -split -strand + -ibam accepted_hits.bam -g ChromInfo.txt > accepted_hits.plus.bedGraph
bedtools genomecov -bg -split -strand - -ibam accepted_hits.bam -g ChromInfo.txt > accepted_hits.minus.bedGraph

bedGraphToBigWig accepted_hits.plus.bedGraph ChromInfo.txt accepted_hits.plus.bw
bedGraphToBigWig accepted_hits.minus.bedGraph ChromInfo.txt accepted_hits.minus.bw
ReplyDelete
Replies
Anonymous9:11 AM
Hi!
That is very helpful, but I have the problem that when I load my file on UCSC what it seems is that my minus file is just a shifted reflection of the plus and I do not see the strand specific information.
I have mapped with STAR and than with samtools generated a bam file and than I used the third method.
I am very puzzled.
Best
Giulia
ReplyDelete
Replies
Anonymous3:30 PM
for f in *.bedgraph.gz; do
zcat $f | grep -v track > $f.temp &&
bedGraphToBigWig $f.temp ./hg19.chrom.sizes `echo $f | sed -e s/bedgraph.gz/bw/` &&
rm $f.temp &
done;
ReplyDelete
Replies
Unknown5:36 AM
You may want to use bam2bigwig.py (https://github.com/lpryszcz/bin).
More info here: http://bioinfoexpert.com/?p=235
ReplyDelete
Replies

Add comment

Pages

Thursday, July 12, 2012

Three ways to convert bam/bed file to bigwig, separated by strand

8 comments: