When you attempted to sort and extract the unique genomic regions using "sort -k1,1 -k2,2n -u", you might make a mistake by missing the region with the same chr and start, but different end position.
The right way should be "sort -k1,1 -k2,2n -k3,3n -u" or "sort -k1,1 -k2,2n | sort -u"
No comments:
Post a Comment