Thursday, March 05, 2015

How to extract the gap region in human genome?

Just notice that I should avoid the gap region, esp. when we generate a random background as your null distribution using tools such as bedtools shuffle.

Short answer: go below UCSC Table Browser link and choose to save as a bed file
 http://genome.ucsc.edu/cgi-bin/hgTables?clade=mammal&org=Human&db=hg19&hgta_group=allTables&hgta_track=hg19&hgta_table=gap&hgta_regionType=genome&hgta_outputType=primaryTable

As below table shown, 8.28% of hg19 assembly are simply gap.

Gap (gap) Summary Statistics
item count457
item bases239,845,127 (8.28%)
item total239,845,127 (8.28%)
smallest item47
average item524,825
biggest item30,000,000

2 comments:

  1. I still couldn't get your idea. Could you explain more? Thanks!



    ReplyDelete
    Replies
    1. If you are not clear why there are gaps in the human genome, you may refer to this post: https://www.biostars.org/p/67068/, or searching "gap region in human genome" in Google.

      Delete