Friday, January 24, 2014

about UCSC Genome Browser track

Two points:

1. The custom track data may be compressed by any of the following programs: gzip (.gz), compress (.Z), or bzip2 (.bz2). But not for bigwig and bam.

2. In a track hub Db configuration file, up to 9 subgroup types can be defined for a composite, such as:

subGroup1 <gTag1> <gTitle1> <mTag1a=mTitle1a> [mTag1b=mTitle1b…]
subGroup2 <gTag2> <gTitle2> <mTag2a=mTitle2a> [mTag2b= mTitle2b…]
...
subGroup9 <gTag9> <gTitle9> <mTag9a=mTitle9a> [mTag9b= mTitle9b…]

But these is no such limitation (I guess so, not test yet) for the tag/title pairs in each subGroup. For example, ENCODE data trackDb put all TFs in one subGroup:
http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/trackDb.txt

One question: How to share tracks but secure the data files in the track hub?

The current directory hierarchy for a hub is like:
myHub/ - directory containing track hub files

     hub.txt -  a short description of hub properties
     genomes.txt - list of genome assemblies included in the hub data
     hg19/ - directory of data for the hg19 (GRCh37) human assembly
          trackDb.txt - display properties for tracks in this directory
          dnase.html - description text for a DNase track 
          dnaseLiver.bigWig - wiggle plot of DNase in liver
          dnaseLiver.bigBed - regions of active DNase
          dnaseLung.bigWig - wiggle plot of DNase in lung
          dnaseLung.bigWig - regions of active DNase
          ...
          rnaSeq.html - description text for an RNAseq track
          rnaSeqLiver.bigWig - wiggle plot of RNAseq data in liver
          rnaSeqLiver.bigBed - intron/exon lists for liver
          rnaSeqLung.bigWig - wiggle plot of RNAseq data in lung
          rnaSeqLung.bigBed - intron/exon lists for lung
     hg18/ - directory of data for the hg18 (Build 36) human assembly
          trackDb.txt - display properties for tracks in this directory
          dnase.html - description text for a DNase track 
          dnaseLiver.bigWig - wiggle plot of DNase data in liver
          dnaseLiver.bigBed - regions of active DNase
          dnaseLung.bigWig - wiggle plot of DNase data in lung
          dnaseLung.bigWig - regions of active DNase
          ...
          rnaSeq.html - description text for an RNAseq track
          rnaSeqLiver.bigWig - wiggle plot of RNAseq data in liver
          rnaSeqLiver.bigBed - intron/exon lists for liver
          rnaSeqLung.bigWig - wiggle plot of RNAseq data in lung
          rnaSeqLung.bigBed - intron/exon lists for lung

The UCSC webpage also indicates that "unlisted hubs are in no way secure." But this is definitely a unsolved problem. Maybe the only solution is to set up your own local mirror?

1 comment:

  1. The genome browser at the University of California Santa Cruz (UCSC) is a popular web based tools for rapidly displaying the requested portion of a genome at any scale.

    ReplyDelete