http://seananderson.ca/2013/10/19/reshape.html
Basically,
reshape2 is based around two key functions: melt and cast:
melt takes wide-format data and melts it into long-format data.
cast takes long-format data and casts it into wide-format data.
For example, this is wide format:
> head(fpkm)
ID FPKM.SRR1069188 FPKM.SRR1070986 FPKM.SRR1071289
ENSG00000240361.1 1.00000000 1.000000 1.000000
ENSG00000186092.4 1.00000000 1.000000 1.000000
ENSG00000237613.2 1.00000000 1.000000 1.000000
ENSG00000239906.1 0.05888838 5.139312 5.055983
ENSG00000241860.1 1.20237363 1.160175 1.085992
ENSG00000222623.1 1.00000000 1.000000 1.000000
>require('reshape2')
>head(melt(fpkm))
No id variables; using all as measure variables
variable value
1 FPKM.SRR1069188 1.00000000
2 FPKM.SRR1069188 1.00000000
3 FPKM.SRR1069188 1.00000000
4 FPKM.SRR1069188 0.05888838
5 FPKM.SRR1069188 1.20237363
6 FPKM.SRR1069188 1.00000000
or, you can set the column name by
> head(melt(fpkm, variable.name = "Sample",value.name ="FPKM"))
No id variables; using all as measure variables
Sample FPKM
1 FPKM.SRR1069188 1.00000000
2 FPKM.SRR1069188 1.00000000
3 FPKM.SRR1069188 1.00000000
4 FPKM.SRR1069188 0.05888838
5 FPKM.SRR1069188 1.20237363
6 FPKM.SRR1069188 1.00000000
if you want, you can also keep some of columns as ID in the long format, for example, I want to keep the gene ID in the long format:
>head(melt(fpkm, variable.name = "Sample",value.name ="FPKM", id="ID"))
ID Sample FPKM
1 ENSG00000240361.1 FPKM.SRR1069188 1.00000000
2 ENSG00000186092.4 FPKM.SRR1069188 1.00000000
3 ENSG00000237613.2 FPKM.SRR1069188 1.00000000
4 ENSG00000239906.1 FPKM.SRR1069188 0.05888838
5 ENSG00000241860.1 FPKM.SRR1069188 1.20237363
6 ENSG00000222623.1 FPKM.SRR1069188 1.00000000
I will do the long-->wide example when I have a good case to show... :)
Update: see this post of the long-->wide conversion.
Update: see this post of the long-->wide conversion.
No comments:
Post a Comment