As a senior-in-age-but-not-senior-in-knowledge bioinformatian, I would seriously recommend who will like to work in this field to have basic knowledge in the following subjects I can think of:
1. probability and statistics (not everyone know the difference between them)
2. machine learning (the 4-elements circle: data + algorithm + model + criteria)
3. programming design (knowing how to write script does not mean you know how to program; a good programer should learn the concept of how to write code in a inheritable manner).
4. algorithm and data structure (many know some algorithm, but to truly understand it is not a easy task. Binindex is a good example of using the concept of binary tree to store/query genomic coordinate in a super fast way.)
5. know how to appreciate a scientific work. (A paper can be good in way of (i) data sources (2) method and/or (3) idea. For sure it's also important to tell good paper from junk papers. I feel it's so important to enhance the sensitivity of 'smelling' a paper)
I found this nice reading list from Hendrik's page (http://www.liacs.nl/~hoogeboo/mcb/nature_primer.html)
How to apply de Bruijn graphs to genome assembly (Phillip E C Compeau, Pavel A Pevzner & Glenn Tesler) November 2011, Vol 29, No 11; pp 987 - 991 doi: 10.1038/nbt.2023 (?) Analyzing 'omics data using hierarchical models (Hongkai Ji & X Shirley Liu) April 2010, Vol 28, No 4; pp 337 - 340 doi: 10.1038/nbt.1619 (?) What is flux balance analysis? (Jeffrey D Orth, Ines Thiele & Bernhard Ø Palsson) March 2010, Vol 28, No 3; pp 245 - 248 doi: 10.1038/nbt.1614 (?) How does multiple testing correction work? (William S Noble) December 2009, Vol 27, No 12 ; pp 1135 - 1137 doi: 10.1038/nbt1209-1135 (?) How to visually interpret biological data using networks (Daniele Merico, David Gfeller & Gary D Bader) October 2009, Vol 27 No 10 ; pp 921 - 924 doi: 10.1038/nbt.1567 (?) How to map billions of short reads onto genomes (Cole Trapnell & Steven L Salzberg) May 2009, Vol 27, No 5; pp 455 - 457 doi: 10.1038/nbt0509-455 (?) SNP imputation in association studies (Eran Halperin & Dietrich A Stephan) April 2009, Vol 27, No 4; pp 349 - 351 doi: 10.1038/nbt0409-349 (?) Maximizing power in association studies (Eran Halperin & Dietrich A Stephan) March 2009, Vol 27, No 3; pp 255 - 256 doi: 10.1038/nbt0309-255 (?) Understanding genome browsing (Melissa S Cline & W James Kent) February 2009, Vol 27, No 2; pp 153 - 155 doi: 10.1038/nbt0209-153 (?) What are decision trees? (Carl Kingsford & Steven L Salzberg) September 2008, Volume 26, No 9; pp 1011 - 1013 doi: 10.1038/nbt0908-1011 (?) What is the expectation maximization algorithm? (Chuong B Do & Serafim Batzoglou) August 2008, Volume 26 No 8; pp 897 - 899 doi: 10.1038/nbt1406 (?) What is principal component analysis? (Markus Ringnér) March 2008, Volume 26, No 3; pp 303 - 304 doi: 10.1038/nbt0308-303 (?) What are artificial neural networks? (Anders Krogh) February 2008, Volume 26, No 2; pp 195 - 197 doi: 10.1038/nbt1386 (?) | How does eukaryotic gene prediction work? (Michael R Brent) August 2007, Volume 25, No 8; pp 883 - 885 doi: 10.1038/nbt0807-883 (?) How do shotgun proteomics algorithms identify proteins? (Edward M Marcotte) July 2007, Volume 25, No 7; pp 755 - 757 doi: 10.1038/nbt0707-755 (?) What is a support vector machine? (William S Noble) December 2006, Volume 24, No 12; pp 1565 - 1567 doi: 10.1038/nbt1206-1565 (?) How does DNA sequence motif discovery work? (Patrik D'haeseleer) August 2006, Volume 24, No 8; pp 959 - 961 doi: 10.1038/nbt0806-959 (?) What are DNA sequence motifs? (Patrik D'haeseleer) April 2006, Volume 24, No 4; pp 423 - 425 doi: 10.1038/nbt0406-423 (?) Inference in Bayesian networks (Chris J Needham, James R Bradford, Andrew J Bulpitt & David R Westhead) January 2006, Volume 24, No 1; pp 51 - 53 doi: 10.1038/nbt0106-51 (?) How does gene expression clustering work? (Patrik D'haeseleer) December 2005, Volume 23, No 12; pp 1499 - 1501 doi: 10.1038/nbt1205-1499 (?) How do RNA folding algorithms work? (Sean R Eddy) November 2004, Volume 22, No 11; pp 1457 - 1458 doi: 10.1038/nbt1104-1457 (?) What is a hidden Markov model? (Sean R Eddy) October 2004, Volume 22, No 10; pp 1315 - 1316 doi: 10.1038/nbt1004-1315 (?) What is Bayesian statistics? (Sean R Eddy) September 2004, Volume 22, No 9; pp 1177 - 1178 doi: 10.1038/nbt0904-1177 (?) Where did the BLOSUM62 alignment score matrix come from? (Sean R Eddy) August 2004, Volume 22, No 8; pp 1035 - 1036 doi: 10.1038/nbt0804-1035 (?) What is dynamic programming? (Sean R Eddy) July 2004, Volume 22, No 7; pp 909 - 910 doi: 10.1038/nbt0704-909 (?) |
Getting Started in ...Getting Started in Gene Orthology and Functional Analysis(Fang G, Bhardwaj N, Robilotto R, Gerstein MB) PLoS Comput Biol (2010) 6(3): e1000703; doi: 10.1371/journal.pcbi.1000703 (?) Getting Started in Structural Phylogenomics (Sjölander K ) PLoS Comput Biol (2010) 6(1): e1000621 ; doi: 10.1371/journal.pcbi.1000621 (?) Getting Started in Gene Expression Microarray Analysis (Slonim DK, Yanai I) PLoS Comput Biol (2009) 5(10): e1000543; doi: 10.1371/journal.pcbi.1000543 (?) Getting Started in Text Mining: Part Two. (Rzhetsky A, Seringhaus M, Gerstein MB) PLoS Comput Biol (2009) 5(7): e1000411. ; doi: 10.1371/journal.pcbi.1000411 (?) Getting Started in Computational Mass Spectrometry-Based Proteomics. (Vitek O) PLoS Comput Biol (2009) 5(5): e1000366. ; doi: 10.1371/journal.pcbi.1000366 (?) | Getting Started in Computational Immunology. (Kleinstein SH ) PLoS Comput Biol (2008) 4(8): e1000128; doi: 10.1371/journal.pcbi.1000128 (?) Getting Started in Biological Pathway Construction and Analysis. (Viswanathan GA, Seto J, Patil S, Nudelman G, Sealfon SC ) PLoS Comput Biol (2008) 4(2): e16; doi: 10.1371/journal.pcbi.0040016 (?) Getting Started in Text Mining (Cohen KB, Hunter L) PLoS Comput Biol (2008) 4(1): e20; doi: 10.1371/journal.pcbi.0040020 (?) Getting Started in Probabilistic Graphical Models. (Airoldi EM ) PLoS Comput Biol (2007) 3(12): e252. ; doi: 10.1371/journal.pcbi.0030252 (?) Getting Started in Tiling Microarray Analysis (Liu XS) PLoS Comput Biol (2007) 3(10): e183; doi: 10.1371/journal.pcbi.0030183 (?) Ten Simple RulesAlso the Ten Simple Rules series of editorials has a separate page at the PLoS journal. A link is now all you need to read about 'Ten Simple Rules for Getting Published' or '...for a Good Poster Presentation', etc.On the Process of Becoming a Great Scientist (Giddings MC) PLoS Comput Biol (2008) 4(2): e33; doi: 10.1371/journal.pcbi.0040033 (?) |
Good list. Thanks for putting this up!
ReplyDeleteThanks for visiting! :) Actually I need to learn most of them.
ReplyDeleteThanks, pretty good list!
ReplyDeleteGood list
ReplyDeleteI'm currently revising for my undergrad finals, and this is most useful. Thanks.
ReplyDeleteknowledge sharing nice information ... thanx share the knowledge and great post...
ReplyDeleteWhat an excellent list! Thanks for sharing
ReplyDelete