The creation of many new bioinformatic applications is largely a response to the biological community aggregating types of knowledge and developing new experimental techniques. The development of bioinformatic applications can be broken down into two main tasks: data management and biological analysis. The first task generally involves aggregating sequences (strings representing DNA or proteins, the former being represented by non-random but complex patterns of As, Cs, Ts and Gs) and/or annotation information (the known properties of a given sequence; for instance, annotation consists of the locations of genes and other biologically relevant features).

How to Get Genomic Annotation
As mentioned, EnsEMBL and the UCSC web sites collate and warehouse an extensive amount of genomic annotation. Individuals seeking to data-mine specific organisms can, as a minimum, visit the following sites:
 Mus Musculus, Mouse
 Rattus Norvegicus, Rat
 Fugu rubripes, Pufferfish
 Drosophila melanogaster, Fruitfly
 Caenorhabditis elegans, nematode
 Arabidopsis Thaliana, mustard weed
 Saccharomyces cerevisiae
This list is by no means comprehensive or representative of all the large genome sequencing and analysis projects that are currently in progress.

