Key Bioinformatics Terms for Data Scientists

Learn key bioinformatics terms including the various "-omics" and the differences between the types of DNA and RNA to help you understand one of the key applications of data science.

By John Unikowski, Sept 2014.

Connectome: the complete description of the structural connectivity of an organism’s nervous system, often used in connection as a comprehensive circuit diagram of the human brain

Enhanceosome: A complex of proteins, transcription factors that assemble cooperatively at the enhancer region of a gene;  the region of DNA that can increase transcription.

Exposomics:  refers to the study of the amount of environmental contaminants that a person is exposed to from conception onwards during their lifetime, covering their eventual impact on genomes and ultimately health.

i2b2: Informatics for Integrating Biology and the Bedside, a framework facilitating the design of targeted therapies for individual patients by enabling clinicians to use existing data for discovery research

Indel: a type of genetic variation involving either an insertion or deletion mutation of a specific nucleotide sequence

Inflammasomes:  comprise protein complexes implicated in physiological and pathological inflammation, that are activated upon cellular infection or stress to facilitate innate immune defences.

Isogenic: Two organisms are isogenic when possessing the same genetic composition.

Hand holding DNA helix

Karyotype:  an individual's collection of chromosomes. It is equivalent to the number and appearance of chromosomes in the nucleus of a eukaryote (somatic) cell.  It also refers to a lab technique that produces a photomicrograph of an individual's chromosomes

Kinome:  the genome subset comprising the protein kinase genes expressed in a cell

Metagenomics: describes the functional and sequence-based genomic analysis of the collective microbial genomes contained in an environmental sample by direct extraction and cloning of DNA

Microbiome:  the total genetic content of microbes, that are collectively referred to as microbiota,  colonising a given environment, especially the human body,

-Omics: the study of related sets of biomolecules, interpreted by a specified computational model.

Operon: A group of distinct genes that are expressed and regulated, functioning as a single transcription unit.

Retrotransposon:  A small, mobile DNA sequence that can move from one genomic location to another by producing RNA that is transcribed by reverse transcriptase back into DNA, which is then inserted at a new site. It is also called a retroposon.

siRNA: Small interfering RNA (siRNA) are small pieces of double-stranded (ds) RNA   20-25 nucleotides long that can interfere with the translation of proteins i.e. with the expression of a specific gene

Transcriptome: the full range or complement of all RNA molecules, including mRNA, rRNA, tRNA and other non-coding RNA in cells.

DNA Double Helix

Transposon:  a segment of DNA that inserts and independently replicates itself in another place in the genome.

Variome: comprises the variants in an individual's genome, the total set of genetic variations found in populations of species that have experienced  a relatively short evolutionary change

John Unikowski studied Chemistry with Biochemistry for BSc at Brunel University, London then obtained an MSc in Bio-Organic Chemistry from the University of St Andrews. His bioinformatics experience includes 5 years with Thomson Corporation as a scientific indexer involved in the updating of pharmaceutical databases. Currently he is working as a freelance information scientist with interests in both chem.- and bioinformatics.