Posts tagged ‘nucleotide’

My DNA on my iPod?

If I wanted to store all the information included in my DNA, what size would it occupy? Could I copy it on my iPod and take it everywhere with me?

DNA is a double-stranded molecule. That means it is compounded of two strands, i.e. two sequences of nucleotides (which, once again, are represented with four letters A,C,G and T). Those two strands are complementary, facing each other. A natural measure of the length of DNA is known as “base pair” or “bp” . A base pair is just one nucleotide and its complement on the other strand.

The human genome contains about 3 billions baise pairs. Since there are four nucleotide, if one wants to code it in a computer format, he has to translate it in binary code: with 1’s and 0’s.

For example, let’s say:

  • A=00
  • C=01
  • G=10
  • T=11

Two bits are needed to code four letters. Each letter thus requires two bits of information. Eight bits make a byte, which is a classical measure of computer memory size. Don’t be afraid with the two lines of utterly simple math that lay below… In bytes, the size my genome requires:

3,000,000,000 bp x 2 bits / 8 = 750,000,000 bytes

A Mega byte, or Mb, is 2^20 bytes. In Mb, my genome would need:

750,000,000 / 2^20 = 715.2557 Mb

We can now answer the initial question: YES! I could store my whole genome on any iPod, even the smallest one. However, it couldn’t be stored on a classical CD-R, containing only 700Mb.

For the most tenacious among you, I should precise that here, we made the (sound) hypothesis that we wanted to record only one of the two strands of the DNA molecule. In fact, the two strands are redundant so it is useless to store both of them. But if you wanted to do so, note that the smallest model of iPod (the 1Gb Shuffle) is not able to store your genome!

Be cautious when choosing your next iPod!


June 19, 2008 at 6:02 pm 3 comments

THE genome vs. MY genome

As said in a previous post, the human genome has been sequenced in 2001. Remember the genome is the sequence of nucleotide defining most physical characteristics of an individual. Since then, everyone is talking about “the” human genome. Now a little riddle:

If there is one human genome, how come personal genomics companies now offer to analyze your personal genome?

Is it swindle? Do they sell every time the same information? Of course they don’t. And if you subscribed to their services, you probably paid for valuable information.

In fact the human genome must be seen rather as a map indicating the probabilities of seeing each nucleotide (A, C, G and T) at each location of the genome of an individual. So for example, at a given place in your genome, the nucleotide “A” may be observed. However, maybe at this place the expected nucleotide is “G” in 95% of human beings. This may reveal a specific trait of your genome. 

This is related to SNPs (Single Nucleotide Polymorphism) which surely will be covered in another post. Keep reading! 


June 17, 2008 at 5:24 pm Leave a comment

A bit of history and background

Why personalized medicine and all the -omics technologies are so hot nowadays? Why wasn’t it the case 10 years ago? And by the way… what is it?

Mankind has always been fighting for his own health. Classical medicine was developed centuries ago and is still now every day enhanced by new discoveries. Interest has also been cast into psychology since a few decades, healing people from another point of view. But it is only within the last 10 or 15 years that revolutionary techniques related to the (human) genome allow considering a radical change in the way healthcares are considered.

Everyone should now have access to information about his own genome!

But what is in fact a genome?

Your genome is the blueprint of your body. It is unique and defines your characteristics such as your eyes color, your hair type, etc. It is coded by a giant molecule known as DNA. It is generally two-stranded and compounded of small parts called “nucleotides”. Those can be of four types, named in short A, C, G and T. DNA is folded into chromosomes.

Some sequences of nucleotides have a special importance and form genes. Those specific parts of the genome are read by the cells machinery. Following the instructions described by the genes, special molecules are produced. Those are the proteins. The proteins serves as messengers inside and between cells, as basic cell building blocs and as operators of specific metabolic reactions. 


Now let’s get back to history and understand why personal genomics are only available now.

In 1958, Fred Sanger (working at Cambridge) proposed the first protein sequencing techniques. Sequencing is discovering the list of components of a biological object. In 1978, he proposed a DNA sequencing technique. He was awarded two Nobel Prizes for those two works. The first genome sequenced was the one of a virus. But virus have generally small genomes.

In 1995, a complete DNA sequence of a free-living organism (in opposition to virus) has been reported. This organism was a bacterium (H. influenza) and was sequenced by Craig Venter and his group (The Institute for Genomic Research, TIGR, still active in genomic and bioinformatic research). This year of ’95 was the very beginning of what is now called the post-genomic era. Other achievements followed. But it is only in 2001 that the full DNA sequence of H. Sapiens (this is us) has been completed.

The knowledge of the human genome isn’t thus even ten years old! However, this event generated such an excitement in the scientific community that strong efforts have been put in this research area. A decade later, we know enjoy high-throughput technologies such as microarrays, and all the armada of genome-wide chips. The time is now come for everyone to learn about his own genome!

Do you know what you’re made of? 


June 17, 2008 at 3:51 pm 2 comments


  • 4,774 hits