What are SNPs?

To begin with, and probably to confuse more than illuminate, a SNP (pronounced "snip") stands for Single Nucleotide Polymorphism. So, what does that mean? Well, we need to think about DNA in order for that to make any sense. DNA can be thought of as a very long, linear molecule made up of 4 different basic units called nucleotides. These nucleotides are usually referred to by the initial letter of their name, that is Adenine, Thymine, Cytosine and Guanine. Thus a molecule of DNA can be represented as a long string of the four letters A, T, C and G. A short section of DNA could have the following sequence:


...ATTCGCCATCAGTCACCAAGTCGTTCTGTCGTTAC...

A second fact we need to know about DNA is that in most cases it does not exist as a single molecule but as a double strand (the famous double helix). The two strands are complementary to each other with T always paired with A and C always paired with G. So rather than a single strand as represented above, our molecule of DNA would look like this:


...ATTCGCCATCAGTCACCAAGTCGTTCTGTCGTTAC...
...TAAGCGGTAGTCAGTGGTTCAGCAAGACAGCAATG...

These double stranded molecules of DNA are organised into large (at a molecular level!) structures called chromosomes. Organisms, man, cows, barley for instance, receive their DNA in the gametes (sex cells) from their parents. Thus, they have two copies of each chromosome, one from their male parent and one from their female parent. These will not be completely identical to each other in terms of the sequence of the DNA that they contain, although they will be very, very similar. If we consider the hypothetical fragment of DNA typed-out above, we might find that there is a single difference between the paternal(blue) and maternal copies (green).


...ATTCGCCATCAGTCACCAAGTCGTTCTGTCGTTAC...
...TAAGCGGTAGTCAGTGGTTCAGCAAGACAGCAATG...

...ATTCGCCATCAGTCACTAAGTCGTTCTGTCGTTAC...
...TAAGCGGTAGTCAGTGATTCAGCAAGACAGCAATG...

This single difference (the 'S' in SNP stands for 'single' - see first line) is a SNP: to be precise, in order for such differences to be classified as genuine SNPs rather than spontaneous mutations they must occur in at least 1% of the population. Single nucleotide polymorphisms are classified into two groups (transitions and transversions) depending on how they are thought to have arisen: transitions (A<>T or G<>C) or transversions (A<>G, A<>C, G<>T or C<>T).

If one were to look along all the DNA in a single chromosome and make comparisons between different individuals in a population many such SNPs would be found. In humans, for example SNPs occur every 100 to 300 bases along the 3-billion-base human genome. Indeed, they may well occur on average every few hundred letters in the sequence.

Molecular biologists are able to use these differences to obtain some feeling for the level of genetic variation in a population and, by their association with genes, are able to map the position of genes. Breeders, on the other hand, can use them to follow genes of interest in the progeny population when they perform crosses.

SNPs can occur in coding (gene) and noncoding regions of the genome. Many SNPs have no effect on cell function, but scientists believe others could predispose people to disease or influence their response to a drug.

Although more than 99% of human DNA sequences are the same, variations in DNA sequence can have a major impact on how humans respond to disease; environmental factors such as bacteria, viruses, toxins, and chemicals; and drugs and other therapies. This makes SNPs valuable for biomedical research and for developing pharmaceutical products or medical diagnostics. SNPs are also evolutionarily stable (that is, they don't change much from generation to generation) making them easier to follow in population studies.

Scientists believe SNP maps will help them identify the multiple genes associated with complex ailments such as cancer, diabetes, vascular disease, and some forms of mental illness. These associations are difficult to establish with conventional gene-hunting methods because a single altered gene may make only a small contribution to the disease.

Homoeologous vs varietal SNPs