Botany · Molecular Basis of Inheritance

Genetic Code

The genetic code is the dictionary that links the nucleotide sequence of mRNA to the amino acid sequence of a protein. It sits between transcription and translation in the chapter, and NEET tests it almost every year — through codon counting, the code's salient properties, start and stop codons, and reading-frame effects of mutations. A short, high-yield topic worth full marks.

NCERT grounding

NCERT Class XII Biology places the genetic code in Section 5.6 of Molecular Basis of Inheritance, immediately after transcription and just before translation. The text frames the central problem clearly: replication and transcription simply copy one nucleic acid into another, and so are easy to picture through complementarity. Translation is different. It demands a transfer of information from a polymer of nucleotides to a polymer of amino acids, and — in NCERT's own words — "neither does any complementarity exist between nucleotides and amino acids, nor could any be drawn theoretically."

That gap is exactly what the genetic code bridges. NCERT credits the physicist George Gamow with the bold reasoning that, because there are only 4 bases coding for 20 amino acids, the code must be a combination of bases read three at a time. The chemical synthesis of defined RNAs by Har Gobind Khorana and Marshall Nirenberg's cell-free protein-synthesis system then allowed the code to be deciphered and arranged into the codon checker-board of Table 5.1.

"The codon is triplet. 61 codons code for amino acids and 3 codons do not code for any amino acids, hence they function as stop codons." — NCERT Class XII Biology, Section 5.6

The NIOS Biology supplement (Lesson 23) reinforces the same set of properties — triplet, unambiguous, comma-less and non-overlapping, degenerate, universal — and adds the wobble idea that the first two bases of synonymous codons are usually conserved. This subtopic page goes deeper into how each property was established and why each one matters in the exam.

The genetic code, decoded

The genetic code is the rule book that assigns each amino acid a nucleotide "word" in mRNA. That word is a codon — a sequence of three adjacent bases. The ribosome reads an mRNA codon by codon, and a charged tRNA delivers the matching amino acid. Because the code is written in only four letters (A, U, G, C) but must specify twenty amino acids, the central question the early molecular biologists faced was: how many letters make one word?

Why the code had to be a triplet

George Gamow reasoned through simple arithmetic. A singlet code — one base per amino acid — gives only 4 possibilities. A doublet code gives 4 × 4 = 16 possibilities, still short of 20. A triplet code gives 4 × 4 × 4 = 4³ = 64 possibilities, comfortably more than the 20 amino acids needed. The triplet was the smallest word size that worked, so Gamow proposed it. Proving the codon was actually a triplet, NCERT notes, was a far more daunting task than proposing it.

64

Total codons in the genetic code

Of these, 61 codons specify amino acids (the sense codons) and 3 codons — UAA, UAG, UGA — are stop codons that specify no amino acid. 64 = 4³, the count of three-letter words from a four-letter alphabet.

Figure 2 The 64 codons: 61 sense codons and 3 stop codons 64 codons = 4³ (three-letter words from A, U, G, C) 61 sense codons → 20 amino acids 3 stop codons 61 > 20 ⇒ degeneracy: most amino acids have several codons UAA UAG UGA AUG = start

Figure 2. The 64 codons divide into 61 sense codons that specify the 20 amino acids and 3 stop codons. Because 61 exceeds 20, the code is degenerate. AUG doubles as the start codon.

The decisive experimental work came from two complementary techniques. Har Gobind Khorana's chemical method built synthetic RNA molecules with defined combinations of bases — homopolymers such as poly-U and copolymers with known repeating patterns. Marshall Nirenberg's cell-free protein-synthesis system could then take such an RNA and report which amino acid it directed into a polypeptide. Severo Ochoa's enzyme, polynucleotide phosphorylase, was also helpful because it polymerised RNA of defined sequence in a template-independent manner. Together these efforts filled in the 64-cell checker-board.

Figure 1 Codons read in a fixed reading frame on mRNA mRNA read 5′ → 3′ in non-overlapping triplets 5′ 3′ A U G U U U A A A G G C U A A Met Phe Lys Gly STOP START TERMINATION tRNA anticodons pair antiparallel with each codon U A C A A A U U U C C G none

Figure 1. An mRNA is read 5′→3′ in non-overlapping triplets. AUG sets the start, internal codons are translated one by one, and the stop codon UAA ends the message. There is no tRNA for a stop codon.

The salient features of the genetic code

NCERT lists six salient features of the genetic code. Reading them together gives a precise picture of how the code behaves — and almost every NEET question on this subtopic tests one of these properties directly.

Memory anchor: the code is triplet, degenerate, unambiguous, comma-less, non-overlapping and nearly universal, with AUG doing double duty and three codons reserved for stop.

Triplet

Three bases code one amino acid. 61 of the 64 codons specify amino acids; 3 are stop codons.

Degenerate

Most amino acids are coded by more than one codon, e.g. AAA and AAG both code lysine. 61 codons, 20 amino acids.

Unambiguous & specific

One codon codes for one and only one amino acid. A given triplet is never read two ways.

Comma-less & non-overlapping

Codons are read contiguously with no punctuation; one base belongs to one codon only.

Nearly universal

UUU codes phenylalanine from bacteria to humans. A few exceptions occur in mitochondria and some protozoans.

AUG & stop codons

AUG codes methionine and is the initiator codon. UAA, UAG, UGA are stop codons coding no amino acid.

Two features are most often tested and most often confused, so they deserve careful separation. Degeneracy means redundancy in the codons: because there are 61 sense codons for only 20 amino acids, several codons can stand for the same amino acid. Unambiguity works in the opposite direction: any single codon, once read, corresponds to exactly one amino acid. Degeneracy is a many-to-one relationship; the code is still never ambiguous, because the "many" all point to one and the same amino acid.

Degenerate vs Unambiguous — read the direction

Degenerate (redundant)

Many → 1

several codons → one amino acid

  • 61 sense codons, only 20 amino acids
  • Leucine, serine, arginine each have 6 codons
  • AAA and AAG both code lysine
  • First two bases often conserved (wobble)
vs

Unambiguous & specific

1 → 1

one codon → one amino acid

  • A codon is never read two ways
  • UUU is always phenylalanine
  • No codon shares two amino acids
  • Specificity makes translation reliable

AUG, stop codons and the universality of the code

The codon AUG has a dual function. It codes for the amino acid methionine, and it also acts as the initiator (start) codon that marks where the ribosome begins translating. At the start of a message AUG is read by a special initiator tRNA; when AUG appears internally it is read by an ordinary methionine tRNA. The three termination codons — UAA, UAG and UGA — code for no amino acid at all. There are no tRNAs for them; instead, when the ribosome reaches a stop codon a release factor binds and the completed polypeptide is freed.

Universality means the same codon dictionary operates from bacteria to humans: UUU codes phenylalanine in a bacterium and in a human cell alike. NCERT notes that the code is "nearly" universal — minor exceptions have been found in mitochondrial codons and in some protozoans. This near-universality has a practical pay-off: it is precisely because the code is shared that a human gene introduced into a bacterium is translated correctly, which is the basis of producing human insulin in bacteria by recombinant DNA technology.

"The code is nearly universal: from bacteria to human UUU would code for Phenylalanine."

NCERT Class XII Biology · Section 5.6

The reading frame and mutations

Because the code is comma-less and non-overlapping, the ribosome must start at the right base and then read in fixed, contiguous groups of three. The set of triplets defined by a chosen starting point is the reading frame. NCERT illustrates this with a sentence of three-letter words: RAM HAS RED CAP. Inserting one letter (say B) shifts every word that follows — RAM HAS BRE DCA P — and the message becomes nonsense from the point of insertion onward.

NCERT's RAM-HAS-RED-CAP demonstration of reading frame

three-letter words = codons
  1. Start

    RAM HAS RED CAP

    Original message, read correctly in triplets.

  2. +1 base

    RAM HAS BRE DCA P

    Insert one letter: frame shifts, message garbled downstream.

    Frameshift
  3. +2 bases

    RAM HAS BIR EDC AP

    Insert two letters: frame still shifted.

    Frameshift
  4. +3 bases

    RAM HAS BIG RED CAP

    Insert three letters: one whole codon added, frame restored.

    No frameshift

The conclusion is exact. Insertion or deletion of one or two bases — or any number not a multiple of three — shifts the reading frame from the point of the change onward; every codon downstream is misread. These are frameshift insertion or deletion mutations. Insertion or deletion of three bases, or any multiple of three, adds or removes whole codons, so one or more amino acids are gained or lost but the reading frame past that point is unaltered.

Contrast this with a point substitution, where a single base is swapped for another. A substitution changes at most one codon and so usually changes at most one amino acid. The classic NCERT example is the change of a single base pair in the gene for the beta globin chain, which replaces glutamate with valine and causes sickle cell anaemia. A point substitution leaves the reading frame intact; only an insertion or deletion that is not a multiple of three shifts the frame.

tRNA — the adapter that reads the code

The genetic code could not function without a molecule to physically connect a codon to its amino acid. Francis Crick predicted this need from the start: amino acids have no structural feature that lets them recognise an mRNA triplet directly, so an adapter molecule must do the reading. That adapter is tRNA, originally known as soluble RNA (sRNA), discovered before the code was even deciphered. Each tRNA has an anticodon loop whose three bases are complementary to a codon, and an amino acid acceptor end that carries the matching amino acid. tRNAs are specific for each amino acid, there is a distinct initiator tRNA, and there are no tRNAs for the stop codons.

Worked examples

Worked example 1

An mRNA reads 5′-AUG UUU UUC UUC UUU UUU UUC-3′. Using the codon table, predict the amino acid sequence it codes.

Read the message in triplets from the start codon. AUG codes methionine. Both UUU and UUC code phenylalanine (an example of degeneracy). So the polypeptide is Met-Phe-Phe-Phe-Phe-Phe-Phe. Note that the reverse problem — given the amino acid sequence, predict the mRNA — has no unique answer, because each phenylalanine could be UUU or UUC. That ambiguity in the reverse direction is degeneracy in action.

Worked example 2

An RNA has 999 bases and codes a protein of 333 amino acids. If the base at position 901 is deleted so the RNA becomes 998 bases, how many codons are altered?

999 bases / 3 = 333 codons, matching 333 amino acids. The first 900 bases form 300 intact codons. Deleting base 901 shifts the reading frame from that point onward, so every codon after the first 300 is misread. Altered codons = 333 − 300 = 33. This is a frameshift deletion (one base, not a multiple of three).

Worked example 3

For the mRNA 5′-AACAGCGGUGCUAUU-3′, which change leaves the reading frame unchanged: (a) inserting G at position 5, (b) deleting G at position 5, or (c) deleting GGU from positions 7, 8 and 9?

Inserting or deleting a single base shifts the frame, so (a) and (b) both cause a frameshift. Deleting three consecutive bases — GGU — removes exactly one whole codon, so the reading frame past that point is unchanged. The answer is (c). Rule: only insertions or deletions in multiples of three preserve the frame.

Worked example 4

Which property of the genetic code allows bacteria to manufacture human insulin by recombinant DNA technology?

The code being nearly universal means the same codon dictionary operates in humans and in bacteria. A human insulin gene transferred into a bacterium is therefore read correctly by the bacterial translation machinery. The relevant property is universality — not degeneracy, specificity or unambiguity.

Common confusion & NEET traps

Most errors on this subtopic come from mixing up properties that sound similar, or from careless codon counting. The traps below are the recurring ones in NEET papers.

NEET PYQ Snapshot — Genetic Code

Real NEET questions on codons, code properties, start/stop codons and reading frame.

NEET 2025

Who proposed that the genetic code for amino acids should be made up of three nucleotides?

  1. Franklin Stahl
  2. George Gamow
  3. Francis Crick
  4. Jacque Monod
Answer: (2)

Why: George Gamow, a physicist, reasoned that with only 4 bases coding 20 amino acids the code must be a triplet, since 4³ = 64 gives enough combinations.

NEET 2021

Statement I: The codon 'AUG' codes for methionine and phenylalanine. Statement II: 'AAA' and 'AAG' both codons code for the amino acid lysine. In the light of the above statements, choose the correct answer.

  1. Statement I is incorrect but Statement II is true
  2. Both Statement I and Statement II are true
  3. Both Statement I and Statement II are false
  4. Statement I is correct but Statement II is false
Answer: (1)

Why: AUG codes methionine and acts as the initiator codon — it does not code phenylalanine, so Statement I is wrong. AAA and AAG both code lysine, an example of degeneracy, so Statement II is true.

NEET 2019

Which of the following features of genetic code allows bacteria to produce human insulin by recombinant DNA technology?

  1. Genetic code is not ambiguous
  2. Genetic code is redundant
  3. Genetic code is nearly universal
  4. Genetic code is specific
Answer: (3)

Why: Because the code is nearly universal, a human gene is read the same way by bacterial translation machinery, allowing bacteria to make human insulin.

NEET 2019

Under which condition will there be no change in the reading frame of the mRNA 5′-AACAGCGGUGCUAUU-3′?

  1. Insertion of G at 5th position
  2. Deletion of G from 5th position
  3. Insertion of A and G at 4th and 5th positions respectively
  4. Deletion of GGU from 7th, 8th and 9th positions
Answer: (4)

Why: Deleting three consecutive bases (GGU) removes one whole codon, leaving the reading frame intact. Inserting or deleting one or two bases shifts the frame.

FAQs — Genetic Code

Quick answers to the questions students ask most about the genetic code.

Why is the genetic code a triplet code and not a doublet?

There are only 4 bases but 20 amino acids must be specified. A singlet code gives 4 combinations and a doublet code gives 4 × 4 = 16 combinations — both fewer than 20. A triplet code gives 4 × 4 × 4 = 64 combinations, enough to code all 20 amino acids with codons to spare. George Gamow first reasoned that the code must be a triplet.

What does it mean that the genetic code is degenerate?

Degenerate means most amino acids are coded by more than one codon. Since 61 codons code for only 20 amino acids, several codons map to the same amino acid — for example, both AAA and AAG code for lysine. Degeneracy is not the same as ambiguity: one codon still specifies exactly one amino acid.

What is the dual function of the AUG codon?

AUG has two roles. It codes for the amino acid methionine, and it also acts as the initiator (start) codon that marks where translation begins on an mRNA. The same triplet is read by a special initiator tRNA at the start of a message and by an ordinary methionine tRNA when AUG occurs internally.

Which codons are stop codons and do they code for any amino acid?

UAA, UAG and UGA are the three stop or termination codons. They do not code for any amino acid and there are no tRNAs for them. When the ribosome reaches a stop codon a release factor binds and the finished polypeptide is released. Of the 64 codons, 61 code for amino acids and these 3 are stop codons.

How does a frameshift mutation differ from a point substitution?

A point substitution changes a single base, altering at most one codon and usually one amino acid. Insertion or deletion of one or two bases shifts the reading frame from the point of change onward, so every codon downstream is misread — this is a frameshift mutation. Insertion or deletion of three bases (or a multiple of three) adds or removes whole codons and keeps the reading frame intact.

Who deciphered the genetic code?

Har Gobind Khorana developed chemical methods to synthesise RNA with defined base combinations. Marshall Nirenberg built a cell-free protein synthesis system that allowed codons to be read out, and Severo Ochoa's enzyme polynucleotide phosphorylase made RNAs of defined sequence. Together these efforts produced the codon table. Francis Crick had earlier predicted the adapter molecule that became tRNA.