DNA structure — the Watson-Crick double helix
DNA is a polymer of nucleotides. Each nucleotide has three parts — a nitrogenous base, a pentose sugar (deoxyribose in DNA, ribose in RNA), and a phosphate group. The base links to the 1' carbon of the sugar via an N-glycosidic bond, and phosphates connect the 5' carbon of one sugar to the 3' carbon of the next. The sugar-phosphate backbone runs with a defined 5' → 3' polarity.
In 1953, James Watson and Francis Crick — using the X-ray diffraction data of Maurice Wilkins and Rosalind Franklin and the base-ratio observations of Erwin Chargaff — proposed the double helix. Two polynucleotide strands run antiparallel; bases pair specifically across the helix — adenine with thymine through two hydrogen bonds, guanine with cytosine through three. Pairing of a purine (A or G) with a pyrimidine (T or C) keeps the helix at uniform width.
Watson-Crick features & Chargaff's rule: the five geometric facts plus the two ratio rules that NEET asks again and again — distance per base pair, base pairs per turn, pitch, sugar-phosphate backbone, antiparallel polarity, and A = T, G = C.
1 · Antiparallel strands
5' ↔ 3'
opposite polarity
Two polynucleotide chains run in opposite directions — one 5' → 3', the other 3' → 5' — held together by hydrogen bonds between bases.
2 · Base pairing
A=T, G≡C
2H / 3H bonds
Purine always pairs with pyrimidine — keeps the helix width uniform at 2 nm.
PYQ 2021: if A = 30%, G = C = 20%3 · Right-handed helix
3.4 nm
pitch of one turn
The B-form DNA in cells is right-handed. One full turn spans 3.4 nm.
4 · 10 bp per turn
0.34 nm
per base pair
There are approximately 10 base pairs in each turn — distance between two consecutive base pairs is 0.34 nm.
PYQ 2020: 6.6×10⁹ bp ≈ 2.2 m5 · Backbone outside
Sugar-PO₄
on the outside
The hydrophilic sugar-phosphate backbone faces solvent; bases stack inside the helix — keeps DNA stable.
6 · Chargaff: A = T
A : T = 1
ratio rule
In any double-stranded DNA, the amount of adenine equals the amount of thymine.
7 · Chargaff: G = C
G : C = 1
ratio rule
Guanine equals cytosine. (A+G)/(T+C) is constant for a species — directly led Watson and Crick to base-pairing.
Packaging — how 2 metres fit in a 10-micron nucleus
If the DNA in one human cell were stretched out, it would reach 2.2 metres — yet the nucleus is barely 10 micrometres across. The fit is hierarchical and built on one repeating unit: the nucleosome.
The packaging is electrostatic. DNA's phosphate backbone is negatively charged. Histones — small, basic proteins rich in the positively charged amino acids lysine and arginine — neutralise the charge. Five histone types exist (H1, H2A, H2B, H3, H4); four of them (H2A, H2B, H3, H4) form pairs to make a histone octamer. Around this octamer, 200 base pairs of DNA wrap — that is one nucleosome. H1 sits outside the bead and locks the DNA. Euchromatin is loosely packed and transcriptionally active; heterochromatin is densely packed and transcriptionally inactive.
The search for the genetic material — Griffith → Avery → Hershey-Chase
For half of the twentieth century the answer was uncertain. Three experiments — each more decisive than the last — moved the field from suspicion of protein to certainty of DNA.
Hershey and Chase grew one batch of bacteriophages in medium containing radioactive phosphorus (³²P, found in DNA but not protein); a second batch in radioactive sulphur (³⁵S, found in protein but not DNA). After infecting E. coli, the bacteria were agitated in a blender to dislodge the phage coats and centrifuged. ³²P appeared inside the bacteria, while ³⁵S stayed in the supernatant with the coats. Only DNA had entered — the answer NEET 2023 (Q.112) and NEET 2017 (Q.51) both asked.
Properties of genetic material — why DNA, not RNA
A genetic material must (i) replicate, (ii) be chemically stable, (iii) allow slow mutations, and (iv) express Mendelian characters. RNA can replicate and mutate but its 2'-OH group is chemically labile; DNA's thymine (vs uracil) is more stable. Hence DNA stores information, RNA does catalysis and transient transfer.
The RNA world hypothesis
If DNA needs protein enzymes to replicate, and proteins need DNA-encoded genes to be made, which came first? The RNA world hypothesis proposes RNA as the first genetic material. Evidence: many essential processes — splicing, peptide-bond formation by 23S rRNA — depend on ribozymes (catalytic RNA); cofactors like ATP, NAD, FAD carry RNA-derived nucleotides; RNA can both store information and catalyse reactions. DNA later arose as a more stable information carrier; proteins took over catalysis. NEET 2018 (Q.115) flagged the match "ribozyme — nucleic acid."
DNA replication — the semi-conservative model
The base-pairing rule made the replication mechanism almost self-evident. Watson and Crick concluded their 1953 paper with one of the most famous understatements in biology: it had not escaped their notice that the specific pairing they had postulated immediately suggested a possible copying mechanism. Each parental strand serves as a template; the new strand is built by base-pairing — semi-conservative replication. Each daughter DNA molecule contains one parental strand and one newly synthesised strand.
"It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material."
Watson & Crick, Nature (1953)
Meselson and Stahl's experiment (1958)
Three models were possible — conservative (both parental strands stay together), semi-conservative (one old + one new), dispersive (parental DNA fragmented and mixed with new DNA). Matthew Meselson and Franklin Stahl distinguished them in E. coli:
- Step 1. Grow E. coli in ¹⁵NH₄Cl medium until DNA becomes uniformly heavy.
- Step 2. Transfer cells to ¹⁴N medium; sample at 20-minute intervals and centrifuge in a CsCl density gradient.
- Step 3. After one generation, a single intermediate-density band (¹⁵N/¹⁴N hybrid). After two generations, two bands — one intermediate, one light.
The intermediate band ruled out conservative replication; the light band ruled out dispersive. Only semi-conservative fit. The experiment was first performed in a bacterium (NEET 2018, Q.91). Later confirmed in eukaryotes by Taylor and co-workers using tritiated thymidine in Vicia faba.
Replication machinery — enzymes at the fork
Replication takes place at a Y-shaped region of DNA called the replication fork, which opens at an origin of replication. Several enzymes collaborate:
- Helicase — unwinds the parental double helix at the fork.
- Primase — synthesises a short RNA primer providing a free 3'-OH end.
- DNA-dependent DNA polymerase III — the main replication enzyme. Adds deoxyribonucleotides to the 3'-OH, polymerising only in the 5' → 3' direction. It uses dNTPs as substrates (the energy for polymerisation comes from the two terminal phosphates).
- DNA polymerase I — replaces RNA primer with DNA.
- DNA ligase — seals the nicks between Okazaki fragments.
Because parental strands are antiparallel and DNA polymerase synthesises only 5' → 3', the two new strands are made differently. The leading strand grows continuously with the fork. The lagging strand grows discontinuously away from the fork in short bursts — Okazaki fragments later joined by DNA ligase. Eukaryotic replication is restricted to the S phase; in bacteria, it happens prior to fission (NEET 2017, Q.133).
Transcription — DNA to RNA
Transcription copies one strand of DNA into RNA. Of the two DNA strands, only the template strand (3' → 5') is read by RNA polymerase. The other strand — same sequence as the RNA except T replaces U — is the coding strand and serves as the reference. A transcription unit has three parts: a promoter at the 5' end (RNA polymerase binding site), the structural gene, and a terminator at the 3' end. Polymerase reads template 3' → 5' and writes RNA 5' → 3'.
Prokaryotic transcription
Bacteria use a single DNA-dependent RNA polymerase for all RNA classes (mRNA, tRNA, rRNA). It is a holoenzyme (α₂ββ'ω) + σ. The σ subunit recognises the promoter for initiation; once polymerisation begins, σ leaves; ρ (rho) factor enables termination. The same RNA polymerase catalyses initiation, elongation, and termination — NEET 2021 (Q.171). Without a nuclear envelope, mRNA can be translated before transcription is complete.
Eukaryotic transcription — three polymerases, splicing, capping, tailing
Three RNA polymerases divide the labour: RNA polymerase I — 28S, 18S, 5.8S rRNAs (nucleolus); RNA polymerase II — heterogeneous nuclear RNA (hnRNA, precursor of mRNA); RNA polymerase III — tRNA, 5S rRNA, snRNAs. The same RNA polymerase III question appeared in NEET 2021 (Q.136) and NEET 2023 (Q.117).
Eukaryotic structural genes are interrupted — coding exons are interspersed with non-coding introns. The primary transcript (hnRNA) undergoes splicing in spliceosomes (absent in bacteria — NEET 2017, Q.121), then further processing: capping (a 7-methylguanosine cap at the 5' end) and tailing (200–300 adenylate residues at the 3' end). The mature mRNA is exported through nuclear pores for translation.
The genetic code
The code translates nucleotide sequences into amino acid sequences. Gamow argued a 3-letter code over 4 letters gives 4³ = 64 combinations — enough for 20 amino acids. Har Gobind Khorana (synthetic templates) and Marshall Nirenberg (cell-free system) cracked the code; Severo Ochoa contributed polynucleotide phosphorylase.
Triplet
3 letters
read in groups
61 codons specify amino acids; 3 are stop codons (UAA, UAG, UGA). AUG is start.
Degenerate
Redundant
multiple codons per AA
Most amino acids are coded by more than one codon (e.g., lysine: AAA, AAG).
Unambiguous
One AA
per codon
Each codon codes for one and only one amino acid. The reverse is not true.
Non-overlapping
Sequential
comma-less
Codons are read one after another with no overlap and no gaps.
Nearly universal
All life
few exceptions
A bacterium reads the same codons as a human — this is why E. coli can make human insulin.
PYQ 2019: universality enables rDNAAUG — start + Met
Dual role
initiator + Met codon
AUG codes for methionine AND signals translation start. NEET 2021 (Q.199) caught students who thought it also coded for phenylalanine.
An insertion or deletion in numbers not divisible by three shifts the reading frame from that point — a frameshift mutation. NEET 2017 (Q.76) and NEET 2019 (Q.82) both tested this: delete one base near the start of a 999-base mRNA and every codon downstream shifts; only deletion of a multiple of three restores the frame.
Translation — RNA to protein
Translation synthesises a polypeptide from an mRNA sequence. Three machines collaborate: mRNA (message), tRNA (adapter), and ribosome (workbench).
Activation of amino acids. Each tRNA must first be loaded with its amino acid. Aminoacyl-tRNA synthetase catalyses this two-step charging reaction using ATP, linking the amino acid to the 3' end of tRNA. There are 20 such enzymes — one per amino acid. This is the first phase of translation (NEET 2020, Q.17 — though the option set there listed mRNA binding as the first phase of the ribosomal cycle).
tRNA — the adapter. Hypothesised by Crick before it was found. A clover-leaf shape (inverted L in 3D), with an anticodon loop that base-pairs with the codon, and a CCA amino-acid attachment end. A special initiator tRNA reads AUG; no tRNAs exist for stop codons.
The ribosome. Two subunits, structural rRNAs, and about 80 different proteins (NEET 2023, Q.150). The 23S rRNA is the ribozyme that catalyses peptide-bond formation. Three sites: A (aminoacyl), P (peptidyl), E (exit).
Initiation begins when the small subunit encounters mRNA — NEET 2022 (Q.118). The small subunit binds mRNA, the initiator tRNA reads AUG, then the large subunit docks. Elongation: aminoacyl-tRNA enters A site → peptide bond formed → ribosome translocates one codon. Termination: a stop codon enters A site → a release factor (not a tRNA) binds → polypeptide released.
Many ribosomes can translate one mRNA simultaneously — a polysome (also polyribosome) — NEET 2016 (Q.132) and NEET 2018 (Q.147). The mRNA carries untranslated regions at both ends — a 5' UTR before AUG and a 3' UTR after the stop codon — that regulate translation efficiency.
Regulation of gene expression — the lac operon
Genes are not transcribed at all times. The decision to transcribe is made at the level of promoter activity, regulated by repressors and activators. The prototype is the lac operon, worked out by François Jacob and Jacques Monod in the early 1960s — the first transcriptionally regulated system elucidated. It is a single polycistronic transcription unit in E. coli with one regulatory gene and three structural genes:
i gene
Repressor
regulatory protein
Constitutively expressed. Codes the lac repressor — a tetramer that binds the operator and blocks transcription.
z gene
β-galactosidase
cleaves lactose
Hydrolyses lactose to glucose + galactose.
y gene
Permease
membrane transporter
Increases permeability of the cell to lactose.
a gene
Transacetylase
side-product enzyme
Adds acetyl groups to galactose — accessory role.
When lactose is absent: the i-gene product (repressor) binds the operator. RNA polymerase is blocked; z, y, a are silent.
When lactose enters the cell: residual β-galactosidase converts some lactose to allolactose, which binds and inactivates the repressor. RNA polymerase transcribes z, y, a as a single polycistronic mRNA. As lactose is consumed, repressor rebinds and the operon shuts down. NEET 2016 (Q.116) asked the inducer (answer: lactose). NEET 2019 (Q.76) tested the four gene products. NEET 2018 (Q.96) matched Jacob–Monod to lac. NEET 2018 (Q.176) flagged that an enhancer is not part of an operon — operons contain regulator + promoter + operator + structural genes.
The Human Genome Project
HGP was launched in 1990 — a mega-project coordinated by the US Department of Energy and the NIH, with Wellcome Trust (UK) as a major partner. It cost around 9 billion US dollars and was completed in 2003. Goals: identify all human genes, sequence the 3 billion base pairs, store the data, develop analysis tools, transfer technologies, and address ethical, legal, and social issues.
Two methodologies were used. Expressed Sequence Tags (ESTs) focused on the genes actually expressed as RNA — a shortcut to the functional portion of the genome (NEET 2019, Q.68 and NEET 2023, Q.123). The second approach, sequence annotation, sequenced the entire genome and assigned function. About 1.4 million single-nucleotide polymorphisms (SNPs) were identified, the basis of genetic mapping and disease-gene discovery. Less than 2% of the genome codes for proteins; over 50% is repeated sequences; chromosome 1 has the most genes (2,968), the Y the fewest (231).
DNA fingerprinting
If 99.9% of human DNA is shared, where does forensic identification get its power? From the 0.1% that varies — especially from variable number tandem repeats (VNTRs), short DNA sequences repeated in tandem, the repeat number differing between individuals. VNTRs belong to a class of mini-satellite DNA; they range from 0.1 to 20 kb. After hybridisation with a VNTR probe, the autoradiogram gives a banding pattern unique to each individual (identical only between identical twins).
Alec Jeffreys developed the technique in 1985 (NEET 2018, Q.96). Classic protocol: (i) isolate DNA, (ii) digest with restriction endonucleases, (iii) electrophorese, (iv) Southern blot, (v) hybridise with a radiolabelled VNTR probe, (vi) autoradiograph. Modern fingerprinting uses PCR to amplify VNTR regions before analysis. DNA polymorphism — sequence variation at a given locus — is the foundation of both fingerprinting and genetic mapping (NEET 2022, Q.123). Ethidium-bromide-stained DNA bands fluoresce bright orange under UV (NEET 2021, Q.105 and NEET 2023, Q.120).
The central dogma — a closing schematic
Francis Crick proposed the central dogma in 1958: information flows from DNA to RNA to protein. DNA replicates (DNA → DNA), is transcribed to RNA, then translated to protein. In retroviruses, information flows backward via reverse transcriptase (Temin and Baltimore) — but protein never feeds information back. NEET 2021 (Q.126) tested this exact flow.
"DNA makes RNA, RNA makes protein — and protein makes us."
Francis Crick — the central dogma, paraphrased
NEET PYQ Snapshot
Five highest-frequency questions from this chapter's 42-strong PYQ pool — solve before moving on.
Unequivocal proof that DNA is the genetic material was first proposed by —
Answer: (3) Hershey and ChaseWhy: Griffith proposed a "transforming principle" but did not identify it. Avery, MacLeod and McCarty showed that the principle was DNA biochemically but did not satisfy every critic. Hershey and Chase's blender experiment with ³²P-labelled DNA and ³⁵S-labelled protein in bacteriophages provided the final, unequivocal proof. Wilkins and Franklin produced the X-ray diffraction data of DNA.
What is the role of RNA polymerase III in the process of transcription in eukaryotes?
Answer: (3) tRNA, 5S rRNA, snRNAWhy: The three-polymerase division of labour in eukaryotes — RNA Pol I transcribes the larger rRNAs (28S, 18S, 5.8S); RNA Pol II transcribes hnRNA, the precursor of mRNA; RNA Pol III transcribes the smaller RNAs (tRNA, 5S rRNA, snRNA). This exact question appeared in both 2021 and 2023 — lock it.
If adenine makes 30% of the DNA molecule, what will be the percentage of thymine, guanine and cytosine in it?
Answer: (4) T = 30%, G = C = 20%Why: Chargaff's rule: A = T, G = C. If A = 30%, then T = 30%. So A + T = 60% and G + C = 40%, meaning G = C = 20% each.
Which is required as inducer for the expression of the lac operon? Also: match i, z, y, a genes with their products.
Answer: (2) LactoseWhy: Lactose (more specifically, its isomer allolactose) is the inducer. The i gene codes for the repressor; z codes for β-galactosidase; y for permease; a for transacetylase. The repressor is constitutively made; lactose binds and inactivates it, freeing RNA polymerase to transcribe z, y, a as a single polycistronic mRNA.
If the distance between two consecutive base pairs is 0.34 nm and the total number of base pairs of a DNA double helix in a typical mammalian cell is 6.6 × 10⁹ bp, then the length of the DNA is approximately —
Answer: (2) 2.2 metresWhy: Length = number of base pairs × distance per base pair = 6.6 × 10⁹ × 0.34 × 10⁻⁹ m = 2.244 m ≈ 2.2 m. NEET 2022 inverted the same problem — given 1.1 m, asked for the number of base pairs (answer 3.3 × 10⁹ bp).
Expert FAQs
Questions NEET has asked from this chapter, answered straight.
Who gave the unequivocal proof that DNA is the genetic material?
How many base pairs are in one turn of the DNA double helix?
What is Chargaff's rule?
How many base pairs are in a typical nucleosome?
What does semi-conservative DNA replication mean?
What are Okazaki fragments?
What is the role of RNA polymerase III in eukaryotic transcription?
What is the start codon and what are the stop codons?
What is the inducer of the lac operon?
What is VNTR and how is it used in DNA fingerprinting?
Go Deeper
Drill into the 12 subtopics NEET asks most often.