Deep within the cell’s nucleus, there’s our DNA. DNA is made up of genes, and each gene is basically a specific part of the DNA that codes for a protein.
And genes become proteins in two steps: transcription and translation.
Transcription is the first step in creating a protein, during which a specific gene is “read” and copied on an individual mRNA, or messenger RNA molecule - which is like a blueprint with instructions on what protein to build.
Now, DNA has two strands, which wrap one around the other to form the characteristic “double helix”.
Each single strand of DNA is composed of four types of nucleotides - which are the individual “letters” or “building blocks” of DNA.
Nucleotides of DNA are made out of a sugar - deoxyribose, a phosphate, and one of the four nucleobases - adenine, cytosine, guanine, and thymine - or, commonly, A, C, G, T for short.
The nucleotides on one strand pair up through hydrogen bonds with nucleotides on the opposing strand, to create the double-stranded DNA : specifically, A bonds with T, and C bonds with G, so they’re called complementary bases.
Now, with these two strands - one strand is called the coding, or the sense strand, and the other strand is called the template, or the anti-sense strand.
The coding strand has a coding sequence of nucleotides that serves as a master blueprint for our protein.
It’s a what-you-see-is-what-you-get kind of thing.
The template strand, on the other hand, has a sequence of nucleotides that is complementary to the sequence on the coding strand.
In addition, the two DNA strands also have a “direction” - the coding strand runs from the 5’ end towards the 3’ end, while the template strand runs from the 3’ to the 5’ end.
A bit like two snakes coiled up together but facing different directions.
So, if the coding strand looks like this:
5’ end - A A T C C A G T A - 3’ end
The template strand will look like this:
3’ end - T T A G G T C A T - 5’ end
*Disclaimer: no cats were harmed in the making of this strand.
Now, transcription starts with the unpacking of DNA from chromatin and de-helicization - meaning that the double helix unwinds a bit so that individual genes are exposed.
The starting point of a gene is determined by a promoter region, which is a repetitive non-coding sequence of nucleotides - for example, T A T A T A T A sequence is one very famous promoter, called the TATA box - that marks where to begin transcribing.
A few dozen proteins and enzymes come together to form what’s called a pre-initiation complex around the promoter, also featuring an enzyme called RNA polymerase.
Then, a process called elongation occurs, which is where RNA polymerase unzips the two strands by shearing the hydrogen bonds between the complementary nucleotides for the length of around 14 base pairs.
This open area is within the RNA polymerase, and is called the transcription bubble.
The RNA polymerase follows the template strand and uses it to assemble an mRNA molecule, that is the mirrored image of the template strand.
Now, mRNA is slightly different from DNA.
First off, it uses a slightly different set of nucleotides, where the T is replaced by uracil, or U.
The U will normally pair with A, as T would.
Also, mRNA runs in the opposite direction compared to the template strand - so from 5’ end to 3’ end.
So, when reading the template strand, RNA polymerase will move along from the 3’ end of the template strand towards the 5’ end, while creating the mRNA molecule in reverse - from 5’ end to 3’ end.