Learninsta presents the core concepts of Biology with high-quality research papers and topical review articles.
Human Genome Project (HGP)
The international human genome project was launched in the year 1990. It was a mega project and took 13 years to complete. The human genome is about 25 times larger than the genome of any organism sequenced to date and is the first vertebrate genome to be completed. Human genome is said to have approximately 3 × 109 bp. HGP was closely associated with the rapid development of a new area in biology called bioinformatics.
Goals and methodologies of Human Genome Project
The main goals of Human Genome Project are as follows
- Identify all the genes (approximately 30000) in human DNA.
- Determine the sequence of the three billion chemical base pairs that makeup the human DNA.
- To store this information in databases.
- Improve tools for data analysis.
- Transfer related technologies to other sectors, such as industries.
- Address the ethical, legal and social issues (ELSI) that may arise from the project.
The methodologies of the Human Genome Project involved two major approaches. One approach was focused on identifying all the genes that are expressed as RNA (ESTS – Expressed Sequence Tags). The other approach was sequence annotation. Here, sequencing the whole set of genome was taken, that contains all the coding and non-coding sequences and later assigning different regions in the sequences with functions.
For sequencing, the total DNA from a cell is isolated and converted into random fragments of relatively smaller sizes and cloned in suitable hosts using specialized vectors. This cloning results in amplification of pieces of DNA fragments so that it could subsequently be sequenced with ease.
Bacteria and yeast are two commonly used hosts and these vectors are called as BAC (Bacterial Artificial Chromosomes) and YAC (Yeast Artificial Chromosomes). The fragments are sequenced using automated DNA sequencers (developed by Frederick Sanger).
The sequences are then arranged based on few overlapping regions, using specialized computer based programs. These sequences were subsequently annotated and are assigned to each chromosome. The genetic and physical maps on the genome are assigned using information on polymorphism of restriction endonuclease recognition sites and some repetitive DNA sequences, called microsatellites.
The latest method of sequencing even longer fragments is by a method called Shotgun sequencing using super computers, which has replaced the traditional sequencing methods.
Salient features of Human Genome Project:
- The human genome contains 3 billion nucleotide bases.
- An average gene consists of 3000 bases, the largest known human gene being dystrophin with 2.4 million bases.
- Genes are distributed over 24 chromosomes. Chromosome 19 has the highest gene density. Chromosome 13 and Y chromosome have lowest gene densities.
- The chromosomal organization of human genes shows diversity.
- There may be 35000-40000 genes in the genome and almost 99.9 nucleotide bases are exactly the same in all people.
- Functions for over 50 percent of the discovered genes are unknown.
- Less than 2 percent of the genome codes for proteins.
- Repeated sequences make up very large portion of the human genome. Repetitive sequences have no direct coding functions but they shed light on chromosome structure, dynamics and evolution (genetic diversity).
- Chromosome 1 has 2968 genes whereas chromosome ’Y’ has 231 genes.
- Scientists have identified about 1.4 million locations where single base DNA differences (SNPs – Single nucleotidepolymorphism – pronounce as ‘snips’) occur in humans.
- Identification of ‘SNIPS’ is helpful in finding chromosomal locations for disease associated sequences and tracing human history.
Applications and future challenges
The mapping of human chromosomes is possible to examine a person’s DNA and to identify genetic abnormalities. This is extremely useful in diagnosing diseases and to provide genetic counselling to those planning to have children.
This kind of information would also create possibilities for new gene therapies. Besides providing clues to understand human biology, learning about non-human organisms, DNA sequences can lead to an understanding of their natural capabilities that can be applied towards solving challenges in healthcare, agriculture, energy production and environmental remediation.
A new era of molecular medicine, characterized by looking into the most fundamental causes of disease than treating the symptoms will be an important advantage.
- Once genetic sequence becomes easier to determine, some people may attempt to use this information for profit or for political power.
- Insurance companies may refuse to insure people at ‘genetic risk’ and this would save the companies the expense of future medical bills incurred by ‘less than perfect’ people.
- Another fear is that attempts are being made to “breed out” certain genes of people from the human population in order to create a ‘perfect race’.