Scientists unveil more diverse, accurate draft of human genome
Shoppers crowd a market at Ashon, one of the busiest marketplaces in the country, during Tihar festival, in Kathmandu, Nepal, Nov. 13, 2020. (Reuters Photo)


Scientists have recently unveiled the first draft of a new accounting of the human genome, which improves on its predecessor by being a more diverse and accurate DNA blueprint for our species that they hope will help shed light on a range of issues, most importantly in identifying genetic underpinnings of diseases and new ways to treat them.

This "pangenome" achievement was announced two decades after the first sequencing of the human genome, a feat that transformed biomedical research by giving scientists a reference map to analyze DNA for clues about disease-related mutations.

The new genome rundown may help clarify the contribution of genetic variation to health and disease, improve genetic testing, and guide drug discovery.

It could be of particular value in understanding neurodevelopmental disorders such as schizophrenia, autism, macrocephaly, and microcephaly, as well as drug metabolism.

The work, led by the international Human Pangenome Reference Consortium of scientists funded by the United States government's National Human Genome Research Institute (NHGRI), essentially was a reboot of the prior effort and solved a key deficiency – a failure to represent the genetic variations present among the world's 8 billion people.

The previous work had significant gaps and was based largely on a single person's DNA. The new work is a collection of nearly perfect genome assemblies for 47 people of diverse ancestries and an alignment of those individual genomes to show which parts match and which differ. Calling this a first draft, the researchers intend to increase the number of people reflected in the data to 350 by mid-2024.

"A pangenome is not just one reference genome, but a whole collection of diverse genomes. By comparing those genomes we can then build a map of not just one individual, but a whole population of variation," said University of California, Santa Cruz genomicist Benedict Paten, co-leader of the consortium and senior author of the main research paper published in the journal Nature.

This collection comprised genomes of people including those of African, East Asian, South Asian, European, North American, South American, and Caribbean ancestry, though not yet Oceania.

"Bottom line – what we're doing is retooling genomics to create a diverse, inclusive representation of human variation as the fundamental reference structure, and so mitigating bias. This is important if we want our research to benefit everyone equally," Paten said.

A genome is an organism's genetic blueprint – in this case, a human – and contains the information needed for development and growth. However, each person's genome varies slightly – about 0.4% on average – from other people.

These genetic differences can shed light on a person's health, help diagnose disease, craft treatments and forecast medical outcomes.

"By building very high quality, almost complete references we're getting a better picture of how some of the most complex regions of the genome vary. Until now, the composition of these fast-evolving regions has been largely invisible to us," Paten said.

Researchers in 2003 unveiled what was billed as the complete sequence of the human genome, though about 8% of it had not been fully deciphered. That reference genome was a mosaic drawn from about 20 people, including 70% from one individual of mixed European and African ancestry. The first complete human genome, based on a single European individual, was published a last year after scientists filled in the gaps.

"Human ancestry is incredibly complex, and we're all related to each other through our common history," said Ira Hall, director of the Yale Center for Genomic Health and one of the research leaders.

"And so by sampling broadly across the genetic tree of humanity, it benefits everybody. Even if some specific group isn't explicitly included, it still is representing our common origins and provides common benefits."

The cost of supporting the consortium will be about $40 million over five years, NHGRI said, less than the multibillion-dollar expenditure for the 2003 genome project thanks to technological advances.