First Arab human pangenome reveals millions of novel variants using long-read sequencing
Study powered by PacBio HiFi sequencing offers new insight into underrepresented populations
Scientists have assembled the first Arab human pangenome, revealing more than 100 million base pairs of previously unsequenced DNA and millions of genetic variants unique to Arab individuals. The study, published in Nature Communications, was led by Mohammed Bin Rashid University of Medicine and Health Sciences (MBRU) and Dubai Health, and relied heavily on long-read sequencing technologies, including PacBio’s HiFi platform.
The team sequenced 53 individuals from eight Arab countries, producing diploid, haplotype-resolved genomes with exceptional resolution. The findings highlight the importance of population-specific reference genomes for advancing precision medicine.
Although Arab individuals make up nearly 6% of the global population, they are significantly underrepresented in current genomic reference datasets. Existing references like GRCh38 and the T2T-CHM13 assembly miss large portions of sequence variation found in these populations.
Key findings from the study include:
More than 111 million base pairs of DNA absent from current human reference genomes
8.94 million small variants and 235,000 structural variants, many undetectable with short-read methods
883 duplicated genes, including one (TAF11L5) found in every participant, with potential links to recessive disease
More than 1,400 base pairs of novel mitochondrial DNA, offering improved resolution for maternal lineage tracking
“MBRU’s work demonstrates why population-specific pangenomes matter,” said Christian Henry, chief executive officer at PacBio. “This study will have lasting impact for research and precision medicine in a historically underrepresented population.”
“By creating a high-resolution Arab pangenome we’re giving scientists and clinicians a tool to make precision medicine more equitable,” said Dr Mohammed Uddin, senior author of the study and associate professor of human genetics at MBRU. “This achievement depended on long-read technologies, notably PacBio’s high quality HiFi sequencing, which allowed us to capture the full complexity of the genome, including structural variation and previously hidden sequences.”
The new Arab pangenome, named the UAE Pangenome Reference (UPR), improves genome mapping accuracy and variant detection in Arab individuals. Early analyses suggest it can enhance diagnosis of rare conditions and provide more reliable data for population-specific disease research.
The UPR is publicly available and will support ongoing efforts to better understand disease predisposition, inheritance patterns, and the prevalence of recessive conditions in Arab communities.
This initiative follows other pangenome efforts involving long-read sequencing, including the Human Pangenome Reference Consortium and the Chinese Pangenome Consortium.




