A research group from Xi'an Jiaotong University (XJTU) published their latest academic findings on the high-quality assembly of a genome associated with the flowering plant arabidopsis thaliana in Genomics, Proteomics & Bioinformatics (GPB), a prestigious bi-monthly English-language scientific journal.
The findings were announced by the Beijing Institute of Genomics (China National Center for Bioinformation) in the Chinese Academy of Sciences (CAS) and the Genetics Society of China on Sept 3.
The study of genomics is an important component of the human genome project(HGP) and its major research targets include microbiology, plants and animals.
Arabidopsis thaliana was selected as a reference plant in the research of genome sequences because of its comparative strengths, such as smaller height, higher proliferation and shorter living period.
Nowadays, over half of botanists and nearly 10,000 laboratories in the world have done research on it in the areas of genetic analysis, gene cloning and functional genome, making huge contributions to increasing the production of crops, improving crops’ ability to cope with natural disasters, and protecting plants.
Genetic analysis mainly relies on the genome of reference species. Theoretically speaking, however, only a small number of plants and animals exhibit 100 percent of the reference genome.
Existing studies on the reference genome of arabidopsis thaliana haven't yet found all of its gene sequences, especially the centromeric sequence, a key element in the process of cell division, and ageing-induced telomeric sequence. A lack of these two sequences, which contain highly repetitive fragments, have created tremendous difficulties for the assembly of genome and hindered research on the sequences and functions of these regions.
A lack of these two sequences, which contain highly repetitive fragments, have created tremendous difficulties for the assembly of genome and hindered research on the sequences and functions of these regions.
Together with his teammates, Ye Kai, an expert in information and biomedicine, proposed a fresh strategy of bacterial artificial chromo some-oriented sequence replacement and found a high-quality genome of arabidopsis thaliana, titled "Col-XJTU", through the application of artificial intelligent computing and biomedicine-related data and the adoption of pooling sequence strategy.
"European countries and the US have had the dominant right in the formulation of international standards concerning the research of the arabidopsis thaliana genome over the past two decades. However, this time we have successfully drawn a map of the Col-XJTU genome, the precision of which is as high as 99.99 percent. It has become the highest international standard," Ye said.
Wang Bo, an assistant professor from the Department of Automation at XJTU and first author of the publication, said the research group has completed the seamless assembly of centromeres on the No 3, 4 and 5 chromosomes and the assembly of a majority of centromeres on the No 1 and 2 chromosmes.
The precision of the base and structure of a genome is a major criterion to assess its quality.
Statistics show that the precision of the base and structure of the Col-XJTU genome is higher than TAIR10.1, an international conventional genome first designed by US scientists.
In addition, the quality of the assembly of the Col-XJTU genome rivals that of similar research conducted by a research team composed of scientists from Cambridge University and Johns Hopkins University.
Ye noted that there are still great challenges for scientists to differentiate repetitive genome sequences from each other in the complete assembly of genomes and that this is the reason why they have yet failed to identify the reasons for the two remaining opening gaps. He added that they will address these challenges in their future studies.
Together with Ye, Yang Xiaofei, an associate professor from the Department of Computer Science and Technology at XJTU, was the correspondence author of the publication.