Chinese scientists update soybean genome to a golden reference

September 12, 2019

Soybean is one of the most important crops worldwide. A high-quality reference genome will facilitate its functional analysis and molecular breeding. Previously, biologists from China (Chinese Academy of Science, University of Science and Technology of China, Jiangsu Academy of Agricultural Sciences, Berry Genomics Corporation) de novo assembled a high-quality Chinese soybean genome Gmax_ZH13 (Shen et al., 2018). However, due to technical limitations, a large number of small contigs were not anchored onto chromosomes.

Recently, the leader research group for Gmax_ZH13 genome project from the Institute of Genetics and Developmental Biology, Chinese Academy of Science, updated the Gmax_ZH13 genome to a golden reference genome Gmax_ZH13_v2.0.

Based on the Gmax_ZH13, by adding more sequence data and refresh assembly pipeline (Figure 1A), researchers finally assembled Gmax_ZH13_v2.0 with a length of 1,011,174,350 bp. Its assembly quality was increased dramatically. When compared to Gmax_ZH13, the Contig N50 size of Gmax_ZH13_v2.0 increased 6.5 times (from 3.46 Mb to 22.6 Mb), gap number decreased 1.8 times (from 815 to 448) and gap length decreased 8.8 times (from 20.49 Mb to 2.33Mb). Meanwhile, the un-anchored contig number decreased 17 times (from 549 to 36), resulting in the ratio of sequence that anchored to 20 chromosomes reaching 98%. All these assembly parameters indicated the high completeness of Gmax_ZH13_v2.0. Besides nuclear chromosomes, researchers assembled the circular genomes of chloroplast and mitochondria with a length of 152,220 bp and 513,779 bp respectively.

To improve the accuracy of gene annotation, in addition to Iso-seq reads used for Gmax_ZH13 annotation, researches performed RNA-seq and smRNA-seq for another 27 ZH13 samples, which were collected from different tissues at different developmental stages. They finally annotated 55,443 protein coding genes containing 96,366 mRNAs in the nuclear genome, 81 protein coding genes in the chloroplast genome and 49 protein coding genes in the mitochondrial genome. 97% of the 1,440 single copy Embryophyta genes in BUSCO_v3 were completely assembled, confirming the high quality of protein coding gene annotation. Besides that, non-coding genes were also annotated, including 297 rRNA, 1,112 tRNA, 166 snRNA 1,816 snoRNA and 35926 TE. Especially, 331 MIRNA genes and the mature miRNAs they produced were annotated by smRNA-seq data (Figure 1B).

Researchers also provided a detailed expression profiling for all protein coding genes and miRNAs they annotated (Figure 1C). These expression profiling data will be helpful for soybean fundamental research, for instance, searching expression pattern of individual genes or choosing tissue specific expression genes. Moreover, the data can be used to investigate the relationship of miRNAs and their target genes because they came from the same sample sets.

"We updated the Gmax_ZH13 genome to a more complete and continuous platinum reference genome Gmax_ZH13_ v2.0, did comprehensive annotation and provided detailed expression information for it", said Professor Zhixi Tian, the leader of the Gmax_ZH13 Chinese soybean genome project. "We believe that the new genome will greatly facilitate soybean fundamental research and molecular breeding."
This work was supported by the National Key Research & Development Program of China (2017YFD0101305), National Natural Science Foundation of China (31525018, 31788103), and the State Key Laboratory of Plant Cell and Chromosome Engineering (PCCE-KF-2019-05).

See the article:

Shen, Y., Du, H., Liu, Y., Ni, L., Wang, Z., Liang, C., and Tian, Z. (2019). Update soybean Zhonghuang 13 genome to a golden reference. Sci China Life Sci 62, 1257-1260.

Shen, Y., Liu, J., Geng, H., Zhang, J., Liu, Y., Zhang, H., Xing, S., Du, J., Ma, S., and Tian, Z. (2018). De novo assembly of a Chinese soybean genome. Sci China Life Sci 61,

Science China Press

Related Genome Articles from Brightsurf:

Genome evolution goes digital
Dr. Alan Herbert from InsideOutBio describes ground-breaking research in a paper published online by Royal Society Open Science.

Breakthrough in genome visualization
Kadir Dede and Dr. Enno Ohlebusch at Ulm University in Germany have devised a method for constructing pan-genome subgraphs at different granularities without having to wait hours and days on end for the software to process the entire genome.

Sturgeon genome sequenced
Sturgeons lived on earth already 300 million years ago and yet their external appearance seems to have undergone very little change.

A sea monster's genome
The giant squid is an elusive giant, but its secrets are about to be revealed.

Deciphering the walnut genome
New research could provide a major boost to the state's growing $1.6 billion walnut industry by making it easier to breed walnut trees better equipped to combat the soil-borne pathogens that now plague many of California's 4,800 growers.

Illuminating the genome
Development of a new molecular visualisation method, RNA-guided endonuclease -- in situ labelling (RGEN-ISL) for the CRISPR/Cas9-mediated labelling of genomic sequences in nuclei and chromosomes.

A genome under influence
References form the basis of our comprehension of the world: they enable us to measure the height of our children or the efficiency of a drug.

How a virus destabilizes the genome
New insights into how Kaposi's sarcoma-associated herpesvirus (KSHV) induces genome instability and promotes cell proliferation could lead to the development of novel antiviral therapies for KSHV-associated cancers, according to a study published Sept.

Better genome editing
Reich Group researchers develop a more efficient and precise method of in-cell genome editing.

Unlocking the genome
A team led by Prof. Stein Aerts (VIB-KU Leuven) uncovers how access to relevant DNA regions is orchestrated in epithelial cells.

Read More: Genome News and Genome Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to