四川农业大学学报 ›› 2021, Vol. 39 ›› Issue (3): 275-278.doi: 10.16036/j.issn.1000-2650.2021.03.001

所属专题: 水稻研究专题

• 特别策划:水稻研究专题 •    下一篇

基于33个遗传多样性水稻材料的泛基因组分析揭示“隐藏”的基因组变异

钦鹏, 陈薇兰, 王淏, 李仕贵*   

  1. 四川农业大学水稻研究所,成都611130
  • 收稿日期:2021-06-22 出版日期:2021-06-28 发布日期:2021-07-05
  • 通讯作者: *李仕贵,博士,教授,主要从事控制水稻主要农艺性状分子机制的解析、优异等位基因挖掘和应用研究,E-mail: lishigui@sicau.edu.cn。
  • 作者简介:钦鹏,博士,教授,主要从事水稻种质资源精准鉴定与高温下高产优质基因挖掘和利用研究,E-mail: qinpeng@sicau.edu.cn。
  • 基金资助:
    国家重点研发计划(2016YFD0100400); 国家自然科学基金重大研究计划集成项目(92035301)

Pan-Genome Analysis of 33 Genetically Diverse Rice Accessions Reveals Hidden Genomic Variations

QIN Peng, CHEN Weiion, WANG Hao, LI Shigui*   

  1. Rice Research Institute, Sichuan Agricultural University, Chengdu 611130, China
  • Received:2021-06-22 Online:2021-06-28 Published:2021-07-05

摘要: 【目的】基因组结构变异(SV)和基因拷贝数变异(gCNV)是动植物中主要的遗传变异来源,全面准确地鉴定和分析 SV和gCNV对挖掘优异等位基因、保障水稻粮食安全具有重要意义。【方法】利用长片段测序数据和基因组装方法(HERA), 对31个具有遗传多样性的水稻栽培稻进行了高质量基因组组装,结合日本晴和蜀恢498高质量基因组,进行了系统的基 因组比较分析。【结果】共鉴定到171 072个非冗余SVs和25549个gCNVs,其中82.8%的PAV未在先前基于短序列测序 数据获得的PAV中鉴定到。利用非洲栽培稻CG14作为外群,对发生在亚洲栽培稻群体的SV(dSV)进行了推断,发现大多 数dSV位于基因非编码区,以及泛基因组中50%(32668)基因上下游2kbp区域在32个亚洲栽培稻种至少有一个dSV。进一步 结合转录组数据分析发现SVs和gCNVs对调控基因表达量对具有重要作用。对SV形成机制分析发现,SV主要由TEI(转 座子插入)和NHEJ(非同源末端连接)两种机制形成,但不同类型SV的主要形成机制有所不同。该研究还构建了水稻中首个 图形基因组,结合674份材料的二代测序和叶片早衰数据,发现17.5%的SV与其附近SNP的连锁度非常低,GWAS分析发现 一个与叶片早衰显著相关的位点只能被SV检测到。相关研究成果以“Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations”为题于2021年5月发表在国际期刊Cell。【结论】提供了 一个高质量泛基因 组水平的基因组变异资源,将促进水稻的功能基因组和进化生物学研究、优异基因资源发掘和水稻育种。

关键词: 水稻, 泛基因组, 高质量基因组, 结构性变异, 基因拷贝数变异, 图形基因组

Abstract: 【Objective】 Structural variations (SVs) and gene copy number variation (gCNV) are a major source of genetic variation, identification and analysis of SV and gCNV is significant for mining elite natural allele and ensuring rice food security.【Method】 Using the long-read sequencing data and the method of genome assembly (HERA), we generated 31 high-quality assemblies of genetically diverse rice accessions. Coupling with Nipponbare and R498 high-quality genomes, we performed genomic sequence compar- isons and analysis.【Result】 We identified 171 072 non-redundant SVs and 25 549 gCNVs, and found that 82.8% of PAVs have not been discovered in previous study based on short-read sequencing. We used the O. glaberrima accession CG14 as an outgroup for inferring the SVs occurred in O. saliva (dSV), and found that a majority of dSVs shared overlap with non-coding region, also found that ~50%(32 668) of the genes in the pan-genome had at least one dSV overlapped with their region including +2 kbp of coding regions across the 32 O. saliva accessions. Coupling with analysis of transcriptome data, we found that SVs and gCNVs had significant roles in the regulation of gene expression. Analysis on the mechanism of SV formation found that most of these assigned SVs was formed through transposable element insertion (TEI) and nonhomologous end joining (NHEJ) (43.9%), but dominant mechanism was variable among different types of SV. This study also constructed a graph-based genome in rice. Coupling with the datasets of resequencing short-reads and leaf senescence phenotype of 674 accessions, we found that 17.5% of SVs showed very low linkage with nearby SNPs, and GWAS analysis showed that a locus significantly associated with leaf senescence phenotype was only detected using SV data. These related results was published in Cell in May 2021 titled ‘Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations’.【Conclusion】 Thisstudyprovideshigh-qualitypan-genome-scalegenomic variation resources, and will facilitate rice functional genomics and evolutionary biology research, elite natural allele mining and rice breeding.

Key words: rice, pan-genome, high-quality genome, structural variation, gene copy number variation, graph-based genome

中图分类号: 

  • S511