Received | : | Jan 08, 2018 |
Accepted | : | Feb 02, 2018 |
Published Online | : | Feb 06, 2018 |
Journal | : | Journal of Plant Biology and Crop Research |
Publisher | : | MedDocs Publishers LLC |
Online edition | : | http://meddocsonline.org |
Cite this article: Liu Z, Guo S, Xu J, Zhang Y, Dong L, et al. Genome size estimation of Chinese cultured artemisia annua L. J Plant Biol Crop Res. 2018; 1: 1002.
Almost all of antimalarial artemisinin is extracted from the traditional Chinese medicinal plant Artemisia annua L. However, under the condition of insufficient genomic information and unresolved genetic backgrounds, regulatory mechanism of artemisinin biosynthetic pathway has not yet been clear. The genome size of genuine A. annua plants is an especially important and fundamental parameter, which helpful for further insight into genomic studies of artemisinin biosynthesis and improvement. In current study, all those genome sizes of A. annua samples collected with Barcoding identification were evaluated to be 1.38-1.49 Gb by Flow Cytometry (FCM) with Nipponbare as the benchmark calibration standard and soybean and maize as two internal standards individually and simultaneously. The genome estimation of seven A. annua strains came from five China provinces (Shandong, Hunan, Chongqing, Sichuan, and Hainan) with a low coefficient of variation (CV, ≤ 2.96%) wasrelative accurate, 12.87% (220 Mb) less than previous reports about a foreign A. annua species with a single control. It facilitated the schedule of A. annua whole genome sequencing project, optimization of assembly methods and insight into its subsequent genetics and evolution.
Keywords: Artemisia annua L. ; Flow cytometry; Different internal standards; Genome size
Abbreviations: FCM: flow cytometry; CV: coefficient of variation; HPCV: half peak coefficient of variation; NJ tree: neighbor joining tree; PI: propidium iodide; FSC: forward scatter; SSC: side scatter
Artemisiae annuae herba, the dried aerial part of the annual herbaceous plant Artemisia annua L. [1], characteristically synthesizes and accumulates the unique sesquiterpene endo peroxide lactone the antimalarial drug artemisinin. Artemisinin-based combination therapies (ACTs) is recommended to be the best choice for quick and reliable treating acute malaria by WHO [2-4]. What’s more, the artemisinin isolation enabled the inventor Professor Tu to receive the 2011 Lasker DeBakey Clinical Medical Research Award and the 2015 Nobel Prize in Physiology or Medicine [5]. Moreover, antimalarial artemisinin was confirmed to have other multifunctions, such as anticancer [6,7], antiviral [8,9], and antischistosomal activities [10].
As still the main source of artemisinin A. annua plant is cosmopolitan species in the world (such as in Viet Nam and India), but most widespread in each province of China with the artemisinin content ranging from 0.1%~1.5% dried leaf weight, which affected by ecological environment and varietal difference [11]. However, the A. annua strains that contain less than 0.5% artemisinin content could not be used as raw material for artemisinin, especially the strains in Northern China, Viet Nam and India (less than 0.1% artemisinin content) [11]. It’s urgent to increase the artemisinin yield by numerous attempts, focusing on genetic modification and bioengineering of artemisinin biosynthesis in plants during the last two decades [5,12,13]. Unfortunately, under the condition of insufficient genomic information and genetic backgrounds, regulatory mechanism of artemisinin biosynthetic pathway has not yet been clarified. Also, artemisinin can be semi synthesized via artemisinic acid or dihydroartemisinic acid feasibly obtained from genetically modified yeast [14-17], but it cannot far reach high commercial values. Fundamentally, it seems be especially imperative to research accurate evaluation of Chinese A. annua genome size for its subsequent genetics study.
What’s more, it is well known that diversity lines have significantly different genetics and genome sizes in many plant species [18-22], especially just as rapid genome size change in Arabidopsis (A. thaliana) [23-27], rice (Oryza sativa L.) [28- 33], maize (Zea mays) [34] and soybean (Glycine max) [35]. The size flexibility of the Zeagenome varies from 4.92 to 6.87 pg/2C [34] in wild and cultivated maize, which related with geography and altitudinal gradients [34]. The maize luxurians genome was ~1.5-fold larger than that of B73 [36], while there were very different karyotypes between the domesticated subspecies maize (Z. mays ssp. mays) and two prominent wild subspecies (Z. mays ssp. Parviglumis , and Z. mays ssp. Mexicana ) [34,36,37]. Moreover, the estimated Soybean genome sizes ranges from 889.33 Mbp to 1118.34 Mbp [38-42]. The flow cytometric data of different maize varieties were slightly inconsistent using the same internal standard, while flow cytometric results of the same soybean and maize varieties with different internal references were still diverse [33,43-48]. Therefore, it is necessary to analysis genome size variation in Chinese A. annua populations with different internal references.
There upon, seven wild A. annua sample strains identified by DNA barcoding were chosen in China five different areas (Shandong, Hunan, Chongqing, Sichuan, and Hainan). Then those genome sizes were estimated by Flow Cytometry (FCM) with Nipponbareas a benchmark calibration standard, having a relatively distinct genome. And considering that both of soybean and maize showed appropriately closed genome sizes with A. annua (Soybean was 889.33 – 1118.34 Mbp [38-42], maize was 2300–3360 Mbp [34,49], A. annua maybe 1710 Mbp [50] ), they were adopted as two typical internal standards in FCM. The accurate genome size could facilitate the schedule of A. annua whole genome sequencing project and may be helpful to give further insight into artemisinin improvement genomic studies.
Seven ITS2 and seven psbA-trnH sequences were obtained from seven collected wild samples. Based on ITS2 and psbAtrnH sequences of seven wild samples, A. annua , its closely related species and counterfeits, two Neighbor Joining (NJ) trees were constructed separately. All the ITS2 and psbA-trnH control sequences of A. annua and its adulterants were generated from our previous studies [51,52] (Supplementary Table S1).
Supplementary Table S1: Sample informations of Artemisia annua and counterfeits.
The sequence length, GC content, and K2P genetic distance of the ITS2 and psbA-trnH regions of samples, A. annua and its adulterants were analyzed and summarized (Supplementary Table S2). The ITS2 sequence length of sample strains gathered from five provinces was 225 bp, while the psbA-trnH sequence length was 353 bp. No variable site existed both in those ITS2 (Figure 1b) and psbA-trnH sequences of A.annua in 5 provinces. The average GC content of ITS2 sequences of A.annua was 56.40%, and that of psbA-trnH sequences was 25.20%. On the basis of the ITS2 and psbA-trnH sequences, the intraspecific divergence of A.annua calculated using the K2P model was zero, which was far lower than the minimum interspecific distance of A.annua and 23 other closely related species and counterfeits (Supplementary Table S2).
Figure 1: Annua ITS2 sequences anylsis a): the NJ phylogenetic tree based on ITS2 sequences of A. annua and counterfeits. b): multiple Aligenment of samples and A. annua ITS2 sequences
Supplementary Table S2: Sequence characters and K2P genetic distances of samples, A. annua and counterfeits.
All our ITS2 sequences were in accord with A.annua (Figure 1b), separating from other closely related species in neighbor joining (NJ) NJ trees (Figure 1a). In agreement with our previous studies [51,52], A.annua and the closely related species and counterfeits could be distinguished from each other on the basis of the ITS2 sequences (not psbA-trnH). Thus, all strains were identified as genuine A.annua with no variation of their ITS2 and psbA-trnH sequences.
In our study, the peaks of Nipponbare, soybean, maize, and seven A.annua strains were alone determined. With no peak overlap, it made sure that the peaks of A.annua strains were well separated from internal standards. Using high-quality sequenced Nipponbare [53] as a benchmark calibration standard, the mixed Nipponbare with soybean, and the mixed Nipponbare with maize were detected three times, separately. And according to the formula: sample 2C DNA content = sample peak mean/standard peak mean standard 2C DNA content, the genome sizes of soybean and maize were measured as 0.92 ± 0.00 Gb and 2.17 ± 0.02 Gb, respectively. Then, the mixed internal standards and per A.annua strain (such as A.annua mixed with soybean, A.annua with maize, A.annua with soybean and maize) were measured with three technical replicates, parallelly twice. The flow cytometric results of seven A.annua strains collected from five provinces were analysed and accounted in Figure 2 and Table 1. The values of A.annua genome sizes were assessed ranging from 1.31 Gb to 1.54 Gb, referred to Nipponbare as primitively control, and soybean and maize as control individually and simultaneously.
Through the flow cytometric data analysis of two parallel per strain detected three repeatedly with three different controls in individual and simultaneous ways, the maximum coefficient of variation values (CV) was detected to be 2.96% in the group data of HK strain with Z. Mays as a control. Using the same control, the differences among genome sizes of 7 A.annua strains were 45 Mb (Nipponbare), 88 Mb (G. max) and 41 Mb (Z. Mays), respectively. And the CV of seven strains with Nipponbare as control ranged from 0.29% to 2.94%, that of G. max was 0.67- 2.66%, and that of Z. Mays was 0.86-2.96%. This suggested that there was a quite stability of the flow cytometric instrument (BD AccuriTM C6).
However, G. max and Z. Mays genome sizes assessed by FCM using Nipponbare as primitively control were slightly larger than that in previous reports [49,54]. Various varieties and their different growth conditions may be the key points [55]. From those high-quality data with low CV ≤ 2.96%, seven strains detections had little differences using Nipponbare, G. max and Z. Mays as control, respectively. It also illustrated that diversity lines in A. annua species had little influence on their genome sizes.
The genome sizes variation of the same A.annua strain using the three different control ranged from 38 Mb (the HK strain) to 96 Mb (the SC strain). And the difference sizes among 5 other strains genome were close (57 Mb for LQ strain, 55 Mb for YJ strain, 64 Mb for YY-1 strain, 67 Mb for YY-2 strain, and 67 Mb for JX strain). Those flow cytometric results indicated that the estimation for the same A.annua species with different standards had little diversity.
The biggest genome size of A.annua YY-2 strain was 1.49 ± 0.07 Gb, whereas the smallest of YY-1 strain was 1.38 ± 0.06 Gb by a margin of 110Mb (Table 1). The result of all flow cytometric data of A.annua strains and their mean value (1.44 Gb) differ by 73 Mb, showing that there was slight genome size variation in A.annua species.
Originally, the genome size of A. annua estimated by Nagl and Ehrendorfer (in 1974) [56], Geber and Hasibeder (in 1980) [57] was respectively 4.1 pg and 3.8 pg, with microdensitometry measurements based on Feulgen staining. Using Pisumsativum cv. Express long (2C = 8.37 pg) [58] as only one internal standard, the DNA content per haploid genome of A. annua (1.75 pg, 1.71 Gb) was assessed using flow cytometry (FCM) by Torrell and Valles [50]. In previous studies, A. annua samples had been deposited in the Herbarium of the Laboratory of Botany, Faculty of Pharmacy, university of Barcelona (BCF) since 1997, lacking of study on molecular identification. The half peak coefficient of variation (HPCV) of A. annua reached to 3.01 and its DNA amount value was very different from other taxa.
In this paper, seven A. annua strains collected from five provinces were identified on the basis of the ITS2 sequences. And their flow cytometric analysis with a low CV (≤2.96%) indicated that their genome sizes range from 1.38 Gb to1.49 Gb, using Nipponbare as the benchmark calibration standard and G. max and Z. Mays as two internal standards individually and simultaneously. However, the biggest size (1.49 Gb) of the YY-2 strain genome was 12.87% (220 Mb) less than the estimation result (1.71 Gb) of A. annua in Spain by Torrell and Valles with Pisumsativum as an internal standard [50]. It was resulted from that there was absence of significant genome size variation in Pisumsativum [59,60], which was unfitness for an internal standard without proofreading by a benchmark calibration standard in FCM. So it does require a benchmark calibration standard in the method of FCM.
Considering the same area, the genome size of YY-1 stain was less about 110Mb than YY-2 while JX was less 54 Mb than HK. Both of YY-1 and YY-2 had little difference with other area strains. Moreover, the variation between all seven strains’ genome sizes and their mean value (1.44 Gb) was merely 73 Mb, showing no significant discrepancy. It indicated that there was minor genome size variation in A. annua species, which may predominantly result from transposable element accumulation, expansion/contraction of tandem repeats, variation in intron length and so on [61].
In addition, we have attempted to carry out genome-wide survey with low-depth (<30X) high-throughput sequencing data. The estimation value was a little larger than flow cytometric data (unreported) and A. annua genome was rich in high repeat content sequences. It was assessed in conjunction with Dendrobium officinale (a traditional Chinese Orchid herb), whose genome size is about 1.27 Gb based on flow cytometric data and 1.35 Gb assembled by combining the second-generation and third-generation PacBio sequencing technologies [62]. And it also accords with the size of Eriobotrya Lindl . ‘Jiefangzhong’ genome (654.40 Mb estimated by FCM and 773.00 Mb by 17-mer spectrum) [63]. These inflated genome sizes attribute to their high repeat content and heterozygosity [36]. However, flow cytomeric data with internal standards Caenorhabditis elegans (~100 Mb) and Drosophila melanogaster (~175 Mb) showed the genome size of the first sequenced A. thaliana (157 Mb) was 25% larger than that initiative estimate of 125 Mb, partially was set down to genes mission in centromeric and ribosomal DNA regions [64]. Though the discrepancy between those two sets of data also exists in many sequenced plant genomes, both can determine the same order of magnitude of plant genome size and flow cytometric results were quite credible.
Hence, the unreported k-mer analysis results confirmed that A. annua genome was closed to 1.38-1.49 Gb, having a complex genome with high repeat content.
At present, Feulgen Spectrophotometry and FCM are commonly used methods for estimating genome size. FCM is a powerful method in qualitative and quantitative analysis of animal, botanical, microbial monoplast, and other microscopic particles in liquid suspension [65,66]. Served as a traditional standard technology for estimating genome size, FCM can confirm nuclear DNA content exactly [67,68]. Bennett et al appealed to provide a precise angiosperm C-value served as a benchmark calibration standard for plant genome estimations [64]. So far, Arabidopis ( A. thaliana ) [69] and rice ( O. sativa L. ) [28-32] are two high-quality sequenced plants. The sequences of rice genome is considered as a “gold standard” in plants. Moreover, the Nipponbare RefSeq has the best quality information, compared to known crop genome sequences [28-32]. It is quite important for sequence comparison of herbaceous plants, and Nipponbare can be used as a benchmark calibration standard. Besides,the genomes of soybean (G. max) and maize (Z. mays) whose genome sizes are both close to that of A. annua L. , had been estimated by FCM [43-46,48,70,71] and also sequenced [38,49]. Therefore, soybean and maize remeasured by Nipponbare are competent for well accurate determination of the A. annua genome size.
The method and technique of FCM are simple and convenient, extending the spectrum of its application. FCM has the advantages of great flux per a batch of operations, but it is limited to internal reference and easy to be affected by endogenous DNA and secondary metabolites. In general, high content of cytosolic compounds in medicinal plant leaves containing proteins and secondary metabolites is considerably liable to bias nuclear DNA content estimations by FCM, which cannot be completely overcame [72,73]. Recently researches were mainly focused on the most appropriate buffers and procedures for sample preparation. Otto I buffer can precipitate nucleus, as well wipe off cytosolic compounds in officinal plant leaves and nucleus debris in a certain degree [74]. From our flow cytometric data, it can be found that the Otto buffer was also applicable to A. annua samples.
Estimates for the same species with different standards were sometimes surprisingly divergent, but not in accordance with our A. annua flow cytometric results. Comparisons with two results of various species genome estimations by FCM and genome survey, we discovered some sequenced model plants characteristically stability in DNA content and ease of preparation were inadequate to serve as internal standards. On condition that the flow cytomeric estimations in conformity with genome assembly results, the genome size discrepancies between Mosobamboo ( P. pubescens ) and its internal standard soybean ( G. max ) [75-77], between barley ( Hordeum vulgare ) and standard P. sativum [43,78] are 1.70-fold and 1.49-fold, respectively. It indicated that the optimum genome discrepancies of the uknown and internal standard should be approximately 0.5- or 2-fold [79]. However, in our study (Table 2), under Nipponbare as a benchmark calibration standard of G. max and Z. mays , the size of A. annua genome using Nipponbare (3.76-fold discrepancy), G. max (1.56-fold discrepancy) and Z. mays (0.66- fold discrepancy) respectively as control were all suitable.
All flow cytometric data of A. annua species (CV≤2.96%) were essentially stable, however, the estimations of different strains with the three different control were disproportionate. The results showed that the different contents of inclusion in different strains would influence the intensity of fluorescent staining in A. annua plants. It may be a valuable reference for quality evaluation of their different compounds in A. annua plants.
In virtue of high-quality sequenced Nipponbareas a benchmark calibration standard, the genome size of genuine A. annua identified by ITS2 was estimated to be approx. 1.38-1.49 Gb by FCM with Nipponbare, G. max and Z. mays as different internal standards. Further more, genome size did not show significant variation in seven wild A. annua strains coming from five provinces. It showed that no rapid expansion and contraction in A. annua genome was found. So, it is necessary to conduct a further study on the relationship among environment factors, genetic information and artemisinin content variation. The assessment of A. annua genome size would provide a deeper understanding of its genome. It facilitate the suitable schedule of its whole genome sequencing project, and provide references for insight into its subsequent genetics and evolution.
In this essay (Table 1), seven wild sample plants or seeds were gathered from five provinces (Shandong, Hunan, Chongqing, Sichuan, and Hainan). Some wild seedings were transplanted into our greenhouse, and some were seedlings germinated from wild seeds in soil or Murashige and Skoog (MS) medium.
Rice (Nipponbare) seeds, whose genome size is quite definite, came from the Institute of Genetics and Developmental Biology, Chinese Academy of Sciences. And the seeds of soy bean and maize were purchased from the market, uncertain varieties. Growth and development of Nipponbare, soybean and maize were carried out by the water culture.
We adopted ITS2 and psbA-trnH sequences as barcodes to identify seven samples, A. annua , and the others. Samples genomic DNA isolation and their ITS2 and psbA-trnH sequences were obtained, assembled and analyzed according to the protocol of “Standard DNA Barcodes of Chinese Material Medica in Chinese Pharmacopoeia” [51,80].
80mg fully developed fresh leaves of A. annua strains, Nipponbare, soybean and maize or 200mg callus and adventitious bud of cultured strains were collected in clean Petri dishes. Those samples were rapidly chopped in 2 mL cold Nuclei suspension extractions (Ottobuffer I [55,81]) with a sharp razor blade, and filtered through a 40 µm nylon cell strainer, keeping on ice. The extraction liquid was transferred in a new 2 mL tube, and centrifuged coldly for 3min at low speed (1844 g, for 5000 rpm) to remove the supernatant particles. Then the precipitate in the bottom was suspended again with 600ul fresh ice-cold Otto I solution, and centrifuged coldly for 30s at 500g, twice repeatly. Before flow cytometric analysis, staining with propidium iodide (PI, with RNase, BD Biosciences PharmingenTM, San Diego, US) [43,45,82] was performed equivalently in a mixture of Otto I and Otto II buffers (1:2) for 15min.
The nuclear DNA content measurements and analysis were carried out by a FCM (BD AccuriTM C6, USA) at a low flow rate (14µl /min) with more than 100000 cells. Forward scatter (FSC), side scatter (SSC), blue (488nm) and red (640 nm) fluorescence for PI were acquired. Two parallel with three technical replicates per sample were detected for the stability of the instrument. And mean values and standard deviations of all flow cytometric data were calculated. The formula and storage of Otto I and II solution were referred to the Dolezel’s protocol [55].
We should be obliged to Ms. Li Xiang from Institute of Chinese Materia Medica for the help in affording the ITS2 sequences of A. annua and the closely related species and counterfeits, also Ms. Yanyan Ma form Experimental Research Center China Academy of Chinese Medical Sciences for her kind FCM technical support. In addition, we also thank Shance Li from the Institute of Genetics and Developmental Biology, Chinese Academy of Sciences for offering Rice (Nipponbare) seeds.
This work was supported by “the Fundamental Research Funds for the Central public welfare research institutes” (zz0808021).
The authors declare that they have no competing interests.
We always work towards offering the best to you. For any queries, please feel free to get in touch with us. Also you may post your valuable feedback after reading our journals, ebooks and after visiting our conferences.