Received | : | Aug 31, 2018 |
Accepted | : | Jan 17, 2019 |
Published Online | : | Jan 21, 2019 |
Journal | : | Journal of Plant Biology and Crop Research |
Publisher | : | MedDocs Publishers LLC |
Online edition | : | http://meddocsonline.org |
Cite this article: Tekalign A, Derera J, Sibiya J, Mumm RH. Molecular analysis for genetic diversity and population structure of Ethiopian faba bean (Vicia faba L.) accessions. Plant Biol Crop Res. 2019; 1: 1010.
Genetic diversity in germplasm is crucial for strategic breeding, however very little is known about the genetic potential of faba bean (Vicia fabea) for faba bean improvement and genomics research in Ethiopia. Therefore, forty landraces including ten improved faba bean varieties were characterized using thirty SSR markers to quantify the extent of genetic diversity and determine population structure. There were 220 alleles, per locus averaging 7.86; the (Polymorphic information content) PIC ranged between 0.0 and 0.87 with average of 0.62 using Power Marker and PAST software analysis. The PIC and gene diversity value averaged 0.62 and 0.63, respectively, indicating high genetic diversity among the faba bean collections. Analysis of molecular variance showed that 68.5% of total variation was found within population and population structure analysis showed three major cluster, clusters I and II are landrace and III genotypes from ICARDA. Therefore, these could potentially be used to improve the faba bean gene pool. Further, the generated knowledge about the level of diversity and population structure of faba bean germplasm is an important contribution to this crop breeding and conservation. Besides, the SSR markers used in the present study were effective, highly polymorphic and are recommended in future diversity studies of faba bean.
Keywords: Cluster; Faba bean landraces; Genetic diversity; Principal component analysis; Population structure; Molecular markers.
Faba bean ( Vicia faba L. ) is thought originated in the Near East and is one of the earliest domesticated legumes after chickpea and pea with the Mediterranean basin as the most important centre of diversity; China, Afghanistan and Ethiopia have also been reported as secondary centres of diversity for the crop [1,2]. It is fourth most important grain legume in tropical and subtropical regions of the world [3]. Faba bean belongs to the family Fabaceae (Leguminosae) with an estimated genome size of ~13000 Mb [4,5]. It is a diploid (2n=2x=12) and predominantly a cross-pollinating species (35-55%) [6,7].
In Ethiopia faba bean is grown in the mid-altitude to highland areas with high rainfall and in various types of soils. Faba bean has been cultivated as excellent source of protein for resource poor farmers (27-34%) [8], as cash crop and a major break for cereal mono-cropping system to increase the soil fertility by fixing atmospheric nitrogen [9]. Although less productive, the faba bean landraces have shown good adaptation to local, often stressful, conditions and are preferred by the farmers and consumers for their good taste.
The landraces harbour great genetic potential with alleles to improve agronomic performance, biotic and abiotic stress tolerance and quality characteristics. Therefore, there is potential to develop improved varieties with traitssourced from landrace populations [10].
The potential for improvement of a plant is determined by the level of genetic diversity; thus their use in breeding results in improved food production [11]. For effective breeding and management of genetic diversity, germplasm collections need to be well-characterized. Diversity study in faba bean can be performed using phenotypic, molecular and biochemical markers [12]. However, the phenotypic characters are influenced by environmental factors [13]. Moreover, the phenotypic differences possibly determined by a small number of genes and may not represent genetic divergence in the entire genome [14].
Diversity studies using molecular techniques have been conducted on local collections of faba bean in the world [15,1,16- 19]. Terzopoulos and Bebeli [20] confirmed the existence of different germplasm pools in Mediterranean faba bean using ISSR markers. In contrast, narrow genetic diversity was observed in faba bean from China using EST-SSR markers [21]. In Ethiopia the existence of potential genetic diversity of faba bean was reported based on phenotypic characterisation [22]. However, information on genetic diversity is a major challenge for systematic use of faba bean in breeding programmes in Ethiopia. Consequently, the faba bean improvement programme depends on genotypes from other sources mainly the International Centre for Agricultural Research in the Dry Areas (ICARDA) [22]. Therefore, the objectives of this study were to estimate the genetic diversity, population structure and gene flow of faba bean landraces from major faba bean growing areas of Ethiopian highlands.
Forty Ethiopian faba bean landraces chosen randomly from collection representing the major faba bean growing areas as well as ten inbred lines were used for this study. The landraces were collected from different major faba bean growing areas of Ethiopian highlands (Figure 1); inbred lines were developed by pulse crops breeding programme of Holetta Agricultural Research Centre (HARC) and International Centre for Agriculture Research in the Dry Area (ICARDA). The population designation was made by area where the materials collected and more detailed information for each genotype is given in (Table 1).
Figure 1: Map of Ethiopia showing faba bean collection sites of major faba bean growing areas of Ethiopian highlands.
Faba bean genotypes were planted in seedling trays at INCOTEC laboratory in South Africa, the leaf samples were harvested from the two-week-old healthy leaves of one plant per genotypes, folded and stored in to a 14-15 mL test tube. The genomic DNA of the 50 faba bean genotypes was extracted based on the International Maize and Wheat Improvement Centre (CIMMYT) protocol [23]. The purified total DNA was checked for its concentration using spectrophotometer while the quality was examined using 0.7% Tris Borate EDTA (TBE) agarose gel electrophoresis. The final DNA concentration of all extracted DNA stock was adjusted to 10 ng/μl and the DNA samples were stored at 4ºC.
Thirty SSR markers used in the present study were obtained 6 from Gong et al. [21] study (P disignation), 5 from Ma et al. [24] study (M designation) and 19 from Zeid et al. [25]. Study (VfG designation). These markers have been recommended by researchers [21,24,25] due to their level of polymorphism and close association with known gene functions. They were recommended for the basic studies, diversity in population or germplasm collection, genetic mapping and marker-assisted breeding of faba bean. Details of the 30 SSR loci used in this study are presented in (Table 2).
PCR amplification was performed using a Gene Amp PCR System 9700 (Applied Bio systems) thermal cycler. Reactions were executed using 12 μl of a reaction mixture containing 1×PCR reaction buffer, 2.5 mM Mg++, 0.2 μl each of dNTPs (Bioline), 1 unit Taq 42 polymerase (Bioline) and 10 ng of genomic DNA. Primers were labelled with a 104 fluorescent dye; two primers were provided for the amplification of each SSR locus: one tailed forward primer (0.05 μmol), one normal reverse primer (0.25 μmol). The initial denaturation step was performed at 94ºC for 2 min, followed by 1 cycle at 940 C for 30 s, 630 C for 30 s and 720 C for 45 s. The annealing temperature was decreased by 10 C per cycle in subsequent cycles until it reached a temperature of 570 C. Products were subsequently amplified for 33 cycles at 940 C for 30 s, 570 C for 30 s, and annealing of primer at primer specific temperature for 30 s, and 720 C for 45 s with a final extension for 20 min. Electrophoresis was done on the PCR products in 0.7% TBE- agarose gels. The gel then stained in ethidium bromide and observed under a UV transilluminator and the bands were scored.
Basic statistics were calculated using the genetic analysis package as indicated below. Summary statistics that include the Number of Alleles (NA), Major Allele Frequency (MAF), the heterozygosity, Polymorphic Information Content (PIC) used to measure the informativeness of genetic marker for linkage analysis, and gene diversity [18] were calculated using Power Marker (ver.3.23) [26]. Variance components within and among populations were calculated using analysis of molecular variance (AMOVA) with the software ARLEQUIN V3.1 [27]. As a measure of genetic differentiation between populations a test for significance between population pairs was computed with the F-Statistics estimation (Fst), depict the statistically expected level of heterozygosity in a population, with AMOVA. Genetic relationships among individuals were assessed by multivariate Principal Component Analysis (PCA), which is used to detect patterns of variation in complex data sets, estimates of genetic similarity based on Jaccard’s coefficient similarity were computed from the proportion of shared SSR alleles between every pair of faba bean genotypes [28] using PAST software V 3.0 [29]. The pattern of diversity among the genotypes was identified based on Eigen vectors. The genetic distance based clustering was performed with the un-weighted pair group method with arithmetic mean (UPGMA) and the dendrogram showing the relatedness among the 50 faba bean obtained by WADR clustering method using SAS software V. 9.3.[30].
The population structure was analysed using STRUCTUR 2.3.4 which implements a model-based clustering algorithm using the genotype data [31]. It assumes a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals are assigned to populations according to their membership coefficients for each cluster. A series of K, from 1 to 20, was used to estimate the number of clusters under the admixture model with allele frequencies correlated. For each K, 20 independent runs of 10000 iterations were processed following a burn-in period of 50000 iterations. The optimum K value, which indicates the number of genetically distinct clusters in the data, was determined from 20 replicate runs for each of K [32]. The ΔK was calculated based on the rate of change of the log-likelihood between successive K values. Software program Structure Harvester V 0.6.93 [33] was used for calculating parameters of Evanno et al. [32]. Following the method of Evanno et al. [32], the ΔK were plotted against the K numbers of the groups. The software package CLUMPP was used to combine the STRUCTURE group-membership output data for each population from 20 replicates run from the molecular data for K=3 [34]. The optimal number genetic structure (the maximum value of ΔK was graphically displayed using DISTRUCT [35].
Twenty eight of the 30 SSRs were polymorphic among the 50 entries (Table 3). The fragment size of the alleles ranged from 120 bp (VfG31 and VfG55) to 326 bp (VfG11). A total of 222 alleles were detected with 30 SSR markers. The number of alleles detected by a single SSR locus varied from 1 (for VfG87 and M10) to 24 (for VfG67) with an average of 7.4 per marker (Supplementary Table 1). These, markers VfG87 and M10 were no longer useful to this analysis and were dropped. Heterozygosity (He) across all loci was very high ranging from ranged from 0.39 (VfG19) to 1.00 (M17, VfG3, VfG 10, VfG27 and VfG 81), with a mean of 0.92. The major allele frequency ranged from 0.14 (VfG28) to 0.83 (VfG19) with a mean of 0.44 (Table 3). Gene diversity, or expected heterozygosity per locus in a population, is used to quantify the genetic variation, evaluate genetic divergence and population relationship and detect inbreeding. The gene diversity scores of the 28 polymorphic SSR loci ranged from 0.28VfG19) to 0.91 (VfG28) with mean of 0.68 (Table 3). The PIC value, for 28 SSR loci varied from 0.32 (VfG19) to 0.91 (VfG28). All except two loci (P41, and VfG19) showed high PIC values (>0.5) with an average of 0.67 (Table 3).
There was higher and significant (P<0.001) distribution of genetic variation within population grouping based on geographical location (Table 4). There was 68.54% of the molecular variation within populations and 31.09% among populations. The Fixation index (Fst), a measure of population differentiation due to genetic structure, value of 0.31460 was observed showing the extent of differentiation of populations (Supplementary Table 2). Thus, population differed from one another and lines within population also exhibited diversity
Table 4: Analysis of molecular variance of 50 faba bean genotypes grouped in to population based on their geographical location.
The pair wise FST value revealed the lowest genetic differentiation (0.087) between Tigrai and Harer populations (Table 5, below diagonal). Alternatively, it was highest between the faba bean genotypes from ICARDA and other populations (Wollega 0.77, Wollo 0.72, North Shewa 0.69, Gojjam 0.69, Tigrai 0.68, Central highland 0.58 and HARC 0.55). Significant variation (P<0.001) of population differentiation was observed between population of Arsi and Central highland, Arsi and Wollo, Arsi and North Shewa, Arsi and ICARDA materials, Central highland and North Shewa, improved materials from HARC and Central highland, North Shewa and Harar, Wollega and Central highland, Wollo and Central highland population (Table 5, above diagonal).
Table 5: Population pair wise FST values, (below diagonal) and their P values (above diagonal).
The dendrogram classified the germplasm into three major clusters (Figure 2). Cluster I and cluster II were further subdivided into sub-clusters. Cluster II comprised the landraces from different locations, and four old small seeded improved varieties (CS-20-DK, NC58, Bulga-70 and Kasa) and two recently released varieties, Moti and Dosha (Moti and Dosha). All faba bean landrace collections from different parts of Wollega (Wollwga1, Wollwga2, Wollwga3, Wollwga4 and Wollwga5) were also grouped in cluster II. Cluster III comprised the exotic genotypes from ICARDA (ILB-938, ILB-4726, and BPL-710). The Genotype Central highland4, a faba bean collection from the Central highland of West Shewa Chalya, Arsi4, collection from Arsi zone Dawa Bursa and Gebelcho an improved variety from HACR were grouped in cluster III.
Figure 2: MUPGMA dendrogram for fifty faba bean genotypes based on the Jaccard coefficient as revealed using SSR markers (the name indicated genotype name and collection place listed in (Table 1) and (B) bar plot of K=3, estimates of membership coefficient, each vertical bar represents the membership coefficient for an individual genotype grouped into three GI, GII and GIII.
There were three different genetic groups (GI, GII and GII) (Figure 2A and B). The maximum ΔK occurred at K=3 and at K=3, the faba bean genotypes divided into three clusters (Figure 2B & Figure 3A). All ICARDA faba bean genotypes were fully grouped together in the DISTRACT plot (Figure 3B) the rest of the population showed admixture (Figure 4). As shown in Figure3C some of the individual genotypes from populations of Gojjam, Central highland, Tigray, Wollega, and Wollo were found not admixtures. All the improved faba bean materials from HARC were admixtures, shared genetic components across groups.
Figure 3: Inferred population structure of faba bean genotypes: Plot of (A) the relationship between ΔK and K showing the highest peak at K=3; (B) DISTRUCT plot for 50 faba bean genotypes based on the STRUCTURE analysis, each color represent a different cluster, and black segments separate the population, population names are below the Figure and (C) individual genotypes from each population (collection site and source) is represented by a single vertical line partitioned in to K=3. Each color represents one cluster, and the length of the colored segment shows the genotype’s estimated proportion of membership in that cluster as calculated by STRUCTURE. The color code for the inferred three cluster is 1=Red, 2=Green, 3= Blue.
Figure 4: Scatter plot of PC1 and PC2 based on the similarity of 30 SSR markers for 50 faba bean germplasm. Different genotypes grouped into three main groups and one admixture group.
The individuals were assigned to clusters (Supplementary Table 2). Genotypes with membership coefficient > 0.800 were assigned to the respective cluster completely and individuals with <0.800 indicated that they were admixed and were assigned to two or more population clusters. All genetic groups comprised individuals with a high estimated membership coefficient for the respective cluster.
The first four principal components explained 30% (12.05% for PC1, 7.08% for PC2, 5.16% for PC3 and 4.67% for PC4) of the total diversity. The test genotypes were divided into three distinct groups with the ICARDA materials fully separated from the others (Figure 4). In the PCA plot, the faba bean germplasm were divided into three main groups and one admixture with 12, 16, 5 and 17 genotypes in Cluster I, II, III and admixture respectively. Group III consisted of faba bean exotic varieties from ICARDA (FBV26-ILB-938, FBV28-ILB-4726, and FBV31-BPL 710). This group also comprised of improved faba bean variety Gebelcho-FBV29 and landrace collection (FBColl-36). Thus, the grouping obtained by unweighted pair-group method with arithmetic mean (UPGMA) dendrogram and plot from STRUCTUR analysis was confirmed by Principal Component Analysis (PCA).
Genetic dissimilarity >50% was observed from about 86.94% of the pair wise comparison among the faba bean genotypes (Figure 5). The highest genetic dissimilarity coefficient (93%) was observed between faba bean collection from West Shewa (FBColl-036) and faba bean collection from Arsi Zone (FBColl003).The lowest value of genetic dissimilarity coefficient (29%) was observed between faba bean varieties ILB-4726 (FBV-028) and ILB-938 (FBV-026).
In this study on average 7.4 alleles were detected per locus whereas Gong et al. [36] detected 2.3 alleles per locus among lines tested. Such considerable differences in the number of alleles detected may arise from differences in the diversity of the test genotypes used, the number of genotypes examined and the genotyping method used. The SSR markers exhibit relative abundance and co-dominant inheritance and are useful for estimating genetic relationship and diversity [37]. In this study, the SSR markers were able to detect considerable level of genetic diversity present among the tested faba bean genotypes.
High average heterozygosity of 0.85 was detected in this study, which could reflect the partial cross-pollination nature of faba bean [10]. This heterozygosity is important in creating genetic variability in faba bean [38, 39). The results have implications for breeding new varieties. There is potential for selection within the populations, which is consistent with previous studies. Link et al. [40] reported the possibility of considerable potential for selection within populations for specific traits from highly heterogeneous and heterozygous plants in faba bean.
Polymorphic information content (PIC) value provides an estimate of discriminating power of a marker based on the number of alleles at a locus and relative frequencies of these alleles. In the present study nearly 86% of these markers had high PIC values (>0.5) with an average of 0.62. This indicates that the markers were highly polymorphic, informative and were useful in discriminating the faba bean genotypes. This is considered to be high based on the previous classification by Botstein et al. [41], who indicated that PIC values>0.5 represents highly informative markers, 0.5> PIC >0.25 is informative marker, and PIC < 0.25 is a slightly informative marker. In the present study the high polymorphic rate for most (86%) of the markers and PIC value, together with more than 50% genetic dissimilarity for about 87% pair wise comparison in the test faba bean genotypes, suggests a high level of heterogeneity. Furthermore, the SSR markers used in the present study were effective and highly polymorphic. Therefore, the set of markers used are recommended in other future evaluations of faba bean germplasm.
The AMOVA results indicated high genetic variation which is consistent with previous studies on faba bean. The findings from this study indicate that faba bean populations from the Ethiopian highlands are highly variable. This is in line with the study by Kwon et al. [16] who reported a large amount of variation within faba bean landraces. Recently, Ouji et al. [19] reported 74.3% genetic diversity within nine populations of Tunisian faba bean than that of among population which is also congruent with findings from the current study. Similarly, Terzopoulos and Bebeli [20] reported a high level of genetic diversity within the populations of Mediterranean-type faba bean. The study, therefore, adds crucial evidence that variation within populations is large in faba bean.
Higher genetic diversity within populations is expected [42] than that of among populations since faba bean is largely an out-crossing crop [43]. The small differentiation, low percentage of variation (31.46%) partitioned among populations of faba bean from different locations observed in this study could be attributed to the exchange of faba bean seed among farmers. Ethiopian faba bean growers get their seed mostly by informal seed exchange system. In this line there has been a lot of genetic exchange among famers’ landrace for seed [44]. However, in this study large FST value (0.17-0.23) between the population of Arsi, Central highland, and North Shewa suggests that although they are located geographically near each other, the collections from these locations have high genetic differentiation.
The clustering of genotypes into three major groups reflects the origin of the genotypes and known pedigree relationships. For example, the improved variety Gebelcho grouped with the ICARDA materials because it is derived from a cross of ILB4726×Tesfa and released from HARC [45]. The results showed well-defined distribution patterns of the materials, according to the genetic distance and the membership coefficient relationships among them [35]. In the PCA plot, the faba bean germplasm were divided into three main groups and admixture. These groups reflect the breeding history and the relation between improved, exotic and landrace collection. Admixture situation in the genotypes shows good agreement with pedigree information. Both the PCA and STRUCTURE analysis also suggested the existence of three major groups. The admixture from small membership coefficient value in the PCA is explained as the presence of gene flow of the genotype which had infiltrated into other population clusters [46]. This observed admixture situation is due to the largely out-crossing nature of faba bean [47] and artificial crossing and hybridization.
The high percentage genetic dissimilarity coefficient for the majority of pair wise comparison among the faba bean genotypes suggests lesser amount of genetic relatedness and elucidates low genetic similarity. The present result is different from Kwon et al. [16] who reported that high genetic similarity exhibited for the majority of the pair wise comparison among worldwide collected faba bean entries. It was observed substantial genetic diversity and clear population structure using unexploited set of landraces. Since the landraces included in this study possess desirable attributes like resistance to abiotic factors, they qualify as suitable parental choice for varietal development in faba bean.
The present study suggests that the faba bean collections used were genetically variable and shows evidence of considerable gene flow among the geographical locations where the materials were collected. The choice of parents within faba bean population would be recommended to explore the interclusters variability, followed by crosses among different populations to explore the inter-population variability for traits of interest of the breeding program. This information is helpful for developing appropriate science based strategies for faba bean breeding. However, it is suggested that more molecular work such as sequencing of the faba bean genomes is vital to explore the available diversity and to have better understanding of the presence of genetic variability in faba bean and consequently utilization of existing variability for improvement of faba bean for the challenges of faba bean production in Ethiopia.
The authors are grateful to the African Centre for Crop Improvement (ACCI) and the Alliance for a Green Revolution in African (AGRA) for financial support to the first author. The Ethiopian Institute of Agricultural Research (EIAR) and Holetta Agricultural Research Centre (HARC) are gratefully acknowledged for hosting the study and providing research facilities.
We always work towards offering the best to you. For any queries, please feel free to get in touch with us. Also you may post your valuable feedback after reading our journals, ebooks and after visiting our conferences.