Does selection in beekeeping allow for heritability?
Matthieu Guichard (Agroscope, Swiss Bee Research Centre, Bern) conducted a long-term study between 2010 and 2018 on approximately 1000 Carnica honey bee colonies and approximately 1000 Mellifera honey bee colonies, aiming to determine the heritability of various traits: honey yield, gentleness, comb adherence, swarming tendency, hygienic behaviour, and Varroa infestation.
Although low to very low, heritability values are positive in both bee races for four traits: gentleness, comb adherence, hygienic behaviour, and honey yield. Swarming does not appear to be a trait that can be modulated by selection. Varroa infestation also does not seem to be improvable, probably due to reinfestation in an environment with a high density of apiaries in Switzerland.
The many environmental parameters further complicate selection through controlled mating. Heritability measurements are context-dependent, and values reported in the literature must be interpreted with caution. An apiary is not a homogeneous unit: each colony exhibits specific behaviour depending on its topographical situation and its ability to detect food resources.
Selection is therefore a particularly complex process. The average beekeeper should bear in mind that placing colonies in a suitable environment, combined with high-performing young queens, is sufficient to increase honey and pollen yields. Most queens used by amateur breeders for grafting have already undergone intensive selection. The task therefore consists more in maintaining fixed traits than in attempting to improve them at the cost of considerable and often disappointing effort.
0 Abstract
Successful honey bee breeding programmes require traits that can be genetically improved by selection. Heritabilities for production, behaviour, and health traits, as well as their phenotypic correlations, were estimated in two distinct Swiss Apis mellifera mellifera and Apis mellifera carnica populations based on 9 years of performance records and more than two decades of pedigree information. Breeding values were estimated by a best linear unbiased prediction (BLUP) approach, taking either queen or worker effects into account. In A. m. mellifera, the highest heritabilities were obtained for defensive behaviour, calmness during inspection, and hygienic behaviour, while in A. m. carnica, honey yield and hygienic behaviour were the most heritable traits. In contrast, estimates for infestation rates by Varroa destructor suggest that phenotypic variation cannot be attributed to an additive genetic origin in either population. The highest phenotypic correlations were determined between defensive behaviour and calmness during inspection. The implications of these findings for testing methods and the management of the breeding programme are discussed.
1 Introduction
In Switzerland, beekeepers from the associations mellifera.ch (MEL), Apis mellifera mellifera, and the Société Romande d’Apiculture (SAR), breeding Apis mellifera carnica, maintain two independent breeding programmes. However, they share a common interest in improving production, behaviour, and health traits of honey bees. Their goal is to provide beekeepers with genetic material that meets their respective population standards and exhibits good beekeeping characteristics under local environmental conditions. Selection is supported by government funding to sustain local breeding programmes.
The two breeding programmes maintain mating stations in distinct Alpine valleys, allowing controlled mating of queens with selected drones from the respective honey bee populations (Plate et al. 2019). Since 2010, selection in each population has taken place after the evaluation of approximately 100 to 180 queens per year by qualified beekeepers in networks of test apiaries throughout Switzerland following standardised testing procedures corresponding to or very similar to other protocols described in the literature (Büchler et al. 2013; Ehrhardt et al. 2010). In test apiaries, the following traits are regularly recorded: honey yield, defensive behaviour, calmness during inspection, swarming tendency, hygienic behaviour towards pin-killed brood, and infestation by the parasitic mite Varroa destructor, while MEL beekeepers additionally assess colony size. After an initial treatment to equalise infestation among colonies at a very low level, no treatment against V. destructor is applied during testing in the following season.
To determine the quality of the two breeding programmes, genetic parameters of the aforementioned traits were estimated using a best linear unbiased prediction (BLUP) approach. Recently, a similar analysis was conducted including phenotypic records of approximately 15,000 Austrian A. m. carnica colonies (Brascamp et al. 2016). Estimation of genetic parameters strongly depends on the size and structure of the data; therefore, it was uncertain whether the same approach (Brascamp and Bijma 2014) could be applied to our smaller datasets (approximately 1,000 colonies per population). In the study by Brascamp et al. (2016), it was shown that genetic effects of queens and workers, both contributing to colony performance, can be estimated jointly. Under this condition, it became possible to add the estimated breeding values (EBV) for queen and worker effects and use this sum as a selection criterion. In smaller datasets, it is more likely that the restricted maximum likelihood algorithm does not converge when a joint estimation is performed, necessitating estimation of either worker or queen effects. Such a situation was described for a recent honey bee breeding programme comprising 151 colonies (Facchini et al. 2019).
The aim of this study was to calculate EBVs, heritability estimates, and phenotypic correlations for the different traits recorded by MEL and SAR beekeepers. In addition, we validated the results of the applied statistical models. The results presented in this study will enable Swiss beekeepers to optimise their breeding programmes and selection strategies.
2 Materials and methods
Structure of the applied breeding programmes
In the MEL breeding programme, groups of 12 sister queens are mated at mating stations during summer with drones from drone-producing colonies headed by sister queens. All queens are evaluated blindly in the following year in a network of test apiaries. Based on this evaluation, queens are selected in their third year for grafting to produce daughter queens (female side). Selection of queens for the production of queens heading drone-producing colonies (male side) also occurs throughout the programme based on test results.
In the SAR breeding programme, experienced beekeepers are responsible for maintaining their own maternal lines. For this purpose, on the female side, colonies are empirically selected each year for producing the next generation by grafting. Following the same strategy as in MEL, groups of 12 sister queens are produced and distributed to different test apiaries, where they are tested blindly to select queens for the male side. In contrast to the MEL programme, only some of the best queens may occasionally be used for the female side of queen breeding; however, this is rare. Consequently, phenotypic information is intensively used on the male side and scarcely on the female side.
Datasets
In 2019, the honey bee breeding associations MEL and SAR provided recorded phenotypes (2009–2018) and pedigree information to estimate heritabilities and EBVs for the various traits (honey yield, defensive behaviour, calmness during inspection, swarming tendency, hygienic behaviour towards pin-killed brood, and infestation by V. destructor). The database was updated annually, corresponding to colonies evaluated in the previous beekeeping season (the number of queens tested by beekeepers determines the amount of additional information available each year).
In each dataset, the unique identification number (ID) of the queen was used to identify the respective colonies. Here, a colony is defined as a group of worker sisters originating from the same queen. The identity of the queen heading the colony (dam of the workers), her mother, and the mother of the drone-producing colonies (sire) used to mate the colony queen was generally known. Based on this pedigree information, two pedigree files (MEL and SAR) were generated. For the majority of colonies included in the pedigree records, information on most evaluated traits and the identification of the test apiary were available. At each test apiary, beekeepers indicate the date and the person who evaluated the colonies. These presumed effects (year, evaluator, and location) on honey bee evaluation were confounded, as the quality of all colonies is assessed simultaneously during the season. Phenotypic records and apiary information (year, evaluator, and location) were included in the respective performance records.
Data preparation
In the pedigree files (Table I), the identities of dams and sires were recorded for each colony. The number of drone-producing colonies was also indicated. A base queen or base sire was added to the pedigree file if one of the parents was unknown. If queens were mated at the same location with an unknown father (often a group of unrelated drone-producing colonies), the latter was coded as a common sire with no known parents. This situation occurred repeatedly in the MEL population and resulted in the addition of many virtual base animals. In the pedigree file, the number of entries corresponds to the total number of colonies, dams, and sires. The lines of these files contain all colonies, dams, and sires, as well as the identities of their own dams and sires.
Table I Structure and content of pedigree and performance files obtained for mellifera.ch (MEL, Apis mellifera mellifera) and the Société Romande d’Apiculture (SAR, Apis mellifera carnica)
Full-size table
Before the start of the records analysed here, queens were already mated at mating stations in a manner similar to the years included in the dataset. It was therefore assumed that the additive genetic relationship among drone-producing colonies had reached equilibrium. For each mating, the number of involved drones was assumed to follow a Poisson distribution and was set to 12 (Brascamp et al. 2016). The inverse of the pedigree relationship matrix among all pedigree entries was calculated following Brascamp and Bijma (2014).
In the performance records (Table I), for ease of interpretation of results, phenotypic values of each trait were entered without transformation, even for traits that were not normally distributed. Honey yield was analysed as raw data, but colonies producing no honey were also excluded to assess whether lack of production was mainly due to unfavourable environmental conditions or low genetic value of the colony. Two ratios were calculated from phenotypic data: the growth rate of V. destructor infestation between spring and summer (Ehrhardt and Bienefeld 2007) and the growth rate of colony size, expressed as the ratio between summer and spring colony size. Identification of the analysis apiary was also added to the file by combining geographic location and year of analysis.
Model
Estimated breeding values for all traits were calculated using ASReml software version 4.1.2132 (www.vsni.co.uk), using the aforementioned performance files and inverses of the pedigree relationship matrices.
In an initial attempt, a model was used to jointly estimate worker and queen effects, as well as the fixed apiary-year effect and a global mean. In both datasets, due to the size or structure of the data, the restricted maximum likelihood algorithm did not converge. Thus, worker and queen effects were evaluated separately. Accordingly, models were defined to include either colony or queen for the purpose of separate estimation of worker and queen effects. In the model including colony, we accounted for the fact that workers are a group of individuals rather than a single individual by calculating the relationship matrix following the approach of Brascamp and Bijma (2014).
For each dataset, two single-trait linear models were finally used, the first on worker effects (WM) and the second on queen effects (QM) as random effects.
3 Discussion
Genetic analysis of two independent honey bee datasets, each comprising approximately 1,000 colonies with observations, indicated that it is possible to calculate genetic parameters even in small populations. However, in our case, it was not possible to jointly estimate queen and worker effects, as in previous studies (Bienefeld and Pirchner 1990, 1991; Brascamp et al. 2016; Ehrhardt et al. 2010). As two linear models (WM and QM) had to be used, part of the variation related to the queen effect may have been included in the worker effect in WM, and vice versa, and it was not possible to estimate the genetic correlation between worker and queen effects for the same trait.
Since the MEL and SAR populations were not genetically linked, were managed differently, and had distinct evaluation protocols for some traits, comparisons between estimates should only be made with caution. This also applies to comparisons with some previously published studies, in which heritabilities may have been estimated using other methods.
In both datasets, heritability for honey yield was low to moderate and below values previously reported in other countries (Andonov et al. 2019; Bienefeld and Pirchner 1990; Brascamp et al. 2016; Najafgholian et al. 2011; Tahmasbi et al. 2015; Zakour et al. 2012). This may be explained by specific characteristics of honey production in Switzerland, which relies mainly on oilseed rape nectar and silver fir honeydew (Persano Oddo et al. 2004), both of which are strongly influenced by highly variable environmental conditions: production can, for example, be strongly affected by genotype–environment interactions. Moreover, colony size recorded by MEL beekeepers was almost not heritable but positively correlated with honey yield; the latter may be influenced by colony management (e.g. space available for the queen to lay eggs or supplementary feeding if necessary). In MEL data, heritability estimates for honey yield were slightly higher when non-producing colonies were included in the dataset. This result suggests a putative genetic effect on non-producing colonies; therefore, we suggest including non-producing colonies in data analyses.
The heritability estimates for defensive behaviour and calmness during inspection differed between the two populations, with high values in MEL corresponding to previously published values for other populations (Andonov et al. 2019; Bienefeld and Pirchner 1990; Brascamp et al. 2016; Tahmasbi et al. 2015) or even exceeding them (Zakour et al. 2012). Lower estimates were obtained for the SAR population; this may be related to evaluation protocols. As quality levels of colonies were recorded and many expressed apparently satisfactory behaviour levels for these traits, half of the colonies were scored between 3.5 and 4 (maximum score). We therefore observed lower variation in records for these traits compared with the MEL dataset, where the worst colony per apiary was scored 1, the best 4, and the others distributed in between (Table I). Thus, low heritability could be the result of low reported variability rather than absence of genetic effects. To improve evaluation of calmness during inspection and defensive behaviour in SAR, we suggest evaluating colonies using the full scale from 1 to 4 for relative ranking to better discriminate the best colonies. This had already been suggested to more efficiently select for low defensive behaviour in extremely aggressive A. m. syriaca populations (Zakour and Bienefeld 2013). If almost no variability can be detected in the field for certain traits, for example when all colonies are very close to the optimum, other approaches might be preferred, such as removing a few low-performing colonies from the programme. High correlations between defensive behaviour and calmness during inspection were obtained for both populations (0.71 and 0.65 for MEL and SAR, respectively). High genetic correlations between these two traits have already been reported in an Austrian A. m. carnica population (Brascamp et al. 2016). This may indicate either that beekeepers are not able to distinguish the two traits or that they are generally closely linked. Breeding programmes could consider including only one trait or defining better tools to assess the two traits more distinctly.
In agreement with a previous study (Brascamp et al. 2016), heritability estimates for swarming tendency were low (< 0.1) for both queen and colony effects, indicating either strong genotype–environment interactions (involving weather or honey flow conditions), non-genetic queen quality factors, or lack of accuracy in trait evaluation by beekeepers. Higher values were obtained in two other populations (Andonov et al. 2019; Tahmasbi et al. 2015), indicating that in those cases selection for this trait could be possible.
Surprisingly, heritability estimates for infestation by V. destructor resulted only in zero values. This finding does not correspond to earlier findings reported by others (Büchler et al. 2008; Ehrhardt et al. 2010) but matches some observations (Harbo and Harris 1999; Maucourt 2019). Low, non-significant heritabilities were found in spring in WM. However, spring values are mainly recorded to check the effectiveness of pre-test treatment (infestation should be close to zero for all colonies); thus, these values are likely due to dataset-specific characteristics rather than genetic differences among colonies. In contrast, higher estimated heritabilities would have been expected for summer infestation rates and infestation growth rates between spring and summer. Several reasons may explain this result: either infestation may not be influenced by the host’s genetic background, or such influences are masked early by much stronger horizontal transmission among colonies and/or apiaries. Such transmission, for example related to drifting, is likely to occur from late spring onwards, as relatively long gaps between spring and summer honey flows are common in Switzerland. Many apiaries in Switzerland still use traditional hives grouped in small pavilions with entrances of different colonies placed side by side. Apiaries with grouped hives and entrances oriented in the same direction are known to increase mite transfers between colonies (Dynes et al. 2019; Seeley and Smith 2015). In addition, many regions in Switzerland have a high density of colonies and apiaries per square kilometre (Fluri et al. 2004; von Büren et al. 2019), and colony density has been linked to reinvasion flows of mites in neighbouring Germany (Frey and Rosenkranz 2014). These mite flows among colonies probably mask colony effects. Finally, since infestation measurements of V. destructor require precise protocols, it should be verified whether, despite training provided by associations, all beekeepers performed measurements with sufficient accuracy. However, no heritability of infestation level was observed either in a research population of similar size (Maucourt 2019). This may indicate that under certain conditions, even when evaluators are trained, heritability of this trait is extremely low, possibly due to environmental effects. Even if heritabilities of V. destructor have been low so far, beekeepers should continue to measure infestation levels to support monitoring of test networks and ensure good sanitary conditions in test apiaries (effective treatments between test periods). Mortality due to poor infestation management by beekeepers can be avoided, allowing a maximum number of queens to be fully tested, which are very valuable as potential drivers of genetic progress.
In the case of MEL, estimated heritabilities for hygienic behaviour were comparable to values found in the literature (Büchler et al. 2008; Ehrhardt et al. 2010; Facchini et al. 2019; Maucourt 2019). Lower values in SAR may be due to the evaluation protocol for this trait. More precise data might be obtained if values were expressed as the number of cleaned cells per hour (test duration is not known so far), following the MEL approach. We found no association between hygienic behaviour and summer infestation level of V. destructor. Hygienic behaviour is used as a criterion to improve brood health, but many beekeepers may associate this trait with the objective of selecting V. destructor-resistant colonies, for example through Varroa-sensitive hygiene. The link between hygienic behaviour and V. destructor resistance is controversially discussed (Leclercq et al. 2018). In the present case, it is unlikely that selection based on hygienic behaviour would lead to improved colony survival in the context of V. destructor infestations. This may be because other resistance traits (e.g. grooming, Varroa-sensitive hygiene, suppressed mite reproduction, swarming) or, more likely, a combination of these traits may be more effective defence mechanisms under Swiss conditions, as observed in some naturally resistant populations (Locke 2016). It is unclear how hygienic behaviour could contribute to reducing the prevalence of European foulbrood, as the prevalence of this endemic disease is also strongly influenced by colony density (von Büren et al. 2019). For these reasons, the ability of hygienic behaviour to limit the prevalence of chalkbrood or European or American foulbrood must be evaluated in the Swiss context. In the meantime, selection for hygienic behaviour should not have adverse effects on honey yield, as the two traits show a low positive correlation (0.13 and 0.11 for MEL and SAR, respectively).
Model testing showed that EBVs were unbiased for almost all traits, particularly for traits with heritability above 0.1. A notable exception was quality management in the SAR dataset, where all estimates were biased. This could be related to the lack of performance data on the female side, as most colonies used for queen breeding are empirically selected by beekeepers responsible for the lines. We suggest that in the future, performance information on the female side should be added, for example by using tested queens as mothers to produce the next generation. In the MEL dataset, EBVs for hygienic behaviour obtained by WM were estimated more precisely than those obtained by QM. This can be explained by the fact that hygienic behaviour depends on the ability of workers to detect dead brood, a task that does not involve the queen. In this study, we were unable to jointly estimate worker and queen effects. By adding data to performance files in the coming years, one objective of the associations could be to jointly evaluate both effects in the future, once convergence of the maximum likelihood algorithm can be achieved. This is expected to lead to a more precise estimation of the breeding objective when defined as the sum of worker and queen effects in a joint analysis.
Based on our results, beekeepers could select traits with the highest heritabilities and periodically calculate genetic progress to verify whether selection leads to the desired results. Other parameters, such as generation interval, queen mortality due to management issues, and selection differentials, could also be considered to optimise genetic progress. Standardisation and quality of data collection should be frequently checked, as they are crucial for dataset quality and for obtaining better genetic estimates. Furthermore, standardisation will help compare results with other studies in the future. This also applies to methods for estimating breeding values and heritabilities. Traits related to infestation by V. destructor should be locally re-examined to explore the genetic background of honey bees for resistance selection under Swiss conditions.
Download the article (English) as PDF
See also:


