Hi Diogo,


Hope they are useful, I had to be a little briefer than I may have otherwise liked due to time constraints. 

Best,
Tim


1. Diogo on behalf of several groups. The students were intrigued by the use of Nc as opposed to Ne throughout the paper. Your rationale seems to be that it is precisely the set of factors shaping Ne which you want to get at, and Nc can be assumed to represent an upper limit on Ne. On the other hand, the disparities between Nc and the population size shaping levels of diversity in long timescales are a challenge to the analysis, since it is the Ne which will modulate the impact of selection. Thus their question is whether the paper could have been developed using an Ne proxy (neutral diversity in high recombination regions, for example?) or whether this would have introduced an element of circularity in the analysis.


One of the main motivations for our paper was the perspective piece by Leffler et al (http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001388) that proposed a possible reason why census population sizes vary dramatically more among species than effective population sizes is that selection reduces diversity more in large (census) populations. This is an old idea going back to the original Maynard-Smith and Haigh (1974) hitchhiking paper, which has more recently been developed by John Gillispie in his series of papers on genetic draft (where he shows that in the limit of frequent enough selective sweeps, Ne becomes completely decoupled from Nc, and depends only on the rate of sweeps). Testing this idea therefore requires comparing Nc to the effects of linked selection (since Ne is itself potentially influenced by linked selection).


2. Group 7I. In your work you used proxies such as body size and species range to infer Nc which, in turn, is informative about Ne (and therefore the effects of linked selection). Recent research has shown that life-history traits related to the classical r/K strategies (including body size and propagule size) explain a lot of the neutral variation (https://doi.org/10.1038/nature13685). In light of this result, do you think the r/K strategy could provide a better estimate for the impact of selection instead of, or in addition to, body and range size?


In their paper, Romiguier et al are primarily concerned with trying to understand demographic correlates of genetic diversity, which is a bit orthogonal to our paper. Specifically, they are trying to test whether intrinsic biological factors (e.g., life history traits, body size, etc) or contingent factors (species range, etc) are more important predictors of genetic diversity. They find that genetic diversity is much more strongly predicted by intrinsic compared to extrinsic factors. One hypothesis is that r/K strategies alter the sensitivity of species to population crashes associated with environmental disturbances, which could impact Ne quite strongly.


However, this is a somewhat different question that we are trying to address in our paper. Romiguier et al are largely not concerned with the impacts of selection. It would be an interesting question to ask whether life history strategies are associated with stronger or weaker linked selection (a particularly interesting factor here would be selfing in plants and some animals; one can also imagine a model where selection is more efficient in r-species if it acts on the huge numbers of offspring produced each generation), but this is somewhat different from either our paper or the Romiguier paper.


3. Groups 9I and 2N. Figure 2 from your paper presents the correlation between neutral diversity and recombination rate, associated with classes of organisms (which have variable Nc), showing that classes that have higher Nc present more correlation with neutral diversity and recombination due to higher effect of selection. We suppose, based on that figure and comparing animals and plants, that the impact of selection would be higher in animals due to higher correlations. However, in Figure 3, the line representing the impact of selection in plants is in general higher than that of animals. How can we explain this difference? Would this be because of the particular groups of plants and animals used to compose Figure 3 or there is other unnoticed aspect taking place?


The species used are the same in both Figure 2 and Figure 3.


However, the difference here is likely due primarily to the fact that, in many plant species, gene density and recombination rate are strongly correlated, which makes estimating a correlation coefficient tricky (since we expect more linked selection in regions of low recombination or high gene density, but if high recombination = high gene density these effects tend to counteract each other). So the correlation coefficients are likely underestimated. Another important factor is that many plants are at least partially selfing, which needs to be accounted for but is difficult to do with high precision. These two factors make it difficult to directly compare whether plants or animals have “more linked selection”, and also explain the differences in the figures.


4. Group 3I. The data you collected makes clear that big populations of small organisms have their neutral diversity reduced by natural selection due to linkage in a more intense way than small populations of large organisms. Would that pattern be the same considering species in which individuals have body sizes with significant variation according to the step of the life cycle, e.g. young tadpole and frogs? For example, in this case we would have very large population sizes associated to the non-adult, and using the adult size as a proxy for population size would underestimate the population size impacting linked selection.


This is an interesting question that gets into more complicated models of life cycle selection (e.g, where does selection occur at each stage in the life cycle). What matters of course is the population size at the life stage where selection occurs, so if there is an opportunity for selection among many thousands or tens of thousands of offspring of which only a few survive, the population size of r species may be much bigger than expected. This will also be true for, e.g., mass spawners where the many offspring are not necessarily siblings. If instead losses are largely random during the early life stages the relevant population size may be the adult. This relates to the question above about r/K species and would be an interesting question for future work.


5. Group 6I. Along the text, it’s discussed that genetic hitchhiking would substantially limit neutral diversity in species that present bigger population sizes. It’s also suggested that more species should be studied and that more tests should be carried out with the objective of better understanding the genetic hitchhiking effect in a broader taxonomic background. This would be helpful in answering one of the questions brought up in the paper, that refers to the need to elucidate  the relative contributions of BGS and HH, an issue that is still only beginning to be explored. This question seems rather interesting since BGS and HH are very distinct process that end up causing similar effects. How do you expect to be able to distinguish the contributions of each process? Or, do you even think it’s possible to differentiate them? We hypothesized that a possible approach would be to study the distribution of fitness effects for various species, would you agree?


There are a number of groups working on this -- one clear prediction is that HH should result in dips in diversity around substitutions (on average), but BGS won’t, since under HH diversity is reduced due to a fixation (sweep) whereas in BGS diversity is reduced because of lack of fixation (removal of deleterious alleles). This is discussed extensively in Elyashiv et al (http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1006130) and references therein.


The distribution of fitness effects of substitutions (as opposed to new mutations) is an interesting topic, and relates to another prediction that species with large population sizes (Nc) should have a DFE that skews more towards beneficial mutations (e.g., alpha, the fraction of substitutions that are fixed due to positive selection, should be higher). However, DFE estimates by themselves are not a direct measure of HH vs BGS, since BGS is associated with mutations that don’t fix (in the cartoon example of the case where all mutations are either weakly deleterious or strongly advantageous, both HH and BGS will likely be major contributors to shaping patterns of diversity, but DFE of substitutions for large population size species will be strongly skewed towards beneficial mutations since the weakly deleterious stuff never fixes).


6. Group 5N. Karasov and coworkers (https://doi.org/10.1371/journal.pgen.1000924 found that the recent effective size of Drosophila is greatly underestimated by neutral diversity (predominantly shaped by long term processes), and propose this may be a common pattern. HH and BGS could also contribute to further reduce the neutral genetic diversity (used to infer effective population sizes), as presented by the analysis in your paper. Taking these selective and demographic processes together, we would like to ask if it is likely that current proxies utilized for population size (neutral diversity) could in fact be even further away from recent effective population sizes than we previously imagined. Will this have an important impact in understanding molecular evolution?


In fact, I suspect there is “positive epistasis” between these two patterns: in species with large recent effective sizes, selection is very efficient potentially leading to lots of HH, which will reduce neutral diversity more than in species with little HH. D. melanogaster is a classic example of this, where recent “adaptation Ne” is huge and HH is also a major factor in reducing diversity.


This definitely has substantial implications for understanding molecular evolution, in particular for estimating how frequent selection sweeps are, and the probability of getting recurrent or “soft” sweeps. This has been discussed extensively in recent papers from Dmitri Petrov’s lab, Andrew Kern’s lab, and Pleuni Pennings’ lab.


7. Group 7N. In your paper relations are established between spatial distribution and size with the impact of selection on neutral diversity. However, due to the difficulty of obtaining data (especially recombination maps), you only worked with two eukaryotic multicellular groups. After you paper was published, has there been any work supporting or suggesting this evolutionary pattern (higher impact of linked selection in high Nc species) is observed in other groups of eukaryotes or even prokaryotes?

Species without sexual reproduction and recombination (e.g., all prokaryotes) present a whole different set of challenges (e.g. clonal interference, Muller’s ratchet), which has been studied by many groups (e.g. see recent work from Michael Desai’s lab); linked selection as we discuss in the paper is not really applicable since it crucially depends on recombination.


As far as other eukaryotic groups: there are two major issues. The first is that outside animals and plants sampling is often poor and also many species have much more complicated life cycles (e.g., in many fungi that can reproduce clonally or by meiosis and are not obligately sexual), which greatly complicates things like “recombination rate”. If sexual reproduction occurs once every 5,000 generations, what is the crossing over rate? It is not clear how to answer that question.


8. Diogo. For the analyses involving the regression of impact of selection on measures of Nc, I was surprised reviewers didn't request methods that account for phylogenetics correlation to be used. Is there any reason for this?


From our point of view, the main reason we didn’t attempt this is practical. Phylogenetic regression requires a tree with branch lengths; for the enormous range of species we included in our analysis, it is not clear this is even possible to produce in a sensible way that is not prone to lots of assumptions; even if we split the dataset into plants & animals the phylogenetic distance spanned in each group is enormous.

An alternate approach used in e.g. the Romiguier et al paper is to just take family means. However, we rarely had more than a few species at the family level, meaning that this would probably not change much.


Finally, at the time we were doing this analysis and submitting this work, many large scale phylogenetic studies had not yet been published, meaning that even to do a much more limited sampling (e.g. just do PGLS within mammals) would not have been easy


However, would make sense in a larger analysis of new data to include PGLS analysis or related methods of at least subsets of the data.



Última atualização: quarta-feira, 25 abr. 2018, 17:50