The evolutionary rate of any protein-coding gene varies over time and hence between species. It has been observed that some genes have rates that covary with those of other genes and that they tend to be functionally related. Evolutionary rate covariation (ERC) is a measure of that correlation in relative evolutionary rates1 (RER) across time. If one considers the set of branches relating the orthologous copies of a given gene, an ERC value measures how correlated its branch-specific rates of amino acid divergence are with those of another gene.

Protein pairs that have high ERC values (i.e., high rate covariation) are often found to participate in shared cellular functions, such as in a metabolic pathway2 or meiosis3 or being in a protein complex together. Many co-functional proteins physically interact, such as those in protein complexes and enzyme/substrate interactions 46. For example, SMC5 and SMC6 form a complex and have a shared function in the spatial organization of chromatin. They also have similar rates of evolution across a phylogeny of yeast species (Figure 1A). Their strong rate correlation is quantified as a Fisher transformed correlation coefficient (ftERC = 24.944); this value for SMC5-SMC6 is highly elevated because the expectation for non-correlated pairs is zero. Yet, which forces led to this high correlation? Given the strong association between physical interactions and maintenance of functionality, the relative contribution of either is difficult to dissect for SMC5 and SMC6, and also for most physically interacting proteins.

Overview of experimental schema and hypotheses. Proteins that share functional/physical relationships have similar relative rates of evolution across the phylogeny, as shown in (A) with SMC5 and SMC6. The color scale along the bottom indicates the relative evolutionary rate (RER) of the specific protein for that species compared to the genome-wide average. A higher (red) RER indicates that the protein is evolving at a faster rate than the genome average for that branch. Conversely, a lower (blue) RER indicates that protein is evolving at a slower rate than the genome average. (B) Suppose the correlation in relative evolutionary rates between two proteins is due to compensatory coevolution and physical interactions. In that case, the ERC value would be higher for just the amino acids in the physically interacting domain. (C) Outline of experimental design. Created with

There are two hypotheses addressing the relative contributions of physical interaction versus co-function to correlated evolutionary rates. The idea that physical interactions contribute more to correlated evolutionary rates hinges on the maintenance of proper binding79. Under this hypothesis, a mutation in one binding partner will result in a compensatory mutation in the other, i.e. coevolution, consistent with the “lock and key” model for maintenance of physical interactions 7,9. If the physical interaction hypothesis holds, then there is great interest in using rate covariation tools, such as ERC, to predict quaternary structure and connectivity in protein complexes. However, there is a competing hypothesis that diminishes that potential utility. The second hypothesis is that rate correlations are primarily the result of parallel changes in selective pressures acting upon all genes in that function. Shared selective pressures can result in correlated rates of evolution due to several underlying causes. These causes include relaxation of constraint on a function, which results in accelerated RERs for the gene set providing that function. Similarly, increased importance of a function could lead to increased constraint and result in slower evolutionary rates for all genes involved. Shared changes in selective pressure can also result from adaptive evolution driving change in a function, and thereby to its genes.

There have been previous studies that weighed whether physical coevolution is the strongest contributor to rate covariation6,10,11. The conclusions drawn from these studies are ambiguous and, in some cases, contradictory. Furthermore, the studies had differing sample sizes and used different protein units to examine the physical interaction, such as surface residues6, the binding neighborhood11, and protein domains10. It is difficult to compare the conclusions between these different experimental designs. The different units of the protein will, by nature, give different results. For instance, surface versus buried residues are under different constraints, making it difficult to assess whether the physical interaction is driving the change or simply the protein structure. Whereas examining protein domains gives a view of how the entire three-dimensional structure of the protein is potentially affected by changes in the binding partner. Given the contradictory conclusions and lack of statistical power in previous studies, the overall question regarding the contribution of coevolution to the overall rate covariation remains unanswered.

In this study, we test the contribution of compensatory physical coevolution to rate covariation by measuring ERC on a large dataset of 343 yeast species. Specifically, we ask whether physically interacting domains have higher ERC than domains of the same proteins that do not physically interact (Figure 1B). Since domains can be analyzed separately, their ERC can be easily quantified in a practical workflow (Figure 1C). By looking only within complexes rather than across all proteins with annotated physical interactions, we normalize signals from functional associations, assuming that proteins in the same complex will be under the same functional constraints and, therefore, the same selective pressures. Ultimately, we show a weak contribution from physical coevolution and, therefore, poor predictability of physical interactions based on ERC scores, regardless of complex size or average complex ERC. We further show that the ranking of physically interacting domains across the complex has little generalizability in predicting which domains or proteins physically interact.


Both protein pathways and complexes have elevated ERC

Protein pathways and complexes are both functional units of the cell; however, complexes are defined by their physical interactions, while pathways may contain many proteins that do not physically interact at all. To investigate the discrepancy between contributions to ERC signal from co-function and physical interaction, we used a dataset of 343 evolutionarily distant yeast species. This dataset started with 12,552 orthologous genes, which we parsed into annotated pathways from KEGG12 and YeastPathway35 and protein complexes from the EMBL-EBI yeast complex portal13.

In general, ERC values for pathways and complexes were both high and had similar distributions after accounting for their sizes (Figure 2A). Almost all complexes and pathways had mean ERC values significantly greater than a null distribution consisting of random protein pairs. While protein complexes appear to have higher mean ERC scores than the pathways, the members of a given complex are also co-functional, making interpretation of the relative contribution of physical interactions to the average ERC score difficult.

Protein complexes and cellular pathways have significantly high average ERC. (A) The mean ERC values for 617 protein complexes (purple) and 125 cellular pathways (orange) versus the number of members contributing to the score. (B) Heat maps of the ERC scores for each protein pair in the motor proteins pathway (left) and SLIK complex (right). ERC for members of the motor proteins pathways that physically interact was set to NA (gray). (C) Scatter plots of the relative evolutionary rates for the top scoring pair from the motor proteins pathway (orange) and SLIK complex (purple).

To illustrate this difficulty at a more granular level, we present two examples, the SLIK protein complex and the motor protein pathway, both of which have significantly high average ERC values (p <0.001) (Figure 2B). It is also notable that when the ERC values for any physical interactions within the motor protein pathway are removed, it continues to have a significantly high average ERC. While the highest scoring protein pair in the SLIK complex, TAF5 and TAF6, physically interact and have a highly elevated ERC score (in top 1% genome-wide; Figure S2), the highest scoring pair in the motor protein pathway, TUB3 and TPM1, also have a highly elevated ERC score but do not physically interact (Figure 2C). This observation, which reflects the global pattern, makes it difficult to determine if the higher score between TAF5 and TAF6 is indicative of a stronger contribution from physical interaction to the ERC value or if co-functionality is the main driver. These observations at both the global level and in individual complexes and pathways prompted further investigation into individual complexes to determine if the physical interactions within a complex have higher average ERC scores.

ERC does not distinguish physical interactions from non-physical interactions within a given protein complex

To more directly test whether the signal contributing to the high rate covariation comes from the coevolution of physical interactions, we divided the proteins in a complex into their domains. A domain-level ERC analysis allowed us to directly contrast physically interacting domains with non-physically interacting domains. To maintain the highest level of confidence in the analyses, we only selected complexes whose members had well-defined domain boundaries and annotated physical interactions between complex members (Methods). This resulted in a dataset of 14 complexes spanning functions from transcription and translation to autophagosome formation. We also added three complexes from Jothi et al.10: Mitochondrial F1-ATPase, SEC23/24 heterodimer, and exportin CSE1 with substrate (Table 1).

We split the proteins from the 17 complexes into their annotated domains (Methods) and calculated ERC between all domains in a complex. The average ERC for the physically interacting and non-physically interacting domains for each complex were compared. Of the 17 complexes, 12 had higher average ERC for the physically interacting than the non-physically interacting. However, when we look at an individual complex such as the COMA complex (Figure 3), the physical interactions are some of the highest ERC scores but not the actual greatest. Likewise, some of the physically interacting domains also have some of the lowest ERC scores. To classify how often physically interacting domains have greater ERC and the statistical significance of such results, we proceeded with a rank-based analysis.

Recreation of figure 1B with the COMA complex. The table shows the ERC values for each domain pair as labeled in the cartoon on the right. The domain pairs with physical interactions are highlighted in green.

The rank-based method, receiver operating characteristic (ROC) curve analysis, ranks the domain pairs based on their ERC score and calculates the true positive rate (TPR) and false positive rate (FPR) based on whether each pair is physically interacting (positive) or not (negative). The relationship between TPR and FPR is then plotted on a curve that scores the ability to rank a true positive above a false positive; the curve is summarized by the area under the curve (ROC-AUC). In our analysis, the ROC-AUC gives the proportion of times a random pair of physically interacting domains has a higher ERC score than a random pair of non-physically interacting domains where a value of 1 would indicate that all physical interactions ranked above all non-physical interactions.

Twelve of the seventeen complexes had a ROC-AUC greater than 0.5. Since the random expectation would be that half would exceed 0.5, this is a significant excess (Binomial test, P = 0.0245). Moreover, five complexes had AUCs greater than 0.7 (Figure 4). These results indicate that physically interacting domains tend to have a higher ERC than non-physically interacting domains in those complexes. Likewise, after performing a one-tailed Mann-Whitney U test, four complexes ranked the physically interacting domains significantly higher than expected from a Gaussian distribution at an alpha of 5%, which is also a significant excess (Binomial test, P = 0.0012). Once again indicating that there is a significant amount of ERC signal coming from physically interacting domain pairs within these complexes.

ROC curve analysis of all 17 protein complexes. 13 of the 17 complexes have a ROC-AUC greater than 0.5. The SEC23/24 complex (bright green) has the highest ROC-AUC at 1, and the NUP84 complex (marigold) has the lowest AUC of 0.247. One-tailed Mann-Whitney U test, p < 0.05*, p<0.01**

We then applied a third test for a general pattern among all 17 complexes. We generated a null ROC-AUC distribution by permuting the location of the positive class along the ranked list for each complex (Methods). We compared the true average ROC-AUC from the 17 complexes (0.596) to the permuted distribution of average ROC-AUCs, which resulted in a permutation p-value of 0.08. The lack of significance in the global permutation test shows that the global signal for the contribution of physical coevolution to ERC is weak enough that it can be masked by the majority of complexes that show no evidence in favor.

The power of ERC to detect physically interacting domains is not generalizable in single protein pairs

The wide range in ROC-AUCs from the previous analysis suggested the possibility of confounding factors masking the signal. Potential confounders could be: members of the complex moonlighting in other pathways14 or core membership versus peripheral membership within the complex itself15. To test for the effect of physical interaction free from these confounders, we broke each complex down and only compared the domains within two proteins at a time. Thus, the variation introduced by these potential confounders will be consistent across all domains between a single protein pair, and comparing only their domains pair-by-pair could reveal the contribution of physical coevolution to the ERC signal. This analysis also allowed us to determine the significance of a complex having a physical interaction ranked first. If the ranking is significant, it would indicate a future use case of ERC to predict physical interactions.

To determine the significance of the physically interacting domain ranking within each pair of complex proteins, we calculated the proportional rank of the physically interacting domain pair versus all other domain pairs (Methods). This metric gave us the proportion of times the physical interaction ranked higher than non-physical interactions. The complexes had individual protein pair proportional rank values spanning the entire range of 0 to 1 without clustering at either extreme (Figure 5). These results indicate that even within the same complex, there is a wide variation in how strongly the physical interaction correlates with high ERC.

Protein-vs-protein physically interacting domains do not consistently rank higher than non-physically interacting domains. The domains from individual protein pairs within each complex were ranked and the proportional ranking of the physically interacting domain was calculated. Each dot represents the proportional rank of the interacting domain pair for a protein pair, with colors representing the complexes. A score of 1 indicates that the physically interacting domains were ranked first. A score of 0 indicates that the physically interacting domains were ranked last. Permutation test, p<0.05 (bold), p<0.01 (bold and underlined)

We then looked at the complex average proportional rank to determine if there was some overarching signal. We took the average for all protein pairs within a complex to get the mean complex proportional rank (Supplementary data). The significance of the complex proportional rank was determined by generating a null distribution of proportional rank values for each protein pair and randomly sampling from that to generate a complex-wide null distribution. The observed average for each complex was compared to the null. At a permutation p-value of 0.01, only two complexes had significantly high proportional rank, indicating a strong contribution from physical interactions: CUL8-MMS1-MMS22-CTF4 E3 ubiquitin ligase and exportin complexes (Figure 5). Three additional complexes were significant at a permutation p-value of 0.05: MCM, ORC, and ESCRT-I. This leaves over two-thirds of the complexes with no significant contribution from physical interactions to the ERC signal.

Given that some of the complexes ranked the physically interacting domains significantly higher in the proportional rank test suggests that compensatory co-evolution does contribute to the ERC signal. However, the inconsistency of the ranking indicates that there is not a strong enough signal to confidently call an interaction physical or not and would be of little value to an experimentalist wanting to infer interacting domains. Ultimately, the contribution from physical interactions on the ERC signal is not strong enough to determine if a high-ranking protein pair is associated through physical interaction versus co-functionality.


Given the differing conclusions reached in previous studies about the strength of contribution from physical interaction to correlated evolutionary rates 6,10,11 we aimed to provide a robust conclusion with improved experimental design and sample size. Upon the addition of hundreds of species to the analysis, we were able to look at evidence of compensatory coevolution with increased power at a number of scales: whole protein, domains within complexes, and domains between individual protein pairs. We propose that using entire domains as a unit of study, rather than amino acids at physical interfaces, captures structural changes that could still impact the binding and wouldn’t be captured by just looking at the binding residues themselves. These changes would still be selected for through compensatory coevolution, and relegating them to the non-interacting category could potentially mask correlated signals.

We found that compensatory coevolution due to physical interaction contributes weakly to evolutionary rate covariation and only in some complexes. Looking across all complexes in our study, a proportion higher than random chance showed elevated ERC between interacting domains, this result was reflected in 12 of the 17 complexes with ROC-AUCs over 0.5 and the 4 complexes with individually significant rankings. Moreover, when we examined only single protein pairs we found a similar excess of high scoring physically interacting domains. This indicates that there is a non-negligible signal from physical interaction contributing to evolutionary rate covariation. Evidence for physical coevolution however was tempered by a global permutation test, which did not reach significance, indicating that this inference is sensitive to approach and further underlines the relatively weak contribution of physical coevolution. In light of this weak contribution and inconsistency between individual protein complexes, the practical application of a tool such as ERC to provide actionable hypotheses for physically interacting domains is not advisable, given the uncertainty as to which complexes would have high ERC between physically interacting domains.

Previous studies attributed varying degrees of evolutionary rate covariation signal to physical interactions between proteins. On one extreme, Hakes et al.6 found no evidence that the physical interaction interface between two proteins had a greater correlation of evolutionary rates than the whole protein or just surface residues. They concluded this after examining surface and potentially interacting residues across 32 complexes that had orthologs in at least 12 species for each complex. On the other extreme, Jothi et al.10 concluded that there was a strong enough signal to predict which domains of a protein complex interact with significant accuracy, however their analysis was limited to just 3 protein complexes. Because of their conclusions, we included their complexes in our own study and achieved similar rankings of the physically interacting domains. However, only one of their complexes was statistically significant in our analysis which we attribute to a lack of power given that one of the complexes only had one physically interacting domain pair out of twenty-four pairs. Kann et. al11 reached a similar conclusion to our study. They found that binding neighborhoods have a higher evolutionary rate covariation on average than random non-binding sequences of the same length, ultimately concluding that while physical interactions are not the sole contributor to evolutionary covariation, they still contribute strongly enough to be detected.

These previous studies agreed that compensatory coevolution is not the sole force behind evolutionary rate covariation but they came to differing conclusions as to how much it contributes. This study concludes that there is a detectable but weak contribution from physical interactions that is not strong enough to confidently predict which domains physically interact. This study also found that the contribution of physical interaction was limited to a clear minority of complexes despite the high power lent from using hundreds of species to calculate ERC; this inconsistent effect across complexes might explain why previous studies and this one found varying degrees of contribution, since results could have depended on which complexes were chosen and in which species. Among the complexes we found to have a physical coevolution component, there is no obvious reason why these specific complexes would exhibit stronger coevolution between physically interacting domains when compared to other complexes. We hypothesize that while compensatory coevolution is a known phenomenon 4,16, amino acid changes resulting from it are relatively rare and are unlikely to contribute greatly to the general ERC signal between co-functional proteins. Rather, the greater and consistent contribution comes from shared evolutionary pressures and hence shared levels of constraint, the result of which was visible in our study of genetic pathways that do not physically interact (Figure 2).


Calculating evolutionary rate covariation (ERC)

Evolutionary rate covariation is calculated by correlating relative evolutionary rates (RERs) between two gene trees. The relative evolutionary rate is the rate at which a branch on a gene tree changes compared to the genome-wide average and is calculated as described by Kowalczyk et al.1. The method limits comparisons to gene trees that share at least 15 species and requires that all trees have the same topology. After calculating the correlation between RERs for each gene pair, the correlation values are Fisher transformed. Fisher transformation normalizes the ERC based on the number of branches that contributed to that correlation. The full ERC pipeline can be found at (Github).

Complex and pathway ERC permutation test

We took the entire yeast complexome from the EMBL complex portal13 and curated yeast pathways from KEGG12 and YeastPathway35. These lists were then compared to our dataset and pared down to complexes/pathways that had greater than 4 members. This resulted in 617 protein complexes and 125 pathways.

We then ran 1000 permutations to find the significance of the ERC scores between complex/pathway members. The null distribution was generated by sampling the same number of random genes in the complex/pathway from the entire dataset. The average ERC from the complex/pathway was then compared to the null distribution to get a p-value.

Preparing the protein complexes

14 complexes were chosen by searching for protein complexes within the EMBL yeast complex portal with crosslink, cross-link/mass spectrometry, or crystallographic data. Three additional complexes were added based on Jothi et al.10 Mitochondrial F1 ATPase, SEC23/24 heterodimer, and the exportin CSE1P complexed with cargo. The physical interactions were collected from the literature sources for each complex (Supplementary data).

Each protein was subdivided into domains by running the amino acid sequence through interpro scan17. New gene trees were generated for each domain using phangorn18, and ERC was run to calculate an all-domain-by-all-domain matrix. We generated domain-vs-domain ERC matrices for each of the seventeen complexes (Supplementary data) that were used for all analyses.

Generating ROC curves

ROC curve analysis was performed using the PRROC package19. First, the ERC matrices for each complex were turned into pairwise edge lists with columns [GENEA, GENEB, ERC, class] and ranked by ERC value. The positive class was defined as the physical interactions found in the primary reference for each complex and denoted with a “1”. The negative class was any protein domain pair without annotated physical interactions, denoted with a “0”.

The statistical significance of the ROC-AUC was determined by a one-tailed Mann-Whitney U test, using the relationship between AUC and U defined by

Where n0 is the number of non-physically interacting domains and n1 is the number of physically interacting domains.

ROC-AUC permutations were calculated by randomly shuffling the order of physical interactions within each complex ranked list 1000 times. The permuted AUC was calculated using the same pipeline as described above. The full study analysis was performed by taking the average from each of the 1000 permutations across all complexes.

Calculating proportional rank of physically interacting domain pairs versus all other domain pairs

To test how often ERC ranks the physical interactions higher than non-physical we compared each pair of proteins individually, where each protein’s domains were only compared to one other protein’s domains using the scores in Supplementary data if they shared a physical interaction somewhere along the full protein. For example, in the COMA complex, the domains for OKP1 were only compared to the domains of AME1 because OKP1 does not share a physical interaction with either CTF19 or MCM21. The matrix was then ranked by ERC score. The proportional ranking of the physically interacting domain pair was calculated by:

A score of 1 indicates the physically interacting domain pair is ranked first. A score of 0 indicates the domain pair with a physical interaction was ranked last. If the protein pair had more than one domain pair that had a physical interaction, the average proportional rank score was taken. The observed complex proportional rank score is the average of each protein’s score (Supplementary data).

The permutation p-value for the complex proportional rank score was calculated by randomly shuffling the ranking of the physical interaction(s) between two proteins 1000 times and calculating the proportional rank each time to generate a null distribution. The null distribution for the entire complex was calculated by randomly selecting a proportional rank value from the null distribution of each protein pair from the complex and averaging it. The observed complex average proportional rank was then compared to this null distribution to calculate the p-value.

Data Availability

All code and supplementary data can be accessed in the ‘erc’ GitHub repository:


This project was funded by the National Human Genome Research Institute at the National Institutes of Health [HG009299 to N.C and M.C].

The support and resources from the Center for High Performance Computing at the University of Utah are gratefully acknowledged.

Permutation p-value distribution of 617 protein complexes (top) and 125 pathways (bottom) when compared to a null distribution of 1000 samples.

Histogram of the Fisher transformed ERC values for all 12,552 orthologous genes in the 343 yeast dataset. Median = 0.953.