This document represents a supplementary material for the ‘Is Age Really Cruel to Experts: Compensatory Effects of Activity’ study. If you have any additional questions about the paper or statistical analysis, feel free to write to Nemanja.Vaci@aau.at.
To investigate whether negative effect of practice in FIDE dataset is due to the restriction of range of tournament activity records. We derived the players that are registered in both datasets.
We added the first and the last name of every player in both datasets
All the special characters and name titles were replaced
We used stringdist package in R, in particular function amatch to find string matches between datasets.
Year of birth was added to the both datasets
We calculated pairwise string distances between previously matched names with stringdist function
      The Jaro-Winkler method was used to compute distances, that results on a scale from 0 (exact match) to 1 (completely dissimilar). This method takes into account the length of the strings, number of characters with match between two strings, and number of transpositions required to make string exact.
This resulted with 13487 individuals.
The main model (see Modeling of age and activity effects subsection in Results) was re-analysed on the new datasets. Results show same pattern of results as in the main analysis. That is, the preserving effect of tournament activity in German database and declining effect in FIDE database.
FIDE database
FIDEIntersection<-lmer(Rating~poly(AgeC, degree=3, raw=T)*Games+(1+AgeC|IDFIDE), data=FIDE)
print(summary(FIDEIntersection), cor=FALSE)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Rating ~ poly(AgeC, degree = 3, raw = T) * Games + (1 + AgeC | IDFIDE)
## Data: FIDE
##
## REML criterion at convergence: 4202242
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -16.6570 -0.4346 -0.0128 0.4276 13.3794
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## IDFIDE (Intercept) 89681.55 299.469
## AgeC 90.21 9.498 -0.62
## Residual 442.81 21.043
## Number of obs: 453617, groups: IDFIDE, 12982
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.887e+03 3.103e+00 608.1
## poly(AgeC, degree = 3, raw = T)1 2.228e+01 1.479e-01 150.7
## poly(AgeC, degree = 3, raw = T)2 -6.011e-01 3.944e-03 -152.4
## poly(AgeC, degree = 3, raw = T)3 4.243e-03 3.921e-05 108.2
## Games 2.599e-01 3.145e-02 8.3
## poly(AgeC, degree = 3, raw = T)1:Games 7.088e-02 3.764e-03 18.8
## poly(AgeC, degree = 3, raw = T)2:Games -3.481e-03 1.314e-04 -26.5
## poly(AgeC, degree = 3, raw = T)3:Games 3.703e-05 1.338e-06 27.7
German database
GermanIntersection<-lmer(Rating~poly(AgeC, degree=3, raw=T)*Games+(1+AgeC|IDGerman), data=German)
print(summary(GermanIntersection), cor=FALSE)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Rating ~ poly(AgeC, degree = 3, raw = T) * Games + (1 + AgeC | IDGerman)
## Data: German
##
## REML criterion at convergence: 5521820
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -12.5444 -0.4498 0.0116 0.4765 8.0926
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## IDGerman (Intercept) 454528.4 674.19
## AgeC 729.3 27.01 -0.69
## Residual 3971.9 63.02
## Number of obs: 482960, groups: IDGerman, 13488
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 5.065e+02 7.310e+00 69.3
## poly(AgeC, degree = 3, raw = T)1 1.458e+02 4.546e-01 320.6
## poly(AgeC, degree = 3, raw = T)2 -4.064e+00 1.360e-02 -298.8
## poly(AgeC, degree = 3, raw = T)3 3.189e-02 1.340e-04 238.0
## Games 6.796e+00 1.249e-01 54.4
## poly(AgeC, degree = 3, raw = T)1:Games -6.795e-01 1.781e-02 -38.2
## poly(AgeC, degree = 3, raw = T)2:Games 2.092e-02 6.958e-04 30.1
## poly(AgeC, degree = 3, raw = T)3:Games -1.907e-04 7.587e-06 -25.1
To investigate cohort effects in the datasets, we divided the players into three groups:
Players born between 1900 and 1940
Players born between 1940 and 1980
Players born after 1980
The Generalized Additive Modeling (mgcv package in R) was used to fit the non-linear regression over the age of participants. This way we obtained curves over the age for every cohort.
Results show that the proposed effect persists in the case of the FIDE database. The main reason is that Elo ratings differ between the cohorts, as the required Elo points for logging in the database were changed through history of dataset. The results on the German database, however, show that there are no indication of possible cohort effects. In other words, the Elo scores from different cohorts aligne perfectly.
FIDE database
bam1<-bam(Rating~s(Age), data=FIDE1)
bam2<-bam(Rating~s(Age), data=FIDE2)
bam3<-bam(Rating~s(Age), data=FIDE3)
par(mfrow=c(1,3))
plot_smooth(bam3, view='Age', ylim=c(1800,2300), main='2010 to 1980')
## Summary:
## * Age : numeric predictor; with 30 values ranging from 10.000000 to 30.000000.
plot_smooth(bam2, view='Age', ylim=c(1800,2300), main='1980 to 1940')
## Summary:
## * Age : numeric predictor; with 30 values ranging from 31.000000 to 55.000000.
plot_smooth(bam1, view='Age', ylim=c(1800,2300), main='1940 to 1900')
## Summary:
## * Age : numeric predictor; with 30 values ranging from 56.000000 to 80.000000.
German database
bamG1<-bam(Rating~s(Age), data=German1)
bamG2<-bam(Rating~s(Age), data=German2)
bamG3<-bam(Rating~s(Age), data=German3)
par(mfrow=c(1,3))
plot_smooth(bamG3, view='Age', ylim=c(900,1800), main='2007 to 1980')
## Summary:
## * Age : numeric predictor; with 30 values ranging from 10.000000 to 27.000000.
plot_smooth(bamG2, view='Age', ylim=c(900,1800), main='1980 to 1940')
## Summary:
## * Age : numeric predictor; with 30 values ranging from 28.000000 to 55.000000.
plot_smooth(bamG1, view='Age', ylim=c(900,1800), main='1940 to 1900')
## Summary:
## * Age : numeric predictor; with 30 values ranging from 56.000000 to 80.000000.
To investigate effects of expertise on the behavior of function, we divided the ability factor in four different groups. In this case, we used statistically defined cut-offs based on player’s peak rating, thus, groups of players with four different level of expertise were obtained.
Results show that in the case of both datasets, the ability level influences the stabilization point of decline. The higher the level of ability, the sooner the inflection point is observed, except for the lowest ability level, as the function is flat.
Descriptive statistics for FIDE database
with(FIDE, aggregate(Rating~Ability2, FUN= function(Rating) c(MEAN=mean(Rating), SD=sd(Rating), RANGE=range(Rating))))
## Ability2 Rating.MEAN Rating.SD Rating.RANGE1 Rating.RANGE2
## 1 1 1710.19992 83.60242 1500.00000 1837.00000
## 2 2 2022.45803 91.78300 1500.00000 2175.00000
## 3 3 2238.44584 93.00313 1502.00000 2513.00000
## 4 4 2492.20624 101.42623 1645.00000 2851.00000
with(German, aggregate(Rating~Ability2, FUN= function(Rating) c(MEAN=mean(Rating), SD=sd(Rating), RANGE=range(Rating))))
## Ability2 Rating.MEAN Rating.SD Rating.RANGE1 Rating.RANGE2
## 1 1 754.08544 90.28766 200.00000 859.00000
## 2 2 1162.77550 235.82714 200.00000 1510.00000
## 3 3 1689.28402 229.60076 200.00000 2161.00000
## 4 4 2179.11974 193.70309 298.00000 2813.00000
Descriptive statistics for German database
with(German, aggregate(Rating~Ability2, FUN= function(Rating) c(MEAN=mean(Rating), SD=sd(Rating), RANGE=range(Rating))))
## Ability2 Rating.MEAN Rating.SD Rating.RANGE1 Rating.RANGE2
## 1 1 754.08544 90.28766 200.00000 859.00000
## 2 2 1162.77550 235.82714 200.00000 1510.00000
## 3 3 1689.28402 229.60076 200.00000 2161.00000
## 4 4 2179.11974 193.70309 298.00000 2813.00000
FIDEAbility4<-lmer(Rating~poly(AgeC, degree=3, raw=T)*Games*Ability2+(1+AgeC|ID), data=FIDE)
GermanAbility4<-lmer(Rating~poly(AgeC, degree=3, raw=T)*Games*Ability2+(1+AgeC|ID), data=German)
FIDE database
print(summary(FIDEAbility4), cor=FALSE)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Rating ~ poly(AgeC, degree = 3, raw = T) * Games * Ability2 + (1 + AgeC | ID)
## Data: FIDE
##
## REML criterion at convergence: 27456376
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -24.0034 -0.3968 -0.0119 0.3919 16.6139
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## ID (Intercept) 69586.2 263.79
## AgeC 152.3 12.34 -0.72
## Residual 496.2 22.28
## Number of obs: 2916227, groups: ID, 100529
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.614e+03 3.588e+00 449.7
## poly(AgeC, degree = 3, raw = T)1 -1.330e+01 2.568e-01 -51.8
## poly(AgeC, degree = 3, raw = T)2 5.145e-01 7.976e-03 64.5
## poly(AgeC, degree = 3, raw = T)3 -5.805e-03 7.638e-05 -76.0
## Games 8.109e-01 3.828e-02 21.2
## Ability2 1.548e+02 1.456e+00 106.3
## poly(AgeC, degree = 3, raw = T)1:Games -7.230e-02 5.515e-03 -13.1
## poly(AgeC, degree = 3, raw = T)2:Games 1.878e-03 2.079e-04 9.0
## poly(AgeC, degree = 3, raw = T)3:Games -1.896e-05 2.193e-06 -8.6
## poly(AgeC, degree = 3, raw = T)1:Ability2 1.077e+01 9.650e-02 111.7
## poly(AgeC, degree = 3, raw = T)2:Ability2 -3.323e-01 2.824e-03 -117.7
## poly(AgeC, degree = 3, raw = T)3:Ability2 2.963e-03 2.748e-05 107.8
## Games:Ability2 -3.553e-02 1.413e-02 -2.5
## poly(AgeC, degree = 3, raw = T)1:Games:Ability2 2.663e-02 1.968e-03 13.5
## poly(AgeC, degree = 3, raw = T)2:Games:Ability2 -1.050e-03 7.429e-05 -14.1
## poly(AgeC, degree = 3, raw = T)3:Games:Ability2 1.133e-05 7.934e-07 14.3
FIDEdes
## Maximum Second_Derivative
## 1 35 out_of_range
## 2 34 out_of_range
## 3 33 65
## 4 36 53
German database:
print(summary(FIDEAbility4), cor=FALSE)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Rating ~ poly(AgeC, degree = 3, raw = T) * Games * Ability2 + (1 + AgeC | ID)
## Data: FIDE
##
## REML criterion at convergence: 27456376
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -24.0034 -0.3968 -0.0119 0.3919 16.6139
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## ID (Intercept) 69586.2 263.79
## AgeC 152.3 12.34 -0.72
## Residual 496.2 22.28
## Number of obs: 2916227, groups: ID, 100529
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.614e+03 3.588e+00 449.7
## poly(AgeC, degree = 3, raw = T)1 -1.330e+01 2.568e-01 -51.8
## poly(AgeC, degree = 3, raw = T)2 5.145e-01 7.976e-03 64.5
## poly(AgeC, degree = 3, raw = T)3 -5.805e-03 7.638e-05 -76.0
## Games 8.109e-01 3.828e-02 21.2
## Ability2 1.548e+02 1.456e+00 106.3
## poly(AgeC, degree = 3, raw = T)1:Games -7.230e-02 5.515e-03 -13.1
## poly(AgeC, degree = 3, raw = T)2:Games 1.878e-03 2.079e-04 9.0
## poly(AgeC, degree = 3, raw = T)3:Games -1.896e-05 2.193e-06 -8.6
## poly(AgeC, degree = 3, raw = T)1:Ability2 1.077e+01 9.650e-02 111.7
## poly(AgeC, degree = 3, raw = T)2:Ability2 -3.323e-01 2.824e-03 -117.7
## poly(AgeC, degree = 3, raw = T)3:Ability2 2.963e-03 2.748e-05 107.8
## Games:Ability2 -3.553e-02 1.413e-02 -2.5
## poly(AgeC, degree = 3, raw = T)1:Games:Ability2 2.663e-02 1.968e-03 13.5
## poly(AgeC, degree = 3, raw = T)2:Games:Ability2 -1.050e-03 7.429e-05 -14.1
## poly(AgeC, degree = 3, raw = T)3:Games:Ability2 1.133e-05 7.934e-07 14.3
Germandes
## Maximum Second_Derivative
## 1 flat_function flat_function
## 2 41 54
## 3 38 54
## 4 33 50
We also excluded the players in German dataset below 1500 points to estimate whether shape of the curves is due to the range restriction. This way we obtained two groups of players (equal on Elo rating in both datasets): 1) Between 1500 and 2000 Elo points and 2) Above 2000 Elo points. Results show same effects as in the main analysis.
GermanRestricted<-lmer(Rating~poly(AgeC, degree=3, raw=T)*Games*AbilityNew+(1+AgeC|ID), data=German2)
print(summary(GermanRestricted), cor=FALSE)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Rating ~ poly(AgeC, degree = 3, raw = T) * Games * AbilityNew + (1 + AgeC | ID)
## Data: German2
##
## REML criterion at convergence: 18349579
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -12.2627 -0.5202 0.0091 0.5385 12.4061
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## ID (Intercept) 139978.7 374.14
## AgeC 192.2 13.86 -0.71
## Residual 1992.7 44.64
## Number of obs: 1685698, groups: ID, 90154
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 9.510e+02 6.669e+00 142.59
## poly(AgeC, degree = 3, raw = T)1 5.245e+00 4.937e-01 10.62
## poly(AgeC, degree = 3, raw = T)2 3.156e-01 1.500e-02 21.04
## poly(AgeC, degree = 3, raw = T)3 -5.899e-03 1.426e-04 -41.36
## Games -2.029e-02 2.054e-01 -0.10
## AbilityNew 3.875e+01 5.318e+00 7.29
## poly(AgeC, degree = 3, raw = T)1:Games 2.342e-01 2.543e-02 9.21
## poly(AgeC, degree = 3, raw = T)2:Games -8.897e-03 9.002e-04 -9.88
## poly(AgeC, degree = 3, raw = T)3:Games 8.731e-05 9.275e-06 9.41
## poly(AgeC, degree = 3, raw = T)1:AbilityNew 5.050e+01 3.853e-01 131.08
## poly(AgeC, degree = 3, raw = T)2:AbilityNew -1.677e+00 1.191e-02 -140.74
## poly(AgeC, degree = 3, raw = T)3:AbilityNew 1.503e-02 1.180e-04 127.42
## Games:AbilityNew 2.775e+00 1.539e-01 18.03
## poly(AgeC, degree = 3, raw = T)1:Games:AbilityNew -3.693e-01 1.956e-02 -18.88
## poly(AgeC, degree = 3, raw = T)2:Games:AbilityNew 1.175e-02 7.169e-04 16.39
## poly(AgeC, degree = 3, raw = T)3:Games:AbilityNew -1.085e-04 7.607e-06 -14.26