TY - GEN
T1 - Data quantity is more important than its spatial bias for predictive species distribution modelling
AU - Gaul, Willson
AU - Sadykova, Dinara
AU - White, Hannah J.
AU - León-Sánchez, Lupe
AU - Caplat, Paul
AU - Emmerson, Mark C.
AU - Yearsley, Jon M.
PY - 2020/5/27
Y1 - 2020/5/27
N2 - Biological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of 1) spatial bias in training data, 2) sample size (the average number of observations per species), and 3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but only when the bias was relatively strong. Sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models.
### Competing Interest Statement
The authors have declared no competing interest.
AB - Biological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of 1) spatial bias in training data, 2) sample size (the average number of observations per species), and 3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but only when the bias was relatively strong. Sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models.
### Competing Interest Statement
The authors have declared no competing interest.
UR - https://www.biorxiv.org/content/10.1101/2020.05.24.113415v1
https://www.biorxiv.org/content/10.1101/2020.05.24.113415v1.abstract
UR - https://www.mendeley.com/catalogue/692a7840-adcd-3d25-b547-d3ffac30ff2e/
U2 - 10.1101/2020.05.24.113415
DO - 10.1101/2020.05.24.113415
M3 - Other contribution
T3 - bioRxiv
ER -