A compositional multivariate approach was used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Each soil sample site was assigned to the regional geology map, resulting in spatial data for one categorical variable and 35 continuous variables comprised of individual and amalgamated elements. This paper examines the extent to which soil geochemistry reflects the underlying geology or superficial deposits. Since the soil geochemistry is compositional, log-ratios were computed to adequately evaluate the data using multivariate statistical methods. Principal component analysis (PCA) and minimum/maximum autocorrelation factors (MAF) were used to carry out linear discriminant analysis (LDA) as a means to discover and validate processes related to the geologic assemblages coded as Age Bracket. Peat cover was introduced as an additional category to measure the ability to predict and monitor fragile ecosystems. Overall prediction accuracies for the Age Bracket categories were 68.4% using PCA and 74.7% using MAF. With the inclusion of peat the accuracy for LDA classification decreased to 65.0% and 69.9% respectively. The increase in misclassification due to the presence of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification.
- compositional data analysis; minimum/maximum autocorrelation factors (MAF); linear discriminant analysis (LDA); log-ratios; centred log-ratio (clr); geochemistry