Ensuring the quality of data measurements is critical to the success of any data collection effort. In this work, particular attention is given to the possibility of measurement errors due to incorrect data transformations and radius misalignment, as these can distort tree-ring width values and consequently impact downstream analyses.


Measurement accuracy

Ring-width measurements from certain sites may occasionally appear unusually high or low, often due to transformation errors, as tree-ring data is usually stored as integers with an associated scale factor. The CFS_scale() function addresses this by applying a k-nearest neighbors (k-NN) approach, using geodesic distances on the WGS84 ellipsoid (via the distGeo function from the geosphere package), to identify geographically close sites. It then compares the median tree-ring measurements of the target site to those of its nearest neighbors. This procedure is conducted within the same species.


# ring measurement
# loading data
dt.samples <- fread(system.file("extdata", "dt.samples.csv", package = "growthTrendR"))

# formatting the users' data conformed to CFS-TRenD data structure
dt.samples_trt <- CFS_format(data = list(dt.samples, 39:68), usage = 1, out.csv = NULL)
# loading processed data
# otherwise need to run CFS_format() first as done in data report
# dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR"))
# , message = FALSE, warning = FALSE, results = 'hide'
all.sites <- dt.samples_trt$tr_all_wide[,.N, by = c("species", "uid_site", "site_id")][, N:=NULL]
if (nrow(all.sites[, .N, by = .(species, site_id)][N>1]) > 0) stop("species-site_id is not unique...")
# e.g. taking the target sites
target_site <- all.sites[c(1,2), -"uid_site"]

ref.sites <- merge(dt.samples_trt$tr_all_wide[,c("species", "uid_site", "site_id", "latitude","longitude", "uid_radius")], dt.samples_trt$tr_all_long$tr_7_ring_widths, by = c("uid_radius"))

dt.scale <- CFS_scale( target_site = target_site, ref_sites = ref.sites, scale.label_data_ref = "CFS-TRenD V1.2-proj69", scale.max_dist_km = 200, scale.N_nbs = 2)
arguments of CFS_scale() function:

target_site

The target site refers to a single site for a specific species to be evaluated and includes at least five columns: species, uid_site, site_id, latitude, and longitude.

ref_sites

The reference sites refer to a dataset of ring-width measurements that includes the target site. In addition to the columns present in the target site, the dataset also contains uid_radius, year, and rw_mm.

scale.label_data_ref

a short description of reference dataset.This text will appear in the report as the data source for the generated figures.

scale.N_nbs

This specifies the maximum number of neighbors to be considered in the procedure.

scale.max_dist_km

This specifies the maximum distance (in kilometers) for searching neighbors of the target site.

generate report:

# default is NULL
generate_report(robj = dt.scale, output_file = NULL)

# user may also specify the directory and filename, for example
outfile_scale <- "scale_report.html"
generate_report(robj = dt.scale, output_file = outfile_scale)


arguments of the generate_report() function:

robj

The input for the scale report is the output of the CFS_scale() function, which assigns the class “cfs_scale” to the resulting object.

output_file

The output_file argument allows users to generate an HTML report at a specified path and file name. If set to NULL (default), the report opens automatically in the browser.

Quality Report: ring width measurement



This report compares the median ring-width of the target site(s) with the 2 nearest neighbors in the dataset CFS-TRenD V1.2-proj69 , using a maximum search distance of 200 km.



Radius alignment

Tree-ring measurements often exhibit long-term growth trends and interannual variability, which can obscure short-term anomalies and complicate the assessment of data quality (e.g., Bunn 2008; Holmes 1983). Measurement instruments may also introduce inaccuracies due to their limitations. The aim of this exercise is to identify and classify whether ring-width measurements are accurate by using the cross-correlation function (CCF) applied to a treated series (consecutive differences).


# ring measurement
dt.samples <- fread(system.file("extdata", "dt.samples.csv", package = "growthTrendR"))

# formatting the users' data conformed to CFS-TRenD data structure
dt.samples_trt <- CFS_format(data = list(dt.samples, 39:68), usage = 1, out.csv = NULL)
# loading processed data
# dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR"))


# data processing
dt.samples_long <- merge(dt.samples_trt$tr_all_wide[, c("uid_site", "site_id", "species", "uid_tree", "uid_sample", "sample_id", "radius_id", "uid_radius")],
                        dt.samples_trt$tr_all_long$tr_7_ring_widths, by = "uid_radius")

# rename to the reserved column name
setnames(dt.samples_long, c("sample_id", "year", "rw_mm"), c("SampleID", "Year" ,"RawRing"))

# assign treated series
# users can decide their own treated series
# dt.samples_long[, RW_trt:= RawRing - shift(RawRing), by = SampleID]
setorder(dt.samples_long, SampleID, Year)
dt.samples_long$RW_trt <-
  ave(
    as.numeric(dt.samples_long$RawRing),
    dt.samples_long$SampleID,
    FUN = function(x)
      if (length(x) > 1L) c(NA_real_, diff(x)) else NA_real_
  )

# quality check on radius alignment based on the treated series
dt.qa <-CFS_qa(dt.input = dt.samples_long, qa.label_data = "CFS-TRenD V1.2-proj69", qa.label_trt = "difference", qa.min_nseries = 5)
arguments of CFS_qa() function:

dt.input

The input dataset must include at least five columns: species, SampleID, Year, RawRing, and RW_trt. SampleID identifies each series, RawRing contains the raw ring-width measurements in millimeters, and RW_trt contains transformed measurements suitable for constructing a master chronology, according to the user’s choice. In this example, consecutive differences of the series were used for RW_trt.

qa.label_data

a short description of the input dataset.This text will appear in the report as data source for the generated figures.

qa.label_trt

a short description of the treated series.This text will appear in the report.

generate report:


# e.g. series to check
chk.lst <- dt.qa$dt.stats[1:2,]$SampleID

generate_report(robj = dt.qa,  qa.out_series = chk.lst)
arguments of the generate_report() function:

robj

The input for the cross-dating report is the output of the CFS_qa() function, which assigns the class “cfs_qa” to the resulting object.

qa.out_series

This argument allows users to select which series to examine visually. By default, all series are included, which may result in unnecessarily long processing times.

Quality Report: cross-dating



This report provides an overview of the quality of each series with 4 graphs:

  1. raw ring measurement vs. year;

  2. cross-correlation plots of raw ring measurement with chronologies (raw)

  3. treated ring measurement vs. year;

  4. cross-correlation plots of treated ring measurement with chronologies (treated). “qa_code” is attached to this plot.

below is the description of qa_code.


Description of qa_code
qa_code Description
pass The maximum correlation occurs at lag 0
borderline The correlation at lag 0 ranks as the second highest, and its difference from the maximum remains within a predefined threshold, categorizing as a quasi-pass
pm1 The maximum correlation occurs at lag 1 or -1, suggesting slight misalignment.
highpeak The maximum correlation occurs at a non-zero lag and is more than twice the second-highest value, potentially signaling an issue
fail All other measurements that do not fit into the aforementioned categories fall under this classification.



parameters used in this report:

data source: CFS-TRenD V1.2-proj69

treated series: difference

threshold for borderline: 0.1

minimum number of series for developping chronologies: 5