Think about it: Handling RLum objects
One of the hard lessons you have to learn as a package developer is that users usually do not care much about what you may have had in mind when writing a function. When we introduced the RLum-class
object structure in 'Luminescence'
in version in December 2014 (more than ten years ago!), we had little doubt that it would be highly welcomed by users as it streamlines their analysis pipelines. As a matter of fact, little is happening today in the package without this object structure; at least seen from a developer’s perspective.
The support requests that ripple in tell a different story, though. Time and time again RLum-class
objects seem to have been perceived as an obstacle to overcome and work around, without showing any appreciation. This perception likely stems from our failure to highlight their advantages and even prioritise other features, such as allowing subset()
to operate Analyst-like directly on RisoeBINfileData
objects.
Although this is not the appropriate place to delve deeply into the intricate details of RLum-class
objects, I would like to demonstrate the utilisation of two new functions we have developed as part of the REPLAY project. They will
be included in an upcoming version of ’Luminescence’
and here they help to spotlight RLum-class
objects and their inherent advantages when used in analysis pipelines.
Background: the RLum-class
object structure concept
Just skip this part, if you are already familiar with RLum-class
objects.
While the details are tricky, the concept of the RLum-class
objects is simple. They
aim to provide a unified data structure that will work regardless of the function and data used,
following the idea: Import data
->
Analyse data
->
Further process results
.
For instance, a typical OSL curve can be represented as x and y values:
## x y
## 1 1 564
## 2 2 318
## 3 3 180
## 4 4 101
## 5 5 57
## 6 6 32
## 7 7 18
## 8 8 10
## 9 9 5
## 10 10 3
So why do we use an RLum.Data.Curve-class
object instead of a simple data.frame
?
Two reasons:
-
An OSL curve is more than just a set of
x
andy
values. Its interpretation is only meaningful against the background of information on the stimulation wavelength, detection window, measured material, etc. -
OSL curves come in sets and are part of a measurement protocol on aliquots (subsamples, usually representing positions in the reader or single grains). An
RLum.Data.Curve-class
can store all of that information for subsequent analysis, and if records are combined in anRLum.Analysis
structure, they conclusively represent the measured protocol.
In the context of data processing, the principle of consistency is paramount, at least it should be,
regardless of the input data format. In 'Luminescence'
, we ensure this consistency through the RLum-class
object structure. With it, (so the idea) data processing pipelines can be easily understood and implemented, making the entire process tractable.
For instance, data can always be imported with import_Data()
regardless of the data format and 'Luminescence'
will pick the correct import function (from the family of read_
functions) and translate the data into a consistent RLum.Analysis
structure. For instance, the object SAR
below may represent an aliquot measured with a single aliquot regenerative (SAR) dose protocol and the order and intention can be seen
from the data structure (of course, if you are familiar with luminescence dating).
##
## [RLum.Analysis-class]
## originator: Risoe.BINfileData2RLum.Analysis()
## protocol: unknown
## additional info elements: 0
## number of records: 30
## .. : RLum.Data.Curve : 30
## .. .. : #1 TL | #2 OSL | #3 TL | #4 OSL | #5 TL | #6 OSL | #7 TL
## .. .. : #8 OSL | #9 TL | #10 OSL | #11 TL | #12 OSL | #13 TL | #14 OSL
## .. .. : #15 TL | #16 OSL | #17 TL | #18 OSL | #19 TL | #20 OSL | #21 TL
## .. .. : #22 OSL | #23 TL | #24 OSL | #25 TL | #26 OSL | #27 TL | #28 OSL
## .. .. : #29 TL | #30 IRSL
Unfortunately, the analysis data set is often not suitable for immediate analysis. It is often necessary to subset or rearrange the data due to various reasons, such as the implementation of a new protocol, the malfunction of the machine, or any other circumstance.
Yes, as everything in R (being a data science language), you can do this ‘manually’, but you can also
use methods that know how to work with these objects without giving you much of a headache.
Typical functions you may have already encountered are get_RLum()
for subsetting records, merge_RLum()
for
combining them, or plot_RLum()
for a quick visualisation.
However, 'Luminescence'
has more to offer and the real miracle happens when you start reshuffling and combing records. To date there are at least 26 functions in 'Luminescence'
designed to do something with RLum-class
objects.
Let’s focus on two newcomers.
remove_RLum()
A rather typical scenario you may encounter eventually is having a set of records imported
from a measurement file (e.g., .BINX
, .XSYG
), but you will need to get rid of some
of the records because they are not required for the analysis or perhaps they are
broken. The standard approach would use get_RLum()
to select only the records you need,
save the output in a new object. For instance:
SAR_new <- get_RLum(SAR, recordType = c("TL", "OSL"), drop = FALSE)
This call selects TL
and OSL
curves and hence drops the single IRSL
curve at the end of the sequence.
However, it feels somewhat counter intuitive because what you want is only removing one
record. Plus, there is the drop = FALSE
parameter you must not forget to maintain
the object structure.
If we do the same now with remove_RLum()
the code reads:
SAR_new <- remove_RLum(SAR, recordType = "IRSL")
The result is the same, but now our intention to remove IRSL
records is clear,
and the object structure is automatically maintained and the code more intelligible,
even more if used on a chained analysis,
SAR <- import_Data("SAR.xsyg") |>
remove_RLum(SAR, recordType = "IRSL")
which reads cleaner than
SAR <- import_Data("SAR.xsyg") |>
get_RLum(SAR, recordType = c("TL", "OSL"), drop = FALSE)
sort_RLum()
A more intricate scenario emerges when considering the reshuffling of records within a data set. If this is not a requirement for you, you may be among the exceptionally rare luminescence dating practitioners who have never encountered a failed measurement and your meticulous record-keeping practices and absence of errors are noteworthy.
Imagine we have multiple files to import from a fading measurement. In order to analyse them in a meaningful way, you have to ensure that we import the files exactly in the order we have measured them, because date and time matter. More, we want to order them by position first and then by measurement date.
You have messed this up in the file list or your documentation. From experience,
I can ascertain that you will be having a hard time spotting
the mistake, or worse, it may even go by unnoticed. A cleaner way to handle this would
be using sort_RLum()
. For instance:
IRSL <- import_Data("XYSG_FADING/", pattern = ".xsyg") |>
merg_RLum() |>
sort_RLum(info_element = c("position", "startDate"))
The code will pull all available measurement files with the file ending .xsyg
from
a folder. Then they become merged into one big RLum.Analysis-class
object with n records.
sort_RLum()
now ensures that all records are reshuffled first by position and then by measurement
date (here info element startDate
). The result is a truly consistent record with ordered curves.
Alternatively, you can use the function to sort by recordTypes
, for instance
using the SAR
record from above.
sort_RLum(SAR, slot = "recordType")
##
## [RLum.Analysis-class]
## originator: Risoe.BINfileData2RLum.Analysis()
## protocol: unknown
## additional info elements: 0
## number of records: 30
## .. : RLum.Data.Curve : 30
## .. .. : #1 IRSL | #2 OSL | #3 OSL | #4 OSL | #5 OSL | #6 OSL | #7 OSL
## .. .. : #8 OSL | #9 OSL | #10 OSL | #11 OSL | #12 OSL | #13 OSL | #14 OSL
## .. .. : #15 OSL | #16 TL | #17 TL | #18 TL | #19 TL | #20 TL | #21 TL
## .. .. : #22 TL | #23 TL | #24 TL | #25 TL | #26 TL | #27 TL | #28 TL
## .. .. : #29 TL | #30 TL
More, the sort function calculates new info elements that are available automatically
to each record, such as XY_LENGTH
, NCOL
, X_MIN
, X_MAX
, Y_MIN
, Y_MAX
, and this
all by maintaining a consistent object structure independent of the import data format.
Take home message
With the provided example, I tried to shed some light on the rationale behind our
efforts to enhance the handling of RLum-class
objects within the REPLAY project.
Indeed, it makes sense to think about it and include them in our luminescence data analysis.