Breaking Luminescence to improve it
One of the challenges I faced as I started working on the Luminescence R package was how to navigate the very large set of functions that the package provides (by the last count, there are 155 functions that are exposed to the user, plus several internal helpers).
As I fumbled my way through it, I started noticing that when a function failed on a certain input, similar failures would also occur elsewhere. However, finding manually which other functions were affected was frustrating and time-consuming. Wouldn’t it be great if we could (quickly and with limited manual intervention) go through all the functions in the package and being told which are the ones that need to be fixed?
This got me to think about fuzz testing, and how that approach could be adapted for my purposes.
Fuzz testing
Fuzz testing is a software testing technique that generates random, unexpected or malformed data and feeds them into a program. By observing how the software behaves under such conditions, this approach may uncover potential weaknesses, vulnerabilities and security flaws in applications that could be exploited by malicious actors.
As such, it’s adopted in systems where maintaining the integrity of the software and of the machine running it is of crucial importance, such as in the development of operating systems, hardware drivers, language compilers, browsers, cryptographic, network-facing and security-related software.
Clearly, Luminescence
doesn’t fall into that category of software. However,
the underlying idea is still valid: in our setting, fuzz testing can be used
to identify functions that lack sufficient argument validation and to uncover
sets of inputs that may cause issues within the function body.
A major problem with using a fuzz-testing approach is dealing with false positives. Reporting all errors produced is not particularly helpful, as some errors are expected or they have already been handled, and therefore don’t need to be looked at again. How do we know if an error reported is something we should be worrying about?
How Luminescence
reports errors
The functions in Luminescence
are instrumented with a simple but effective
error report interface. Whenever a user-facing function throws an error that
the developer had expected (and handled), it also reports the function’s name
in the error message. For example:
read_BIN2R("wrong.bin")
## Error: [read_BIN2R()] File 'wrong.bin' does not exist
Consider instead this other failure (already fixed in version 1.0.0):
read_BIN2R(character(0))
## Error in if (grepl(pattern = "https?\\:\\/\\/", x = url, perl = TRUE)) { :
## argument is of length zero
Clearly, this error wasn’t handled by the function, but it comes straight from the R core engine. Therefore, it should be reported as a true positive, as it signals insufficient argument validation. Effectively, the code assumed that the user would never provide a zero-length file name. While this is probably true in the majority of cases, it may still occur accidentally or as part of a more complex script.
It makes for a better user experience if even these corner cases were gracefully handled by the package. In contrast to the one above, the current error message to shows immediately the cause of the problem, making it easier to find a solution:
## Error: [read_BIN2R()] 'file' cannot be an empty character
The way errors and warnings are reported in Luminescence
provides an
effective way of removing the vast majority of false positives from the
output. This mechanism allows us to assume that, if a function reports an
error message containing its own name, the developers have already considered
that possibility and wrote the code defensively against that type of failure.
Such errors can be considered handled and can be suppressed from the
output. This has the big advantage of leaving a minimal number of cases to be
investigated after fuzzing.
Caught by the Fuzz!
The initial implementation of our fuzz-testing approach was really straightforward and fit in about 50 lines of R code. Indeed, it is simply a matter of calling each function in the package with a given input (one that had already shown to be problematic somewhere) and record any error or warning produced. Those that appear to be unhandled errors are reported so that they can be inspected and fixed.
From these humble beginnings, the code slowly developed into a fully-fledged R package, CBTF (Caught by the Fuzz!), which as of two weeks ago has finally reached CRAN.
The CBTF
package was developed with the expressed purpose of testing and
improving robustness of Luminescence
(although nothing prevents it from being
used for other packages too).
The core functionality of the package is in the fuzz()
function, which calls
each provided function with a certain input and records the output produced.
The get_exported_functions()
helper is used to find all user-facing functions
from a package, which are those that will be fuzzed.
This is what happened last November when I fuzzed all functions using a data frame containing a zero value in the first column as input:
library(CBTF)
funs <- get_exported_functions("Luminescence")
what <- list(df_with_zero = data.frame(ED = 0:5,
ED_Err = 1))
fuzz(funs, what, ignore_warnings = TRUE)
ℹ Fuzzing 154 functions on 1 input
✔ Test input: df_with_zero [5.1s]
✖ 🚨 CAUGHT BY THE FUZZ! 🚨
── Test input: df_with_zero
analyse_IRSAR.RF FAIL no applicable method for `@` applied to an object of class "integer"
calc_CobbleDoseRate FAIL missing value where TRUE/FALSE needed
calc_OSLLxTxDecomposed FAIL 'names' attribute [4] must be the same length as the vector [2]
fit_OSLLifeTimes FAIL NaN value of objective function! Perhaps adjust the bounds.
get_Quote FAIL the condition has length > 1
github_commits FAIL [github_branches()] 'user' should be of class 'character'
plot_RadialPlot FAIL 'from' must be a finite number
[ FAIL 7 | WARN 0 | SKIP 2 | OK 145 ]
These pointed to a number of unhandled error conditions that were eventually solved, so that the current output is the following:
ℹ Fuzzing 155 functions on 1 input
✔ Test input: df_with_zero [5.1s]
✔ 🏃 You didn't get caught by the fuzz!
[ FAIL 0 | WARN 0 | SKIP 2 | OK 153 ]
We began using CBTF
shortly before the release of
Luminescence
version 1.0.0. At that point, we had 272 calls to our
validation functions; for version 1.0.0 we increased them to 459, and
further to 582 for version 1.1.0. These numbers do not count any
additional checks and validations that were done using standard R
constructs, nor fixes to bugs revealed by CBTF
that were unrelated
to input value validation. All applications of CBTF
to Luminescence
are
documented in a tracking bug which, as of today, lists 35 issues,
for a total of 212 problems identified and fixed.
We will continue using this approach to further improve the stability and
user-friendliness of Luminescence
, and hope that similar techniques will
be used more and more in the R ecosystem at large.