UncertainData.jl: a Julia package for working with measurements and datasets with uncertainties

Abstract

UncertainData.jl provides an interface to represent data with associated uncertainties for the Julia programming language (Bezanson, Edelman, Karpinski, & Shah, 2017). Unlike Measurements.jl (Giordano, 2016), which deals with exact error propagation of normally distributed values, UncertainData.jl uses a resampling approach to deal with uncertainties in calculations. This allows working with and combining any type of uncertain value for which a resampling method can be defined. Examples of currently supported uncertain values are:theoretical distributions, e.g., those supported by Distributions.jl (Besançon et al., 2019; Linet al., 2019); values whose states are represented by a finite set of values with weighted probabilities; values represented by empirical distributions; and more. The package simplifies resampling from uncertain datasets whose data points potentially have different kinds of uncertainties, both in data values and potential index values (e.g., time or space). The user may resample using a set of pre-defined constraints, truncating the supports of the distributions furnishing the uncertain datasets, combined with interpolation on pre-defined grids. Methods for sequential resampling of ordered datasets that have indices with uncertainties are also provided. Using Julia’s multiple dispatch, UncertainData.jl extends most elementary mathematical operations, hypothesis tests from HypothesisTests.jl, and various methods from the StatsBase.jl package for uncertain values and uncertain datasets. Additional statistical algorithms in other packages are trivially adapted to handle uncertain values and datasets from UncertainData.jl by using multiple dispatch and the provided resampling framework. UncertainData.jl was originally designed to form the backbone of the uncertainty handling in the CausalityTools.jl package, with the aim of quantifying the sensitivity of statistical timeseries causality detection algorithms. Recently, the package has also been used in paleoclimate research (Vasskog et al., 2019).

Publication
Journal of Open Source Software 8: 11520