P. Terán, M. López-Díaz (2014). Computational Statistics and Data Analysis 77, 130-145.
This paper is actually a continuation of work in my doctoral dissertation, which is why it is joint work with my Ph.D. advisor. The paper was in preparation when I moved to Zaragoza and, with the distance, it was never completed. Fast forward some years, we went back to working on it, appending to the theoretical results some interesting simulations and, eventually, an example with real data.
It is about approximating an unknown fuzzy set from the information in random samples taken from (again randomly sampled) alpha-cuts of the fuzzy set. Through the connection between fuzzy sets and nested random sets, that can also be recast as a problem of estimating a set conditionally on the value of another variable, when the set depends monotonically on the value of that variable.
We give rates of convergence for the approximants as a function of both sample sizes, in several metrics between fuzzy sets. Simulations suggest that sample sizes of 20-30 may be enough for the rate to be reliable. We present an example with breast cancer data, studying the range of the variable `cell size' as a function of `shape compactness', a measure of cell irregularity.
Up the line:
·P. Terán, M. López-Díaz (2004). A random approximation of set valued càdlàg functions. J. Math. Anal. Appl. 298, 352-362. (You can download it for free at the journal's site.)
Down the line:
Nothing so far. One referee wanted us to study more metrics, another different approximation schemes.
To download the paper, click on the title or here