Diffusion-based kernel density estimator (diffKDE).

Pelz, Maria-Theresia and Slawig, Thomas (2024) Diffusion-based kernel density estimator (diffKDE). Open Access DOI 10.5281/zenodo.7594914.

[thumbnail of 13736609.zip] Archive
13736609.zip - Published Version
Available under License Creative Commons: Attribution 4.0.

Download (51kB)

Supplementary data:

Abstract

The diffKDE package includes a new algorithm of a diffusion-based kernel density estimator (diffKDE) (Chaudhuri ad Marron, 2000; Botev et al., 2010) as a Python tool. It provides a function to calculate the diffKDE from 1-dimensional data as the solution of the diffusion equation with Neumann boundary conditions and an initial value constructed from the delta-distribution of the input data. The implementation is based on an equidistant finite differences discretization in space and time, two pilot estimation steps, and a new approximation of the optimal bandwidth. For the diffKDE, the bandwidth parameter equals the positive square root of the final iteration time. The delta distribution in the initial value is approximated by a Dirac sequence (Hirsch and Lacombe, 1999.). The pilot estimation steps are simplified diffKDEs theirselves and incorporated in the approximation of the optimal bandwidth. One pilot additionally serves as a parameter function in the diffusion equation, as suggested by Botev et al. (2010). The bandwidths for the pilot estimates are data-driven approaches by Silverman (1986). The optimal bandwidth for the diffKDE is a direct approximation of its analytical optimal solution using the second pilot as an approximation of the true probability density. Furthermore, the package provides functions for visual outputs of the first pilot estimate, the time evolution of the solution for the diffKDE and an interactive exploration of the different smoothing grades of the diffKDE at different bandwidths. The last one can be used to identify individual bandwidths for specific purposes. The diffKDE function requires a 1-dimensional data set. Optional parameters are lower and upper spatial boundaries, numbers of spatial and temporal discretization intervals, and a fixed final iteration time

Document Type: Software
Keywords: Kernel Density Estimation; Diffusion Equation; Bandwidth Approximation
Research affiliation: OceanRep > GEOMAR > FB2 Marine Biogeochemistry > FB2-BM Biogeochemical Modeling
Main POF Topic: PT6: Marine Life
Publisher: Zenodo
Date Deposited: 07 Nov 2024 11:04
Last Modified: 07 Nov 2024 11:04
URI: https://oceanrep.geomar.de/id/eprint/60879

Actions (login required)

View Item View Item