--- title: "Graphic Comparison Of Discrete Kernels" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Graphic Comparison Of Discrete Kernels} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: references.bib link-citations: TRUE --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(kernopt) ``` ## Introduction This vignette illustrates some functions provided by the *kernopt* package that implements discrete symmetric kernels, which were recently developed for modeling count data distributions. The *kernopt* package computes that kernel at a target `x` for various values of observations `z` and a fixed bandwidth `h` by using the function `discrete_kernel()` included in this package. ## Discrete Epanechnikov Kernel For a positive integer $h\in\mathbb{N}\setminus\{0\}$, the first example is a discrete symmetric *Epanechnikov* kernel with $\mathcal{S}_{x}=\{x,...,x\pm h\}$ given by: \begin{equation} K^{Epan}_{x,h}(y)= \left(a\bigg(\frac{x-y}{h}\bigg)^{2} + b\right){\bf 1}_{\{x,x\pm 1,...,x\pm h\}}(y), \end{equation} with $a=-b=3h/(1-4h^{2})$ [@chu2017] That kernel only involves the parameter $h$, which is a positive integer that scales the distance between $x$ and $y$. The discrete *Epanechnikov* kernel is less often used in the literature, compared to its continuous counterpart. Below is an illustration of the discrete *Epanechnikov* kernel for different values of `h` using the `discrete_kernel()`function. ```{r, fig.cap='Distribution of discrete symmetric Epanechnikov kernel at the target point $x=5$ for various bandwidth parameters $h$.'} x <- 5 # Target z <- 0:10 # Observations (?) h <- c(1, 2, 3, 4) # Set of Bandwidths K_epan <- matrix( data = 0, nrow = length(z), ncol = length(h) ) for (i in 1:length(h)) { K_epan[, i] <- discrete_kernel(kernel = "epanech", x, z, h[i]) } plot( z, K_epan[, 1], xlab = "x", ylab = "Probability", ylim = c(0, 1), pch = 1 ) lines(z, K_epan[, 1], lty = 1) for (i in 2:length(h)) { points(z, K_epan[, i], xlab = "z", pch = i) lines(z, K_epan[, i], lty = i) } legend( "topleft", c("h=1", "h=2", "h=3", "h=4"), lty = 1:4, pch = 1:4, cex = 1.6 ) ``` ## Discrete Triangular Kernel For $h>0$ and non-negative integer $a\in\mathbb{N}$, the second example is the discrete symmetric triangular kernel with $\mathcal{S}_{x}=\{x,x\pm 1,...,x\pm a\}$, which has a pmf defined by \begin{equation*}\label{eq_TriangKern} K^{Triang}_{a;x,h}(y)= \frac{(a+1)^{h}}{C(a,h)}\left(1 - \frac{|y-x|^{h}}{(a+1)^{h}}\right){\bf 1}_{\{x,x\pm 1,...,x\pm a\}}(y), \end{equation*} with $C(a,h)=(2a+1)(a+1)^{h} - 2\sum_{j=0}^{a}j^{h}$ [@KokonZocSenga2007]. That kernel depends on two parameters $h$ and $a$. That kernel was implemented in the \pkg{kernopt} package through the function \fct{discrete\_triang} with its specific arguments as follows: ```{r, fig.cap="Distribution of discrete symmetric triangular kernel ($a=1$) at the target point $x=5$ for various bandwidth parameters $h$."} x <- 5 z <- 0:10 h <- c(0.1, 0.4, 1, 2) a <- 1 K_trg <- matrix( data = 0, nrow = length(z), ncol = length(h) ) for (i in 1:length(h)) { K_trg[, i] <- discrete_kernel(kernel = "triang", x, z, h[i], a) } plot( z, K_trg[, 1], xlab = "x", ylab = "Probability", ylim = c(0, 1), pch = 1 ) lines(z, K_trg[, 1], lty = 1) for (i in 2:length(h)) { points(z, K_trg[, i], xlab = "z", pch = i) lines(z, K_trg[, i], lty = i) } legend( "topleft", c("h=0.1", "h=0.4", "h=1", "h=2"), lty = 1:4, pch = 1:4, cex = 1.6 ) ``` The discrete symmetric triangular kernel is also available in the *Ake* package in R [@Wansouwe2016]. ## Discrete Optimal Kernel The Discrete Optimal Kernel was proposed in [@SengaDurrieu2024]. For $x\in \mathcal{S}$ and a fixed integer $k\geq1$, the discrete symmetric "optimal" kernel on $\mathcal{S}_{x}=\{x,x\pm 1,\ldots,x\pm k\}$ that minimises the mean integrate squared error of the estimator $\widehat{f}_{K,h}$ in (\ref{eq:estim_fn}) is given by \begin{equation*}\label{eq:OptKern_k} K_{k;x,h}^{opt}(y) =\lambda\bigg(\frac{3k^{2}+3k-1}{5} -(x-y)^{2} \bigg)+\frac{h}{2k+1}, y\in\mathcal{S}_{x}, \end{equation*} where $\lambda=15(1-h)/\big((2k+1)(4k^{2}+4k-3)\big)>0$ and $3/5(1-1/k)