This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
A New Estimator of Kullback–Leibler Divergence via Shannon Entropy
1
Department of Statistics, Faculty of Science, Cumhuriyet University, 58140 Sivas, Türkiye
2
Department of Mathematics, Linköping University, 581 83 Linköping, Sweden
*
Author to whom correspondence should be addressed.
Entropy 2026, 28(7), 720; https://doi.org/10.3390/e28070720 (registering DOI)
Submission received: 30 April 2026 / Revised: 15 June 2026 / Accepted: 22 June 2026 / Published: 24 June 2026
(This article belongs to the Section Information Theory, Probability and Statistics)
Abstract
We examine the estimation of the Kullback–Leibler (KL) divergence and the use of the goodness-of-fit test for multivariate normality. Our starting point is the maximum entropy principle for Shannon entropy: among all distributions with a fixed mean vector and covariance matrix, the multivariate Gaussian distributions uniquely maximize entropy. As a result, the KL divergence from a moment-matched Gaussian distribution to an unknown density can then be written as the entropy difference, which is a suitable information-theoretic measure of divergence from the Gaussian distribution. To estimate, we use k-nearest neighbor (kNN) estimators based on Shannon entropy and KL divergence derived from the Kozachenko–Leonenko approach and subsequent improvements, along with the consistency and -convergence results established for these estimators. Motivated by previous entropy-based goodness-of-fit ideas developed for Rényi-type functionals for generalized Gaussian and Student-type models, we describe a KL-based test statistic as being the difference between the entropy of a Gaussian model fitted to the sample mean and covariance and the KL divergence between the unknown entropy and the kNN estimate. The statistic converges to zero for multivariate normality and converges to a strictly positive bound with non-Gaussian alternatives. The results of Monte Carlo simulations conducted across various dimensions and sample sizes indicate that the proposed method provides accurate Type I error control among the alternatives considered and demonstrates promising empirical power.
