IEEE ISIT 2024 || Athens, Greece || 7-12 July 2024

MO3.R9.1

Missing Mass under Random Duplications

Prafulla Chandra, Andrew Thangaraj, IIT Madras, India

Session:

Statistical Estimation and Detection

Track:

11: Information Theory and Statistics

Location:

Lamda

Presentation Time:

Mon, 8 Jul, 14:35 - 14:55

Session Chair:

Shirin Jalali, Rutgers

Abstract

Learning a distribution from error-prone or non-ideal sampling modeled as "duplication" or "sticky" channels has been of interest recently inspired by applications such as DNA computing. Missing mass, the total probability of missing letters, is an important quantity that plays a crucial role in distribution estimation, specifically in the large alphabet setting. In this work, we consider the problem of estimation of missing mass, which has been well-understood under independent and identically distributed (i.i.d) sampling, in the case when the samples involve duplications. Precisely, we consider the setting where each sample from an unknown distribution gets repeated a Bernoulli-distributed number of times creating a hidden Markov memory in the samples. We characterize the minimax rate of Mean Squared Error (MSE) of estimating missing mass from such elementary duplication sampling channels. An upper bound on the minimax rate is obtained by bounding the risk of a proposed estimator. We derive a matching lower bound by reduction to the i.i.d setting.

Session MO3.R9

MO3.R9.1: Missing Mass under Random Duplications

Prafulla Chandra, Andrew Thangaraj, IIT Madras, India

MO3.R9.2: Low-rank Matrix Sensing With Dithered One-Bit Quantization

Farhang Yeganegi, Arian Eamaz, Mojtaba Soltanalian, University of Illinois Chicago, United States

MO3.R9.3: Sharper rates of convergence for the tensor graphical Lasso estimator

Shuheng Zhou, University of California, United States; Kristjan Greenewald, IBM Research, United States

MO3.R9.4: Detection of Signals in Colored Noise: Leading Eigenvalue Test for Non-central F-matrices

Prathapasinghe Dharmawansa, University of Melbourne, Australia; Saman Atapattu, RMIT University, Australia; Jamie Evans, University of Melbourne, Australia; Kandeepan Sithamparanathan, RMIT University, Australia

Resources

View Manuscript