Near-optimal algorithms for private estimation and sequential testing of collision probability

Near-optimal algorithms for private estimation and sequential testing of collision probability










arXiv:2504.13804v1 Announce Type: new
Abstract: We present new algorithms for estimating and testing emph{collision probability}, a fundamental measure of the spread of a discrete distribution that is widely used in many scientific fields. We describe an algorithm that satisfies $(alpha, beta)$-local differential privacy and estimates collision probability with error at most $epsilon$ using $tilde{O}left(frac{log(1/beta)}{alpha^2 epsilon^2}right)$ samples for $alpha le 1$, which improves over previous work by a factor of $frac{1}{alpha^2}$. We also present a sequential testing algorithm for collision probability, which can distinguish between collision probability values that are separated by $epsilon$ using $tilde{O}(frac{1}{epsilon^2})$ samples, even when $epsilon$ is unknown. Our algorithms have nearly the optimal sample complexity, and in experiments we show that they require significantly fewer samples than previous methods.






Robert Busa-Fekete, Umar Syed





Go to original source





Posted

in

, , ,

by