Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. On Optimal Two Sample Homogeneity Tests for Finite Alphabets
 
conference paper

On Optimal Two Sample Homogeneity Tests for Finite Alphabets

Unnikrishnan, Jayakrishnan  
2012
Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on
2012 IEEE International Symposium on Information Theory (ISIT)

Suppose we are given two independent strings of data from a known finite alphabet. We are interested in testing the null hypothesis that both the strings were drawn from the same distribution, assuming that the samples within each string are mutually independent. Among statisticians, the most popular solution for such a homogeneity test is the two sample chi-square test, primarily due to its ease of implementation and the fact that the limiting null hypothesis distribution of the associated test statistic is known and easy to compute. Although tests that are asymptotically optimal in error probability have been proposed in the information theory literature, such optimality results are not well-known and such tests are rarely used in practice. In this paper we seek to bridge the gap between theory and practice. We study two different optimal tests proposed by Shayevitz [1] and Gutman [2]. We first obtain a simplified structure of Shayevitz’s test and then obtain limiting distributions of the test statistics used in both the tests. These results provide guidelines for choosing thresholds that guarantee an approximate false alarm constraint for finite length observation sequences, thus making these tests easy to use in practice. The approximation accuracies are demonstrated using simulations. We argue that such homogeneity tests with provable optimality properties could potentially be better choices than the chi-square test in practice.

  • Files
  • Details
  • Metrics
Type
conference paper
DOI
10.1109/ISIT.2012.6283716
Web of Science ID

WOS:000312544302025

Author(s)
Unnikrishnan, Jayakrishnan  
Date Issued

2012

Publisher

Ieee

Publisher place

New York

Published in
Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on
ISBN of the book

978-1-4673-2579-0

Total of pages

5

Series title/Series vol.

IEEE International Symposium on Information Theory

Start page

2027

End page

2031

Subjects

Hypothesis testing

•

homogeneity tests

•

p-value

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LCAV  
Event nameEvent placeEvent date
2012 IEEE International Symposium on Information Theory (ISIT)

Boston, Massachusetts, USA

July 1-6 2012

Available on Infoscience
October 17, 2012
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/86182
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés