Spatial Inference in Large-Scale Sensor Networks using Multiple Hypothesis Testing and Bayesian Clustering

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
Sähkötekniikan korkeakoulu |
Signal, Speech and Language Processing
Degree programme
CCIS - Master’s Programme in Computer, Communication and Information Sciences (TS2013)
In this thesis, we address the problem of statistical inference in large-scale sensor networks observing spatially varying fields. First, we revisit traditional single-sensor hypothesis testing. We then present a multiple hypothesis framework to model spatial fields occurring in a multitude of practical signal processing applications. Observing and monitoring phenomena that occur within a spatial field is essential to a variety of applications. This includes tasks, such as, detecting occupied radio spectrum in shared spectrum environments, identifying regions of poor air quality in environmental monitoring, smart buildings and different Internet of Things (IoT) applications. Many of these practical problems can be modeled using a multiple hypothesis testing framework, with the goal of identifying homogeneous spatial regions within which a defined null hypothesis (e.g. pollution remaining at tolerable level, radio spectrum being unoccupied) is in place, and regions where alternative hypotheses are true. These regions can be formed assessing observations made by multiple sensors placed at distinct locations. To be scalable for largescale sensor networks, we suggest to compute local test statistics, such as, p-values at each individual sensor to avoid communication overhead due to a large number of sensors exchanging their raw measurement data. Individual test statistics are fed to a Fusion Center (FC), which performs the inference. At the FC, statistical inference is performed with a propose a method referred to as “Spatial Inference based on Clustering of p-values (SPACE-COP)” that uses multiple hypothesis testing and Bayesian clustering to detect occurring phenomena of interest within the spatial field. The method identifies homogeneous regions in a field based on similarity in decision statistics and locations of the sensors. The number of clusters, each of which is associated to a hypothesis, is determined by a newly derived Bayesian cluster enumeration criterion that is based on the statistical model that has been derived in this project. An EM-algorithm is developed to compute the probabilities that associate sensors with clusters. We present two different decision criteria, for maximum performance (SPACE-COP) and control of false discoveries (FDR SPACE-COP). The performance of the proposed methods is studied in a series of simulation examples and compared to competitors from the literature. Simulation results demonstrate the validity of proposed SPACE-COP methods also for cases in which the assumption on underlying spatial shape of alternative areas was clearly violated and true alternative areas followed arbitrary and even non-convex shapes. In summary, the derived algorithms are applicable to large-scale sensor networks to perform statistical inference and identify homogeneous regions in an observed phenomenon or field where the null hypothesis does not hold.
Koivunen, Visa
Thesis advisor
Koivunen, Visa
Muma, Michael
inference, hypothesis testing, clustering, p-values, sensor networks, Internet of Things
Other note