Sensory Data Clustering using Probabilistic Topic Models[go to overview]
Human activity recognition (HAR) is a field of research that increasingly benefitsfrom the sensory data of wearable inertial sensors. HAR aims to automaticallyidentify the activity carried out by a person on the basis of the information recordedabout him/her and his/her environment. Although there are many approaches toaddressing HAR, existing methods focus on categorizing sensor sequences into predefinedactivity classes and do not provide any information to understand the contentof the sequences themselves. This thesis proposes an approach to investigatingHAR that allows discovering hidden structures to describe sensor sequences. Theselatent structures enable to represent sequences in a way that can be understood by ahuman. The Latent Dirichlet Allocation (LDA) and its Gaussian variation, the GaussianLatent Dirichlet, which are methods of extracting topics generally used in textanalysis, are applied. Sensory sequences are converted into word-like patterns onwhich LDA can be applied. In addition, numerical features are extracted from sequencesfor the use of the Gaussian LDA.Experiments carried out on a data set of HAR show that topic models are capable ofuncovering underlying structures in activity sequences. These latent structures providea human-understandable representation of sequences. The evaluation resultsreveal that topic models, although unsupervised, are capable of recognizing activitieswith accuracy comparable to that achieved by supervised algorithms. Further,the representations of sequences achieved by topic models are meaningful featuresfor sequence classification.
30.01.20 - 10:15