Formal concept analysis is a means to explore and understand data. Unlike typical clustering methods it creates a hierarchical (more precisely: a heterarchical) ordering of objects into concept lattices according to which properties these objects share.
One drawback that comes with data analysis by formal concept analysis is the worst-case exponential blow-up of the number of concepts, although practically this occurs rarely (Cimiano et al 2005). However, to understand the resulting data analysis, it is necessary to make it more compact. Stumme et al (2002) have used the idea to render only popular concepts. This has the advantage that too much emphasis is put on the more abstract concepts. In this thesis the idea is to use information theoretic means (cf. Resnik 1999) to select the “most important” concepts required to understand the overall analysis.
- Philipp Cimiano, Andreas Hotho, Steffen Staab: Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis. J. Artif. Intell. Res. 24: 305-339 (2005).
- Gerd Stumme, Rafik Taouil, Yves Bastide, Nicolas Pasquier, Lotfi Lakhal: Computing iceberg concept lattices with T. Data Knowl. Eng. 42(2): 189-222 (2002)
- Philip Resnik: Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. J. Artif. Intell. Res. 11: 95-130 (1999)