SlideShare a Scribd company logo
Building Concept Lattices from
Numerical Data with Pattern Structures
A tutorial
Mehdi Kaytoue and Amedeo Napoli
mehdi.kaytoue@insa-lyon.fr
November, 9th 2010
Context
Formal Concept Analysis
Works on binary relations
Classification of objects w.r.t. the attributes they have in
common within formal concepts (extent, intent)
Ordering concepts gives a mathematical structure
Ganter & Wille, Springer mathematical foundations 99
Concept lattice : useful for many tasks
Simultaneous classification of objects and their attributes
Information organization
Knowledge discovery in databases (closed itemsets,
associations rules)
Information retrieval
...
Valtchev & al., ICFCA 04 – Wille, JETAI 02
2 / 32
Problem and proposition
When facing numerical data ?
Transform data into binary, a general problem
Conceptual scaling (binarization)
Important choices to be made
Loss of information, of links between objects
Avoiding binarization ? Considering a similarity relation
between values ?
3 / 32
Outline
1 Formal Concept Analysis
2 Pattern structures and intervals
3 Introducing a similarity relation
4 Conclusion
Formal context
Given by (G, M, I) with
G a set of objects
M a set of attributes
I a binary relation between objects and attributes :
(g, m) ∈ I means that “object g owns attribute m”
Represented by a binary table
m1 m2 m3
g1 × ×
g2 × ×
g3 × ×
g4 × ×
g5 × × ×
G = {g1, . . . , g5}
M = {m1, m2, m3}
(g1, m3) ∈ I
5 / 32
Galois connection
Two derivation operators forming a Galois connection
Gives the set of common attributes owned by a set of
objects A ⊆ G
A = {m ∈ M | ∀g ∈ A ⊆ G : (g, m) ∈ I}
Gives the set of objects owning all attributes in B ⊆ M
B = {g ∈ G | ∀m ∈ B ⊆ M : (g, m) ∈ I}
6 / 32
Formal concepts
Given by (A, B), with
with A = B and B = A
A is the concept extent
B is the concept intent
Illustration
{g1} = {m1, m3}
{m1, m3} = {g1, g5}
{g1, g5} = {m1, m3}
m1 m2 m3
g1 × ×
g2 × ×
g3 × ×
g4 × ×
g5 × × ×
({g1, g5}, {m1, m3}) is a formal concept
7 / 32
Concept lattice
Ordering relation on concepts
(A1, B1) ≤ (A2, B2) ⇔ A1 ⊆ A2 (⇔ B2 ⊆ B1)
({g1, g5}, {m1, m3}) ≤ ({g1, g2, g5}, {m1})
Concept lattices have interesting properties
Maximality
Specialization/generalisation hierarchy
Synthetic representation of the data without loss of
information
8 / 32
Problem
How to build a lattice from numerical data ?
How to consider “similar” objects in concepts ?
m1 m2 m3
g1 5 7 6
g2 6 8 4
g3 4 8 5
g4 4 9 8
g5 5 8 5
while avoiding discretization and associated problems
(thresholds, size of binary table, information loss,
calculability, interpretation)
9 / 32
Outline
1 Formal Concept Analysis
2 Pattern structures and intervals
3 Introducing a similarity relation
4 Conclusion
First elements
11 / 32
How to order object descriptions
Classical case
Lattice of attributes (2M, ⊆). With N, O ∈ 2M, one has
N ⊆ O ⇐⇒ N ∩ O = N
For example, with M = {a, b}
{a} ⊆ {a, b} ⇐⇒ {a} ∩ {a, b} = {a}
Pattern case
∩ has the properties of a meet in a semi lattice
A “similarity operator” that gives a description handling
the similarity of its arguments
{a, b} ∩ {a, d} = {a}
12 / 32
Pattern structures
Given by (G, (D, ), δ)
G a set of objects
D a meet semi-lattice of object descriptions called patterns
δ a mapping associating to each object g ∈ G its
description δ(g) ∈ D
Patterns from (D, ) are ordered by
c d ⇐⇒ c d = c ∀c, d ∈ D
A Galois connection between (2G, ⊆) and (D, ) gives rise to
a (pattern) concept lattice
Existing algorithms of FCA (based on closure computation) can
easily be adapted
Ganter & Kuznetsov, ICCS01
13 / 32
Intervals are patterns
Let be [a1, b1] and [a2, b2] two intervals
Their meet is
[a1, b1] [a2, b2] = [min(a1, a2), max(b1, b2)]
[4, 4] [5, 5] = [4, 5]
Their order is given by
[a1, b1] [a2, b2] ⇐⇒ [a1, b1] [a2, b2] = [a1, b1]
[4, 5] [5, 5] ⇐⇒ [4, 5] [5, 5] = [4, 5]
Semi lattice (D, ), or (D, )
14 / 32
Interval vectors are patterns
Given the two following interval vectors
e = [ai, bi] i∈[1,p] et f = [ci, di] i∈[1,p]
Their meet is
e f = [ai, bi] [ci, di] i∈[1,p]
[4, 4], [3, 4] [2, 3], [2, 6] = [2, 4], [2, 6]
Their order is given by
e f ⇔ [ai, bi] [ci, di], ∀i ∈ [1, p]
[2, 4], [2, 6] [4, 4], [3, 4] car [2, 4] [4, 4] et [2, 6] [3, 4]
15 / 32
Galois connection
Two operators
Gives the description representing similarity of a set of
objects
A =
g∈A
δ(g) pour A ⊆ G
Gives the maximal set of objects sharing a given
description
d = {g ∈ G|d δ(g)} pour d ∈ (D, )
Ganter & Kuznetsov, ICCS01
16 / 32
Numerical data are pattern structures
m1 m2 m3
g1 5 7 6
g2 6 8 4
g3 4 8 5
g4 4 9 8
g5 5 8 5
{g1, g2} =
g∈{g1,g2}
δ(g) = δ(g1) δ(g2)
= [5, 5], [7, 7], [6, 6] [6, 6], [8, 8], [4, 4]
= [5, 5] [6, 6], [7, 7] [8, 8], [6, 6] [4, 4]
= [5, 6], [7, 8], [4, 6]
17 / 32
Numerical data are pattern structures
m1 m2 m3
g1 5 7 6
g2 6 8 4
g3 4 8 5
g4 4 9 8
g5 5 8 5
[5, 6], [7, 8], [4, 6] = {g ∈ G| [5, 6], [7, 8], [4, 6] δ(g)}
= {g1, g2, g5}
({g1, g2, g5}, [5, 6], [7, 8], [4, 6] ) is a concept
18 / 32
Semantics in R|M|
Patterns are |M|-hyperrectangles
Ordering of patterns corresponds to rectangle inclusion
19 / 32
General problem
Lowest concepts : few objects, small intervals
Highest concepts : many objects, large intervals
Overwhelming : a single concept for each interval of value
20 / 32
1 Formal Concept Analysis
2 Pattern structures and intervals
3 Introducing a similarity relation
4 Conclusion
Introducing a similarity relation between objects
How to group within the same concept objects having
similar values ?
A simple similarity relation
a θ b ⇔ |a − b| ≤ θ
Examples
2 2 4, 2 3 7
22 / 32
The meet : a similarity operator
Given two objects g and h
their descriptions are respectively δ(g) et δ(h)
the similarity between g et h is represented by
δ({g, h}) = δ(g) δ(h)
For any arbitrary set of objects
every objects are similar (since we can compute )
their level of similarity depends of the level of the meet of
their description in the semi-lattice
How to consider a similarity relation w.r.t. a distance ?
23 / 32
Towards a similarity between objects
Introduce an element ∗ ∈ (D, ) denoting dissimilarity
c d = ∗ ⇐⇒ c and d are similar
c d = ∗ ⇐⇒ c are d are not similar
For intervals, the meet is constrained by a threshold θ
[a, b] θ[c, d] = [min(a, c), max(b, d)] if max(b, d)−min(a, c) ≤ θ
[a, b] θ [c, d] = ∗ otherwise
with θ = 0.2
Actually, we just “cut” the semi-lattice
24 / 32
Going further ?
θ is not a transitive relation, i.e. a tolerance relation
Projecting each pattern d ∈ D, i.e. ψ(d) d
For each dimension, replacing each value with a larger
interval
From a value d and its attribute domain
“Dilatation” : ball of patterns of radius θ (similarity)
“Erosion” : delete pairs of values violating the similarity
(maximality)
Computing the meet of remaining values
Projection can be computing as preprocessing
Each projected pattern determines an equivalence class of
similar values : it reduces the number of concepts.
25 / 32
Projecting for changing lattice granularity
FIGURE : Classical case (no projection)
26 / 32
Projecting for changing lattice granularity
FIGURE : With θ = 0
27 / 32
Projecting for changing lattice granularity
FIGURE : With θ = 1
28 / 32
Projecting for changing lattice granularity
FIGURE : With θ = 2
29 / 32
1 Formal Concept Analysis
2 Pattern structures and intervals
3 Introducing a similarity relation
4 Conclusion
Conclusion
In a few words
Pattern structures for numerical data
Introducing a similarity relation
Other works
Links between binarization and projection of patterns
Algorithms for interval pattern structures
Interval data
Mining closed interval patterns and their generators
Enhancing information fusion with pattern structures
Mining bi-sets in numerical data
Applications
Gene expression data analysis
Farmer practices evaluation
Recommendation systems (movielens)
31 / 32
Some references
M. Kaytoue, S. Duplessis, S. O. Kuznetsov, and A. Napoli. Mining Gene Expression Data with Pattern
Structures in Formal Concept Analysis. In Information Sciences. Spec.Iss. : Lattices, 2010.
M. Kaytoue, Z. Assaghir, A. Napoli, and S. O. Kuznetsov. Embedding tolerance relations in Formal Concept
Analysis for classifying numerical data In 19th Conference on Information and Knowledge Management
(CIKM), 2010.
Z. Assaghir, M. Kaytoue, A. Napoli, and H. Prade. Managing Information Fusion with Formal Concept
Analysis. In Modeling Decisions for Artificial Intelligence, 6th International Conference (MDAI), 2010.
B. Ganter et S. O. Kuznetsov. Pattern Structures and Their Projections. In International Conference on
Conceptual Structures, LNCS (2120), Springer, 2001
M. Kaytoue, S. Duplessis, S. O. Kuznetsov, et A. Napoli. Two FCA-Based Methods for Mining Gene
Expression Data. In Formal Concept Analysis, LNCS (5548), Springer, pages 251–266, 2009.
M. Kaytoue, Z. Assaghir, N. Messai, et A. Napoli. Two Complementary Classification Methods for Designing
a Concept Lattice from Interval Data. In Foundations of Information and Knowledge Systems, LNCS (5956),
Springer, pages 345–362, 2010.
M. Kaytoue, S. Duplessis, and A. Napoli. Toward the Discovery of Itemsets with Significant Variations in
Gene Expression Matrices. In Studies in Classification, Data Analysis, and Knowledge Organization,
Springer, 2010.
32 / 32

More Related Content

PDF
Perimetric Complexity of Binary Digital Images
PDF
Paper Summary of Disentangling by Factorising (Factor-VAE)
PDF
Characterizing and mining numerical patterns, an FCA point of view
PDF
Novel set approximations in generalized multi valued decision information sys...
PDF
A lattice-based consensus clustering
PDF
New Classes of Odd Graceful Graphs
PDF
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
PDF
Entropy 19-00079
Perimetric Complexity of Binary Digital Images
Paper Summary of Disentangling by Factorising (Factor-VAE)
Characterizing and mining numerical patterns, an FCA point of view
Novel set approximations in generalized multi valued decision information sys...
A lattice-based consensus clustering
New Classes of Odd Graceful Graphs
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
Entropy 19-00079

What's hot (20)

PDF
An application of gd
PDF
Some fixed point theorems in fuzzy mappings
PPTX
Backtraking pic&def
PDF
Polyadic systems and multiplace representations
PDF
A common fixed point of integral type contraction in generalized metric spacess
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
PDF
One modulo n gracefulness of
PDF
Comparison of the optimal design
PDF
S. Duplij. Polyadic algebraic structures and their applications
PDF
Common fixed point and weak commuting mappings
PDF
On fixed point theorem in fuzzy metric spaces
PDF
Ijmet 10 01_046
PDF
Ji2416271633
PDF
PDF
Dynamic Programming Over Graphs of Bounded Treewidth
PDF
Combinatorial optimization CO-4
PDF
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
PDF
An application of gd
Some fixed point theorems in fuzzy mappings
Backtraking pic&def
Polyadic systems and multiplace representations
A common fixed point of integral type contraction in generalized metric spacess
International Journal of Engineering and Science Invention (IJESI)
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
One modulo n gracefulness of
Comparison of the optimal design
S. Duplij. Polyadic algebraic structures and their applications
Common fixed point and weak commuting mappings
On fixed point theorem in fuzzy metric spaces
Ijmet 10 01_046
Ji2416271633
Dynamic Programming Over Graphs of Bounded Treewidth
Combinatorial optimization CO-4
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
Ad

Viewers also liked (7)

PPT
Recommendation and Information Retrieval: Two Sides of the Same Coin?
PDF
Formal Concept Analysis
PDF
(In)Formal Concept Analysis
PDF
Context-Aware Recommender System Based on Boolean Matrix Factorisation
PPTX
Concept-Based Information Retrieval using Explicit Semantic Analysis
PPTX
Multimedia content based retrieval slideshare.ppt
PDF
Identifying Traits with Formal Concept Analysis
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Formal Concept Analysis
(In)Formal Concept Analysis
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Concept-Based Information Retrieval using Explicit Semantic Analysis
Multimedia content based retrieval slideshare.ppt
Identifying Traits with Formal Concept Analysis
Ad

Similar to Interval Pattern Structures: An introdution (20)

PDF
On the Mining of Numerical Data with Formal Concept Analysis
PDF
Extracting biclusters of similar values with Triadic Concept Analysis
PDF
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
PDF
Graph kernels
PDF
An Application of Gd-Metric Spaces and Metric Dimension of Graphs
PPTX
theory of computation lecture 01
PDF
Graph Edit Distance: Basics & Trends
PDF
Generalized Notions of Data Depth
PDF
METRIC DIMENSION AND UNCERTAINTY OF TRAVERSING ROBOTS IN A NETWORK
PDF
Using Consolidated Tabular and Text Data in Business Predictive Analytics
PDF
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
PDF
Дмитрий Игнатов для ФИSNA
PDF
Bayesian hybrid variable selection under generalized linear models
PDF
talk_NASPDE.pdf
PPT
Aggregation computation over distributed data streams(the final version)
PPTX
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
ODP
Minimizing cost in distributed multiquery processing applications
PDF
A new generalized lindley distribution
PPTX
Tutorial on deep generative model (slides)
PDF
GREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURES
On the Mining of Numerical Data with Formal Concept Analysis
Extracting biclusters of similar values with Triadic Concept Analysis
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
Graph kernels
An Application of Gd-Metric Spaces and Metric Dimension of Graphs
theory of computation lecture 01
Graph Edit Distance: Basics & Trends
Generalized Notions of Data Depth
METRIC DIMENSION AND UNCERTAINTY OF TRAVERSING ROBOTS IN A NETWORK
Using Consolidated Tabular and Text Data in Business Predictive Analytics
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
Дмитрий Игнатов для ФИSNA
Bayesian hybrid variable selection under generalized linear models
talk_NASPDE.pdf
Aggregation computation over distributed data streams(the final version)
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
Minimizing cost in distributed multiquery processing applications
A new generalized lindley distribution
Tutorial on deep generative model (slides)
GREY LEVEL CO-OCCURRENCE MATRICES: GENERALISATION AND SOME NEW FEATURES

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PPTX
Cell Structure & Organelles in detailed.
PPTX
Cell Types and Its function , kingdom of life
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Business Ethics Teaching Materials for college
PDF
RMMM.pdf make it easy to upload and study
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Insiders guide to clinical Medicine.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Supply Chain Operations Speaking Notes -ICLT Program
102 student loan defaulters named and shamed – Is someone you know on the list?
human mycosis Human fungal infections are called human mycosis..pptx
Week 4 Term 3 Study Techniques revisited.pptx
Cell Structure & Organelles in detailed.
Cell Types and Its function , kingdom of life
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Business Ethics Teaching Materials for college
RMMM.pdf make it easy to upload and study
Pharma ospi slides which help in ospi learning
Final Presentation General Medicine 03-08-2024.pptx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Insiders guide to clinical Medicine.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Microbial diseases, their pathogenesis and prophylaxis
3rd Neelam Sanjeevareddy Memorial Lecture.pdf

Interval Pattern Structures: An introdution

  • 1. Building Concept Lattices from Numerical Data with Pattern Structures A tutorial Mehdi Kaytoue and Amedeo Napoli mehdi.kaytoue@insa-lyon.fr November, 9th 2010
  • 2. Context Formal Concept Analysis Works on binary relations Classification of objects w.r.t. the attributes they have in common within formal concepts (extent, intent) Ordering concepts gives a mathematical structure Ganter & Wille, Springer mathematical foundations 99 Concept lattice : useful for many tasks Simultaneous classification of objects and their attributes Information organization Knowledge discovery in databases (closed itemsets, associations rules) Information retrieval ... Valtchev & al., ICFCA 04 – Wille, JETAI 02 2 / 32
  • 3. Problem and proposition When facing numerical data ? Transform data into binary, a general problem Conceptual scaling (binarization) Important choices to be made Loss of information, of links between objects Avoiding binarization ? Considering a similarity relation between values ? 3 / 32
  • 4. Outline 1 Formal Concept Analysis 2 Pattern structures and intervals 3 Introducing a similarity relation 4 Conclusion
  • 5. Formal context Given by (G, M, I) with G a set of objects M a set of attributes I a binary relation between objects and attributes : (g, m) ∈ I means that “object g owns attribute m” Represented by a binary table m1 m2 m3 g1 × × g2 × × g3 × × g4 × × g5 × × × G = {g1, . . . , g5} M = {m1, m2, m3} (g1, m3) ∈ I 5 / 32
  • 6. Galois connection Two derivation operators forming a Galois connection Gives the set of common attributes owned by a set of objects A ⊆ G A = {m ∈ M | ∀g ∈ A ⊆ G : (g, m) ∈ I} Gives the set of objects owning all attributes in B ⊆ M B = {g ∈ G | ∀m ∈ B ⊆ M : (g, m) ∈ I} 6 / 32
  • 7. Formal concepts Given by (A, B), with with A = B and B = A A is the concept extent B is the concept intent Illustration {g1} = {m1, m3} {m1, m3} = {g1, g5} {g1, g5} = {m1, m3} m1 m2 m3 g1 × × g2 × × g3 × × g4 × × g5 × × × ({g1, g5}, {m1, m3}) is a formal concept 7 / 32
  • 8. Concept lattice Ordering relation on concepts (A1, B1) ≤ (A2, B2) ⇔ A1 ⊆ A2 (⇔ B2 ⊆ B1) ({g1, g5}, {m1, m3}) ≤ ({g1, g2, g5}, {m1}) Concept lattices have interesting properties Maximality Specialization/generalisation hierarchy Synthetic representation of the data without loss of information 8 / 32
  • 9. Problem How to build a lattice from numerical data ? How to consider “similar” objects in concepts ? m1 m2 m3 g1 5 7 6 g2 6 8 4 g3 4 8 5 g4 4 9 8 g5 5 8 5 while avoiding discretization and associated problems (thresholds, size of binary table, information loss, calculability, interpretation) 9 / 32
  • 10. Outline 1 Formal Concept Analysis 2 Pattern structures and intervals 3 Introducing a similarity relation 4 Conclusion
  • 12. How to order object descriptions Classical case Lattice of attributes (2M, ⊆). With N, O ∈ 2M, one has N ⊆ O ⇐⇒ N ∩ O = N For example, with M = {a, b} {a} ⊆ {a, b} ⇐⇒ {a} ∩ {a, b} = {a} Pattern case ∩ has the properties of a meet in a semi lattice A “similarity operator” that gives a description handling the similarity of its arguments {a, b} ∩ {a, d} = {a} 12 / 32
  • 13. Pattern structures Given by (G, (D, ), δ) G a set of objects D a meet semi-lattice of object descriptions called patterns δ a mapping associating to each object g ∈ G its description δ(g) ∈ D Patterns from (D, ) are ordered by c d ⇐⇒ c d = c ∀c, d ∈ D A Galois connection between (2G, ⊆) and (D, ) gives rise to a (pattern) concept lattice Existing algorithms of FCA (based on closure computation) can easily be adapted Ganter & Kuznetsov, ICCS01 13 / 32
  • 14. Intervals are patterns Let be [a1, b1] and [a2, b2] two intervals Their meet is [a1, b1] [a2, b2] = [min(a1, a2), max(b1, b2)] [4, 4] [5, 5] = [4, 5] Their order is given by [a1, b1] [a2, b2] ⇐⇒ [a1, b1] [a2, b2] = [a1, b1] [4, 5] [5, 5] ⇐⇒ [4, 5] [5, 5] = [4, 5] Semi lattice (D, ), or (D, ) 14 / 32
  • 15. Interval vectors are patterns Given the two following interval vectors e = [ai, bi] i∈[1,p] et f = [ci, di] i∈[1,p] Their meet is e f = [ai, bi] [ci, di] i∈[1,p] [4, 4], [3, 4] [2, 3], [2, 6] = [2, 4], [2, 6] Their order is given by e f ⇔ [ai, bi] [ci, di], ∀i ∈ [1, p] [2, 4], [2, 6] [4, 4], [3, 4] car [2, 4] [4, 4] et [2, 6] [3, 4] 15 / 32
  • 16. Galois connection Two operators Gives the description representing similarity of a set of objects A = g∈A δ(g) pour A ⊆ G Gives the maximal set of objects sharing a given description d = {g ∈ G|d δ(g)} pour d ∈ (D, ) Ganter & Kuznetsov, ICCS01 16 / 32
  • 17. Numerical data are pattern structures m1 m2 m3 g1 5 7 6 g2 6 8 4 g3 4 8 5 g4 4 9 8 g5 5 8 5 {g1, g2} = g∈{g1,g2} δ(g) = δ(g1) δ(g2) = [5, 5], [7, 7], [6, 6] [6, 6], [8, 8], [4, 4] = [5, 5] [6, 6], [7, 7] [8, 8], [6, 6] [4, 4] = [5, 6], [7, 8], [4, 6] 17 / 32
  • 18. Numerical data are pattern structures m1 m2 m3 g1 5 7 6 g2 6 8 4 g3 4 8 5 g4 4 9 8 g5 5 8 5 [5, 6], [7, 8], [4, 6] = {g ∈ G| [5, 6], [7, 8], [4, 6] δ(g)} = {g1, g2, g5} ({g1, g2, g5}, [5, 6], [7, 8], [4, 6] ) is a concept 18 / 32
  • 19. Semantics in R|M| Patterns are |M|-hyperrectangles Ordering of patterns corresponds to rectangle inclusion 19 / 32
  • 20. General problem Lowest concepts : few objects, small intervals Highest concepts : many objects, large intervals Overwhelming : a single concept for each interval of value 20 / 32
  • 21. 1 Formal Concept Analysis 2 Pattern structures and intervals 3 Introducing a similarity relation 4 Conclusion
  • 22. Introducing a similarity relation between objects How to group within the same concept objects having similar values ? A simple similarity relation a θ b ⇔ |a − b| ≤ θ Examples 2 2 4, 2 3 7 22 / 32
  • 23. The meet : a similarity operator Given two objects g and h their descriptions are respectively δ(g) et δ(h) the similarity between g et h is represented by δ({g, h}) = δ(g) δ(h) For any arbitrary set of objects every objects are similar (since we can compute ) their level of similarity depends of the level of the meet of their description in the semi-lattice How to consider a similarity relation w.r.t. a distance ? 23 / 32
  • 24. Towards a similarity between objects Introduce an element ∗ ∈ (D, ) denoting dissimilarity c d = ∗ ⇐⇒ c and d are similar c d = ∗ ⇐⇒ c are d are not similar For intervals, the meet is constrained by a threshold θ [a, b] θ[c, d] = [min(a, c), max(b, d)] if max(b, d)−min(a, c) ≤ θ [a, b] θ [c, d] = ∗ otherwise with θ = 0.2 Actually, we just “cut” the semi-lattice 24 / 32
  • 25. Going further ? θ is not a transitive relation, i.e. a tolerance relation Projecting each pattern d ∈ D, i.e. ψ(d) d For each dimension, replacing each value with a larger interval From a value d and its attribute domain “Dilatation” : ball of patterns of radius θ (similarity) “Erosion” : delete pairs of values violating the similarity (maximality) Computing the meet of remaining values Projection can be computing as preprocessing Each projected pattern determines an equivalence class of similar values : it reduces the number of concepts. 25 / 32
  • 26. Projecting for changing lattice granularity FIGURE : Classical case (no projection) 26 / 32
  • 27. Projecting for changing lattice granularity FIGURE : With θ = 0 27 / 32
  • 28. Projecting for changing lattice granularity FIGURE : With θ = 1 28 / 32
  • 29. Projecting for changing lattice granularity FIGURE : With θ = 2 29 / 32
  • 30. 1 Formal Concept Analysis 2 Pattern structures and intervals 3 Introducing a similarity relation 4 Conclusion
  • 31. Conclusion In a few words Pattern structures for numerical data Introducing a similarity relation Other works Links between binarization and projection of patterns Algorithms for interval pattern structures Interval data Mining closed interval patterns and their generators Enhancing information fusion with pattern structures Mining bi-sets in numerical data Applications Gene expression data analysis Farmer practices evaluation Recommendation systems (movielens) 31 / 32
  • 32. Some references M. Kaytoue, S. Duplessis, S. O. Kuznetsov, and A. Napoli. Mining Gene Expression Data with Pattern Structures in Formal Concept Analysis. In Information Sciences. Spec.Iss. : Lattices, 2010. M. Kaytoue, Z. Assaghir, A. Napoli, and S. O. Kuznetsov. Embedding tolerance relations in Formal Concept Analysis for classifying numerical data In 19th Conference on Information and Knowledge Management (CIKM), 2010. Z. Assaghir, M. Kaytoue, A. Napoli, and H. Prade. Managing Information Fusion with Formal Concept Analysis. In Modeling Decisions for Artificial Intelligence, 6th International Conference (MDAI), 2010. B. Ganter et S. O. Kuznetsov. Pattern Structures and Their Projections. In International Conference on Conceptual Structures, LNCS (2120), Springer, 2001 M. Kaytoue, S. Duplessis, S. O. Kuznetsov, et A. Napoli. Two FCA-Based Methods for Mining Gene Expression Data. In Formal Concept Analysis, LNCS (5548), Springer, pages 251–266, 2009. M. Kaytoue, Z. Assaghir, N. Messai, et A. Napoli. Two Complementary Classification Methods for Designing a Concept Lattice from Interval Data. In Foundations of Information and Knowledge Systems, LNCS (5956), Springer, pages 345–362, 2010. M. Kaytoue, S. Duplessis, and A. Napoli. Toward the Discovery of Itemsets with Significant Variations in Gene Expression Matrices. In Studies in Classification, Data Analysis, and Knowledge Organization, Springer, 2010. 32 / 32