SlideShare a Scribd company logo
LEARN TO MAKE
A MACHINE
LEARN
Dr. Angana Chakraborty
Assistant Professor, W.B.E.S.
Sister Nibedita Govt. Gen. Degree College for Girls
Kolkata, West Bengal
Faculty Induction Program-02
Guru Jambheswar University of Science & Technology, Hisar
MACHINE LEARNING IN E-COMMERCE
INFORMATION RETRIEVAL
!Input: A collection of points and a
Query in
!Processing: Build the data structure
and search points nearest to the Query .
!Return: The closest data point(s).
Some distance metric is used to measure
the similarity between the data objects.
n
ℝd
LOCALITY SENSITIVE HASHING
APPROXIMATE NEAREST NEIGHBOUR SEARCH
Locality Sensitive Hashing:[1]
A family of hash functions is called
-sensitive if
if then
if then , Where,
.
ℋ : ℝd
→ U (R, cR, P1, P2)
∀p, q ∈ ℝd
∥ p − q ∥≤ R Prℋ[h(p) = h(q)] ≥ P1
∥ p − q ∥≥ cR Prℋ[h(p) = h(q)] ≤ P2
P1 > P2
[1]Andoni, A. and Indyk, P. (2008). Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In
Communications of the ACM - 50th anniversary issue, pages 117{122}. ACM.
LSH
LSH Table
0 1 2 3 4 5 6 7 8 9
0123
hLSH(2)=hLSH(5)=hLSH(8)=3
CONTEXT BASED LOCALITY SENSITIVE HASHING (conLSH)
LSH conLSH
LSH Table
(a)
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
0123
012345
hLSH(2)=hLSH(5)=hLSH(8)=3
conLSH Table
(b)
hconLSH(1)=hconLSH(7)=0
hconLSH(2)=hconLSH(8)=1
Suitable
for ordered
data
CONTEXT BASED LOCALITY SENSITIVE HASHING (conLSH)
Context:[2]
be a sequence of length .
A context at position of , for , is a subsequence of length
is the context factor.
x : (x1x2…xn) n
ith
x i ∈ {λ + 1,…, n − λ} (xi−λ, …, xi, …, xi+λ) 2λ + 1
λ
[2]Angana Chakraborty, Sanghamitra Bandyopadhyay, “conLSH: Context based Locality Sensitive Hashing for Mapping of noisy SMRT
Reads”, Computational Biology and Chemistry, Elsevier, 107206 (2020), doi:10.1016/j.compbiolchem.2020.107206.
conLSH:[2]
: set of all length- words over and be the set of all length- words over , for .
A -sensitive LSH family of functions mapping from to is called
-sensitive, if for each , there are positions and with such that for all
one has whenever holds.
X d Σ U 2λ + 1 Σ 2λ + 1 < d
(R, cR, P1, P2) ℋ X U (R, cR, λ, P1, P2)
h ∈ ℋ ih jh λ + 1 ≤ ih, jh ≤ d − λ
p, q ∈ X h(p) = h(q) p[ih − λ…ih…ih + λ] = q[jh − λ…jh…jh + λ]
REAL LIFE APPLICATIONS
!Customised suggestion in e-commerce platform
!Faster algorithms for real-time results.
!Training based on demographical, age-based preferences
!Biological Sequence similarity search
!Mutation and SNP detection
!Phylogenetic tree reconstruction
!Action recognition in video sequences
BIBLIOGRAPHY
➤ Andoni, A. and Indyk, P. (2008). Near-optimal hashing algorithms for approximate nearest neighbor in high
dimensions. In Communications of the ACM - 50th anniversary issue, pages 117{122}. ACM.
➤ Chakraborty, A. and Bandyopadhyay, S. (2020), “conLSH: Context based Locality Sensitive Hashing for Mapping
of noisy SMRT Reads”, Computational Biology and Chemistry, Elsevier, 107206 (2020), doi:10.1016/
j.compbiolchem.2020.107206.
➤ Chakraborty, A. and Morgenstern, B. and Bandyopadhyay, S. (2019), S-conLSH: Alignment-free gapped mapping
of noisy long reads", bioRxiv 801118, 2019.
➤ Chakraborty, A. and Bandyopadhyay, S. (2014). A Layered Locality Sensitive Hashing based Sequence Similarity
Search Algorithm for Web Sessions. 2nd ASE International Conference on Big Data Science and Computing,
Stanford University, CA, USA.
➤ Zielezinski, A., Girgis, H. Z., Bernard, G., Leimeister, C.-A., Tang, K., Dencker, T., Lau, A. K., Rohling, S., Choi,
J., Waterman, M. S., Comin, M., Kim, S.-H., Vinga, S., Almeida, J. S., Chan, C. X., James, B., Sun, F.,
Morgenstern, B., and Karlowski, W. M. (2019). Benchmarking of alignment-free sequence comparison methods.
Genome Biology, accepted for publication.
END

More Related Content

PDF
Usage of word sense disambiguation in concept identification in ontology cons...
PDF
Topic model an introduction
PDF
MediaEval 2014: THU-HCSIL Approach to Emotion in Music Task using Multi-level...
PPTX
Discovering Overlapping Community Structure in Networks through Co-clustering
PPTX
Teaching & Learning with Technology TLT 2016
PDF
Journal club: Quantitative models of neural language representation
PDF
Trends in deep learning in 2020 - International Journal of Artificial Intelli...
PDF
Modeling perceptual similarity and shift invariance in deep networks
Usage of word sense disambiguation in concept identification in ontology cons...
Topic model an introduction
MediaEval 2014: THU-HCSIL Approach to Emotion in Music Task using Multi-level...
Discovering Overlapping Community Structure in Networks through Co-clustering
Teaching & Learning with Technology TLT 2016
Journal club: Quantitative models of neural language representation
Trends in deep learning in 2020 - International Journal of Artificial Intelli...
Modeling perceptual similarity and shift invariance in deep networks

Similar to Learn to Make a Machine Learn Presentation by Dr. Angana Chakraborty (20)

PDF
A Survey on Unsupervised Graph-based Word Sense Disambiguation
PDF
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
PPTX
[Chung il kim] 0829 thesis
PDF
Carleton Biology talk : March 2014
PDF
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
PDF
Pasi Fränti: Social and health care services as an optimization problem
 
PDF
Stock markets and_human_genomics
PDF
近似メッセージ伝搬法に基づく離散値ベクトル再構成の一般化
PDF
intelligent sensors and sensor networks
PPTX
Brain Computer Interface for reconstructing sensory experiences
PDF
A Non Parametric Estimation Based Underwater Target Classifier
PDF
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
PPTX
Creative cognition in the city: underlying principles for creativity and inno...
PPTX
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
PPTX
A comparative study of Clustering for Gene expression data in Bioinformatics
PDF
International Journal of Computer Science and Security Volume (2) Issue (5)
PPT
Query Translation for Ontology-extended Data Sources
PDF
Landmark Detection in Hindustani Music Melodies
PDF
Structured Regularization for conditional Gaussian graphical model
PDF
Indian-Sign-Language-Recognition
A Survey on Unsupervised Graph-based Word Sense Disambiguation
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
[Chung il kim] 0829 thesis
Carleton Biology talk : March 2014
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
Pasi Fränti: Social and health care services as an optimization problem
 
Stock markets and_human_genomics
近似メッセージ伝搬法に基づく離散値ベクトル再構成の一般化
intelligent sensors and sensor networks
Brain Computer Interface for reconstructing sensory experiences
A Non Parametric Estimation Based Underwater Target Classifier
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Creative cognition in the city: underlying principles for creativity and inno...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
A comparative study of Clustering for Gene expression data in Bioinformatics
International Journal of Computer Science and Security Volume (2) Issue (5)
Query Translation for Ontology-extended Data Sources
Landmark Detection in Hindustani Music Melodies
Structured Regularization for conditional Gaussian graphical model
Indian-Sign-Language-Recognition
Ad

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
cuic standard and advanced reporting.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
A Presentation on Artificial Intelligence
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Cloud computing and distributed systems.
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
The Rise and Fall of 3GPP – Time for a Sabbatical?
cuic standard and advanced reporting.pdf
Network Security Unit 5.pdf for BCA BBA.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Monthly Chronicles - July 2025
A Presentation on Artificial Intelligence
Machine learning based COVID-19 study performance prediction
Digital-Transformation-Roadmap-for-Companies.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Cloud computing and distributed systems.
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Ad

Learn to Make a Machine Learn Presentation by Dr. Angana Chakraborty

  • 1. LEARN TO MAKE A MACHINE LEARN Dr. Angana Chakraborty Assistant Professor, W.B.E.S. Sister Nibedita Govt. Gen. Degree College for Girls Kolkata, West Bengal Faculty Induction Program-02 Guru Jambheswar University of Science & Technology, Hisar
  • 2. MACHINE LEARNING IN E-COMMERCE
  • 3. INFORMATION RETRIEVAL !Input: A collection of points and a Query in !Processing: Build the data structure and search points nearest to the Query . !Return: The closest data point(s). Some distance metric is used to measure the similarity between the data objects. n ℝd
  • 4. LOCALITY SENSITIVE HASHING APPROXIMATE NEAREST NEIGHBOUR SEARCH Locality Sensitive Hashing:[1] A family of hash functions is called -sensitive if if then if then , Where, . ℋ : ℝd → U (R, cR, P1, P2) ∀p, q ∈ ℝd ∥ p − q ∥≤ R Prℋ[h(p) = h(q)] ≥ P1 ∥ p − q ∥≥ cR Prℋ[h(p) = h(q)] ≤ P2 P1 > P2 [1]Andoni, A. and Indyk, P. (2008). Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In Communications of the ACM - 50th anniversary issue, pages 117{122}. ACM. LSH LSH Table 0 1 2 3 4 5 6 7 8 9 0123 hLSH(2)=hLSH(5)=hLSH(8)=3
  • 5. CONTEXT BASED LOCALITY SENSITIVE HASHING (conLSH) LSH conLSH LSH Table (a) 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0123 012345 hLSH(2)=hLSH(5)=hLSH(8)=3 conLSH Table (b) hconLSH(1)=hconLSH(7)=0 hconLSH(2)=hconLSH(8)=1 Suitable for ordered data
  • 6. CONTEXT BASED LOCALITY SENSITIVE HASHING (conLSH) Context:[2] be a sequence of length . A context at position of , for , is a subsequence of length is the context factor. x : (x1x2…xn) n ith x i ∈ {λ + 1,…, n − λ} (xi−λ, …, xi, …, xi+λ) 2λ + 1 λ [2]Angana Chakraborty, Sanghamitra Bandyopadhyay, “conLSH: Context based Locality Sensitive Hashing for Mapping of noisy SMRT Reads”, Computational Biology and Chemistry, Elsevier, 107206 (2020), doi:10.1016/j.compbiolchem.2020.107206. conLSH:[2] : set of all length- words over and be the set of all length- words over , for . A -sensitive LSH family of functions mapping from to is called -sensitive, if for each , there are positions and with such that for all one has whenever holds. X d Σ U 2λ + 1 Σ 2λ + 1 < d (R, cR, P1, P2) ℋ X U (R, cR, λ, P1, P2) h ∈ ℋ ih jh λ + 1 ≤ ih, jh ≤ d − λ p, q ∈ X h(p) = h(q) p[ih − λ…ih…ih + λ] = q[jh − λ…jh…jh + λ]
  • 7. REAL LIFE APPLICATIONS !Customised suggestion in e-commerce platform !Faster algorithms for real-time results. !Training based on demographical, age-based preferences !Biological Sequence similarity search !Mutation and SNP detection !Phylogenetic tree reconstruction !Action recognition in video sequences
  • 8. BIBLIOGRAPHY ➤ Andoni, A. and Indyk, P. (2008). Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In Communications of the ACM - 50th anniversary issue, pages 117{122}. ACM. ➤ Chakraborty, A. and Bandyopadhyay, S. (2020), “conLSH: Context based Locality Sensitive Hashing for Mapping of noisy SMRT Reads”, Computational Biology and Chemistry, Elsevier, 107206 (2020), doi:10.1016/ j.compbiolchem.2020.107206. ➤ Chakraborty, A. and Morgenstern, B. and Bandyopadhyay, S. (2019), S-conLSH: Alignment-free gapped mapping of noisy long reads", bioRxiv 801118, 2019. ➤ Chakraborty, A. and Bandyopadhyay, S. (2014). A Layered Locality Sensitive Hashing based Sequence Similarity Search Algorithm for Web Sessions. 2nd ASE International Conference on Big Data Science and Computing, Stanford University, CA, USA. ➤ Zielezinski, A., Girgis, H. Z., Bernard, G., Leimeister, C.-A., Tang, K., Dencker, T., Lau, A. K., Rohling, S., Choi, J., Waterman, M. S., Comin, M., Kim, S.-H., Vinga, S., Almeida, J. S., Chan, C. X., James, B., Sun, F., Morgenstern, B., and Karlowski, W. M. (2019). Benchmarking of alignment-free sequence comparison methods. Genome Biology, accepted for publication.
  • 9. END