SlideShare a Scribd company logo
© 2017 NAVER LABS. All rights reserved.
Matthias Gallé
Naver Labs Europe
@mgalle
Human-Centric Machine Learning
Rakuten Technology Conference 2017
Advanced
Chess
Supervised Learning
Where f typically such that
𝑓 = argmin 𝑓∈𝐹
1
𝑁
෍
𝑖=1
𝐿 𝑓 𝑥𝑖 , 𝑦𝑖 + 𝜆𝑅 𝑓
I know what I want
(and can formalize it)
I have time & money to label lots of data
X,Y f(x)
Example: Machine Translation
Given a text s and its proposed translation p, how to measure its distance with
respect to a reference translation t ?
BLEU: n-gram overlap between t and p
typically: 1 ≤ 𝑛 ≤ 4, precision only, brevity penalty
METEOR
bonus points for matching stems and synonyms
use paraphrases
Statistical Machine Translation
P Koehn
(www.statmt.org/book/slides/08-
evaluation.pdf)
Consequences of not formalizing correctly
Users do not use your model
Computer-Assisted Translation used rule-based systems for years
Ad-hoc solutions
Quality Prediction
Automatic Post Edition
Unsupervised Learning
Where Z(X) capture some prior:
• Compression
• Clustering
• Coverage
• ….
I am not sure what I want
I have a (big) corpus with assumed patterns
X Z(X)
Example: Exploratory Search
Whenever your task is:
• Ill-defined:
– Broad / under-specified
– Multi-faceted
• Dynamic:
– Searcher’s understanding inadequate at the beginning
– Searcher’s understanding evolves as results are gradually retrieved.
The answer to what you search is “I know it when I see it”
https://en.wikipedia.org/wiki/I_know_it_when_I_see_it
Interactive Learning
Exploratory Search: examples
E-Discovery
Sensitivity Review
• Vo, Ngoc Phuoc An, et al. "DISCO: A System Leveraging Semantic Search in Document Review." COLING (Demos). 2016.
• Privault, Caroline, et al. "A new tangible user interface for machine learning document review." Artificial Intelligence and Law 18.4 (2010): 459-479.
• Ferrero, Germán, Audi Primadhanty, and Ariadna Quattoni. "InToEventS: An Interactive Toolkit for Discovering and Building Event Schemas." EACL 2017 (2017): 104.
Example: Active Learning
Give initiative to the algorithm
allow action of type: “please, label instance x”
Cognitive effort of labeling a document 3-5x higher than labelling a word [1]
Feature labelling:
• type(feedback) ≠ type(label)
• information load of a word label is small
• word sense disambiguation
[1] Raghavan, Hema, Omid Madani, and Rosie Jones. "Active learning with feedback on features and instances." Journal of
Machine Learning Research7.Aug (2006): 1655-1686.
Conclusion
If you really want to solve a problem, don’t be prisoner of your
performance indicator
Ask yourself:
1. Does it really capture success?
does it align with human judgment?
2. What does the [machine | human] best?
3. Can you remove the burden from humans by smarter algorithms?
Further reading & Acknowledgments
Jean-Michel RendersMarc Dymetman Ariadna Quattoni
http://www.europe.naverlabs.com/Blog
Q&A
© 2017 NAVER LABS. All rights reserved.
Appendix
© 2017 NAVER LABS. All rights reserved.
Statistical Machine Translation
P Koehn
(www.statmt.org/book/slides/08-
evaluation.pdf)

More Related Content

PDF
Neural Semi-supervised Learning under Domain Shift
PDF
Machine Learning part 2 - Introduction to Data Science
PPTX
Transfer learning-presentation
PDF
Machine Learning part1 - Introduction to Data Science
PDF
Icml2017 overview
PDF
Transfer Learning -- The Next Frontier for Machine Learning
PDF
Machine Learning Real Life Applications By Examples
PDF
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
Neural Semi-supervised Learning under Domain Shift
Machine Learning part 2 - Introduction to Data Science
Transfer learning-presentation
Machine Learning part1 - Introduction to Data Science
Icml2017 overview
Transfer Learning -- The Next Frontier for Machine Learning
Machine Learning Real Life Applications By Examples
TensorFlow London: Progressive Growing of GANs for increased stability, quali...

What's hot (10)

PDF
Machine Learning part 3 - Introduction to data science
PDF
A General Overview of Machine Learning
PPT
Introduction to Machine Learning
PPTX
Day 2 (Lecture 5): A Practitioner's Perspective on Building Machine Product i...
PDF
Data science as career
PDF
The Predictron: End-to-end Learning and Planning
PPTX
real life application in numerical method
PPTX
Session 04 communicating results
PPTX
Research project ppt for students
PPTX
application of numerical method
Machine Learning part 3 - Introduction to data science
A General Overview of Machine Learning
Introduction to Machine Learning
Day 2 (Lecture 5): A Practitioner's Perspective on Building Machine Product i...
Data science as career
The Predictron: End-to-end Learning and Planning
real life application in numerical method
Session 04 communicating results
Research project ppt for students
application of numerical method
Ad

Viewers also liked (20)

PDF
Life of an enginner in rakuten osaka diarmaid lindsay
PDF
Predictions and Hard Problems With AI
PDF
トラブルシューティングのあれこれ Yoshihiko kamata
PDF
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
PDF
WannaEat: A computer vision-based, multi-platform restaurant lookup app
PDF
はてなのインフラの歴史、そしてMackerelへ至る道とこれから
PDF
AI based language learning tools
PDF
Don't manage too hard!
PDF
COBOL to Apache Spark
PDF
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
PDF
Rakuten app productivity initiative for developers marcus saw
PDF
What i learned from translation of the sre ryuji tamagawa
PDF
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
PDF
Rakutenとsreと私 yanagimoto koichi
PDF
Challenge for statup's cto from big company nagaaki hoshi
PDF
One Hundred Languages
PDF
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
PDF
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
PDF
Java ee7 with apache spark for the world's largest credit card core systems, ...
PDF
Building your own static site Using Hugo
Life of an enginner in rakuten osaka diarmaid lindsay
Predictions and Hard Problems With AI
トラブルシューティングのあれこれ Yoshihiko kamata
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
WannaEat: A computer vision-based, multi-platform restaurant lookup app
はてなのインフラの歴史、そしてMackerelへ至る道とこれから
AI based language learning tools
Don't manage too hard!
COBOL to Apache Spark
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
Rakuten app productivity initiative for developers marcus saw
What i learned from translation of the sre ryuji tamagawa
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
Rakutenとsreと私 yanagimoto koichi
Challenge for statup's cto from big company nagaaki hoshi
One Hundred Languages
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
Java ee7 with apache spark for the world's largest credit card core systems, ...
Building your own static site Using Hugo
Ad

Similar to Human-Centric Machine Learning (20)

PPT
Supervised_Learning.ppt
PDF
Noorbehbahani a survey on instance selection for active learning
PDF
NYAI - Interactive Machine Learning by Daniel Hsu
PDF
Active Learning Literature Survey
PPT
Machine Learning Applications in NLP.ppt
PDF
Learning loss for active learning
PDF
Artificial Intelligence: an introduction.pdf
PDF
December 2024 - Top Read Articles in Soft Computing
PDF
機械学習モデルの判断根拠の説明
PDF
Text Classification Powered by Apache Mahout and Lucene
PDF
October 2021: Top Read Articles in Soft Computing
PDF
Magpie
PDF
A Meaning-Based Statistical English Math Word Problem Solver.pdf
PDF
July 2025: Top Read Articles in Soft Computing
PDF
Introduction to ML.pdf Supervised Learning, Unsupervised
PDF
March 2021: Top Read Articles in Soft Computing
PPT
Lect24 hmm
DOC
Supervised Corpus-based Methods for Word Sense Disambiguation
Supervised_Learning.ppt
Noorbehbahani a survey on instance selection for active learning
NYAI - Interactive Machine Learning by Daniel Hsu
Active Learning Literature Survey
Machine Learning Applications in NLP.ppt
Learning loss for active learning
Artificial Intelligence: an introduction.pdf
December 2024 - Top Read Articles in Soft Computing
機械学習モデルの判断根拠の説明
Text Classification Powered by Apache Mahout and Lucene
October 2021: Top Read Articles in Soft Computing
Magpie
A Meaning-Based Statistical English Math Word Problem Solver.pdf
July 2025: Top Read Articles in Soft Computing
Introduction to ML.pdf Supervised Learning, Unsupervised
March 2021: Top Read Articles in Soft Computing
Lect24 hmm
Supervised Corpus-based Methods for Word Sense Disambiguation

More from Rakuten Group, Inc. (20)

PDF
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
PPTX
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
PDF
楽天における安全な秘匿情報管理への道のり
PDF
What Makes Software Green?
PDF
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
PDF
DataSkillCultureを浸透させる楽天の取り組み
PDF
大規模なリアルタイム監視の導入と展開
PDF
楽天における大規模データベースの運用
PDF
楽天サービスを支えるネットワークインフラストラクチャー
PDF
楽天の規模とクラウドプラットフォーム統括部の役割
PDF
Rakuten Services and Infrastructure Team.pdf
PDF
The Data Platform Administration Handling the 100 PB.pdf
PDF
Supporting Internal Customers as Technical Account Managers.pdf
PDF
Making Cloud Native CI_CD Services.pdf
PDF
How We Defined Our Own Cloud.pdf
PDF
Travel & Leisure Platform Department's tech info
PDF
Travel & Leisure Platform Department's tech info
PDF
OWASPTop10_Introduction
PDF
Introduction of GORA API Group technology
PDF
100PBを越えるデータプラットフォームの実情
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
楽天における安全な秘匿情報管理への道のり
What Makes Software Green?
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
DataSkillCultureを浸透させる楽天の取り組み
大規模なリアルタイム監視の導入と展開
楽天における大規模データベースの運用
楽天サービスを支えるネットワークインフラストラクチャー
楽天の規模とクラウドプラットフォーム統括部の役割
Rakuten Services and Infrastructure Team.pdf
The Data Platform Administration Handling the 100 PB.pdf
Supporting Internal Customers as Technical Account Managers.pdf
Making Cloud Native CI_CD Services.pdf
How We Defined Our Own Cloud.pdf
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
OWASPTop10_Introduction
Introduction of GORA API Group technology
100PBを越えるデータプラットフォームの実情

Recently uploaded (20)

DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPT
Teaching material agriculture food technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The AUB Centre for AI in Media Proposal.docx
Empathic Computing: Creating Shared Understanding
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Teaching material agriculture food technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
20250228 LYD VKU AI Blended-Learning.pptx
Review of recent advances in non-invasive hemoglobin estimation
The Rise and Fall of 3GPP – Time for a Sabbatical?
NewMind AI Weekly Chronicles - August'25 Week I
Building Integrated photovoltaic BIPV_UPV.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Encapsulation theory and applications.pdf
NewMind AI Monthly Chronicles - July 2025
CIFDAQ's Market Insight: SEC Turns Pro Crypto
MYSQL Presentation for SQL database connectivity
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Per capita expenditure prediction using model stacking based on satellite ima...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Human-Centric Machine Learning

  • 1. © 2017 NAVER LABS. All rights reserved. Matthias Gallé Naver Labs Europe @mgalle Human-Centric Machine Learning Rakuten Technology Conference 2017
  • 3. Supervised Learning Where f typically such that 𝑓 = argmin 𝑓∈𝐹 1 𝑁 ෍ 𝑖=1 𝐿 𝑓 𝑥𝑖 , 𝑦𝑖 + 𝜆𝑅 𝑓 I know what I want (and can formalize it) I have time & money to label lots of data X,Y f(x)
  • 4. Example: Machine Translation Given a text s and its proposed translation p, how to measure its distance with respect to a reference translation t ? BLEU: n-gram overlap between t and p typically: 1 ≤ 𝑛 ≤ 4, precision only, brevity penalty METEOR bonus points for matching stems and synonyms use paraphrases
  • 5. Statistical Machine Translation P Koehn (www.statmt.org/book/slides/08- evaluation.pdf)
  • 6. Consequences of not formalizing correctly Users do not use your model Computer-Assisted Translation used rule-based systems for years Ad-hoc solutions Quality Prediction Automatic Post Edition
  • 7. Unsupervised Learning Where Z(X) capture some prior: • Compression • Clustering • Coverage • …. I am not sure what I want I have a (big) corpus with assumed patterns X Z(X)
  • 8. Example: Exploratory Search Whenever your task is: • Ill-defined: – Broad / under-specified – Multi-faceted • Dynamic: – Searcher’s understanding inadequate at the beginning – Searcher’s understanding evolves as results are gradually retrieved. The answer to what you search is “I know it when I see it”
  • 11. Exploratory Search: examples E-Discovery Sensitivity Review • Vo, Ngoc Phuoc An, et al. "DISCO: A System Leveraging Semantic Search in Document Review." COLING (Demos). 2016. • Privault, Caroline, et al. "A new tangible user interface for machine learning document review." Artificial Intelligence and Law 18.4 (2010): 459-479. • Ferrero, Germán, Audi Primadhanty, and Ariadna Quattoni. "InToEventS: An Interactive Toolkit for Discovering and Building Event Schemas." EACL 2017 (2017): 104.
  • 12. Example: Active Learning Give initiative to the algorithm allow action of type: “please, label instance x” Cognitive effort of labeling a document 3-5x higher than labelling a word [1] Feature labelling: • type(feedback) ≠ type(label) • information load of a word label is small • word sense disambiguation [1] Raghavan, Hema, Omid Madani, and Rosie Jones. "Active learning with feedback on features and instances." Journal of Machine Learning Research7.Aug (2006): 1655-1686.
  • 13. Conclusion If you really want to solve a problem, don’t be prisoner of your performance indicator Ask yourself: 1. Does it really capture success? does it align with human judgment? 2. What does the [machine | human] best? 3. Can you remove the burden from humans by smarter algorithms?
  • 14. Further reading & Acknowledgments Jean-Michel RendersMarc Dymetman Ariadna Quattoni http://www.europe.naverlabs.com/Blog
  • 15. Q&A © 2017 NAVER LABS. All rights reserved.
  • 16. Appendix © 2017 NAVER LABS. All rights reserved.
  • 17. Statistical Machine Translation P Koehn (www.statmt.org/book/slides/08- evaluation.pdf)