SlideShare a Scribd company logo
“Extra” by Jeremy Brooks https://flic.kr/p/4aKH3c
EXTRA
Stuart Myles * Associated Press * 14th June 2016
© 2016 IPTC (www.iptc.org) All rights reserved
https://flic.kr/p/tgYcsA
EXTRA
EXTraction Rules Apparatus
• Rules-based classification of text
• Open source software
• EXTRA is being developed by the IPTC
• Grant from the Digital News Initiative
https://iptc.github.io/extra/
© 2016 IPTC (www.iptc.org) All rights reserved 3
Google DNI
• Google’s €150 million Digital News Initiative fund
– Stimulate innovation among European news organizations
– https://www.digitalnewsinitiative.com/fund/
• Multiple funding rounds
– First funding of €27 million to projects in 23 countries
– http://googlepolicyeurope.blogspot.gr/2016/02/digital-news-initiative-
first-funding_24.html
• IPTC’s EXTRA project funded in first round - October 2015
– Developer €35,000
– Linguist €10,000
– Project Manager €5,000
– Total grant to IPTC from DNI = €50,000
© 2016 IPTC (www.iptc.org) All rights reserved
EXTRA
EXTraction Rules Apparatus
• Open source
– IPTC always uses open licenses
• Rules-based
– Better for breaking news than statistical methods
– More consistent and scalable than hand tagging
– Easier to explain why rules classify content
• Multilingual
– Developing rules for two IPTC Media Topics Languages
• News classification
– Rules will be developed using news content corpora
© 2016 IPTC (www.iptc.org) All rights reserved 5
EXTRA Progress
Technical use cases
https://docs.google.com/document/d/1O8pmFlohcGXThzyrWil_OFbDyqJk1Hcjpml_RRXu
w6U/edit?usp=sharing
Rules language requirements
https://docs.google.com/document/d/1MMv5qlrLF71bBN1w1ErXaSyTKB2Kd1ksgixBnUO
w0fQ/edit?usp=sharing
Delivered roadmap to DNI
Securing news corpora in two Media Topics languages
– English from Thomson Reuters
– German from APA
– French from AFP
© 2016 IPTC (www.iptc.org) All rights reserved 6
Communications plan
– Working on EXTRA – but who might not make every meeting
– IPTC membership who are interested in EXTRA
– Beyond IPTC who are interested / might want to work on EXTRA
– Teleconferences https://iptc.org/events/
– Email https://groups.yahoo.com/neo/groups/iptc-extra/info
– Documentation
• http://dev.iptc.org/Topic-EXTRA
• https://iptc.github.io/extra/
• Do we need
– Team communications - Slack?
– Outreach - Twitter? Blog? Medium? LinkedIn?
© 2016 IPTC (www.iptc.org) All rights reserved 7
EXTRA Wednesday Workshop
• Review technical use cases
• Review rules language requirements
• Select licenses
– Source code
– Corpora
• Decide on communications plan
• A plan for a plan
– Technical foundations
– Select consultants
© 2016 IPTC (www.iptc.org) All rights reserved 8
Date and Place of Next Meeting
Berlin, Germany 24 – 26 October 2016
https://flic.kr/p/dzWJB
Tack och adjö!
© 2016 IPTC (www.iptc.org) All rights reserved 9

More Related Content

PPTX
Nitf 2010-11
PDF
"ALL YOUR METADATA ARE BELONG TO US." What Can You Do?
PDF
Concerned Citizens Council Lecture Series ACLU: Dr. Steven Schafersman
PDF
Shakalaka! Plato Experten Marketing 2015 - Social Media
PDF
Google Docs Workshop
PDF
Imagesof Ghana
PDF
Shakalaka - Mad Marketing - Artevelde 2013 (no video)
PPTX
IPTC Chairman's Welcome June 2016
Nitf 2010-11
"ALL YOUR METADATA ARE BELONG TO US." What Can You Do?
Concerned Citizens Council Lecture Series ACLU: Dr. Steven Schafersman
Shakalaka! Plato Experten Marketing 2015 - Social Media
Google Docs Workshop
Imagesof Ghana
Shakalaka - Mad Marketing - Artevelde 2013 (no video)
IPTC Chairman's Welcome June 2016

Similar to IPTC EXTRA - Open Source Rules Classification (20)

PPTX
Update on IPTC's EXTRA Open Source Classification Engine
PPTX
MICO — Towards Contextual Media Analysis
PPTX
IPTC EXTRA Rules Based Classification for News
PPT
Europeana Newspapers wp2 liber2013
PDF
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
PPT
Refinement of Digitised Newspapers
PPTX
IPTC AGM 2018 Welcome
PDF
PROSE: Empowering FLOSS in European Projects
PPT
ENP Belgrade WS refinement introduction
PPTX
IPTC Spring Meeting Welcome To Athens April 2018
PDF
FASTEN Objectives
PPT
Europeana Newspapers - the Gateway to European Newspapers Online
PPT
PDF
Local Weather Information and GNOME Shell Extension
PPT
Europeana_Newspapers_ONB_infoday_HJLieder
PDF
ENP_Dutch_Infoday_LWilms
PPT
20151019 webinar Open Access in Horizon 2020
PDF
"OSS in Public Administrations - A short Report from the European Level" by B...
PPT
Presentation of Hans-Jörg Lieder, BnF Information Day
Update on IPTC's EXTRA Open Source Classification Engine
MICO — Towards Contextual Media Analysis
IPTC EXTRA Rules Based Classification for News
Europeana Newspapers wp2 liber2013
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
Refinement of Digitised Newspapers
IPTC AGM 2018 Welcome
PROSE: Empowering FLOSS in European Projects
ENP Belgrade WS refinement introduction
IPTC Spring Meeting Welcome To Athens April 2018
FASTEN Objectives
Europeana Newspapers - the Gateway to European Newspapers Online
Local Weather Information and GNOME Shell Extension
Europeana_Newspapers_ONB_infoday_HJLieder
ENP_Dutch_Infoday_LWilms
20151019 webinar Open Access in Horizon 2020
"OSS in Public Administrations - A short Report from the European Level" by B...
Presentation of Hans-Jörg Lieder, BnF Information Day
Ad

More from Stuart Myles (20)

PPTX
IPTC Rights Statements For News
PPTX
IPTC New Taxonomies Ideas
PPTX
IPTC Board Spring 2019
PPTX
IPTC Spring 2019 Conference
PPTX
Photomation or Fauxtomation?
PPTX
Image Tagging at the Associated Press
PPTX
IPTC Rights Working Group Toronto October 2018
PPTX
How Can We Make Algorithmic News More Transparent?
PPTX
IPTC EXTRA Spring 2018
PPTX
IPTC Machine Readable Rights for News and Media: Solving Three Challenges wit...
PPTX
Ap Taxonomy Localization Requirements and Challenges
PPTX
Sustaining Television News Technical Challenges
PPTX
How to Train Your Classifier: Create a Serverless Machine Learning System wit...
PPTX
The Search for IPTC's Next Managing Director
PPTX
IPTC Approach to News in JSON
PPTX
IPTC News in JSON November 2017
PPTX
IPTC EXTRA and EXTRA+ November 2017
PPTX
Welcome to Barcelona - IPTC November 2017
PPTX
Credibility Schema Working Group
PPTX
Rights for Photo and Video Archives at the Associated Press
IPTC Rights Statements For News
IPTC New Taxonomies Ideas
IPTC Board Spring 2019
IPTC Spring 2019 Conference
Photomation or Fauxtomation?
Image Tagging at the Associated Press
IPTC Rights Working Group Toronto October 2018
How Can We Make Algorithmic News More Transparent?
IPTC EXTRA Spring 2018
IPTC Machine Readable Rights for News and Media: Solving Three Challenges wit...
Ap Taxonomy Localization Requirements and Challenges
Sustaining Television News Technical Challenges
How to Train Your Classifier: Create a Serverless Machine Learning System wit...
The Search for IPTC's Next Managing Director
IPTC Approach to News in JSON
IPTC News in JSON November 2017
IPTC EXTRA and EXTRA+ November 2017
Welcome to Barcelona - IPTC November 2017
Credibility Schema Working Group
Rights for Photo and Video Archives at the Associated Press
Ad

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PPT
Teaching material agriculture food technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Spectroscopy.pptx food analysis technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Encapsulation theory and applications.pdf
PPTX
Machine Learning_overview_presentation.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Approach and Philosophy of On baking technology
Teaching material agriculture food technology
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation_ Review paper, used for researhc scholars
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
NewMind AI Weekly Chronicles - August'25-Week II
20250228 LYD VKU AI Blended-Learning.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Diabetes mellitus diagnosis method based random forest with bat algorithm
Spectroscopy.pptx food analysis technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Encapsulation theory and applications.pdf
Machine Learning_overview_presentation.pptx
Review of recent advances in non-invasive hemoglobin estimation
MIND Revenue Release Quarter 2 2025 Press Release
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

IPTC EXTRA - Open Source Rules Classification

  • 1. “Extra” by Jeremy Brooks https://flic.kr/p/4aKH3c
  • 2. EXTRA Stuart Myles * Associated Press * 14th June 2016 © 2016 IPTC (www.iptc.org) All rights reserved https://flic.kr/p/tgYcsA
  • 3. EXTRA EXTraction Rules Apparatus • Rules-based classification of text • Open source software • EXTRA is being developed by the IPTC • Grant from the Digital News Initiative https://iptc.github.io/extra/ © 2016 IPTC (www.iptc.org) All rights reserved 3
  • 4. Google DNI • Google’s €150 million Digital News Initiative fund – Stimulate innovation among European news organizations – https://www.digitalnewsinitiative.com/fund/ • Multiple funding rounds – First funding of €27 million to projects in 23 countries – http://googlepolicyeurope.blogspot.gr/2016/02/digital-news-initiative- first-funding_24.html • IPTC’s EXTRA project funded in first round - October 2015 – Developer €35,000 – Linguist €10,000 – Project Manager €5,000 – Total grant to IPTC from DNI = €50,000 © 2016 IPTC (www.iptc.org) All rights reserved
  • 5. EXTRA EXTraction Rules Apparatus • Open source – IPTC always uses open licenses • Rules-based – Better for breaking news than statistical methods – More consistent and scalable than hand tagging – Easier to explain why rules classify content • Multilingual – Developing rules for two IPTC Media Topics Languages • News classification – Rules will be developed using news content corpora © 2016 IPTC (www.iptc.org) All rights reserved 5
  • 6. EXTRA Progress Technical use cases https://docs.google.com/document/d/1O8pmFlohcGXThzyrWil_OFbDyqJk1Hcjpml_RRXu w6U/edit?usp=sharing Rules language requirements https://docs.google.com/document/d/1MMv5qlrLF71bBN1w1ErXaSyTKB2Kd1ksgixBnUO w0fQ/edit?usp=sharing Delivered roadmap to DNI Securing news corpora in two Media Topics languages – English from Thomson Reuters – German from APA – French from AFP © 2016 IPTC (www.iptc.org) All rights reserved 6
  • 7. Communications plan – Working on EXTRA – but who might not make every meeting – IPTC membership who are interested in EXTRA – Beyond IPTC who are interested / might want to work on EXTRA – Teleconferences https://iptc.org/events/ – Email https://groups.yahoo.com/neo/groups/iptc-extra/info – Documentation • http://dev.iptc.org/Topic-EXTRA • https://iptc.github.io/extra/ • Do we need – Team communications - Slack? – Outreach - Twitter? Blog? Medium? LinkedIn? © 2016 IPTC (www.iptc.org) All rights reserved 7
  • 8. EXTRA Wednesday Workshop • Review technical use cases • Review rules language requirements • Select licenses – Source code – Corpora • Decide on communications plan • A plan for a plan – Technical foundations – Select consultants © 2016 IPTC (www.iptc.org) All rights reserved 8
  • 9. Date and Place of Next Meeting Berlin, Germany 24 – 26 October 2016 https://flic.kr/p/dzWJB Tack och adjö! © 2016 IPTC (www.iptc.org) All rights reserved 9