SlideShare a Scribd company logo
Hisashi KOMINE / MNU K. K.
Bluemix User Group in Watson Summit 2017
ver. 1.0.0
DOCUMENT
CONVERSION
&
RETRIEVE AND
RANK
1
Q-0
A-0
Facebook:
Twitter:
Github: /
Qiita:
hssh
hssh mnu-komine
hssh
Apache Spark Bluemix Hadoop TUT Vue.js DMM.com reveal.js org-mode PHP MySQL Document
ConversionAichi Toyohasi ES2016 iOS Zend Framework A cappella AngularJS GCP Chef Vagrant TNCT
neptune.io Azure MongoDB CSS3 Laravel Solr KDDI Cloud Webimpact AWS Scala Retrieve and
RankPBOX MapR API ConnectCat IntelliJ IDEA Ruby Tensorflow HTML5 Hokkaido Shiraoi Python
Docker MNU Electron Ruby on Rails macOS Yokohama Aobaku Outdoor Hustler Emacs Elasticsearch Golang Cloudn
WatsonMariaDB Machine Learning Django Camp node.js Apache Cordova Scrum
2 . 2
Q-1
Document Conversion (DC)
A-1
Word PDF HTML Retrieve and Rank
Watson API
Q-2
Retrieve and Rank (R&R)
A-2
Solr
Q-3
Watson API JS
A-3
Watson API CORS
Bluemix CORS Proxy API
JS
API Connect
Q-4
DC R&R
A-4
DC APIIndex a document
curl -X POST -u "{username}":"{password}" 
-F "file=@example.html" 
"https://gateway.watsonplatform.net/document-conversion/api/v1/index_document"
6 . 1
Q-5
DC R&R
A-5
DC API dry_run true
R&R
R&R API
: id title fileName sourceUrl flags
Index a document
Index documents
7 . 1
curl -X POST -u "{username}":"{password}" 
-F 'config={"retrieve_and_rank": {"dry_run": true}}' 
-F "file=@example.html" 
"https://gateway.watsonplatform.net/document-conversion/api/v1/index_document"
curl -X POST -H "Content-Type: application/json" -u "{username}":"{password}" 
--data-binary @your_docs.json 
"https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_
7 . 2
Q-6
R&R Solr
A-6
R&R APISearch Solr standard query parser
curl -X POST -u "{username}":"{password}" 
"https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_b3b6_
?q=what%20is%20the%20basic%20mechanism%20of%20the%20transonic%20aileron%20buzz
&wt=json"
8 . 1
Q-7
R&R Solr
A-7
R&R APISearch Solr standard query parser
https://${API_URL}?fl=id,title
9 . 1
Q-8
R&R Solr
A-8
R&R API
Solr
Search Solr standard query parser
https://${API_URL}?q=fileName:example.html
10 . 1
Q-9
R&R Solr
A-9
R&R APIGet configuration
curl -u "{username}":"{password}" 
-o example_config.zip 
"https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_
11 . 1
Q-10
R&R Solr
A-10
API
API
Get configuration
Upload Solr configuration
curl -X POST -H "Content-Type: application/zip" -u "{username}":"{password}" 
--data-binary @/configs/example_config.zip 
"https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_
12 . 1
Q-11
R&R Solr Schema
A-11
Solr
contentType
<field name="contentType"
type="string"
indexed="true"
stored="true"
multiValued="false"/>
Q-12
R&R Solr Schema Collection
A-12
R&R APICreate Solr collection
curl -X POST -u "{username}":"{password}" 
-d "action=CREATE&name=example_collection&collection.configName=example_config&wt=json"
"https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_
14 . 1
Q-13
R&R Ranker
A-13
R&R APICreate ranker
curl -X POST -u "{username}":"{password}"
-F training_data=@train.csv 
-F training_metadata="{"name":"My ranker"}" 
"https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/rankers"
15 . 1
Q-14
Ranker
A-14
R&R APISearch and rank
curl -X POST -u "{username}":"{password}" 
"https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_
?ranker_id=B2E325-rank-67
&q=what%20is%20the%20basic%20mechanism%20of%20the%20transonic%20aileron%20buzz
&wt=json"
16 . 1
Q-15
Solr Ranker
A-15
17 . 1
Q-16
A-16
Bad knowhow
1. API
2. API
3. Search and rank
4. ID API
Search and rank
Search Solr standard query parser
Search Solr standard query parser
18 . 1

More Related Content

PPTX
Scala.js: Next generation front end development in Scala
PDF
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
DOCX
Template of nested stack
PDF
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
PDF
Mist - Serverless proxy to Apache Spark
PDF
Infrastructure as code terraformujeme cloud
PPTX
regular expressions and the world wide web
PDF
Serverless to author, schedule, execute and monitor data workflows.
Scala.js: Next generation front end development in Scala
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Template of nested stack
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Mist - Serverless proxy to Apache Spark
Infrastructure as code terraformujeme cloud
regular expressions and the world wide web
Serverless to author, schedule, execute and monitor data workflows.

What's hot (18)

PDF
2014 09 30_sparkling_water_hands_on
PPT
HTML Flight Scraper
PDF
Side by Side with Elasticsearch & Solr, Part 2
PDF
R and Athena … there is another way!?
PPTX
grlc: Bridging the Gap Between RESTful APIs and Linked Data
PDF
Full Stack Scala
PDF
Analyse your SEO Data with R and Kibana
PDF
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
PDF
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
PDF
Ground Control to Nomad Job Dispatch
PDF
Big Data with BigQuery, presented at DevoxxUK 2014 by Javier Ramirez from teo...
PDF
A peak at Rails 2.0
PDF
Riak Intro at Munich Node.js
PDF
Rancher最速セットアップ理論 プロジェクトr to the next stage
PPT
Datatypes for the real world
PPTX
Javantura v3 - Develop the right way with S-CASE – Marin Orlić
PDF
2014 spark with elastic search
PPTX
Scrapy-101
2014 09 30_sparkling_water_hands_on
HTML Flight Scraper
Side by Side with Elasticsearch & Solr, Part 2
R and Athena … there is another way!?
grlc: Bridging the Gap Between RESTful APIs and Linked Data
Full Stack Scala
Analyse your SEO Data with R and Kibana
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Ground Control to Nomad Job Dispatch
Big Data with BigQuery, presented at DevoxxUK 2014 by Javier Ramirez from teo...
A peak at Rails 2.0
Riak Intro at Munich Node.js
Rancher最速セットアップ理論 プロジェクトr to the next stage
Datatypes for the real world
Javantura v3 - Develop the right way with S-CASE – Marin Orlić
2014 spark with elastic search
Scrapy-101
Ad

Similar to Document Conversion & Retrieve and Rank 一問一答 (20)

PPTX
Spark ML Pipeline serving
PDF
Ams adapters
PPTX
Introduction to Apache Camel
PDF
Machine Learning with H2O, Spark, and Python at Strata 2015
PDF
Introduction to CloudStack API
PDF
H2O PySparkling Water
PPTX
StrongLoop Overview
PDF
Data / Streaming / Microservices Platform with Devops
PDF
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
PDF
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
PDF
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
PDF
API REST et client Javascript - Nuxeo Tour 2014 - Workshop
PDF
Kafka streams - From pub/sub to a complete stream processing platform
PDF
Austin Data Meetup 092014 - Spark
PDF
Data Summer Conf 2018, “Mist – Serverless proxy for Apache Spark (RUS)” — Vad...
PDF
Seattle StrongLoop Node.js Workshop
PDF
presentation
PDF
L’odyssée d’une requête HTTP chez Scaleway
PDF
Apache Camel - The integration library
PDF
Automate your automation with Rudder’s API! \o/
Spark ML Pipeline serving
Ams adapters
Introduction to Apache Camel
Machine Learning with H2O, Spark, and Python at Strata 2015
Introduction to CloudStack API
H2O PySparkling Water
StrongLoop Overview
Data / Streaming / Microservices Platform with Devops
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
API REST et client Javascript - Nuxeo Tour 2014 - Workshop
Kafka streams - From pub/sub to a complete stream processing platform
Austin Data Meetup 092014 - Spark
Data Summer Conf 2018, “Mist – Serverless proxy for Apache Spark (RUS)” — Vad...
Seattle StrongLoop Node.js Workshop
presentation
L’odyssée d’une requête HTTP chez Scaleway
Apache Camel - The integration library
Automate your automation with Rudder’s API! \o/
Ad

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Modernizing your data center with Dell and AMD
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Machine learning based COVID-19 study performance prediction
PDF
NewMind AI Monthly Chronicles - July 2025
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Advanced IT Governance
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
cuic standard and advanced reporting.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
MYSQL Presentation for SQL database connectivity
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx
Modernizing your data center with Dell and AMD
Advanced Soft Computing BINUS July 2025.pdf
Approach and Philosophy of On baking technology
Machine learning based COVID-19 study performance prediction
NewMind AI Monthly Chronicles - July 2025
The AUB Centre for AI in Media Proposal.docx
Per capita expenditure prediction using model stacking based on satellite ima...
Advanced IT Governance
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
cuic standard and advanced reporting.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The Rise and Fall of 3GPP – Time for a Sabbatical?
Diabetes mellitus diagnosis method based random forest with bat algorithm
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Reach Out and Touch Someone: Haptics and Empathic Computing

Document Conversion & Retrieve and Rank 一問一答

  • 1. Hisashi KOMINE / MNU K. K. Bluemix User Group in Watson Summit 2017 ver. 1.0.0 DOCUMENT CONVERSION & RETRIEVE AND RANK 1
  • 3. Facebook: Twitter: Github: / Qiita: hssh hssh mnu-komine hssh Apache Spark Bluemix Hadoop TUT Vue.js DMM.com reveal.js org-mode PHP MySQL Document ConversionAichi Toyohasi ES2016 iOS Zend Framework A cappella AngularJS GCP Chef Vagrant TNCT neptune.io Azure MongoDB CSS3 Laravel Solr KDDI Cloud Webimpact AWS Scala Retrieve and RankPBOX MapR API ConnectCat IntelliJ IDEA Ruby Tensorflow HTML5 Hokkaido Shiraoi Python Docker MNU Electron Ruby on Rails macOS Yokohama Aobaku Outdoor Hustler Emacs Elasticsearch Golang Cloudn WatsonMariaDB Machine Learning Django Camp node.js Apache Cordova Scrum 2 . 2
  • 4. Q-1 Document Conversion (DC) A-1 Word PDF HTML Retrieve and Rank Watson API
  • 5. Q-2 Retrieve and Rank (R&R) A-2 Solr
  • 6. Q-3 Watson API JS A-3 Watson API CORS Bluemix CORS Proxy API JS API Connect
  • 7. Q-4 DC R&R A-4 DC APIIndex a document curl -X POST -u "{username}":"{password}" -F "file=@example.html" "https://gateway.watsonplatform.net/document-conversion/api/v1/index_document" 6 . 1
  • 8. Q-5 DC R&R A-5 DC API dry_run true R&R R&R API : id title fileName sourceUrl flags Index a document Index documents 7 . 1
  • 9. curl -X POST -u "{username}":"{password}" -F 'config={"retrieve_and_rank": {"dry_run": true}}' -F "file=@example.html" "https://gateway.watsonplatform.net/document-conversion/api/v1/index_document" curl -X POST -H "Content-Type: application/json" -u "{username}":"{password}" --data-binary @your_docs.json "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_ 7 . 2
  • 10. Q-6 R&R Solr A-6 R&R APISearch Solr standard query parser curl -X POST -u "{username}":"{password}" "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_b3b6_ ?q=what%20is%20the%20basic%20mechanism%20of%20the%20transonic%20aileron%20buzz &wt=json" 8 . 1
  • 11. Q-7 R&R Solr A-7 R&R APISearch Solr standard query parser https://${API_URL}?fl=id,title 9 . 1
  • 12. Q-8 R&R Solr A-8 R&R API Solr Search Solr standard query parser https://${API_URL}?q=fileName:example.html 10 . 1
  • 13. Q-9 R&R Solr A-9 R&R APIGet configuration curl -u "{username}":"{password}" -o example_config.zip "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_ 11 . 1
  • 14. Q-10 R&R Solr A-10 API API Get configuration Upload Solr configuration curl -X POST -H "Content-Type: application/zip" -u "{username}":"{password}" --data-binary @/configs/example_config.zip "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_ 12 . 1
  • 15. Q-11 R&R Solr Schema A-11 Solr contentType <field name="contentType" type="string" indexed="true" stored="true" multiValued="false"/>
  • 16. Q-12 R&R Solr Schema Collection A-12 R&R APICreate Solr collection curl -X POST -u "{username}":"{password}" -d "action=CREATE&name=example_collection&collection.configName=example_config&wt=json" "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_ 14 . 1
  • 17. Q-13 R&R Ranker A-13 R&R APICreate ranker curl -X POST -u "{username}":"{password}" -F training_data=@train.csv -F training_metadata="{"name":"My ranker"}" "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/rankers" 15 . 1
  • 18. Q-14 Ranker A-14 R&R APISearch and rank curl -X POST -u "{username}":"{password}" "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_ ?ranker_id=B2E325-rank-67 &q=what%20is%20the%20basic%20mechanism%20of%20the%20transonic%20aileron%20buzz &wt=json" 16 . 1
  • 20. Q-16 A-16 Bad knowhow 1. API 2. API 3. Search and rank 4. ID API Search and rank Search Solr standard query parser Search Solr standard query parser 18 . 1