SlideShare a Scribd company logo
CouchDB Developer Day
Full-Text Search Lab
Create a Cloudant account
• Go to https://cloudant.com/sign-up/
• Sign up!
Setup
curl $account.cloudant.com/foo –X PUT
curl $account.cloudant.com/foo/_design/bar –X PUT –d '{"indexes":{"baz":{"index":"function(doc){index("color",
doc.color); index("size", doc.size);}"}}}'
curl $account.cloudant.com/foo/doc
1 –X PUT –d '{"size": "small", "color": "green"}'
curl $account.cloudant.com/foo/doc2 –X PUT –d '{"size": "large", "color": "green"}'
curl $account.cloudant.com/foo/doc3 –X PUT –d '{"size": "small", "color": "red"}'
Searching
curl $account.cloudant.com/foo/_design/bar/_search/baz?q=size:small
curl $account.cloudant.com/foo/_design/bar/_search/baz?q=size:large
curl $account.cloudant.com/foo/_design/bar/_search/baz?q=color:red
curl $account.cloudant.com/foo/_design/bar/_search/baz?q=size:small%20AND%20color:red
Pagination
Every search request returns a "bookmark" attribute. Pass this back to Cloudant to get the next "page" of
results.
curl https://$account.cloudant.com/foo/_design/bar/_search/baz?q=*:*&limit=1
curl
https://$account.cloudant.com/_design/bar/_search/baz?q=*:*&limit=1&bookmark=g2wAAAABaANkA
B9kYmNvcmVAZGI1LmplbmV2ZXIuY2xvdWRhbnQubmV0bAAAAAJhAGI_____amgCRj_wAAAAAA
AAYQBq
Sorting
The "sort" parameter lets you sort results on any indexed field or combination of indexed fields.
curl https://$account.cloudant.com/foo/_design/bar/_search/baz?q=*:*&sort="size<string>"
curl https://$account.cloudant.com/foo/_design/bar/_search/baz?q=*:*&sort="color<string>"
Tokenization (https://docs.cloudant.com/search.html)
• Tokenizers break down textual input into tokens for efficient and
flexible searching
• Using an appropriate tokenizer is often critical
• Generic analyzers: standard, email, keyword, whitespace
• Language specific analyzers: english, french, german, spanish,
chinese, dutch...
• You can configure different analyzers for different fields
• Some tokenizers omit common words
• Some tokenizers omit common prefixes or suffixes
Tokenization Examples
> curl https://$account.cloudant.com/_search_analyze –Hcontent-type:application/json –d '{"analyzer":"standard",
"text": "rnewson@apache.org"}'
{"tokens":["rnewson","apache.org"]}
> curl https://$account.cloudant.com/_search_analyze –Hcontent-type:application/json –d '{"analyzer":"email",
"text": "rnewson@apache.org"}'
{"tokens":["rnewson@apache.org"]}
> curl https://$account.cloudant.com/_search_analyze –Hcontent-type:application/json –d '{"analyzer":"english",
"text": "running"}'
{"tokens":["run"]}

More Related Content

PPTX
CouchDB Day NYC 2017: Introduction to CouchDB 2.0
PPTX
CouchDB Day NYC 2017: MapReduce Views
PPTX
CouchDB Day NYC 2017: Replication
PPTX
CouchDB Day NYC 2017: Mango
PPTX
CouchDB Day NYC 2017: JSON Documents
PPT
Realtime Analytics Using MongoDB, Python, Gevent, and ZeroMQ
PPTX
2015 555 kharchenko_ppt
PPTX
Commit2015 kharchenko - python generators - ext
CouchDB Day NYC 2017: Introduction to CouchDB 2.0
CouchDB Day NYC 2017: MapReduce Views
CouchDB Day NYC 2017: Replication
CouchDB Day NYC 2017: Mango
CouchDB Day NYC 2017: JSON Documents
Realtime Analytics Using MongoDB, Python, Gevent, and ZeroMQ
2015 555 kharchenko_ppt
Commit2015 kharchenko - python generators - ext

What's hot (20)

PDF
Web Scrapping with Python
PPTX
Visualizing ORACLE performance data with R @ #C16LV
PDF
Scrapy workshop
PPTX
Node collaboration - Exported Resources and PuppetDB
PDF
Javascript Continues Integration in Jenkins with AngularJS
PDF
Spatial script for my JS.Everywhere 2012
PDF
PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013
ODT
Spatial script for MongoBoulder
PDF
Shell Script to Extract IP Address, MAC Address Information
PDF
今時なウェブ開発をSmalltalkでやってみる?
PPTX
Sensu wrapper-sensu-summit
PDF
Let's break apache spark workshop
TXT
File handling complete programs in c++
PDF
Pydata-Python tools for webscraping
PDF
Scrapy talk at DataPhilly
PDF
PDF
Ground Control to Nomad Job Dispatch
PDF
Ansible
PDF
DaNode - A home made web server in D
PPT
Qtp Scripts
Web Scrapping with Python
Visualizing ORACLE performance data with R @ #C16LV
Scrapy workshop
Node collaboration - Exported Resources and PuppetDB
Javascript Continues Integration in Jenkins with AngularJS
Spatial script for my JS.Everywhere 2012
PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013
Spatial script for MongoBoulder
Shell Script to Extract IP Address, MAC Address Information
今時なウェブ開発をSmalltalkでやってみる?
Sensu wrapper-sensu-summit
Let's break apache spark workshop
File handling complete programs in c++
Pydata-Python tools for webscraping
Scrapy talk at DataPhilly
Ground Control to Nomad Job Dispatch
Ansible
DaNode - A home made web server in D
Qtp Scripts
Ad

Viewers also liked (20)

PPTX
CouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDB
PDF
Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
PPTX
Erlang latest version & opensource projects
ODP
Using Erlang in an Embedded and Cross-Compiled World
PDF
RabbitMQ: Message queuing that works
PPTX
Etomidate ketamine
KEY
Intro to Erlang
PDF
Embedded Erlang, Nerves, and SumoBots
PDF
Harel Kodesh, Vice President, Predix and CTO, GE Digital
PDF
Phoenix Framework
PDF
Introduction to Erlang
PDF
IBM DevOps Workshops at IBM InterConnect 2017
PDF
1 hour dive into Erlang/OTP
PDF
Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...
PDF
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
PPT
Erlang OTP
PDF
Building a Network IP Camera using Erlang
PDF
Mac Devine, VP & CTO, Emerging Technology & Advanced Innovation, IBM Cloud Di...
PPTX
RTView - Monitoring Service for SmartCloud Applications
KEY
A web app in pure Clojure
CouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDB
Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
Erlang latest version & opensource projects
Using Erlang in an Embedded and Cross-Compiled World
RabbitMQ: Message queuing that works
Etomidate ketamine
Intro to Erlang
Embedded Erlang, Nerves, and SumoBots
Harel Kodesh, Vice President, Predix and CTO, GE Digital
Phoenix Framework
Introduction to Erlang
IBM DevOps Workshops at IBM InterConnect 2017
1 hour dive into Erlang/OTP
Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
Erlang OTP
Building a Network IP Camera using Erlang
Mac Devine, VP & CTO, Emerging Technology & Advanced Innovation, IBM Cloud Di...
RTView - Monitoring Service for SmartCloud Applications
A web app in pure Clojure
Ad

Similar to CouchDB Day NYC 2017: Full Text Search (20)

PDF
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
PDF
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
PDF
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
PPTX
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
PPTX
MongoDB 3.2 - Analytics
PDF
Elasticsearch in 15 Minutes
PPTX
Webinar: The Anatomy of the Cloudant Data Layer
PPTX
03 pig intro
KEY
Couchdb: No SQL? No driver? No problem
PPTX
The Aggregation Framework
PPTX
Couch db 浅漫游.
KEY
OSCON 2011 CouchApps
PPTX
Introduction to MongoDB and Workshop
PDF
Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...
PDF
Using Document Databases with TYPO3 Flow
PDF
MongoDB Aggregation Framework
PPTX
Elastic search and Symfony3 - A practical approach
PPTX
MVP Cloud OS Week: 9 Sept, Track 1 Data Liberty
PPTX
MVP Cloud OS Week Track 1 9 Sept: Data liberty
PDF
Midgard2 - Content Repository for mobile applications
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
MongoDB 3.2 - Analytics
Elasticsearch in 15 Minutes
Webinar: The Anatomy of the Cloudant Data Layer
03 pig intro
Couchdb: No SQL? No driver? No problem
The Aggregation Framework
Couch db 浅漫游.
OSCON 2011 CouchApps
Introduction to MongoDB and Workshop
Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...
Using Document Databases with TYPO3 Flow
MongoDB Aggregation Framework
Elastic search and Symfony3 - A practical approach
MVP Cloud OS Week: 9 Sept, Track 1 Data Liberty
MVP Cloud OS Week Track 1 9 Sept: Data liberty
Midgard2 - Content Repository for mobile applications

More from IBM Cloud Data Services (15)

PPTX
CouchDB Day NYC 2017: Core HTTP API
PDF
Practical Use of a NoSQL
PPTX
I See NoSQL Document Stores in Geospatial Applications
PPTX
NoSQL for SQL Users
PPTX
dashDB: the GIS professional’s bridge to mainstream IT systems
PDF
Cloud Data Services: A Brand New Ballgame for Business
PPTX
Practical Use of a NoSQL Database
PPTX
SQL To NoSQL - Top 6 Questions Before Making The Move
PPTX
Machine Learning with Apache Spark
PPTX
Mobile App Development With IBM Cloudant
PPT
IBM Cognos Business Intelligence using dashDB
PPTX
Run Oracle Apps in the Cloud with dashDB
PDF
Analyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGIS
PDF
Get Started Quickly with IBM's Hadoop as a Service
PDF
Introducing dashDB MPP: The Power of Data Warehousing in the Cloud
CouchDB Day NYC 2017: Core HTTP API
Practical Use of a NoSQL
I See NoSQL Document Stores in Geospatial Applications
NoSQL for SQL Users
dashDB: the GIS professional’s bridge to mainstream IT systems
Cloud Data Services: A Brand New Ballgame for Business
Practical Use of a NoSQL Database
SQL To NoSQL - Top 6 Questions Before Making The Move
Machine Learning with Apache Spark
Mobile App Development With IBM Cloudant
IBM Cognos Business Intelligence using dashDB
Run Oracle Apps in the Cloud with dashDB
Analyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGIS
Get Started Quickly with IBM's Hadoop as a Service
Introducing dashDB MPP: The Power of Data Warehousing in the Cloud

Recently uploaded (20)

PPTX
Cybersecurity: Protecting the Digital World
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PDF
Website Design Services for Small Businesses.pdf
PPTX
Custom Software Development Services.pptx.pptx
PPTX
Trending Python Topics for Data Visualization in 2025
DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
PDF
Autodesk AutoCAD Crack Free Download 2025
PDF
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
PDF
Digital Systems & Binary Numbers (comprehensive )
PDF
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
PDF
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
DNT Brochure 2025 – ISV Solutions @ D365
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Complete Guide to Website Development in Malaysia for SMEs
PDF
Topaz Photo AI Crack New Download (Latest 2025)
PPTX
chapter 5 systemdesign2008.pptx for cimputer science students
PDF
Wondershare Recoverit Full Crack New Version (Latest 2025)
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
Cybersecurity: Protecting the Digital World
Advanced SystemCare Ultimate Crack + Portable (2025)
Website Design Services for Small Businesses.pdf
Custom Software Development Services.pptx.pptx
Trending Python Topics for Data Visualization in 2025
How to Use SharePoint as an ISO-Compliant Document Management System
Autodesk AutoCAD Crack Free Download 2025
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
Digital Systems & Binary Numbers (comprehensive )
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
Why Generative AI is the Future of Content, Code & Creativity?
DNT Brochure 2025 – ISV Solutions @ D365
Computer Software and OS of computer science of grade 11.pptx
Complete Guide to Website Development in Malaysia for SMEs
Topaz Photo AI Crack New Download (Latest 2025)
chapter 5 systemdesign2008.pptx for cimputer science students
Wondershare Recoverit Full Crack New Version (Latest 2025)
Weekly report ppt - harsh dattuprasad patel.pptx
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx

CouchDB Day NYC 2017: Full Text Search

  • 2. Create a Cloudant account • Go to https://cloudant.com/sign-up/ • Sign up!
  • 3. Setup curl $account.cloudant.com/foo –X PUT curl $account.cloudant.com/foo/_design/bar –X PUT –d '{"indexes":{"baz":{"index":"function(doc){index("color", doc.color); index("size", doc.size);}"}}}' curl $account.cloudant.com/foo/doc 1 –X PUT –d '{"size": "small", "color": "green"}' curl $account.cloudant.com/foo/doc2 –X PUT –d '{"size": "large", "color": "green"}' curl $account.cloudant.com/foo/doc3 –X PUT –d '{"size": "small", "color": "red"}'
  • 4. Searching curl $account.cloudant.com/foo/_design/bar/_search/baz?q=size:small curl $account.cloudant.com/foo/_design/bar/_search/baz?q=size:large curl $account.cloudant.com/foo/_design/bar/_search/baz?q=color:red curl $account.cloudant.com/foo/_design/bar/_search/baz?q=size:small%20AND%20color:red
  • 5. Pagination Every search request returns a "bookmark" attribute. Pass this back to Cloudant to get the next "page" of results. curl https://$account.cloudant.com/foo/_design/bar/_search/baz?q=*:*&limit=1 curl https://$account.cloudant.com/_design/bar/_search/baz?q=*:*&limit=1&bookmark=g2wAAAABaANkA B9kYmNvcmVAZGI1LmplbmV2ZXIuY2xvdWRhbnQubmV0bAAAAAJhAGI_____amgCRj_wAAAAAA AAYQBq
  • 6. Sorting The "sort" parameter lets you sort results on any indexed field or combination of indexed fields. curl https://$account.cloudant.com/foo/_design/bar/_search/baz?q=*:*&sort="size<string>" curl https://$account.cloudant.com/foo/_design/bar/_search/baz?q=*:*&sort="color<string>"
  • 7. Tokenization (https://docs.cloudant.com/search.html) • Tokenizers break down textual input into tokens for efficient and flexible searching • Using an appropriate tokenizer is often critical • Generic analyzers: standard, email, keyword, whitespace • Language specific analyzers: english, french, german, spanish, chinese, dutch... • You can configure different analyzers for different fields • Some tokenizers omit common words • Some tokenizers omit common prefixes or suffixes
  • 8. Tokenization Examples > curl https://$account.cloudant.com/_search_analyze –Hcontent-type:application/json –d '{"analyzer":"standard", "text": "rnewson@apache.org"}' {"tokens":["rnewson","apache.org"]} > curl https://$account.cloudant.com/_search_analyze –Hcontent-type:application/json –d '{"analyzer":"email", "text": "rnewson@apache.org"}' {"tokens":["rnewson@apache.org"]} > curl https://$account.cloudant.com/_search_analyze –Hcontent-type:application/json –d '{"analyzer":"english", "text": "running"}' {"tokens":["run"]}