SlideShare a Scribd company logo
INTERACTIVELY 

QUERY AND SEARCH
YOUR BIG DATA
Romain Rigaux
GOALS

Build	
  a	
  Web	
  app	
  
Quickly	
  explore	
  data	
  
…	
  with	
  Solr
make	
  Solr	
  /	
  Hadoop	
  easier	
  to	
  use
+
ARCHITECTURE

“Just	
  a	
  view”	
  on	
  top	
  of	
  the	
  standard	
  Solr	
  API
REST
HISTORY

V1 USER
HISTORY

V1 ADMIN
ARCHITECTURE

NEXT!
Lot	
  of	
  learning,	
  UX	
  Boost	
  needed	
  
Simple,	
  don’t	
  know	
  it	
  is	
  Solr
HISTORY

V2 USER
HISTORY

V2 ADMIN
HISTORY

V2 BETTER UX
ARCHITECTURE
/select	
  
/admin/collections	
  
/get	
  
/luke...
/add_widget	
  
/zoom_in	
  
/select_facet	
  
/select_range...
REST AJAX
Templates	
  
+	
  
JS	
  Model
www….
ARCHITECTURE

UI FOR FACETS
Query
Collection
	
  Layout All	
  the	
  2D	
  positioning	
  (cell	
  ids),	
  visual,	
  drag&drop
Dashboard,	
  fields,	
  template,	
  widgets	
  (ids)
Search	
  terms,	
  selected	
  facets	
  (q,	
  fqs)
ADDING A WIDGET

LIFECYCLE
Load	
  the	
  initial	
  page	
  
Edit	
  mode	
  and	
  Drag&Drop
/solr/zookeeper/clusterstate.json	
  
/solr/admin/luke…
/get_collection
ADDING A WIDGET

LIFECYCLE
/solr/select?stats=true /new_facet
Select	
  the	
  field	
  
Guess	
  ranges	
  (number	
  or	
  dates)	
  
Rounding	
  (number	
  or	
  dates)
ADDING A WIDGET

LIFECYCLE
Query	
  part	
  1
Query	
  Part	
  2
Augment	
  Solr	
  response
facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000&	
  
f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10
q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000]
{
'facet_counts':{
'facet_ranges':{
'bytes':{
'start':10000,
'counts':[
'900000',
3423,
'1800000',
339,
...
]
}
}
}
{
...,
'normalized_facets':[
{
'extraSeries':[
],
'label':'bytes',
'field':'bytes',
'counts':[
{
'from’:'900000',
'to':'1800000',
'selected':True,
'value':3423,
'field’:'bytes',
'exclude':False
}
], ...
}
}
}
JSON TO WIDGET

{
"field":"rate_code",
"counts":[
{
"count":97797,
"exclude":true,
"selected":false,
"value":"1",
"cat":"rate_code"
} ...
{
"field":"medallion",
"counts":[
{
"count":159,
"exclude":true,
"selected":false,
"value":"6CA28FC49A4C49A9A96",
"cat":"medallion"
} ….
{
"extraSeries":[
],
"label":"trip_time_in_secs",
"field":"trip_time_in_secs",
"counts":[
{
"from":"0",
"to":"10",
"selected":false,
"value":527,
"field":"trip_time_in_secs",
"exclude":true
} ...
{
"field":"passenger_count",
"counts":[
{
"count":74766,
"exclude":true,
"selected":false,
"value":"1",
"cat":"passenger_count"
} ...
REPEAT

UNTIL…
GAME CHANGER!
Possibilihes
5.1	
  /	
  5.2
Analyhc	
  Facets
FACET

FUNCTIONS
Count	
  
Sum	
  
Avg	
  
Percentile	
  
Max	
  
...
Count(id)	
  
Sum(bytes)	
  
Avg(mul(price,	
  quantity))	
  
Percentile(salary,	
  50,	
  90)	
  
Max(temperature)	
  
...
FACET

FUNCTIONS
SUB “NESTED”

FACETS
top_os	
  {	
  
	
  	
  type:	
  term,	
  
	
  	
  field:	
  os,	
  
	
  	
  limit:	
  5	
  
}
top_os	
  {	
  
	
  	
  type:	
  term,	
  
	
  	
  field:	
  os,	
  
	
  	
  limit:	
  5,	
  
	
  	
  facet	
  :	
  {	
  
	
  	
  	
  	
  by_country:	
  {	
  
	
  	
  	
  	
  	
  	
  type:	
  term,	
  
	
  	
  	
  	
  	
  	
  field:	
  country	
  
	
  	
  	
  	
  }	
  
	
  	
  }	
  
}
FUNCTION + NESTED =

ANALYTICS states	
  {	
  
	
  	
  type:	
  term,	
  
	
  	
  field:	
  state,	
  
	
  	
  facet	
  :	
  {	
  
	
  	
  	
  by_month	
  :	
  {	
  
	
  	
  	
  	
  	
  	
  type:	
  range,	
  
	
  	
  	
  	
  	
  	
  field:	
  time,	
  
	
  	
  	
  	
  	
  	
  start:	
  “TODAY-­‐6MONTHS”,	
  
	
  	
  	
  	
  	
  	
  end:	
  “TODAY”,	
  
	
  	
  	
  	
  	
  	
  gap:	
  “MONTH”,	
  
	
  	
  	
  	
  	
  	
  facet	
  :	
  {	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  avg_sal:	
  “avg(salary)”	
  
	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  }	
  
	
  	
  }	
  
}
states	
  {	
  
	
  	
  type:	
  term,	
  
	
  	
  field:	
  state,	
  
	
  	
  facet	
  :	
  {	
  
	
  	
  	
  	
  avg_sal:	
  “avg(salary)”	
  
	
  	
  }	
  
}
OPERATIONS ON

BUCKETS OF DATA
Counts	
  →	
  Functions
OPERATIONS ON

BUCKETS OF DATA
Nested	
  →	
  nD	
  functions
ENTERPRISE

FEATURES
- Access	
  to	
  Search	
  App	
  configurable,	
  LDAP/SAML	
  auths	
  
- Share	
  by	
  link	
  
- Solr	
  Cloud	
  (or	
  non	
  Cloud)	
  
- Proxy	
  user

	
  	
   /solr/jobs_demo/select?user.name=hue&doAs=romain&q=	
  
- Security

	
  	
   Kerberos	
  
- Sentry

	
  	
   Collection	
  level,	
  Solr	
  calls	
  like	
  /admin,	
  /query,	
  Solr	
  UI,	
  ZooKeeper
SEARCH AS ONLY

APP IN HUE
gethue.com/solr-­‐search-­‐ui-­‐only/
• Spark	
  in	
  your	
  browser	
  
• Notebooks	
  
• New	
  REST	
  Server
SPARK

INDEXING
WHAT
• Open	
  source	
  REST	
  for	
  Spark	
  Shell	
  
• Runs	
  locally	
  or	
  inside	
  YARN	
  
• Spark	
  Scala,	
  PySpark	
  and	
  jar/py	
  
submission
SPARK

INDEXING
WHAT
hsps://github.com/cloudera/hue/tree/master/apps/spark/java
SPARK STREAMING
Real	
  hme!	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Spark	
  Solr
• Pytho	
  
• Scala	
  
• Charts
NOTEBOOKS / SHELL
WHAT
DEMO
TIME

• Analyze	
  Bay	
  area	
  bike	
  share	
  
• Visualize	
  one	
  year	
  of	
  data	
  
• Know	
  your	
  users,	
  predict	
  behavior
MISSED

SOMETHING?
demo.gethue.com
• Full	
  Analyhcs	
  
• Easier	
  indexing	
  
• Geo	
  
• Export/Share	
  results	
  
• “More	
  like	
  this”	
  
• Solr	
  Joins,	
  Solr	
  SQL	
  
• Spark,	
  SQL...	
  integrahon,	
  Hue	
  4
WHAT’S NEXT
NEW FEATURES
TWITTER
@gethue
USER GROUP
hue-­‐user@
WEBSITE
hsp://gethue.com
LEARN
hsp://learn.gethue.com
THANKS!


More Related Content

PPTX
Big Data Scala by the Bay: Interactive Spark in your Browser
PDF
20150627 bigdatala
PDF
SF Solr Meetup - Interactively Search and Visualize Your Big Data
PDF
Interactively Search and Visualize Your Big Data
PDF
LDAP, SAML and Hue
PDF
Hadoop Israel - HBase Browser in Hue
PDF
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
PDF
HBase + Hue - LA HBase User Group
Big Data Scala by the Bay: Interactive Spark in your Browser
20150627 bigdatala
SF Solr Meetup - Interactively Search and Visualize Your Big Data
Interactively Search and Visualize Your Big Data
LDAP, SAML and Hue
Hadoop Israel - HBase Browser in Hue
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
HBase + Hue - LA HBase User Group

What's hot (17)

PDF
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
PDF
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
ZIP
5分で説明する Play! scala
PDF
Beeswax Hive editor in Hue
KEY
2011/10/08_Playframework_GAE_to_Heroku
PDF
Hue: The Hadoop UI - Hadoop Singapore
PPTX
Solr 4: Run Solr in SolrCloud Mode on your local file system.
PDF
VSTS Release Pipelines with Kubernetes
PDF
Денис Лебедев-Управление зависимостями с помощью CocoaPods
PPTX
Introduction to ElasticSearch
PDF
Scaling Solr with SolrCloud
KEY
CocoaPods
PPTX
Apache hadoop hue overview and introduction
PDF
Solr Indexing and Analysis Tricks
PDF
AWS Lambda for Data Science @Celerative
PPTX
Spark intro by Adform Research
PPTX
Transfer to kubernetes data platform from EMR
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
5分で説明する Play! scala
Beeswax Hive editor in Hue
2011/10/08_Playframework_GAE_to_Heroku
Hue: The Hadoop UI - Hadoop Singapore
Solr 4: Run Solr in SolrCloud Mode on your local file system.
VSTS Release Pipelines with Kubernetes
Денис Лебедев-Управление зависимостями с помощью CocoaPods
Introduction to ElasticSearch
Scaling Solr with SolrCloud
CocoaPods
Apache hadoop hue overview and introduction
Solr Indexing and Analysis Tricks
AWS Lambda for Data Science @Celerative
Spark intro by Adform Research
Transfer to kubernetes data platform from EMR
Ad

Viewers also liked (9)

PDF
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
PDF
Integrating Hadoop & Solr
PDF
Introduction to Impala
PPTX
YARN - Hadoop's Resource Manager
PDF
August 2013 HUG: Hue: the UI for Apache Hadoop
PDF
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
ODP
An Introduction to Hadoop Hue Gui
PDF
Solr+Hadoop = Big Data Search
PDF
State of the Word 2011
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Integrating Hadoop & Solr
Introduction to Impala
YARN - Hadoop's Resource Manager
August 2013 HUG: Hue: the UI for Apache Hadoop
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
An Introduction to Hadoop Hue Gui
Solr+Hadoop = Big Data Search
State of the Word 2011
Ad

Similar to Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue (20)

PDF
Big Data Day LA 2015 - Solr Search with Spark for Big Data Analytics in Actio...
PDF
Rapid prototyping with solr - By Erik Hatcher
PDF
Rapid Prototyping with Solr
PDF
Hue architecture in the Hadoop ecosystem and SQL Editor
PDF
SQL and Search with Spark in your browser
PDF
Made for Mobile - Let Office 365 Power Your Mobile Apps
PDF
Building mobile applications with DrupalGap
PDF
New-Age Search through Apache Solr
PDF
Oslo Solr MeetUp March 2012 - Solr4 alpha
PDF
Solr as a Spark SQL Datasource
PDF
Behavior Driven Development and Automation Testing Using Cucumber
PDF
Sails.js Intro
PDF
Apache Solr! Enterprise Search Solutions at your Fingertips!
KEY
The Open & Social Web - Kings of Code 2009
PDF
Make your gui shine with ajax solr
PPTX
My weekend startup: seocrawler.co
PPTX
Intro to node and mongodb 1
PDF
Seven Versions of One Web Application
PPTX
Plone FSR
PDF
Getting Started with DrupalGap
Big Data Day LA 2015 - Solr Search with Spark for Big Data Analytics in Actio...
Rapid prototyping with solr - By Erik Hatcher
Rapid Prototyping with Solr
Hue architecture in the Hadoop ecosystem and SQL Editor
SQL and Search with Spark in your browser
Made for Mobile - Let Office 365 Power Your Mobile Apps
Building mobile applications with DrupalGap
New-Age Search through Apache Solr
Oslo Solr MeetUp March 2012 - Solr4 alpha
Solr as a Spark SQL Datasource
Behavior Driven Development and Automation Testing Using Cucumber
Sails.js Intro
Apache Solr! Enterprise Search Solutions at your Fingertips!
The Open & Social Web - Kings of Code 2009
Make your gui shine with ajax solr
My weekend startup: seocrawler.co
Intro to node and mongodb 1
Seven Versions of One Web Application
Plone FSR
Getting Started with DrupalGap

More from gethue (6)

PDF
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
PDF
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
PDF
SF Dev Meetup - Hue SDK
PDF
Hue: The Hadoop UI - Where we stand, Hue Meetup SF
PDF
Hue: The Hadoop UI - HUG France
PDF
Hue: The Hadoop UI - Stockholm HUG
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
SF Dev Meetup - Hue SDK
Hue: The Hadoop UI - Where we stand, Hue Meetup SF
Hue: The Hadoop UI - HUG France
Hue: The Hadoop UI - Stockholm HUG

Recently uploaded (20)

PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
annual-report-2024-2025 original latest.
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Introduction to machine learning and Linear Models
PPT
Quality review (1)_presentation of this 21
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
.pdf is not working space design for the following data for the following dat...
Galatica Smart Energy Infrastructure Startup Pitch Deck
annual-report-2024-2025 original latest.
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
climate analysis of Dhaka ,Banglades.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Introduction to machine learning and Linear Models
Quality review (1)_presentation of this 21
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Database Infoormation System (DBIS).pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Qualitative Qantitative and Mixed Methods.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Miokarditis (Inflamasi pada Otot Jantung)
Data_Analytics_and_PowerBI_Presentation.pptx
.pdf is not working space design for the following data for the following dat...

Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue