SlideShare a Scribd company logo
Smart data,
                 Lily                                                                        at scale
                                                                                             madE easy




                      from content storage
                      to scaling smart data


                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org

maandag 6 juni 2011
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   2

maandag 6 juni 2011
the pain

                                                                    data


                                             need for
                                            distributed
                                            processing
                                                                             moore




                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   3

maandag 6 juni 2011
the pain

              » growth of data sets
              » smart businesses need
                to apply analytics to                                            Smart data,
                activities
                                                                                 at scale
              » doing business online
                means real-time
                                                                                 madE easy
              » talent shortage


                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   4

maandag 6 juni 2011
LILY


                  The Real-time Platform built for the Age of Data.

                  We manage, track and measure your data and users,
                  and do the mat(c)hmaking in-between:
                  » provide you with business intelligence and analytics
                  » harvest user profiles and learn their interests
                  » dynamically engage your users using quality recommendations



                         IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   5

maandag 6 juni 2011
where would you use lily?
        » large collections of data                              » large groups of users
             » content repositories                                 » e-commerce / retail
             » library catalogs                                     » news / media
             » (media) asset management
             » product catalogs
             » ‘live’ archives

                                                                 » ... if you want to use big
                                                                    data, but you need easy.
                        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   6

maandag 6 juni 2011
ns
                                                                           pe
                                                                           ap
                                                                      gic h
                                                                   ma
                                                                  he
                                                                  t
                                                               re
                                                             he
                                                          sw
                                                          si
                                                     +
                                                       thi




                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   7

maandag 6 juni 2011
beyond content management

                                                                    marketing
                            broadcast




                                                                     revenue

                                                               product / service


                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   8

maandag 6 juni 2011
beyond content management: data + analytics

                                                              recommendations
                         call to action



                      personalised

                                                                     revenue

                                                               product / service
                                                         audience data
                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   9

maandag 6 juni 2011
LILY 2.0: smart data
                                      SMARTER DATA                 data processing
                                                  s
                                          relation
                                                                     recommendations
                                                                     semantic augmentation
                                                                     Analytics




                                                          usage
                                                         metrics              domain
                                                                            knowledge
                                                                              patterns
                                                                              rules
                                                                              keywords
                                                                              lists
                                                                              ...




                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   10

maandag 6 juni 2011
roadmap
              » now: highly-scalable data repository: store, index and search
              » next: with real-time usage stats gathering and analytics
              » later: and built-in context- and user-sensitive
                  recommendations

              » built on top of Google BigTable / HBase / Solr
                  » identical, robust technology in use at Facebook, Twitter,
                      StumbleUpon, Yahoo!
                  » scales widely over distributed (cloud) infrastructure

                           IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   11

maandag 6 juni 2011
Lily Repository Model




                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   12

maandag 6 juni 2011
Sample Lily Schema (excerpt)
                                                                      

{
     namespaces:
{
                                                                      



name:
"b$name",
     



/*
Declaration
of
namespace
prefixes.
*/
                                                                      



valueType:
{
primitive:
"STRING"
},
     



"org.lilyproject.bookssample":
"b",
                                                                      



scope:
"versioned"
     



"org.lilyproject.vtag":
"vtag"
                                                                      

},
     

},
                                                                      

{
     fieldTypes:
[
                                                                      



name:
"b$bio",
     

{
                                                                      



valueType:
{
primitive:
"STRING"
},
     



name:
"b$title",
                                                                      



scope:
"versioned"
     



valueType:
{
primitive:
"STRING"
},
                                                                      

},
     



scope:
"versioned"
                                                                      

{
     

},
                                                                      



name:
"vtag$last",
     

{
                                                                      



valueType:
{
primitive:
"LONG"
},
     



name:
"b$pages",
                                                                      



scope:
"non_versioned"
     



valueType:
{
primitive:
"INTEGER"
},
                                                                      

}
     



scope:
"versioned"
                                                                      

],
     

},
                                                                      recordTypes:
[
     

{
                                                                      

{
     



name:
"b$language",
                                                                      



name:
"b$Book",
     



valueType:
{
primitive:
"STRING"
},
                                                                      



fields:
[
     



scope:
"versioned"
                                                                      





{name:
"b$title",
mandatory:
true
},
     

},
                                                                      





{name:
"b$pages",
mandatory:
false
},
     

{
                                                                      





{name:
"b$language",
mandatory:
false
},
     



name:
"b$authors",
                                                                      





{name:
"b$authors",
mandatory:
false
},
     



valueType:
{
primitive:
"LINK",
multiValue:
true
},
                                                                      





{name:
"vtag$last",
mandatory:
false
}
     



scope:
"versioned"
                                                                      



]
     

},
                                                                      

},

                                                                      ...


                         IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                     13

maandag 6 juni 2011
Lily Architecture
             (deployment)




                        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   14

maandag 6 juni 2011
Lily Architecture
                           (components)




                                          IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   15

maandag 6 juni 2011
HBase indexing & RowLog Library
        » building and querying                                             » need for sync/async
            indexes, GAE-style                                                 operations
                                                                               » updating of secondary indexes
                          rowkey            col          col
            content
                              A             val3         foo6                      (e.g. link tables)
              table
                              B             val2         foo7
                                                                               » feeding of Indexer
                                                                                   (= indexes Lily-content into Solr)
                                   rowkey          col                      » not: transactions
                      order




              index
            table A                val2-B
                                   val3-A                                   » need for distribution and
                                                                               durability

                                   IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org            16

maandag 6 juni 2011
The Lily Indexer

                                                                                                          sharding towards
                         indexing of multiple   incremental index                          blob content
       denormalization                                              batch index building                   multiple SOLR
                         versions of a record        updating                               extraction
                                                                                                              instances




                            IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                        17

maandag 6 juni 2011
status june 2011

              » Lily 1.0.1 released - developing since Q4/09
              » some customers - DIY retail / media / news
                  » e-commerce platform project
                  » Lily as the data (integration) tier
              » first contrib: FrogPond (annotated Java <> Lily mapper)
                  https://bitbucket.org/calmera/frogpond


                          IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   18

maandag 6 juni 2011
Next up: usage stats
              » sits in CRUD-path
              » tracks users ops against
                  records
                                                                                                interactions
                  » from both perspectives
                                                                             record                                  user
                  » arbitrary K/V properties: time,
                      location, ...




                                                                                rec
              » automatically builds user



                                                                                 om
                                                                                   me
                                                                                      nd
                                                                                      ati
                                                                                         o
                  profiles (as records)


                                                                                           ns
                                                                                                 indexes




                                                                                                                e
                                                                                                               tim
                  » tied to records ops
                  » indexed access
                  » time dimension: trending

                              IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                     19

maandag 6 juni 2011
from usage stats to recommendations ‘light’

                                record                                                     user



              » grouping of users based on
                  » shared properties
                  » shared record access
              » grouping of records based on
                  » shared properties
                                                                                     {     connections


                  » shared user operations                                            recommendations

                         IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org       20

maandag 6 juni 2011
full-on recommendations

              » look at real-time-capable Mahout algorithms
              » pre-index or -calculate as much as possible
                  » save as secondary indexes
                  » present recommendations as part of record API
              » allow user to contribute ‘domain knowledge’ to
                  record processing pipeline
                  » pattern detection, keywords, ontologies, ...



                          IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   21

maandag 6 juni 2011
timeline


              » Lily + usage stats                                                                 10/2011
              » Lily + usage stats + light-weight analytics                                        12/2011
              » Lily + recommendations ‘light’                                                      3/2012
              » Lily 2.0 : full-on recommendations                                                  6/2012



                       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org             22

maandag 6 juni 2011
lily enterprise


              » adds tools:
                  » yum/deb package repo
                  » cluster deploy scripts
                      (also EC2)
                  » Admin UI
              » + enterprise support



                            IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   23

maandag 6 juni 2011
demo (if time permits)


                       message                                                 part
                      ‣to                                             ‣content
                      ‣from                                           ‣mediaType
                      ‣parts                                          ‣message
                      ‣listId
                      ‣subject
                      ‣sender




                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   24

maandag 6 juni 2011
WHERE?



                                        www.lilyproject.org




                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   25

maandag 6 juni 2011
Thank you !
                                                   for your attention
                                                   for your questions

                                                   » stevenn@outerthought.org

                                                   »           @stevenn

                      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org

maandag 6 juni 2011

More Related Content

PDF
Sirris innovate2011 - Lily, Smart Data at scale made easy, Steven Noels, Oute...
PDF
Location Intelligence: when Business Intelligence meets Cartography
PDF
Smarter Computing Big Data
PDF
Webinar: GIS and BI with Open Source
PDF
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
PDF
Analyse prédictive en assurance santé par Julien Cabot
PDF
IBM Stream au Hadoop User Group
PDF
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Sirris innovate2011 - Lily, Smart Data at scale made easy, Steven Noels, Oute...
Location Intelligence: when Business Intelligence meets Cartography
Smarter Computing Big Data
Webinar: GIS and BI with Open Source
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Analyse prédictive en assurance santé par Julien Cabot
IBM Stream au Hadoop User Group
Exploring Process Barriers to Release Public Sector Information in Local Gove...

Similar to From Content Storage to Scaling Smart Data (20)

PDF
Lily @ Work Webinar
PDF
Outerthought / Lily Partnerships
PDF
Lily at HUG UK
PDF
Huguk lily
PDF
Hadoop World 2011: Lily: Smart Data at Scale, Made Easy
PDF
Welcome to the Age of Data
PDF
NoSQL with Hadoop and HBase
PDF
NoSQL intro for YaJUG / NoSQL UG Luxembourg
PDF
ODI Overview 2013-04-09
PDF
Lily for the Bay Area HBase UG - NYC edition
PPT
Gradiant - Technology Offer in Business Analytics
PDF
Learning Lessons: Building a CMS on top of NoSQL technologies
PDF
Problem Definition muAoPS | Analytics Problem Solving | Mu Sigma
PPTX
Analytics and Data Mining Industry Overview
PDF
KVIV / NoSQL : the new generation of database servers
PDF
Beyond Online PDFs
KEY
Building a CMS on top of NoSQL (for ParisJUG)
PPT
Predictive Analytics Innovation Summit
PPTX
ODOO_klab
PPTX
W-JAX Keynote - Big Data and Corporate Evolution
Lily @ Work Webinar
Outerthought / Lily Partnerships
Lily at HUG UK
Huguk lily
Hadoop World 2011: Lily: Smart Data at Scale, Made Easy
Welcome to the Age of Data
NoSQL with Hadoop and HBase
NoSQL intro for YaJUG / NoSQL UG Luxembourg
ODI Overview 2013-04-09
Lily for the Bay Area HBase UG - NYC edition
Gradiant - Technology Offer in Business Analytics
Learning Lessons: Building a CMS on top of NoSQL technologies
Problem Definition muAoPS | Analytics Problem Solving | Mu Sigma
Analytics and Data Mining Industry Overview
KVIV / NoSQL : the new generation of database servers
Beyond Online PDFs
Building a CMS on top of NoSQL (for ParisJUG)
Predictive Analytics Innovation Summit
ODOO_klab
W-JAX Keynote - Big Data and Corporate Evolution
Ad

More from NGDATA (10)

PDF
NGDATA Corporate Presentation
PDF
The Lily RowLog library
PDF
20110514 appsforghent
PPT
Big Data
PDF
Devoxx 2010 | Tools In Action : Kauri and Lily
PDF
Devoxx 2010 | Tools In Action : Kauri and Lily
PDF
Devoxx 2010 | LAB : ReST in Java
PDF
N-O-SQL, new database technologies on the rise
KEY
NoSQL BOF at Devoxx
KEY
NoSQL "Tools in Action" talk at Devoxx
NGDATA Corporate Presentation
The Lily RowLog library
20110514 appsforghent
Big Data
Devoxx 2010 | Tools In Action : Kauri and Lily
Devoxx 2010 | Tools In Action : Kauri and Lily
Devoxx 2010 | LAB : ReST in Java
N-O-SQL, new database technologies on the rise
NoSQL BOF at Devoxx
NoSQL "Tools in Action" talk at Devoxx
Ad

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
cuic standard and advanced reporting.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Modernizing your data center with Dell and AMD
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Machine learning based COVID-19 study performance prediction
cuic standard and advanced reporting.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Per capita expenditure prediction using model stacking based on satellite ima...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Understanding_Digital_Forensics_Presentation.pptx
The AUB Centre for AI in Media Proposal.docx
Advanced Soft Computing BINUS July 2025.pdf
GamePlan Trading System Review: Professional Trader's Honest Take
Unlocking AI with Model Context Protocol (MCP)
Chapter 3 Spatial Domain Image Processing.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Modernizing your data center with Dell and AMD
MYSQL Presentation for SQL database connectivity
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
NewMind AI Weekly Chronicles - August'25 Week I
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy

From Content Storage to Scaling Smart Data

  • 1. Smart data, Lily at scale madE easy from content storage to scaling smart data IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org maandag 6 juni 2011
  • 2. IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 2 maandag 6 juni 2011
  • 3. the pain data need for distributed processing moore IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 3 maandag 6 juni 2011
  • 4. the pain » growth of data sets » smart businesses need to apply analytics to Smart data, activities at scale » doing business online means real-time madE easy » talent shortage IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 4 maandag 6 juni 2011
  • 5. LILY The Real-time Platform built for the Age of Data. We manage, track and measure your data and users, and do the mat(c)hmaking in-between: » provide you with business intelligence and analytics » harvest user profiles and learn their interests » dynamically engage your users using quality recommendations IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 5 maandag 6 juni 2011
  • 6. where would you use lily? » large collections of data » large groups of users » content repositories » e-commerce / retail » library catalogs » news / media » (media) asset management » product catalogs » ‘live’ archives » ... if you want to use big data, but you need easy. IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 6 maandag 6 juni 2011
  • 7. ns pe ap gic h ma he t re he sw si + thi IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 7 maandag 6 juni 2011
  • 8. beyond content management marketing broadcast revenue product / service IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 8 maandag 6 juni 2011
  • 9. beyond content management: data + analytics recommendations call to action personalised revenue product / service audience data IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 9 maandag 6 juni 2011
  • 10. LILY 2.0: smart data SMARTER DATA data processing s relation recommendations semantic augmentation Analytics usage metrics domain knowledge patterns rules keywords lists ... IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 10 maandag 6 juni 2011
  • 11. roadmap » now: highly-scalable data repository: store, index and search » next: with real-time usage stats gathering and analytics » later: and built-in context- and user-sensitive recommendations » built on top of Google BigTable / HBase / Solr » identical, robust technology in use at Facebook, Twitter, StumbleUpon, Yahoo! » scales widely over distributed (cloud) infrastructure IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 11 maandag 6 juni 2011
  • 12. Lily Repository Model IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 12 maandag 6 juni 2011
  • 13. Sample Lily Schema (excerpt) 

{ namespaces:
{ 



name:
"b$name", 



/*
Declaration
of
namespace
prefixes.
*/ 



valueType:
{
primitive:
"STRING"
}, 



"org.lilyproject.bookssample":
"b", 



scope:
"versioned" 



"org.lilyproject.vtag":
"vtag" 

}, 

}, 

{ fieldTypes:
[ 



name:
"b$bio", 

{ 



valueType:
{
primitive:
"STRING"
}, 



name:
"b$title", 



scope:
"versioned" 



valueType:
{
primitive:
"STRING"
}, 

}, 



scope:
"versioned" 

{ 

}, 



name:
"vtag$last", 

{ 



valueType:
{
primitive:
"LONG"
}, 



name:
"b$pages", 



scope:
"non_versioned" 



valueType:
{
primitive:
"INTEGER"
}, 

} 



scope:
"versioned" 

], 

}, recordTypes:
[ 

{ 

{ 



name:
"b$language", 



name:
"b$Book", 



valueType:
{
primitive:
"STRING"
}, 



fields:
[ 



scope:
"versioned" 





{name:
"b$title",
mandatory:
true
}, 

}, 





{name:
"b$pages",
mandatory:
false
}, 

{ 





{name:
"b$language",
mandatory:
false
}, 



name:
"b$authors", 





{name:
"b$authors",
mandatory:
false
}, 



valueType:
{
primitive:
"LINK",
multiValue:
true
}, 





{name:
"vtag$last",
mandatory:
false
} 



scope:
"versioned" 



] 

}, 

}, ... IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 13 maandag 6 juni 2011
  • 14. Lily Architecture (deployment) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 14 maandag 6 juni 2011
  • 15. Lily Architecture (components) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 15 maandag 6 juni 2011
  • 16. HBase indexing & RowLog Library » building and querying » need for sync/async indexes, GAE-style operations » updating of secondary indexes rowkey col col content A val3 foo6 (e.g. link tables) table B val2 foo7 » feeding of Indexer (= indexes Lily-content into Solr) rowkey col » not: transactions order index table A val2-B val3-A » need for distribution and durability IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 16 maandag 6 juni 2011
  • 17. The Lily Indexer sharding towards indexing of multiple incremental index blob content denormalization batch index building multiple SOLR versions of a record updating extraction instances IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 17 maandag 6 juni 2011
  • 18. status june 2011 » Lily 1.0.1 released - developing since Q4/09 » some customers - DIY retail / media / news » e-commerce platform project » Lily as the data (integration) tier » first contrib: FrogPond (annotated Java <> Lily mapper) https://bitbucket.org/calmera/frogpond IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 18 maandag 6 juni 2011
  • 19. Next up: usage stats » sits in CRUD-path » tracks users ops against records interactions » from both perspectives record user » arbitrary K/V properties: time, location, ... rec » automatically builds user om me nd ati o profiles (as records) ns indexes e tim » tied to records ops » indexed access » time dimension: trending IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 19 maandag 6 juni 2011
  • 20. from usage stats to recommendations ‘light’ record user » grouping of users based on » shared properties » shared record access » grouping of records based on » shared properties { connections » shared user operations recommendations IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 20 maandag 6 juni 2011
  • 21. full-on recommendations » look at real-time-capable Mahout algorithms » pre-index or -calculate as much as possible » save as secondary indexes » present recommendations as part of record API » allow user to contribute ‘domain knowledge’ to record processing pipeline » pattern detection, keywords, ontologies, ... IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 21 maandag 6 juni 2011
  • 22. timeline » Lily + usage stats 10/2011 » Lily + usage stats + light-weight analytics 12/2011 » Lily + recommendations ‘light’ 3/2012 » Lily 2.0 : full-on recommendations 6/2012 IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 22 maandag 6 juni 2011
  • 23. lily enterprise » adds tools: » yum/deb package repo » cluster deploy scripts (also EC2) » Admin UI » + enterprise support IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 23 maandag 6 juni 2011
  • 24. demo (if time permits) message part ‣to ‣content ‣from ‣mediaType ‣parts ‣message ‣listId ‣subject ‣sender IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 24 maandag 6 juni 2011
  • 25. WHERE? www.lilyproject.org IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 25 maandag 6 juni 2011
  • 26. Thank you ! for your attention for your questions » stevenn@outerthought.org » @stevenn IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org maandag 6 juni 2011