SlideShare a Scribd company logo
Multidimensional
 Data Analysis
  with JRuby
   Raimonds Simanovskis
      github.com/rsim
           @rsim
Relational
data model
SQL is good for detailed
       data queries
           Get all sales transactions in
           USA, California
SELECT customers.fullname, products.product_name,
  sales.sales_date, sales.unit_sales, sales.store_sales
FROM sales
  LEFT JOIN products ON sales.product_id = products.id
  LEFT JOIN customers ON sales.customer_id = customers.id
WHERE customers.country = 'USA' AND customers.state_province = 'CA'
SQL becomes complex
       for analytical queries
           Get total sales in USA, California
           in Q1, 2011 by main product groups

SELECT product_class.product_family,
       SUM(sales.unit_sales) unit_sales_sum,
       SUM(sales.store_sales) store_sales_sum
    FROM sales
      LEFT JOIN product ON sales.product_id = product.product_id
      LEFT JOIN product_class
           ON product.product_class_id = product_class.product_class_id
      LEFT JOIN time_by_day ON sales.time_id = time_by_day.time_id
      LEFT JOIN customer ON sales.customer_id = customer.customer_id
    WHERE time_by_day.the_year = 2011 AND time_by_day.quarter = 'Q1'
      AND customer.country = 'USA' AND customer.state_province = 'CA'
    GROUP BY product_class.product_family
Maybe write distributed
map reduce function?
Multidimensional
      Data Model
Multidimensional cubes

     Dimensions
Hierarchies and levels

      Measures
OLAP technologies
  On-Line Analytical Processing
http://github.com/rsim/mondrian-olap
MDX query language
          Get total units sold and sales amount
          in USA, California in Q1, 2011
          by main product groups


SELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS,
       [Product].children ON ROWS
FROM   [Sales]
WHERE ( [Time].[2011].[Q1], [Customers].[USA].[CA] )
Or in Ruby like this
       Get total units sold and sales amount
       in USA, California in Q1, 2011
       by main product groups

olap.from('Sales').
columns('[Measures].[Unit Sales]',
        '[Measures].[Store Sales]').
rows('[Product].children').
where('[Time].[2011].[Q1]', '[Customers].[USA].[CA]').
execute
Also more complex
                queries
           Get sales amount and profit %
           of top 50 products sold in USA and Canada
           during Q1, 2011

olap.from('Sales').
with_member('[Measures].[ProfitPct]').
  as('(Measures.[Store Sales] - Measures.[Store Cost]) / Measures.[Store Sales]',
  :format_string => 'Percent').
columns('[Measures].[Store Sales]', '[Measures].[ProfitPct]').
rows('[Product].children').crossjoin('[Customers].[Canada]', '[Customers].[USA]').
  top_count(50, '[Measures].[Store Sales]')
where('[Time].[2011].[Q1]').
execute
OLAP schema
            (mapping cube to tables)
schema = Mondrian::OLAP::Schema.define do
  cube 'Sales' do
    table 'sales'
    dimension 'Gender', :foreign_key => 'customer_id' do
      hierarchy :has_all => true, :primary_key => 'customer_id' do
        table 'customer'
        level 'Gender', :column => 'gender', :unique_members => true
      end
    end
    dimension 'Time', :foreign_key => 'time_id' do
      hierarchy :has_all => false, :primary_key => 'time_id' do
        table 'time_by_day'
        level 'Year', :column => 'the_year', :type => 'Numeric', :unique_members => true
        level 'Quarter', :column => 'quarter', :unique_members => false
        level 'Month',:column => 'month_of_year',:type => 'Numeric',:unique_members => false
      end
    end
    measure 'Unit Sales', :column => 'unit_sales', :aggregator => 'sum'
    measure 'Store Sales', :column => 'store_sales', :aggregator => 'sum'
  end
end
mondrian-olap gem
   eazybi.com

More Related Content

PDF
Data Warehouses and Multi-Dimensional Data Analysis
PDF
RailsWayCon: Multidimensional Data Analysis with JRuby
PDF
Multidimensional Data Analysis with Ruby (sample)
PDF
Product listing presentation
PDF
Extending Oracle E-Business Suite with Ruby on Rails
PPTX
IT301-Datawarehousing (1) and its sub topics.pptx
PDF
Data warehousing
PPTX
Project report aditi paul1
Data Warehouses and Multi-Dimensional Data Analysis
RailsWayCon: Multidimensional Data Analysis with JRuby
Multidimensional Data Analysis with Ruby (sample)
Product listing presentation
Extending Oracle E-Business Suite with Ruby on Rails
IT301-Datawarehousing (1) and its sub topics.pptx
Data warehousing
Project report aditi paul1

Similar to Multidimensional Data Analysis with JRuby (20)

PDF
Building a semantic/metrics layer using Calcite
PDF
Introtosqltuning
PDF
Tn shaw 107 data warehousing problem set
PDF
Dwbi Project
PPTX
Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptx
PPTX
Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptx
PDF
Da 100-questions
PPT
Startup Metrics 4 Pirates (Brazil, April 2011)
DOCX
Link to Coffee Shop-- httpextmedia.kaplan.edubusinessMediaAB.docx
PDF
DF2UFL 2012: Reporting & Dashboards with Formula Success Tools
PPTX
Ignite Business Model.pptx
DOCX
Final Project Report - Walmart Sales
PPT
Startup Metrics 4 Pirates (July 2011)
PPTX
Empowerment Technology Lesson 4
PPTX
Ignite Milestone 3-1698217779317-1710509640543.pptx
PPTX
paytm_mall_epurchase_data data analysis.
PDF
PL-300 Certification 2025: How to Prepare Smartly and Pass Easily
PPTX
Data ware dimension design
PPTX
Data ware dimension design
PDF
Business Intelligence Portfolio
Building a semantic/metrics layer using Calcite
Introtosqltuning
Tn shaw 107 data warehousing problem set
Dwbi Project
Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptx
Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptx
Da 100-questions
Startup Metrics 4 Pirates (Brazil, April 2011)
Link to Coffee Shop-- httpextmedia.kaplan.edubusinessMediaAB.docx
DF2UFL 2012: Reporting & Dashboards with Formula Success Tools
Ignite Business Model.pptx
Final Project Report - Walmart Sales
Startup Metrics 4 Pirates (July 2011)
Empowerment Technology Lesson 4
Ignite Milestone 3-1698217779317-1710509640543.pptx
paytm_mall_epurchase_data data analysis.
PL-300 Certification 2025: How to Prepare Smartly and Pass Easily
Data ware dimension design
Data ware dimension design
Business Intelligence Portfolio
Ad

More from Raimonds Simanovskis (20)

PDF
Profiling Mondrian MDX Requests in a Production Environment
PDF
Improve Mondrian MDX usability with user defined functions
PDF
Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015
PDF
mondrian-olap JRuby library
PDF
eazyBI Overview - Embedding Mondrian in other applications
PDF
Atvērto datu izmantošanas pieredze Latvijā
PDF
JavaScript Unit Testing with Jasmine
PDF
JRuby - Programmer's Best Friend on JVM
PDF
Agile Operations or How to sleep better at night
PDF
TDD - Why and How?
PDF
Analyze and Visualize Git Log for Fun and Profit
PDF
PL/SQL Unit Testing Can Be Fun
PDF
opendata.lv Case Study - Promote Open Data with Analytics and Visualizations
PDF
Rails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
PDF
Why Every Tester Should Learn Ruby
PDF
Rails on Oracle 2011
PDF
Rails-like JavaScript using CoffeeScript, Backbone.js and Jasmine
PDF
How to Adopt Agile at Your Organization
PDF
PL/SQL Unit Testing Can Be Fun!
PDF
Fast Web Applications Development with Ruby on Rails on Oracle
Profiling Mondrian MDX Requests in a Production Environment
Improve Mondrian MDX usability with user defined functions
Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015
mondrian-olap JRuby library
eazyBI Overview - Embedding Mondrian in other applications
Atvērto datu izmantošanas pieredze Latvijā
JavaScript Unit Testing with Jasmine
JRuby - Programmer's Best Friend on JVM
Agile Operations or How to sleep better at night
TDD - Why and How?
Analyze and Visualize Git Log for Fun and Profit
PL/SQL Unit Testing Can Be Fun
opendata.lv Case Study - Promote Open Data with Analytics and Visualizations
Rails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
Why Every Tester Should Learn Ruby
Rails on Oracle 2011
Rails-like JavaScript using CoffeeScript, Backbone.js and Jasmine
How to Adopt Agile at Your Organization
PL/SQL Unit Testing Can Be Fun!
Fast Web Applications Development with Ruby on Rails on Oracle
Ad

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Approach and Philosophy of On baking technology
PPTX
Cloud computing and distributed systems.
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
KodekX | Application Modernization Development
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Chapter 3 Spatial Domain Image Processing.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
PDF
Electronic commerce courselecture one. Pdf
Digital-Transformation-Roadmap-for-Companies.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Approach and Philosophy of On baking technology
Cloud computing and distributed systems.
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
KodekX | Application Modernization Development
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Chapter 3 Spatial Domain Image Processing.pdf
The AUB Centre for AI in Media Proposal.docx
sap open course for s4hana steps from ECC to s4
Spectral efficient network and resource selection model in 5G networks
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Unlocking AI with Model Context Protocol (MCP)
Understanding_Digital_Forensics_Presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf
Electronic commerce courselecture one. Pdf

Multidimensional Data Analysis with JRuby

  • 1. Multidimensional Data Analysis with JRuby Raimonds Simanovskis github.com/rsim @rsim
  • 3. SQL is good for detailed data queries Get all sales transactions in USA, California SELECT customers.fullname, products.product_name, sales.sales_date, sales.unit_sales, sales.store_sales FROM sales LEFT JOIN products ON sales.product_id = products.id LEFT JOIN customers ON sales.customer_id = customers.id WHERE customers.country = 'USA' AND customers.state_province = 'CA'
  • 4. SQL becomes complex for analytical queries Get total sales in USA, California in Q1, 2011 by main product groups SELECT product_class.product_family, SUM(sales.unit_sales) unit_sales_sum, SUM(sales.store_sales) store_sales_sum FROM sales LEFT JOIN product ON sales.product_id = product.product_id LEFT JOIN product_class ON product.product_class_id = product_class.product_class_id LEFT JOIN time_by_day ON sales.time_id = time_by_day.time_id LEFT JOIN customer ON sales.customer_id = customer.customer_id WHERE time_by_day.the_year = 2011 AND time_by_day.quarter = 'Q1' AND customer.country = 'USA' AND customer.state_province = 'CA' GROUP BY product_class.product_family
  • 5. Maybe write distributed map reduce function?
  • 6. Multidimensional Data Model Multidimensional cubes Dimensions Hierarchies and levels Measures
  • 7. OLAP technologies On-Line Analytical Processing
  • 9. MDX query language Get total units sold and sales amount in USA, California in Q1, 2011 by main product groups SELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, [Product].children ON ROWS FROM [Sales] WHERE ( [Time].[2011].[Q1], [Customers].[USA].[CA] )
  • 10. Or in Ruby like this Get total units sold and sales amount in USA, California in Q1, 2011 by main product groups olap.from('Sales'). columns('[Measures].[Unit Sales]', '[Measures].[Store Sales]'). rows('[Product].children'). where('[Time].[2011].[Q1]', '[Customers].[USA].[CA]'). execute
  • 11. Also more complex queries Get sales amount and profit % of top 50 products sold in USA and Canada during Q1, 2011 olap.from('Sales'). with_member('[Measures].[ProfitPct]'). as('(Measures.[Store Sales] - Measures.[Store Cost]) / Measures.[Store Sales]', :format_string => 'Percent'). columns('[Measures].[Store Sales]', '[Measures].[ProfitPct]'). rows('[Product].children').crossjoin('[Customers].[Canada]', '[Customers].[USA]'). top_count(50, '[Measures].[Store Sales]') where('[Time].[2011].[Q1]'). execute
  • 12. OLAP schema (mapping cube to tables) schema = Mondrian::OLAP::Schema.define do cube 'Sales' do table 'sales' dimension 'Gender', :foreign_key => 'customer_id' do hierarchy :has_all => true, :primary_key => 'customer_id' do table 'customer' level 'Gender', :column => 'gender', :unique_members => true end end dimension 'Time', :foreign_key => 'time_id' do hierarchy :has_all => false, :primary_key => 'time_id' do table 'time_by_day' level 'Year', :column => 'the_year', :type => 'Numeric', :unique_members => true level 'Quarter', :column => 'quarter', :unique_members => false level 'Month',:column => 'month_of_year',:type => 'Numeric',:unique_members => false end end measure 'Unit Sales', :column => 'unit_sales', :aggregator => 'sum' measure 'Store Sales', :column => 'store_sales', :aggregator => 'sum' end end
  • 13. mondrian-olap gem eazybi.com