SlideShare a Scribd company logo
Cubes
                   light-weight OLAP




Stefan Urbanek ■ @Stiivi ■ stefan.urbanek@gmail.com ■ July 2012
source

  github.com/Stiivi/cubes

         documentation

packages.python.org/cubes/
Overview

■   purpose
■   analytical modelling and OLAP
■   slicing and dicing
■   OLAP server
■   SQL backend
analytical data modelling
        lightweight
http://tendre.sme.sk
aggregation browsing
     slicing and dicing
modelling   reporting
            aggregation browsing
Architecture
✂
 model     browser




             http

backends   server
Logical Model
 multidimensional, analytical
business/analyst’s
  point of view
transactions                 analysis
         OLTP                        OLAP




application (operational) data   analytical data
Model
           {
               “name” = “My Model”
               “description” = ....

               “cubes” = [...]
               “dimensions” = [...]
           }




cubes                         dimensions
measures                        levels, attributes, hierarchy
Facts

                  measurable


      fact

                    fact data cell




most detailed information
location




type




              time



           dimensions
Dimension

■ provide context for facts
■ used to filter queries or reports
■ control scope of aggregation of
  facts
Hierarchy


     2010 May 1st



        levels
Dimension

■   levels and attributes          “dimensions” = [
                                     {
■   hierarchy*                          “name”:”date”,
                                        “levels”: ...

■   key attributes                   },
                                        “hierarchy”: ...

                                     ...
■   label attributes               ]




                       *partial support for multiple hierarchies
label attribute   key attribute
                  for links to slices
Cube
               “cubes” = [
                 {
                    “name”:”contracts”,
                    “dimensions”: [ “date”,
                                    “category” ]
                    “measures”: [
■ dimensions          {
                        “name”: “amount”,
                        “label”: “Contract Amount”,
■ measures            }
                        “aggregations”: [“sum”]

                    ]
                 },
                 ...
               ]


                *partial support for multiple hierarchies
"attributes": [
                           {
                             "name":"group",
                             "label": "Group code"

localizable                },
                           {
                             "name":"group_label",
model and attributes         "label": "Group",
                             "locales": ["en", "sk"]
                           }
                       ]
Aggregation
  Browser

    ∑
∑ measures
get more details
Aggregation
                               Browser




SQL Snowflake   SQL Denormalized                               Some HTTP Data
                                            MongoDB Browser
  Browser          Browser                                     Service Browser




                                                                     ?




                        “batteries” that are included
Browser Workspace




logical model
                +   data
Cell
context of interest




cell
cell
Path

              [45,2]




[2012, 6]
                       list of level keys
1   load_model("model.json")

           Application



                  ∑

                                 3   model.cube("sales")
                                 4   workspace.browser(cube)


             cubes

       Aggregation Browser
            backend



2   create_workspace("sql",
                     model,
                     url="sqlite:///data.sqlite")
summary




drill-down
browser.aggregate(o cell)




                            summary
browser.aggregate(o cell,
                  . drilldown=[9 "sector"])




                         drill-down
for row in result.drilldown:




              row["amount_sum"]
row[q label_attribute]            row[k key]
received_amount_sum


measure      aggregation




           record_count
browser.facts(o cell)


browser.values(o cell, 9 dimension)


browser.cell_details(o cell)
✂
    Slicing and Dicing
✂
✂
✂
               April 2012
constructi
 on work                       construction work in
                                    april 2012
             type




    supplier



                            date
cut types
✂

point         set           range
           [[2010,10],   from=[2010,10]
[2010]
            [2010,12]]   to=[2010,12]
Implicit Hierarchy
       drilldown
whole cube


                                          o cell = Cell(cube)
                                          browser.aggregate(o cell)
                Total




                                          browser.aggregate(o cell,
                                                       drilldown=[9 “date”])


2006 2007 2008 2009 2010


                                          ✂ cut = PointCut(9 “date”, [2010])
                                          o cell = o cell.slice(✂ cut)

                                          browser.aggregate(o cell,
                                                       drilldown=[9 “date”])
Jan   Feb Mar Apr March April May   ...
Drill-down Level
. drilldown = [9 "date"]


                implicit: next from o cell




. drilldown = {9 "date": "month"}


                              explicit
Cross Table
 experimental interface
2009     2010

     Assets           Due from Banks     3044     1803
     Assets              Investments    41012    36012
     Assets        Loans Outstanding   103657   118104
     Assets            Nonnegotiable     1202     1123
     Assets             Other Assets     2247     3071
     Assets        Other Receivables      984      811
     Assets              Receivables      176      171
     Assets               Securities       33      289
     Equity            Capital Stock    11491    11492
     Equity         Deferred Amounts      359      313
     Equity                    Other    -1683    -3043
     Equity        Retained Earnings    29870    28793
Liabilities               Borrowings   110040   128577
Liabilities   Derivative Liabilities   115642   110418
Liabilities                    Other       57        8
Liabilities        Other Liabilities     7321     5454
Liabilities             Sold or Lent     2323      998
rows = ["item.category",
        "item.subcategory"]

columns = ["year"]

measures = ["amount_sum"]

table = result.cross_table(
              rows,
              columns,
              measures
        )
Slicer
The HTTP OLAP Server



      ✂
Application




HTTP                         JSON
             Slicer



                   ∑




       Aggregation Browser
GET /model

GET /aggregate

GET /values

GET /report
w
 logical model       configuration   data




$ slicer serve slicer.ini
[server]
backend: sql
log_level: info

[model]
path: model.json
locales: en,sk

[workspace]
url: postgres://localhost/database
schema: datamart
fact_prefix: ft_
dimension_prefix: dm_



                                 w
∑      amount




GET /aggregate
GET aggregate




{
    "cell": [],
    "drilldown": [],
    "summary": {
        "record_count": 62,
        "amount_sum": 1116860
    }
}
∑         amount
✂




GET /aggregate?cut=date:2010
GET aggregate?cut=year:2010




{
    "cell": [
        {
            "path": ["2010"],
            "type": "point",
            "dimension": "year",
            "level_depth": 1
        }
    ],
    "drilldown": [],
    "summary": {
        "record_count": 31,
        "amount_sum": 566020
    }
}
GET aggregate?drilldown=year



{
     "cell": [],
     "total_cell_count": 2,
     "drilldown": [
         {
             "record_count": 31,
             "amount_sum": 550840,
             "year": 2009
         },
         {
             "record_count": 31,
             "amount_sum": 566020,
             "year": 2010
         }
     ],
     "summary": {
         "record_count": 62,
         "amount_sum": 1116860
     }
}
GET report


                     Content-Type: application/json
list of cuts         {
                         "cell": [
                             {
                                 "dimension": "date",
                                 "type": "range",
                                 "from": [2009],
                                 "to": [2011,6]
                             }
                         ],
                         "queries": {
        list of              "by_segment": {
     named queries               "query": "aggregate",
                                 "drilldown": ["segment"]
                             },
                             "by_year": {
                                 "query": "aggregate",
                                 "drilldown": {"date":"year"}
                             }
                         }
                     }
SQL Backend
 What data it works with?
★   or
         ❄
★

dimensions   fact table
❄


             fact table
dimensions
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Aggregation Browser


                     Browsing Context


               Snowflake            Denormalized
                             or
                Mapper               Mapper



denormalized view




snowflake
           ❄
logical




              physical
          ❄
SQL Features
■ does not require DB write access
■ denormalisation
 ■   denormalised browsing, indexing


■ simple date datatype dimension
 ■   extraction of date parts during mapping


■ multiple schema support
Slicer
command-line tool
■ model validation
  slicer model validate model.json



■ model translation
  slicer model translate model.json translation.json



■ workspace testing
  slicer test config.ini



■ denormalization
  slicer denormalize --materialize --index config.ini
Future
■ formatters for visualisation libraries
■ JavaScript library*             help needed

■ backends
■ derived measures


                        *http://github.com/Stiivi/cubes-js
Open Data

■ shared repository of models
■ shared repository of dimensions
■ public cubes
   open Slicer HTTP APIs




                           http://github.com/Stiivi/cubes/wiki
stay light
 Nutrition Facts
 Serving Size 1 cube

 Amount Per Serving
                       % Daily Value
 Total Fat 0g                    0%

   Saturated Fat 0g
   Trans Fat 0g
Thank You
              source:
    github.com/Stiivi/cubes
           documentation:
  packages.python.org/cubes/
             examples:
github.com/Stiivi/cubes-examples
Backup
Transactions                 Reporting
                              multidimensional
object–relational modelling
                                 modelling

      ORM mapping              logical model
                                 (and mapping)


   database connection            browser

     database engine             workspace
Limitations

■ one cut per dimension in a cell
 ■   logical conjunction of cuts (cut1 AND cut2 AND cut3 ...)


■ dimension-only selection
■ one - default hierarchy
 ■   some internals are ready for multiple

More Related Content

PDF
Anomaly/Novelty detection with scikit-learn
PDF
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
PDF
Principal Component Analysis
PDF
ggplot2をつかってみよう
PDF
A direct method for estimating linear non-Gaussian acyclic models
PPTX
GAN with Mathematics
PDF
Probabilistic Graphical Models 輪読会 #1
PDF
「R言語による Random Forest 徹底入門 -集団学習による分類・予測-」 - #TokyoR #11
Anomaly/Novelty detection with scikit-learn
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
Principal Component Analysis
ggplot2をつかってみよう
A direct method for estimating linear non-Gaussian acyclic models
GAN with Mathematics
Probabilistic Graphical Models 輪読会 #1
「R言語による Random Forest 徹底入門 -集団学習による分類・予測-」 - #TokyoR #11

What's hot (20)

PPTX
Principal component analysis
PDF
ディリクレ過程に基づく無限混合線形回帰モデル in 機械学習プロフェッショナルシリーズ輪読会
PDF
Visualizing Data Using t-SNE
PPTX
PCA vs LDA
PDF
データ分析グループの組織編制とその課題 マーケティングにおけるKPI設計の失敗例 ABテストの活用と、機械学習の導入 #CWT2016
PDF
Crisp dm
PPTX
Data mining concepts and work
PDF
PDF
Introduction to R Graphics with ggplot2
PPTX
主成分分析
PDF
関数データ解析の概要とその方法
PPTX
モデル予見制御に基づくペアトレード戦略
PPTX
for関数を使った繰り返し処理によるヒストグラムの一括出力
PDF
Introduction to ggplot2
PDF
状態空間モデルの実行方法と実行環境の比較
PDF
ロジスティック回帰入門
PDF
ゆらぎの数理 情報処理への応用
PPTX
Imputation of Missing Values using Random Forest
PPT
PDF
Rで学ぶロバスト推定
Principal component analysis
ディリクレ過程に基づく無限混合線形回帰モデル in 機械学習プロフェッショナルシリーズ輪読会
Visualizing Data Using t-SNE
PCA vs LDA
データ分析グループの組織編制とその課題 マーケティングにおけるKPI設計の失敗例 ABテストの活用と、機械学習の導入 #CWT2016
Crisp dm
Data mining concepts and work
Introduction to R Graphics with ggplot2
主成分分析
関数データ解析の概要とその方法
モデル予見制御に基づくペアトレード戦略
for関数を使った繰り返し処理によるヒストグラムの一括出力
Introduction to ggplot2
状態空間モデルの実行方法と実行環境の比較
ロジスティック回帰入門
ゆらぎの数理 情報処理への応用
Imputation of Missing Values using Random Forest
Rで学ぶロバスト推定
Ad

Similar to Cubes - Lightweight Python OLAP (EuroPython 2012 talk) (20)

PDF
Python business intelligence (PyData 2012 talk)
PDF
Bringing back the excitement to data analysis
PPT
Datacube
PDF
2 olap operaciones
PDF
Industrialized Linked Data
PPTX
dssi 10 12
PDF
Microsoft Big Data @ SQLUG 2013
PPTX
OLAP Basics and Fundamentals by Bharat Kalia
PDF
Emergent Distributed Data Storage
PDF
Cubes 1.0 Overview
PPTX
Drill njhug -19 feb2013
PDF
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
PPTX
Multi dimensional model vs (1)
PDF
MongoDB MapReduce Business Intelligence
PDF
Linking UK Government Data, John Sheridan
PPT
ch19.ppt
PPT
ch19.ppt
PPTX
Data cube computation
PPTX
Operation Point Cluster - Blue Raster Esri Developer Summit 2013 Presentation
PDF
Arrays in database systems, the next frontier?
Python business intelligence (PyData 2012 talk)
Bringing back the excitement to data analysis
Datacube
2 olap operaciones
Industrialized Linked Data
dssi 10 12
Microsoft Big Data @ SQLUG 2013
OLAP Basics and Fundamentals by Bharat Kalia
Emergent Distributed Data Storage
Cubes 1.0 Overview
Drill njhug -19 feb2013
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Multi dimensional model vs (1)
MongoDB MapReduce Business Intelligence
Linking UK Government Data, John Sheridan
ch19.ppt
ch19.ppt
Data cube computation
Operation Point Cluster - Blue Raster Esri Developer Summit 2013 Presentation
Arrays in database systems, the next frontier?
Ad

More from Stefan Urbanek (18)

PDF
StepTalk Introduction
PDF
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
PDF
Sepro - introduction
PDF
New york data brewery meetup #1 – introduction
PDF
Cubes – pluggable model explained
PDF
Cubes – ways of deployment
PDF
Knowledge Management Lecture 4: Models
PDF
Dallas Data Brewery Meetup #2: Data Quality Perception
PDF
Dallas Data Brewery - introduction
PDF
Bubbles – Virtual Data Objects
PDF
Knowledge Management Lecture 3: Cycle
PDF
Knowledge Management Lecture 2: Individuals, communities and organizations
KEY
Knowledge Management Lecture 1: definition, history and presence
KEY
Open spending as-is 2011-06
PDF
Cubes - Lightweight OLAP Framework
PDF
Open Data Decentralisation
PDF
Data Cleansing introduction (for BigClean Prague 2011)
PDF
Knowledge Management Introduction
StepTalk Introduction
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Sepro - introduction
New york data brewery meetup #1 – introduction
Cubes – pluggable model explained
Cubes – ways of deployment
Knowledge Management Lecture 4: Models
Dallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery - introduction
Bubbles – Virtual Data Objects
Knowledge Management Lecture 3: Cycle
Knowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 1: definition, history and presence
Open spending as-is 2011-06
Cubes - Lightweight OLAP Framework
Open Data Decentralisation
Data Cleansing introduction (for BigClean Prague 2011)
Knowledge Management Introduction

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
cuic standard and advanced reporting.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
MYSQL Presentation for SQL database connectivity
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Approach and Philosophy of On baking technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Advanced methodologies resolving dimensionality complications for autism neur...
Diabetes mellitus diagnosis method based random forest with bat algorithm
cuic standard and advanced reporting.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Unlocking AI with Model Context Protocol (MCP)
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Dropbox Q2 2025 Financial Results & Investor Presentation
Review of recent advances in non-invasive hemoglobin estimation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Understanding_Digital_Forensics_Presentation.pptx
Encapsulation_ Review paper, used for researhc scholars
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
MYSQL Presentation for SQL database connectivity
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Programs and apps: productivity, graphics, security and other tools
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Approach and Philosophy of On baking technology

Cubes - Lightweight Python OLAP (EuroPython 2012 talk)

  • 1. Cubes light-weight OLAP Stefan Urbanek ■ @Stiivi ■ stefan.urbanek@gmail.com ■ July 2012
  • 2. source github.com/Stiivi/cubes documentation packages.python.org/cubes/
  • 3. Overview ■ purpose ■ analytical modelling and OLAP ■ slicing and dicing ■ OLAP server ■ SQL backend
  • 6. aggregation browsing slicing and dicing
  • 7. modelling reporting aggregation browsing
  • 9. ✂ model browser http backends server
  • 12. transactions analysis OLTP OLAP application (operational) data analytical data
  • 13. Model { “name” = “My Model” “description” = .... “cubes” = [...] “dimensions” = [...] } cubes dimensions measures levels, attributes, hierarchy
  • 14. Facts measurable fact fact data cell most detailed information
  • 15. location type time dimensions
  • 16. Dimension ■ provide context for facts ■ used to filter queries or reports ■ control scope of aggregation of facts
  • 17. Hierarchy 2010 May 1st levels
  • 18. Dimension ■ levels and attributes “dimensions” = [ { ■ hierarchy* “name”:”date”, “levels”: ... ■ key attributes }, “hierarchy”: ... ... ■ label attributes ] *partial support for multiple hierarchies
  • 19. label attribute key attribute for links to slices
  • 20. Cube “cubes” = [ { “name”:”contracts”, “dimensions”: [ “date”, “category” ] “measures”: [ ■ dimensions { “name”: “amount”, “label”: “Contract Amount”, ■ measures } “aggregations”: [“sum”] ] }, ... ] *partial support for multiple hierarchies
  • 21. "attributes": [ { "name":"group", "label": "Group code" localizable }, { "name":"group_label", model and attributes "label": "Group", "locales": ["en", "sk"] } ]
  • 25. Aggregation Browser SQL Snowflake SQL Denormalized Some HTTP Data MongoDB Browser Browser Browser Service Browser ? “batteries” that are included
  • 27. Cell
  • 29. cell
  • 30. Path [45,2] [2012, 6] list of level keys
  • 31. 1 load_model("model.json") Application ∑ 3 model.cube("sales") 4 workspace.browser(cube) cubes Aggregation Browser backend 2 create_workspace("sql", model, url="sqlite:///data.sqlite")
  • 34. browser.aggregate(o cell, . drilldown=[9 "sector"]) drill-down
  • 35. for row in result.drilldown: row["amount_sum"] row[q label_attribute] row[k key]
  • 36. received_amount_sum measure aggregation record_count
  • 37. browser.facts(o cell) browser.values(o cell, 9 dimension) browser.cell_details(o cell)
  • 38. Slicing and Dicing ✂
  • 39. ✂ ✂ April 2012 constructi on work construction work in april 2012 type supplier date
  • 40. cut types ✂ point set range [[2010,10], from=[2010,10] [2010] [2010,12]] to=[2010,12]
  • 41. Implicit Hierarchy drilldown
  • 42. whole cube o cell = Cell(cube) browser.aggregate(o cell) Total browser.aggregate(o cell, drilldown=[9 “date”]) 2006 2007 2008 2009 2010 ✂ cut = PointCut(9 “date”, [2010]) o cell = o cell.slice(✂ cut) browser.aggregate(o cell, drilldown=[9 “date”]) Jan Feb Mar Apr March April May ...
  • 43. Drill-down Level . drilldown = [9 "date"] implicit: next from o cell . drilldown = {9 "date": "month"} explicit
  • 45. 2009 2010 Assets Due from Banks 3044 1803 Assets Investments 41012 36012 Assets Loans Outstanding 103657 118104 Assets Nonnegotiable 1202 1123 Assets Other Assets 2247 3071 Assets Other Receivables 984 811 Assets Receivables 176 171 Assets Securities 33 289 Equity Capital Stock 11491 11492 Equity Deferred Amounts 359 313 Equity Other -1683 -3043 Equity Retained Earnings 29870 28793 Liabilities Borrowings 110040 128577 Liabilities Derivative Liabilities 115642 110418 Liabilities Other 57 8 Liabilities Other Liabilities 7321 5454 Liabilities Sold or Lent 2323 998
  • 46. rows = ["item.category", "item.subcategory"] columns = ["year"] measures = ["amount_sum"] table = result.cross_table( rows, columns, measures )
  • 47. Slicer The HTTP OLAP Server ✂
  • 48. Application HTTP JSON Slicer ∑ Aggregation Browser
  • 49. GET /model GET /aggregate GET /values GET /report
  • 50. w logical model configuration data $ slicer serve slicer.ini
  • 51. [server] backend: sql log_level: info [model] path: model.json locales: en,sk [workspace] url: postgres://localhost/database schema: datamart fact_prefix: ft_ dimension_prefix: dm_ w
  • 52. amount GET /aggregate
  • 53. GET aggregate { "cell": [], "drilldown": [], "summary": { "record_count": 62, "amount_sum": 1116860 } }
  • 54. amount ✂ GET /aggregate?cut=date:2010
  • 55. GET aggregate?cut=year:2010 { "cell": [ { "path": ["2010"], "type": "point", "dimension": "year", "level_depth": 1 } ], "drilldown": [], "summary": { "record_count": 31, "amount_sum": 566020 } }
  • 56. GET aggregate?drilldown=year { "cell": [], "total_cell_count": 2, "drilldown": [ { "record_count": 31, "amount_sum": 550840, "year": 2009 }, { "record_count": 31, "amount_sum": 566020, "year": 2010 } ], "summary": { "record_count": 62, "amount_sum": 1116860 } }
  • 57. GET report Content-Type: application/json list of cuts { "cell": [ { "dimension": "date", "type": "range", "from": [2009], "to": [2011,6] } ], "queries": { list of "by_segment": { named queries "query": "aggregate", "drilldown": ["segment"] }, "by_year": { "query": "aggregate", "drilldown": {"date":"year"} } } }
  • 58. SQL Backend What data it works with?
  • 59. or ❄
  • 60. ★ dimensions fact table
  • 61. fact table dimensions
  • 63. Aggregation Browser Browsing Context Snowflake Denormalized or Mapper Mapper denormalized view snowflake ❄
  • 64. logical physical ❄
  • 65. SQL Features ■ does not require DB write access ■ denormalisation ■ denormalised browsing, indexing ■ simple date datatype dimension ■ extraction of date parts during mapping ■ multiple schema support
  • 67. ■ model validation slicer model validate model.json ■ model translation slicer model translate model.json translation.json ■ workspace testing slicer test config.ini ■ denormalization slicer denormalize --materialize --index config.ini
  • 69. ■ formatters for visualisation libraries ■ JavaScript library* help needed ■ backends ■ derived measures *http://github.com/Stiivi/cubes-js
  • 70. Open Data ■ shared repository of models ■ shared repository of dimensions ■ public cubes open Slicer HTTP APIs http://github.com/Stiivi/cubes/wiki
  • 71. stay light Nutrition Facts Serving Size 1 cube Amount Per Serving % Daily Value Total Fat 0g 0% Saturated Fat 0g Trans Fat 0g
  • 72. Thank You source: github.com/Stiivi/cubes documentation: packages.python.org/cubes/ examples: github.com/Stiivi/cubes-examples
  • 74. Transactions Reporting multidimensional object–relational modelling modelling ORM mapping logical model (and mapping) database connection browser database engine workspace
  • 75. Limitations ■ one cut per dimension in a cell ■ logical conjunction of cuts (cut1 AND cut2 AND cut3 ...) ■ dimension-only selection ■ one - default hierarchy ■ some internals are ready for multiple

Editor's Notes

  • #2: OLAP and Logical Model, Architecture, Slicing and Dicing, HTTP Server, SQL Backend\n\n
  • #3: \n
  • #4: \n
  • #5: Q: Who is familiar with OLAP?\n
  • #6: quick setup and reporting\ndoes not cover everything (intentionally)\n
  • #7: example application - public procurements of slovakia\n
  • #8: quick setup and reporting\ndoes not cover everything (intentionally)\n
  • #9: will talk about modelling first, then reporting, then going to mix\n
  • #10: how it looks like and what it does?\n
  • #11: FIXME: add slicer tool here\n
  • #12: not going into details, but just to align terminology and define context\n
  • #13: not so rare we see creating reports directly from what is available, instead of starting with business needs and tryig to find a way how to derive it from what is available\n
  • #14: different approach to data use, different needs\nwhile in apps you are focusing on transactions - trans data/oltp, in reporting you are focusing on analysis -> analytical data\nlogically separate (does not have to be physically separate)\n
  • #15: \n
  • #16: \n
  • #17: \n
  • #18: CONTEXT: where did the sale happened? who signed the contract?\nFILTER: how much was spent for construction work?\nAGGREGATION SCOPE: what was the revenue by country?\n\nused for ordering or sorting\ndefine master-detail relationships\n
  • #19: \n
  • #20: \n
  • #21: provides metadata to easily create apps\n
  • #22: \n
  • #23: \n
  • #24: \n
  • #25: \n
  • #26: \n
  • #27: \n
  • #28: what the browser does?\n
  • #29: aggregating measures\n
  • #30: \n
  • #31: aggregation browser has to have concrete backend implementation\n
  • #32: + bunch of other stuff\n
  • #33: context\n
  • #34: before I will talk about aggregation browser, I have to introduce a cell\n
  • #35: \n
  • #36: \n
  • #37: our filter/selection defines the cell\nthis is kind of multidimensional “breadcrumbs”\n
  • #38: path - taken from file system terminology for easier understanding\nthose are keys\nnote that displayed is level label, not a key\n
  • #39: ... let’s put it into a picture\n
  • #40: \n
  • #41: “aggregation result” was created according to usual report look\n
  • #42: FIXME: add picture\n
  • #43: you can specify multiple dimensions and explicit level to be drilled down (for example “month” level of a date dimension)\n
  • #44: it provides list of records, which are represented as dictionaries \nyou have to find out which one is level attribute or the key\n\n
  • #45: no need to find the context of dimension of interest\nif not sufficient, one can still fall-back to the manual method\n
  • #46: \n
  • #47: facts – get details\nvalues - can be used to create selection boxes, also level can be specified\ncell_details is used for creating the multidimensional breadcrumbs mentioned before - it contains data to humanly describe current context of interest\nordering and pagination is supported\n
  • #48: what was that “cell” thing?\n
  • #49: \n
  • #50: also show hierarchy\n
  • #51: \n
  • #52: \n
  • #53: same drilldown, different cell\n
  • #54: implicit: raises error if current level is the last one\nexample: you are exploring year 2010 (cell) and would like to see split by year (higher level)\n
  • #55: \n
  • #56: \n
  • #57: \n
  • #58: \n
  • #59: \n
  • #60: \n
  • #61: \n
  • #62: \n
  • #63: just to name a few...\n
  • #64: \n
  • #65: \n
  • #66: \n
  • #67: \n
  • #68: \n
  • #69: \n
  • #70: \n
  • #71: \n
  • #72: \n
  • #73: powered by sqlalchemy\n
  • #74: powered by great abstraction framework\nconstruction of SQL statements\n
  • #75: \n
  • #76: \n
  • #77: \n
  • #78: denormalized\n
  • #79: thanks to new browser and browsing context it is possible to transparently switch between original snowflake and generated denormalized view (which can be materialized and indexed based on dimension level keys)\n
  • #80: in which table and which column is the attribute?\n
  • #81: \n
  • #82: \n
  • #83: \n
  • #84: \n
  • #85: if someone would like to contribute with his skills, he is more than welcome and I will help\n
  • #86: so if you have OS app, like Django that more users use, you can publish reporting model for others.\nput your cube in the Wiki\n
  • #87: \n
  • #88: MIT license\n
  • #89: \n
  • #90: \n
  • #91: \n