SlideShare a Scribd company logo
4
Most read
12
Most read
20
Most read
INTRODUCTION TO
    MATRIX FACTORIZATION




                              proprietary material
    METHODS COLLABORATIVE
    FILTERING
    USER RATINGS PREDICTION


1   Alex Lin
    Senior Architect
    Intelligent Mining
Outline
  Factoranalysis
  Matrix decomposition




                                    proprietary material
  Matrix Factorization Model

  Minimizing Cost Function
  Common Implementation




                                2
Factor Analysis
  Aprocedure can help identify the factors that
   might be used to explain the interrelationships
   among the variables




                                                         proprietary material
  Model based approach




                                                     3
Refresher: Matrix Decomposition
                 r                                   q                           p
            5 x 6 matrix                       5 x 3 matrix                 3 x 6 matrix

X11   X12     X13    X14   X15   X16                                    x




                                                                                               proprietary material
X21   X22     X12    X24   X25   X26                                    y

X31   X32     X33    X34   X35   X36         a     b      c             z

X41   X42     X43    X44   X45   X46

X51   X52     X53    X54   X55   X56




X32 = (a, b, c) . (x, y, z) = a * x + b * y + c * z

      Rating Prediction                rui = qT pu
                                       ˆ      i
                                                         User Preference Factor Vector
                                                                                           4


                                             Movie Preference Factor Vector
Making Prediction as Filling Missing
Value
                                     users
              1   2   3     4    5    6   7   8   9   10   …   n




                                                                                                 proprietary material
          1   5       4               4               3
          2                               3       ?
  items




          3   3   ?         5    ?            2
          4       3              3            4   3

                      4                   4
          …




          m




                                              rui = qT pu
                                              ˆ      i
                                                                   User Preference Factor Vector

                                                                                             5
                          Rating Prediction           Movie Preference Factor Vector




                           €
Learn Factor Vectors
                                  users
             1   2   3    4   5    6   7    8    9   10   …      n

         1   5       4             4                 3




                                                                             proprietary material
         2                             3         ?
 items




         3   3   ?        5   ?             2
         4       3            3             4    3

                     4                 4
         …




     4 = U3-1 * I1-1 + U3-2 * I1-2 + U3-3 * I1-3 + U3-4 * I1-4
       m
     3 = U7-1 * I2-1 + U7-2 * I2-2 + U7-3 * I2-3 + U7-4 * I2-4
     …..

     3 = U86-1 * I12-1 + U86-2 * I12-2 + U86-3 * I12-3 + U86-4 * I12-4

                               Note: only train on known entries
    2X + 3Y = 5          2X + 3Y = 5                                     6
    4X - 2Y = 2          4X - 2Y = 2
                         3X - 2Y = 2
Why not use standard SVD?
  Standard   SVD assumes all missing entries are
   zero. This leads to bad prediction accuracy,
   especially when dataset is extremely sparse. (98%




                                                           proprietary material
   - 99.9%)
  See Appendix for SVD

  In some published literatures, they call Matrix
   Factorization as SVD, but note it’s NOT the same
   kind of classical low-rank SVD produced by
   svdlibc.


                                                       7
How to Learn Factor Vectors
  How do we learn preference factor vectors (a, b, c) and
   (x, y, z)?




                                                                                                proprietary material
  Minimize errors on the known ratings

                                                        To learn the factor
                    min
                    q*. p*
                                ∑ (rui − x ui )   2
                                                        vectors (pu and qi)
                             (u,i)∈k

     Minimizing Cost Function
     (Least Squares Problem)


                                              rui : actual rating for user u on item I
     €                                        xui : predicted rating for user u on item I


                                                                                            8
Data Normalization
  Remove         Global mean
                                            users




                                                                                      proprietary material
              1     2     3     4     5      6    7     8     9     10    …   n

          1   1.5         -.9               -.2                     .49

          2                                       .79         ?
  items




          3   0.6   ?           .46   ?                 -.4

          4         .39               .82               .76   .69
          …




                          .52                     .8

          m




                                                                                  9
Factorization Model
      Only     Preference factors

      min        ∑ (rui − µ − qT pu ) 2




                                                                                                     proprietary material
                               i
       q*. p*
                (u,i)∈k
                                        To learn the factor vectors (pu and qi)




                          Rating = 4
€
                 Global        Preference
                 Mean            Factor

                                                               rui : actual rating of user u on item I
                                                               u : training rating average
                                                               bu : user u user bias                10
                                                               bi : item i item bias
                                                               qi : latent factor array of item i
                                                               pu : later factor array of user u
Adding Item Bias and User Bias
      Add     Item bias and User bias as parameters

      min       ∑ (rui − µ − bi − bu − qT pu ) 2




                                                                                                         proprietary material
                                        i
      q*. p*
               (u,i)∈k
                                         To learn Item bias and User bias




                          Rating = 4
€
                 Global     Preference
                 Mean         Factor



                                                              rui : actual rating of user u on item I
                                                              u : training rating average
                                  Item Bias     User Bias
                                                              bu : user u user bias                     11
                                                              bi : item i item bias
                                                              qi : latent factor array of item i
                                                              pu : later factor array of user u
Regularization
    To      prevent model overfitting
                                                                    2            2
          ∑




                                                                                                       proprietary material
min         (rui − µ − bi − bu − q pu ) + λ ( qi + pu + bi2 + bu )
                                                T
                                                i
                                                        2      2
q*. p*
         (u,i)∈k
                                                            Regularization to prevent overfitting



                            Rating = 4


                   Global     Preference
                   Mean         Factor

                                                            rui : actual rating of user u on item I
                                                            u : training rating average
                                                            bu : user u user bias
                                    Item Bias   User Bias   bi : item i item bias
                                                                                                      12
                                                            qi : latent factor array of item i
                                                            pu : later factor array of user u
                                                            λ : regularization Parameters


                                                        €
Optimize Factor Vectors
  Find optimal factor vectors - minimizing cost
   function




                                                    proprietary material
  Algorithms:
      Stochastic gradient descent
      Others: Alternating least squares etc..
  Most   frequently use:
      Stochastic gradient descent




                                                   13
Matrix Factorization Tuning
  Number  of Factors in the Preference vectors
  Learning Rate of Gradient Descent




                                                                    proprietary material
      Best result usually coming from different learning rate
       for different parameter. Especially user/item bias terms.
  Parameters     in Factorization Model
      Time dependent parameters
      Seasonality dependent parameters
  Many    other considerations !



                                                                   14
High-Level Implementation Steps
  Construct  User-Item Matrix (sparse data structure!)
  Define factorization model - Cost function




                                                           proprietary material
  Take out global mean

  Decide what parameters in the model. (bias,
   preference factor, anything else? SVD++)
  Minimizing cost function - model fitting
      Stochastic gradient descent
      Alternating least squares
  Assemble the predictions
  Evaluate predictions (RMSE, MAE etc..)

  Continue to tune the model
                                                          15
Thank you
  Any   question or comment?




                                 proprietary material
                                16
Appendix
  Stochastic Gradient Descent
  Batch Gradient Descent




                                         proprietary material
  Singular Value Decomposition (SVD)




                                        17
Stochastic Gradient Descent
    Repeat Until Convergence {
       for i=1 to m in random order {




                                                                         proprietary material
            θ j := θ j + α (y (i) − hθ (x ( i) ))x (ji) (for every j)
        }                      partial derivative term
    }


€   Your code Here:




                                                                        18
Batch Gradient Descent
    Repeat Until Convergence {
                         m
        θ j := θ j + α ∑ (y (i) − hθ (x ( i) ))x (ji)   (for every j)




                                                                         proprietary material
                        i=1   partial derivative term
    }


€   Your code Here:




                                                                        19
Singular Value Decomposition
            (SVD)
            A = U × S ×VT
                 A                     U                S             VT
                                   m x r matrix    r x r matrix   r x n matrix




                                                                                  proprietary material
            m x n matrix




€
    items




                           items



                                                  rank = k
                                                  k<r


              users                                                 users

            Ak = U k × Sk × VkT

                                                                                 20

€

More Related Content

PDF
Matrix Factorization
PPTX
Recommendation System
PPTX
Recommender Systems
PPT
Item Based Collaborative Filtering Recommendation Algorithms
PPTX
Object Detection using Deep Neural Networks
PDF
Deep sort and sort paper introduce presentation
PDF
Recommendation System Explained
PDF
Matrix Factorization In Recommender Systems
Matrix Factorization
Recommendation System
Recommender Systems
Item Based Collaborative Filtering Recommendation Algorithms
Object Detection using Deep Neural Networks
Deep sort and sort paper introduce presentation
Recommendation System Explained
Matrix Factorization In Recommender Systems

What's hot (20)

PDF
Recommender Systems
PPTX
Recommender systems: Content-based and collaborative filtering
PDF
Collaborative filtering
PPT
Recommender systems
PDF
Faster R-CNN - PR012
PDF
Denoising Diffusion Probabilistic Modelsの重要な式の解説
PPTX
backbone としての timm 入門
PDF
Overview of recommender system
PDF
Recommender Systems
PPTX
CatBoost on GPU のひみつ
PPTX
Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
PPTX
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
PDF
論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」
PDF
(文献紹介)Depth Completionの最新動向
PPTX
CONVOLUTIONAL NEURAL NETWORK
PDF
Faster R-CNN: Towards real-time object detection with region proposal network...
PPTX
Knn 160904075605-converted
PPTX
ResNetの仕組み
PDF
Music recommendations @ MLConf 2014
PDF
PyMC mcmc
Recommender Systems
Recommender systems: Content-based and collaborative filtering
Collaborative filtering
Recommender systems
Faster R-CNN - PR012
Denoising Diffusion Probabilistic Modelsの重要な式の解説
backbone としての timm 入門
Overview of recommender system
Recommender Systems
CatBoost on GPU のひみつ
Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」
(文献紹介)Depth Completionの最新動向
CONVOLUTIONAL NEURAL NETWORK
Faster R-CNN: Towards real-time object detection with region proposal network...
Knn 160904075605-converted
ResNetの仕組み
Music recommendations @ MLConf 2014
PyMC mcmc
Ad

Viewers also liked (19)

PDF
Multidimensional RNN
PDF
この Visualization がすごい2014 〜データ世界を彩るツール6選〜
PDF
Matrix Factorization Techniques For Recommender Systems
PPT
Matrix factorization
PDF
آموزش محاسبات عددی - بخش دوم
PDF
Matrix Factorization Technique for Recommender Systems
PPT
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
PDF
Nonnegative Matrix Factorization
PDF
Neighbor methods vs matrix factorization - case studies of real-life recommen...
PPTX
Recommender system introduction
PDF
Intro to Factorization Machines
PDF
Collaborative Filtering with Spark
PPTX
Factorization Machines with libFM
PDF
Beginners Guide to Non-Negative Matrix Factorization
PDF
Recommender Systems
PPT
Recommendation system
PPTX
Collaborative Filtering Recommendation System
PDF
Building a Recommendation Engine - An example of a product recommendation engine
PDF
Recommender system algorithm and architecture
Multidimensional RNN
この Visualization がすごい2014 〜データ世界を彩るツール6選〜
Matrix Factorization Techniques For Recommender Systems
Matrix factorization
آموزش محاسبات عددی - بخش دوم
Matrix Factorization Technique for Recommender Systems
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Nonnegative Matrix Factorization
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Recommender system introduction
Intro to Factorization Machines
Collaborative Filtering with Spark
Factorization Machines with libFM
Beginners Guide to Non-Negative Matrix Factorization
Recommender Systems
Recommendation system
Collaborative Filtering Recommendation System
Building a Recommendation Engine - An example of a product recommendation engine
Recommender system algorithm and architecture
Ad

Similar to Introduction to Matrix Factorization Methods Collaborative Filtering (10)

PDF
Optimization in Crowd Movement Models via Anticipation
PDF
Self Organinising neural networks
KEY
Machine learning
PDF
Introduction to Machine Learning
PDF
Introduction to Machine Learning
PDF
Mining at scale with latent factor models for matrix completion
PDF
Day 13 graphing linear equations from tables
DOC
225.doc
PPT
Intelligent analysis for historical macroseismic damage scenarios Fabrizio Gi...
PPTX
Singapore conference
Optimization in Crowd Movement Models via Anticipation
Self Organinising neural networks
Machine learning
Introduction to Machine Learning
Introduction to Machine Learning
Mining at scale with latent factor models for matrix completion
Day 13 graphing linear equations from tables
225.doc
Intelligent analysis for historical macroseismic damage scenarios Fabrizio Gi...
Singapore conference

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Machine learning based COVID-19 study performance prediction
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
KodekX | Application Modernization Development
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Empathic Computing: Creating Shared Understanding
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Unlocking AI with Model Context Protocol (MCP)
Mobile App Security Testing_ A Comprehensive Guide.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Encapsulation_ Review paper, used for researhc scholars
Machine learning based COVID-19 study performance prediction
Building Integrated photovoltaic BIPV_UPV.pdf
KodekX | Application Modernization Development
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Empathic Computing: Creating Shared Understanding
MYSQL Presentation for SQL database connectivity
Chapter 3 Spatial Domain Image Processing.pdf
Electronic commerce courselecture one. Pdf
Spectral efficient network and resource selection model in 5G networks
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Digital-Transformation-Roadmap-for-Companies.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx

Introduction to Matrix Factorization Methods Collaborative Filtering

  • 1. INTRODUCTION TO MATRIX FACTORIZATION proprietary material METHODS COLLABORATIVE FILTERING USER RATINGS PREDICTION 1 Alex Lin Senior Architect Intelligent Mining
  • 2. Outline   Factoranalysis   Matrix decomposition proprietary material   Matrix Factorization Model   Minimizing Cost Function   Common Implementation 2
  • 3. Factor Analysis   Aprocedure can help identify the factors that might be used to explain the interrelationships among the variables proprietary material   Model based approach 3
  • 4. Refresher: Matrix Decomposition r q p 5 x 6 matrix 5 x 3 matrix 3 x 6 matrix X11 X12 X13 X14 X15 X16 x proprietary material X21 X22 X12 X24 X25 X26 y X31 X32 X33 X34 X35 X36 a b c z X41 X42 X43 X44 X45 X46 X51 X52 X53 X54 X55 X56 X32 = (a, b, c) . (x, y, z) = a * x + b * y + c * z Rating Prediction rui = qT pu ˆ i User Preference Factor Vector 4 Movie Preference Factor Vector
  • 5. Making Prediction as Filling Missing Value users 1 2 3 4 5 6 7 8 9 10 … n proprietary material 1 5 4 4 3 2 3 ? items 3 3 ? 5 ? 2 4 3 3 4 3 4 4 … m rui = qT pu ˆ i User Preference Factor Vector 5 Rating Prediction Movie Preference Factor Vector €
  • 6. Learn Factor Vectors users 1 2 3 4 5 6 7 8 9 10 … n 1 5 4 4 3 proprietary material 2 3 ? items 3 3 ? 5 ? 2 4 3 3 4 3 4 4 … 4 = U3-1 * I1-1 + U3-2 * I1-2 + U3-3 * I1-3 + U3-4 * I1-4 m 3 = U7-1 * I2-1 + U7-2 * I2-2 + U7-3 * I2-3 + U7-4 * I2-4 ….. 3 = U86-1 * I12-1 + U86-2 * I12-2 + U86-3 * I12-3 + U86-4 * I12-4 Note: only train on known entries 2X + 3Y = 5 2X + 3Y = 5 6 4X - 2Y = 2 4X - 2Y = 2 3X - 2Y = 2
  • 7. Why not use standard SVD?   Standard SVD assumes all missing entries are zero. This leads to bad prediction accuracy, especially when dataset is extremely sparse. (98% proprietary material - 99.9%)   See Appendix for SVD   In some published literatures, they call Matrix Factorization as SVD, but note it’s NOT the same kind of classical low-rank SVD produced by svdlibc. 7
  • 8. How to Learn Factor Vectors   How do we learn preference factor vectors (a, b, c) and (x, y, z)? proprietary material   Minimize errors on the known ratings To learn the factor min q*. p* ∑ (rui − x ui ) 2 vectors (pu and qi) (u,i)∈k Minimizing Cost Function (Least Squares Problem) rui : actual rating for user u on item I € xui : predicted rating for user u on item I 8
  • 9. Data Normalization   Remove Global mean users proprietary material 1 2 3 4 5 6 7 8 9 10 … n 1 1.5 -.9 -.2 .49 2 .79 ? items 3 0.6 ? .46 ? -.4 4 .39 .82 .76 .69 … .52 .8 m 9
  • 10. Factorization Model   Only Preference factors min ∑ (rui − µ − qT pu ) 2 proprietary material i q*. p* (u,i)∈k To learn the factor vectors (pu and qi) Rating = 4 € Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average bu : user u user bias 10 bi : item i item bias qi : latent factor array of item i pu : later factor array of user u
  • 11. Adding Item Bias and User Bias   Add Item bias and User bias as parameters min ∑ (rui − µ − bi − bu − qT pu ) 2 proprietary material i q*. p* (u,i)∈k To learn Item bias and User bias Rating = 4 € Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average Item Bias User Bias bu : user u user bias 11 bi : item i item bias qi : latent factor array of item i pu : later factor array of user u
  • 12. Regularization   To prevent model overfitting 2 2 ∑ proprietary material min (rui − µ − bi − bu − q pu ) + λ ( qi + pu + bi2 + bu ) T i 2 2 q*. p* (u,i)∈k Regularization to prevent overfitting Rating = 4 Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average bu : user u user bias Item Bias User Bias bi : item i item bias 12 qi : latent factor array of item i pu : later factor array of user u λ : regularization Parameters €
  • 13. Optimize Factor Vectors   Find optimal factor vectors - minimizing cost function proprietary material   Algorithms:   Stochastic gradient descent   Others: Alternating least squares etc..   Most frequently use:   Stochastic gradient descent 13
  • 14. Matrix Factorization Tuning   Number of Factors in the Preference vectors   Learning Rate of Gradient Descent proprietary material   Best result usually coming from different learning rate for different parameter. Especially user/item bias terms.   Parameters in Factorization Model   Time dependent parameters   Seasonality dependent parameters   Many other considerations ! 14
  • 15. High-Level Implementation Steps   Construct User-Item Matrix (sparse data structure!)   Define factorization model - Cost function proprietary material   Take out global mean   Decide what parameters in the model. (bias, preference factor, anything else? SVD++)   Minimizing cost function - model fitting   Stochastic gradient descent   Alternating least squares   Assemble the predictions   Evaluate predictions (RMSE, MAE etc..)   Continue to tune the model 15
  • 16. Thank you   Any question or comment? proprietary material 16
  • 17. Appendix   Stochastic Gradient Descent   Batch Gradient Descent proprietary material   Singular Value Decomposition (SVD) 17
  • 18. Stochastic Gradient Descent Repeat Until Convergence { for i=1 to m in random order { proprietary material θ j := θ j + α (y (i) − hθ (x ( i) ))x (ji) (for every j) } partial derivative term } € Your code Here: 18
  • 19. Batch Gradient Descent Repeat Until Convergence { m θ j := θ j + α ∑ (y (i) − hθ (x ( i) ))x (ji) (for every j) proprietary material i=1 partial derivative term } € Your code Here: 19
  • 20. Singular Value Decomposition (SVD) A = U × S ×VT A U S VT m x r matrix r x r matrix r x n matrix proprietary material m x n matrix € items items rank = k k<r users users Ak = U k × Sk × VkT 20 €