Showing posts with label sketch. Show all posts
Showing posts with label sketch. Show all posts

Monday, January 16, 2017

Thesis: Privacy-aware and Scalable Recommender Systems using Sketching Techniques by Raghavendran Balu


Congratulations Dr. Balu !

Privacy-aware and 
Scalable Recommender 
Systems 
using Sketching 
Techniques
 by Raghavendran Balu


In this thesis, we aim to study and evaluate the privacy and scalability properties of recommendersystems using sketching techniques and propose scalable privacy preserving personalization mechanisms. Hence, the thesis is at the intersection of three different topics: recommender systems, differential privacy and sketching techniques. On the privacy aspects, we are interested in both new privacy preserving mechanisms and the evaluation of such mechanisms. We observe that the primary parameter  in differential privacy is a control parameter and motivated to find techniques that can assess the privacy guarantees. We are also interested in proposing new mechanisms that are privacy preserving and get along well with the evaluation metrics. On the scalability aspects, weaim to solve the challenges arising in user modeling and item retrieval. User modeling with evolving data poses difficulties, to be addressed, in storage and adapting to new data. Also, addressing the retrieval aspects finds applications in various domains other than recommender systems. We evaluate the impact of our contributions through extensive experiments conducted on benchmark real datasets and through the results, we surmise that our contributions very well address the privacy and scalability challenges.






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, December 28, 2016

DataSketches : Sketches Library from Yahoo! - implementation -

 
 
While talking to Ravi, I realized I had not mentioned the following library before. Edo was behind the release of the Sketches Library from Yahoo! as open source. This is a Java software library of stochastic streaming algorithms

from the Datasketches page:

The Business Challenge: Analyzing Big Data Quickly.
In the analysis of big data there are often problem queries that don’t scale because they require huge compute resources and time to generate exact results. Examples include count distinct, quantiles, most frequent items, joins, matrix computations, and graph analysis.
If approximate results are acceptable, there is a class of specialized algorithms, called streaming algorithms, or sketches that can produce results orders-of magnitude faster and with mathematically proven error bounds. For interactive queries there may not be other viable alternatives, and in the case of real-time analysis, sketches are the only known solution.
For any system that needs to extract useful information from big data these sketches are a required toolkit that should be tightly integrated into their analysis capabilities. This technology has helped Yahoo successfully reduce data processing times from days to hours or minutes on a number of its internal platforms.
This site is dedicated to providing key sketch algorithms of production quality. Contributions are welcome from those in the big data community interested in further development of this science and art.
 

In particular in the analytics section:

Built-in Theta Sketch set operators (Union, Intersection, Difference) produce sketches as a result (and not just a number) enabling full set expressions of cardinality, such as ((A ∪ B) ∩ (C ∪ D)) \ (E ∪ F). This capability along with predictable and superior accuracy (compared with Include/Exclude approaches) enable unprecedented analysis capabilities for fast queries.
 
This last paragraph echoes the presentation by Sam Bessalah, on Stream Mining via Abstract Algebra (ppt version). at the last meetup of season 1 of the Paris Machine Learning meetup (Europe Wide Machine Learning Meetup) with Andrew Ng  Sam's abstract was ( I recall distinctly having about 200 people listening studiously to monoids after 2 hours and and half of the other presentation) 
 
A quick introduction into some common algorithms and data structures to handle data in streaming fashion, like bloom filters, hyperloglog or min hashes. Then in a second part how abstract algebra with monoids, groups or semi groups help us reason and build scalable analytical systems beyond stream processing.
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, August 27, 2016

Saturday Morning Videos: ICML 2016 Plenary, Tutorials and more....

  
 
The videos for ICML 2016 are out ! They are all here: http://techtalks.tv/icml/2016/orals/
Thank you to the organizing committee to making these videos available. Here are the plenary and tutorial talks

Plenary

 
 Tutorials


 
 
 
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly