SlideShare a Scribd company logo
dummy log generation
using poisson sampling
kyle (kwanghee choi)
problem definition
- have to simulate fake logs based on log count per hour
- data to fit: thumbor.buzzni.com
 Dummy log generation using poisson sampling
log modeling
- count of logs per hour
== frequency of logs appearing in a fixed interval of time
log modeling: poisson distribution
- wikipedia: poisson distribution expresses the probability of a given number
of events occurring in a fixed interval of time or space if these events
occur with a known constant rate λ and independently of the time since the
last event.
log modeling: poisson process
- wikipedia: poisson point process is a type of random mathematical object
that consists of points randomly located on a mathematical space.
implementation: homogeneous case
def get_points_homogeneous(min_t, max_t, occurrence):
points = []
for _ in range(occurrence):
points.append(random.randint(min_t, max_t))
points.sort()
for point in points:
yield point
log modeling
- not a constant rate λ, but function of time λ(t)
log modeling: inhomogeneous poisson process
log modeling: inhomogeneous poisson process
λmax
keep
discard
maximum integer bound
λ(t)
t
65%
discard
probability
7%
discard
probability
implementation: nonhomogeneous case
def get_points_nonhomogeneous(min_t, max_t, occurrence):
points = []
max_bound = occurrence.get_max_bound(min_t, max_t)
for _ in range(max_bound):
points.append(random.randint(min_t, max_t))
points.sort()
for point in points:
keep_probability = occurrence.get(point) / max_bound
if keep_probability > random.random():
yield point
results
target function
dummy log histogram
reference
- Chiu, S. N., Stoyan, D., Kendall, W. S., & Mecke, J. (2013). Stochastic
geometry and its applications (3rd ed.). The Atrium, Southern Gate,
Chichester, West Sussex, United Kingdom: John Wiley & Sons.
- Poisson distribution. (2019, February 16). Retrieved February 25, 2019,
from https://en.wikipedia.org/wiki/Poisson_distribution
- Poisson point process. (2019, February 20). Retrieved February 25, 2019,
from https://en.wikipedia.org/wiki/Poisson_point_process

More Related Content

PDF
A formalization of complex event stream processing
PDF
Altitude San Francisco 2018: WebAssembly Tools & Applications
PPTX
NTRODUCTION TO COMPUTER PROGRAMMING Loop as repetitive statement,
PDF
SciSmalltalk: Doing Science with Agility
PDF
Activity Recognition Through Complex Event Processing: First Findings
PPTX
2 19-2018-mean of all runs
PDF
Rcpp11 genentech
PDF
Go on!
A formalization of complex event stream processing
Altitude San Francisco 2018: WebAssembly Tools & Applications
NTRODUCTION TO COMPUTER PROGRAMMING Loop as repetitive statement,
SciSmalltalk: Doing Science with Agility
Activity Recognition Through Complex Event Processing: First Findings
2 19-2018-mean of all runs
Rcpp11 genentech
Go on!

What's hot (20)

PDF
Rcpp11 useR2014
PDF
Probability of finding a single qubit in a state
PPTX
DCC2014 - Fully Online Grammar Compression in Constant Space
ZIP
なぜ検索しなかったのか
PDF
Gc in golang
PDF
Runtime Monitoring of Stream Logic Formulae (Talk @ FPS 2015)
PDF
Parallel computing with GPars
PDF
Gc in golang
PDF
Chapter20 class-example-program
KEY
Generating and Analyzing Events
DOCX
Q1 create a java desktop application to find the largest number among the t...
DOCX
WAP to implement inheritance and overloading methods in java
PDF
R/C++ talk at earl 2014
PPT
Bayesian learning
DOCX
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
PDF
Live in shell
ODP
What Year Is It: things you shouldn't do with timezones
PDF
2Bytesprog2 course_2014_c1_sets
PDF
Python grass
PDF
Multi dimensional profiling
Rcpp11 useR2014
Probability of finding a single qubit in a state
DCC2014 - Fully Online Grammar Compression in Constant Space
なぜ検索しなかったのか
Gc in golang
Runtime Monitoring of Stream Logic Formulae (Talk @ FPS 2015)
Parallel computing with GPars
Gc in golang
Chapter20 class-example-program
Generating and Analyzing Events
Q1 create a java desktop application to find the largest number among the t...
WAP to implement inheritance and overloading methods in java
R/C++ talk at earl 2014
Bayesian learning
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
Live in shell
What Year Is It: things you shouldn't do with timezones
2Bytesprog2 course_2014_c1_sets
Python grass
Multi dimensional profiling
Ad

More from Kwanghee Choi (19)

PDF
Visual Transformers
PDF
Trends of ICASSP 2022
PDF
추천 시스템 한 발짝 떨어져 살펴보기 (3)
PDF
Recommendation systems: Vertical and Horizontal Scrolls
PDF
추천 시스템 한 발짝 떨어져 살펴보기 (1)
PDF
추천 시스템 한 발짝 떨어져 살펴보기 (2)
PDF
Before and After the AI Winter - Recap
PDF
Mastering Gomoku - Recap
PDF
Teachings of Ada Lovelace
PDF
div, grad, curl, and all that - a review
PDF
Gaussian processes
PDF
Neural Architecture Search: Learning How to Learn
PDF
Duality between OOP and RL
PDF
JFEF encoding
PDF
Bandit algorithms for website optimization - A summary
PDF
Azure functions: Quickstart
PDF
Modern convolutional object detectors
PDF
Usage of Moving Average
PPTX
Jpl coding standard for the c programming language
Visual Transformers
Trends of ICASSP 2022
추천 시스템 한 발짝 떨어져 살펴보기 (3)
Recommendation systems: Vertical and Horizontal Scrolls
추천 시스템 한 발짝 떨어져 살펴보기 (1)
추천 시스템 한 발짝 떨어져 살펴보기 (2)
Before and After the AI Winter - Recap
Mastering Gomoku - Recap
Teachings of Ada Lovelace
div, grad, curl, and all that - a review
Gaussian processes
Neural Architecture Search: Learning How to Learn
Duality between OOP and RL
JFEF encoding
Bandit algorithms for website optimization - A summary
Azure functions: Quickstart
Modern convolutional object detectors
Usage of Moving Average
Jpl coding standard for the c programming language
Ad

Recently uploaded (20)

PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
PPT on Performance Review to get promotions
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Well-logging-methods_new................
PPTX
additive manufacturing of ss316l using mig welding
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
DOCX
573137875-Attendance-Management-System-original
PPT
Project quality management in manufacturing
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Operating System & Kernel Study Guide-1 - converted.pdf
PPT on Performance Review to get promotions
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Foundation to blockchain - A guide to Blockchain Tech
Mechanical Engineering MATERIALS Selection
Internet of Things (IOT) - A guide to understanding
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Lecture Notes Electrical Wiring System Components
Well-logging-methods_new................
additive manufacturing of ss316l using mig welding
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
CH1 Production IntroductoryConcepts.pptx
573137875-Attendance-Management-System-original
Project quality management in manufacturing
R24 SURVEYING LAB MANUAL for civil enggi
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx

Dummy log generation using poisson sampling

  • 1. dummy log generation using poisson sampling kyle (kwanghee choi)
  • 2. problem definition - have to simulate fake logs based on log count per hour - data to fit: thumbor.buzzni.com
  • 4. log modeling - count of logs per hour == frequency of logs appearing in a fixed interval of time
  • 5. log modeling: poisson distribution - wikipedia: poisson distribution expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant rate λ and independently of the time since the last event.
  • 6. log modeling: poisson process - wikipedia: poisson point process is a type of random mathematical object that consists of points randomly located on a mathematical space.
  • 7. implementation: homogeneous case def get_points_homogeneous(min_t, max_t, occurrence): points = [] for _ in range(occurrence): points.append(random.randint(min_t, max_t)) points.sort() for point in points: yield point
  • 8. log modeling - not a constant rate λ, but function of time λ(t)
  • 9. log modeling: inhomogeneous poisson process
  • 10. log modeling: inhomogeneous poisson process λmax keep discard maximum integer bound λ(t) t 65% discard probability 7% discard probability
  • 11. implementation: nonhomogeneous case def get_points_nonhomogeneous(min_t, max_t, occurrence): points = [] max_bound = occurrence.get_max_bound(min_t, max_t) for _ in range(max_bound): points.append(random.randint(min_t, max_t)) points.sort() for point in points: keep_probability = occurrence.get(point) / max_bound if keep_probability > random.random(): yield point
  • 13. reference - Chiu, S. N., Stoyan, D., Kendall, W. S., & Mecke, J. (2013). Stochastic geometry and its applications (3rd ed.). The Atrium, Southern Gate, Chichester, West Sussex, United Kingdom: John Wiley & Sons. - Poisson distribution. (2019, February 16). Retrieved February 25, 2019, from https://en.wikipedia.org/wiki/Poisson_distribution - Poisson point process. (2019, February 20). Retrieved February 25, 2019, from https://en.wikipedia.org/wiki/Poisson_point_process