SlideShare a Scribd company logo
Stat405             Problem solving


                             Hadley Wickham
Monday, 13 September 2010
1. Projects
               2. Saving data
               3. Slot machine question
               4. Basic strategy
               5. Turning ideas into code



Monday, 13 September 2010
Projects

                   By Thursday, each group should send
                   Hadley an email with
                   1. the group name
                   2. the group members




Monday, 13 September 2010
Saving
                             data
Monday, 13 September 2010
Recall

                   To load data,
                   slots <- read.csv(“slots.csv”)




Monday, 13 September 2010
Your turn

                   Guess the name of the function you might
                   use to write an R object back to a csv file
                   on disk. Use it to save slots to
                   slots-2.csv.
                   What happens if you now read in
                   slots.csv with read.csv?



Monday, 13 September 2010
write.csv(slots, "slots-2.csv")
     slots2 <- read.csv("slots-2.csv")

     head(slots)
     head(slots2)

     str(slots)
     str(slots2)

     # Better, but still loses factor levels
     write.csv(slots, file = "slots-3.csv", row.names = F)
     slots3 <- read.csv("slots-3.csv")



Monday, 13 September 2010
Saving data

               # For long-term storage
               write.csv(slots, file = "slots.csv",
                 row.names = FALSE)

               # For short-term caching
               # Preserves factors etc.
               save(slots, file = "slots.rdata")




Monday, 13 September 2010
.csv             .rdata

                            read.csv()          load()
                write.csv( row.names =
                        FALSE)                  save()

                  Only data frames          Any R object
                   Can be read by any
                        program
                                              Only by R
                                          Short term caching of
                Long term storage        expensive computations

Monday, 13 September 2010
Compression
                   Easy to store compressed files to save
                   space:
                   write.csv(slots,
                     file = bzfile("slots.csv.bz2"),
                     row.names = FALSE)
                   slots4 <- read.csv("slots.csv.bz2")
                   Files stored with save() are automatically
                   compressed.


Monday, 13 September 2010
Slot
                            machine
                            question
Monday, 13 September 2010
Slots
                   Casino claims that slot machines have
                   prize payout of 92%. Is this claim true?
                   mean(slots$prize)
                   t.test(slots$prize, mu = 0.92)
                   qplot(prize, data = slots, binwidth = 1)
                   Can we do better?


Monday, 13 September 2010
Doing that lots of times, and calculating
                   the payoff should give us a better idea of
                   the payoff. But first we need some way to
                   calculate the prize from the windows.


                   Solution: Write a function



Monday, 13 September 2010
Strategy
                   1. Break complex tasks into smaller parts
                   2. Use words to describe how each part
                   should work
                   3. Translate words to R
                   4. When all parts work, combine into a
                   function (next class)



Monday, 13 September 2010
DD DD DD                  800
                                            windows <- c("7", "C", "C")
             7   7   7                 80
                                            # How can we calculate the
            BBB BBB BBB                40
                                            # payoff?
            BB BB BB                   25
             B   B   B                 10
             C   C   C                 10
            Any bar Any bar Any bar     5
                C           C   *       5
                C           *   C       5
                C           *   *       2
                                            DD doubles any winning
                *           C   *       2   combination. Two DD
                *           *   C       2   quadruples. DD is wild.
Monday, 13 September 2010
Your turn


                   We can simplify this table into 3 basic
                   cases of prizes. What are they? Take 3
                   minutes to brainstorm with a partner.




Monday, 13 September 2010
Cases

                   1. All windows have same value
                   2. A bar (B, BB, or BBB) in every window
                   3. Cherries and diamonds
                   4. (No prize)




Monday, 13 September 2010
Same values




Monday, 13 September 2010
Same values

                   1. Check whether all windows are the
                      same. How?




Monday, 13 September 2010
Same values

                   1. Check whether all windows are the
                      same. How?
                   2. If so, look up prize value. How?




Monday, 13 September 2010
Same values

                   1. Check whether all windows are the
                      same. How?
                   2. If so, look up prize value. How?

                                    With a partner, brainstorm
                                    for 2 minutes on how to
                                    solve one of these problems
Monday, 13 September 2010
# Same value
         same <- length(unique(windows)) == 1

         # OR
         same <- windows[1] == windows[2] &&
                 windows[2] == windows[3]

         if (same) {
           # Lookup value
         }



Monday, 13 September 2010
&&, || vs. &, |
                   Use && and || to combine sub-conditions and
                   return a single TRUE or FALSE
                            Not & and | - these return vectors when
                            given vectors
                            && and || are “short-circuiting”: they do the
                            minimum amount of work




Monday, 13 September 2010
If
                   if (condition) {
                            expression
                   }
                   Condition should be a logical vector of
                   length 1




Monday, 13 September 2010
if (TRUE) {
       # This will be run
     }

     if (FALSE) {
       # This won't be run
     } else {
       # This will be
     }

     # Single line form: (not recommended)
     if (TRUE) print("True!)
     if (FALSE) print("True!)


Monday, 13 September 2010
if (TRUE) {
       # This will be run
     }

     if (FALSE) {
       # This won't be run
     } else {
       # This will be
     }
         Note indenting.
         Very important!
     # Single line form: (not recommended)
     if (TRUE) print("True!)
     if (FALSE) print("True!)


Monday, 13 September 2010
x <- 5
     if (x < 5) print("x < 5")
     if (x == 5) print("x == 5")

     x <- 1:5
     if (x < 3) print("What should happen here?")

     if (x[1] < x[2]) print("x1 < x2")
     if (x[1] < x[2] && x[2] < x[3]) print("Asc")
     if (x[1] < x[2] || x[2] < x[3]) print("Asc")




Monday, 13 September 2010
if (window[1] == "DD") {
       prize <- 800
     } else if (windows[1] == "7") {
       prize <- 80
     } else if (windows[1] == "BBB") ...

     # Or use               subsetting
     c("DD" =               800, "7" =   80,   "BBB"   =   40)
     c("DD" =               800, "7" =   80,   "BBB"   =   40)["BBB"]
     c("DD" =               800, "7" =   80,   "BBB"   =   40)["0"]
     c("DD" =               800, "7" =   80,   "BBB"   =   40)[window[1]]



Monday, 13 September 2010
Your turn


                   Complete the previous code so that if all
                   the values in win are the same, then prize
                   variable will be set to the correct amount.




Monday, 13 September 2010
All bars

                   How can we determine if all of the
                   windows are B, BB, or BBB?
                   (windows[1] == "B" ||
                    windows[1] == "BB" ||
                    windows[1] === "BBB") && ... ?




Monday, 13 September 2010
All bars

                   How can we determine if all of the
                   windows are B, BB, or BBB?
                   (windows[1] == "B" ||
                    windows[1] == "BB" ||
                    windows[1] === "BBB") && ... ?

                                   Take 1 minute to brainstorm
                                   possible solutions
Monday, 13 September 2010
windows[1] %in% c("B", "BB", "BBB")
     windows %in% c("B", "BB", "BBB")

     allbars <- windows %in% c("B", "BB", "BBB")
     allbars[1] & allbars[2] & allbars[3]
     all(allbars)


     # See also ?any for the complement




Monday, 13 September 2010
Your turn

                   Complete the previous code so that the
                   correct value of prize is set if all the
                   windows are the same, or they are all
                   bars




Monday, 13 September 2010
payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40,
       "BB" = 25, "B" = 10, "C" = 10, "0" = 0)

     same <- length(unique(windows)) == 1
     allbars <- all(windows %in% c("B", "BB", "BBB"))

     if (same) {
       prize <- payoffs[windows[1]]
     } else if (allbars) {
       prize <- 5
     }



Monday, 13 September 2010
Cherries

                   Need numbers of cherries, and numbers
                   of diamonds (hint: use sum)
                   Then need to look up values (like for the
                   first case) and multiply together




Monday, 13 September 2010
cherries <- sum(windows == "C")
         diamonds <- sum(windows == "DD")

         c(0, 2, 5)[cherries + 1] *
           c(1, 2, 4)[diamonds + 1]




Monday, 13 September 2010
payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40,
       "BB" = 25, "B" = 10, "C" = 10, "0" = 0)

     same <- length(unique(windows)) == 1
     allbars <- all(windows %in% c("B", "BB", "BBB"))

     if (same) {
       prize <- payoffs[windows[1]]
     } else if (allbars) {
       prize <- 5
     } else {
       cherries <- sum(windows == "C")
       diamonds <- sum(windows == "DD")

          prize <- c(0, 2, 5)[cherries + 1] *
            c(1, 2, 4)[diamonds + 1]
     }

Monday, 13 September 2010
Writing a function

                   Now we need to wrap up this code in to a
                   reusable fashion. We need a function
                   Have used functions a lot, next time we’ll
                   learn how to write one.




Monday, 13 September 2010

More Related Content

PDF
PDF
Java memory presentation IBM 7
PDF
11 adv-manip
PDF
R packages
PDF
27 development
PDF
High Performance Python on Apache Spark
PDF
Python Data Wrangling: Preparing for the Future
PDF
Improving Python and Spark (PySpark) Performance and Interoperability
Java memory presentation IBM 7
11 adv-manip
R packages
27 development
High Performance Python on Apache Spark
Python Data Wrangling: Preparing for the Future
Improving Python and Spark (PySpark) Performance and Interoperability

Similar to 07 problem-solving (20)

PDF
07 Problem Solving
PDF
08 functions
PDF
PDF
09 bootstrapping
PDF
08 Functions
PDF
10 simulation
PDF
10 simulation
PDF
04 reports
PDF
04 Reports
PDF
09 Simulation
PDF
Becoming a better problem solver: a CS perspective
PDF
05 subsetting
PDF
03 Cleaning
PPTX
Ggplot2 v3
PDF
13 case-study
PDF
11 Data Structures
PPT
R Programming Intro
PDF
Data Munging in R - Chicago R User Group
PDF
Matlab/R Dictionary
PDF
OpenRepGrid – An Open Source Software for the Analysis of Repertory Grids
07 Problem Solving
08 functions
09 bootstrapping
08 Functions
10 simulation
10 simulation
04 reports
04 Reports
09 Simulation
Becoming a better problem solver: a CS perspective
05 subsetting
03 Cleaning
Ggplot2 v3
13 case-study
11 Data Structures
R Programming Intro
Data Munging in R - Chicago R User Group
Matlab/R Dictionary
OpenRepGrid – An Open Source Software for the Analysis of Repertory Grids
Ad

More from Hadley Wickham (19)

PDF
27 development
PDF
24 modelling
PDF
23 data-structures
PDF
Graphical inference
PDF
PDF
PDF
20 date-times
PDF
19 tables
PDF
18 cleaning
PDF
17 polishing
PDF
16 critique
PDF
15 time-space
PDF
14 case-study
PDF
12 adv-manip
PDF
11 adv-manip
PDF
03 extensions
PDF
02 large
PDF
01 intro
PDF
27 development
24 modelling
23 data-structures
Graphical inference
20 date-times
19 tables
18 cleaning
17 polishing
16 critique
15 time-space
14 case-study
12 adv-manip
11 adv-manip
03 extensions
02 large
01 intro
Ad

07 problem-solving

  • 1. Stat405 Problem solving Hadley Wickham Monday, 13 September 2010
  • 2. 1. Projects 2. Saving data 3. Slot machine question 4. Basic strategy 5. Turning ideas into code Monday, 13 September 2010
  • 3. Projects By Thursday, each group should send Hadley an email with 1. the group name 2. the group members Monday, 13 September 2010
  • 4. Saving data Monday, 13 September 2010
  • 5. Recall To load data, slots <- read.csv(“slots.csv”) Monday, 13 September 2010
  • 6. Your turn Guess the name of the function you might use to write an R object back to a csv file on disk. Use it to save slots to slots-2.csv. What happens if you now read in slots.csv with read.csv? Monday, 13 September 2010
  • 7. write.csv(slots, "slots-2.csv") slots2 <- read.csv("slots-2.csv") head(slots) head(slots2) str(slots) str(slots2) # Better, but still loses factor levels write.csv(slots, file = "slots-3.csv", row.names = F) slots3 <- read.csv("slots-3.csv") Monday, 13 September 2010
  • 8. Saving data # For long-term storage write.csv(slots, file = "slots.csv", row.names = FALSE) # For short-term caching # Preserves factors etc. save(slots, file = "slots.rdata") Monday, 13 September 2010
  • 9. .csv .rdata read.csv() load() write.csv( row.names = FALSE) save() Only data frames Any R object Can be read by any program Only by R Short term caching of Long term storage expensive computations Monday, 13 September 2010
  • 10. Compression Easy to store compressed files to save space: write.csv(slots, file = bzfile("slots.csv.bz2"), row.names = FALSE) slots4 <- read.csv("slots.csv.bz2") Files stored with save() are automatically compressed. Monday, 13 September 2010
  • 11. Slot machine question Monday, 13 September 2010
  • 12. Slots Casino claims that slot machines have prize payout of 92%. Is this claim true? mean(slots$prize) t.test(slots$prize, mu = 0.92) qplot(prize, data = slots, binwidth = 1) Can we do better? Monday, 13 September 2010
  • 13. Doing that lots of times, and calculating the payoff should give us a better idea of the payoff. But first we need some way to calculate the prize from the windows. Solution: Write a function Monday, 13 September 2010
  • 14. Strategy 1. Break complex tasks into smaller parts 2. Use words to describe how each part should work 3. Translate words to R 4. When all parts work, combine into a function (next class) Monday, 13 September 2010
  • 15. DD DD DD 800 windows <- c("7", "C", "C") 7 7 7 80 # How can we calculate the BBB BBB BBB 40 # payoff? BB BB BB 25 B B B 10 C C C 10 Any bar Any bar Any bar 5 C C * 5 C * C 5 C * * 2 DD doubles any winning * C * 2 combination. Two DD * * C 2 quadruples. DD is wild. Monday, 13 September 2010
  • 16. Your turn We can simplify this table into 3 basic cases of prizes. What are they? Take 3 minutes to brainstorm with a partner. Monday, 13 September 2010
  • 17. Cases 1. All windows have same value 2. A bar (B, BB, or BBB) in every window 3. Cherries and diamonds 4. (No prize) Monday, 13 September 2010
  • 18. Same values Monday, 13 September 2010
  • 19. Same values 1. Check whether all windows are the same. How? Monday, 13 September 2010
  • 20. Same values 1. Check whether all windows are the same. How? 2. If so, look up prize value. How? Monday, 13 September 2010
  • 21. Same values 1. Check whether all windows are the same. How? 2. If so, look up prize value. How? With a partner, brainstorm for 2 minutes on how to solve one of these problems Monday, 13 September 2010
  • 22. # Same value same <- length(unique(windows)) == 1 # OR same <- windows[1] == windows[2] && windows[2] == windows[3] if (same) { # Lookup value } Monday, 13 September 2010
  • 23. &&, || vs. &, | Use && and || to combine sub-conditions and return a single TRUE or FALSE Not & and | - these return vectors when given vectors && and || are “short-circuiting”: they do the minimum amount of work Monday, 13 September 2010
  • 24. If if (condition) { expression } Condition should be a logical vector of length 1 Monday, 13 September 2010
  • 25. if (TRUE) { # This will be run } if (FALSE) { # This won't be run } else { # This will be } # Single line form: (not recommended) if (TRUE) print("True!) if (FALSE) print("True!) Monday, 13 September 2010
  • 26. if (TRUE) { # This will be run } if (FALSE) { # This won't be run } else { # This will be } Note indenting. Very important! # Single line form: (not recommended) if (TRUE) print("True!) if (FALSE) print("True!) Monday, 13 September 2010
  • 27. x <- 5 if (x < 5) print("x < 5") if (x == 5) print("x == 5") x <- 1:5 if (x < 3) print("What should happen here?") if (x[1] < x[2]) print("x1 < x2") if (x[1] < x[2] && x[2] < x[3]) print("Asc") if (x[1] < x[2] || x[2] < x[3]) print("Asc") Monday, 13 September 2010
  • 28. if (window[1] == "DD") { prize <- 800 } else if (windows[1] == "7") { prize <- 80 } else if (windows[1] == "BBB") ... # Or use subsetting c("DD" = 800, "7" = 80, "BBB" = 40) c("DD" = 800, "7" = 80, "BBB" = 40)["BBB"] c("DD" = 800, "7" = 80, "BBB" = 40)["0"] c("DD" = 800, "7" = 80, "BBB" = 40)[window[1]] Monday, 13 September 2010
  • 29. Your turn Complete the previous code so that if all the values in win are the same, then prize variable will be set to the correct amount. Monday, 13 September 2010
  • 30. All bars How can we determine if all of the windows are B, BB, or BBB? (windows[1] == "B" || windows[1] == "BB" || windows[1] === "BBB") && ... ? Monday, 13 September 2010
  • 31. All bars How can we determine if all of the windows are B, BB, or BBB? (windows[1] == "B" || windows[1] == "BB" || windows[1] === "BBB") && ... ? Take 1 minute to brainstorm possible solutions Monday, 13 September 2010
  • 32. windows[1] %in% c("B", "BB", "BBB") windows %in% c("B", "BB", "BBB") allbars <- windows %in% c("B", "BB", "BBB") allbars[1] & allbars[2] & allbars[3] all(allbars) # See also ?any for the complement Monday, 13 September 2010
  • 33. Your turn Complete the previous code so that the correct value of prize is set if all the windows are the same, or they are all bars Monday, 13 September 2010
  • 34. payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0) same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB")) if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } Monday, 13 September 2010
  • 35. Cherries Need numbers of cherries, and numbers of diamonds (hint: use sum) Then need to look up values (like for the first case) and multiply together Monday, 13 September 2010
  • 36. cherries <- sum(windows == "C") diamonds <- sum(windows == "DD") c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] Monday, 13 September 2010
  • 37. payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0) same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB")) if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } else { cherries <- sum(windows == "C") diamonds <- sum(windows == "DD") prize <- c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] } Monday, 13 September 2010
  • 38. Writing a function Now we need to wrap up this code in to a reusable fashion. We need a function Have used functions a lot, next time we’ll learn how to write one. Monday, 13 September 2010