class: center, middle, inverse, title-slide # Review ### Daniel Anderson ### Week 9, Class 2 --- layout: true <script> feather.replace() </script> <div class="slides-footer"> <span> <a class = "footer-icon-link" href = "https://github.com/uo-datasci-specialization/c3-fp-2021/raw/main/static/slides/w9p2.pdf"> <i class = "footer-icon" data-feather="download"></i> </a> <a class = "footer-icon-link" href = "https://fp-2021.netlify.app/slides/w9p2.html"> <i class = "footer-icon" data-feather="link"></i> </a> <a class = "footer-icon-link" href = "https://fp-2021.netlify.app/"> <i class = "footer-icon" data-feather="globe"></i> </a> <a class = "footer-icon-link" href = "https://github.com/uo-datasci-specialization/c3-fp-2021"> <i class = "footer-icon" data-feather="github"></i> </a> </span> </div> --- # Agenda * Quick looping review * Review functions * A shiny challenge + But first, a brief discussion on publishing shiny apps --- class: inverse-red middle # Quick looping review --- # Scenario * Let's say you want to `scale()` (standardize) every numeric column in a data frame. -- Use `palmerpenguins::penguins()` for our example. ```r library(palmerpenguins) penguins ``` ``` ## # A tibble: 344 x 8 ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year ## <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int> ## 1 Adelie Torgersen 39.1 18.7 181 3750 male 2007 ## 2 Adelie Torgersen 39.5 17.4 186 3800 female 2007 ## 3 Adelie Torgersen 40.3 18 195 3250 female 2007 ## 4 Adelie Torgersen NA NA NA NA <NA> 2007 ## 5 Adelie Torgersen 36.7 19.3 193 3450 female 2007 ## 6 Adelie Torgersen 39.3 20.6 190 3650 male 2007 ## 7 Adelie Torgersen 38.9 17.8 181 3625 female 2007 ## 8 Adelie Torgersen 39.2 19.6 195 4675 male 2007 ## 9 Adelie Torgersen 34.1 18.1 193 3475 <NA> 2007 ## 10 Adelie Torgersen 42 20.2 190 4250 <NA> 2007 ## # … with 334 more rows ``` --- # For loop method You try first. Can you write a `for` loop that loops through each column and applies `scale()` if it's numeric?
02
:
00
--- One approach ```r penguins2 <- penguins for(i in seq_along(penguins2)) { if (is.numeric(penguins2[ ,i, drop = TRUE])) { penguins2[ ,i] <- scale(penguins2[ ,i]) } } ``` --- ```r penguins2 ``` ``` ## # A tibble: 344 x 8 ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year ## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> ## 1 Adelie Torgersen -0.8832047 0.7843001 -1.416272 -0.5633167 male -1.257484 ## 2 Adelie Torgersen -0.8099390 0.1260033 -1.060696 -0.5009690 female -1.257484 ## 3 Adelie Torgersen -0.6634077 0.4298326 -0.4206603 -1.186793 female -1.257484 ## 4 Adelie Torgersen NA NA NA NA <NA> -1.257484 ## 5 Adelie Torgersen -1.322799 1.088129 -0.5628905 -0.9374027 female -1.257484 ## 6 Adelie Torgersen -0.8465718 1.746426 -0.7762357 -0.6880121 male -1.257484 ## 7 Adelie Torgersen -0.9198375 0.3285561 -1.416272 -0.7191859 female -1.257484 ## 8 Adelie Torgersen -0.8648883 1.240044 -0.4206603 0.5901153 male -1.257484 ## 9 Adelie Torgersen -1.799025 0.4804708 -0.5628905 -0.9062289 <NA> -1.257484 ## 10 Adelie Torgersen -0.3520286 1.543873 -0.7762357 0.06016004 <NA> -1.257484 ## # … with 334 more rows ``` --- # Replicate with `lapply()`?
02
:
00
--- ```r penguins3 <- penguins data.frame( lapply(penguins3, function(x) { if(is.numeric(x)) { x <- scale(x) } x }) ) ``` ``` ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year ## 1 Adelie Torgersen -0.88320467 0.78430007 -1.41627153 -0.563316704 male -1.25748435 ## 2 Adelie Torgersen -0.80993901 0.12600328 -1.06069609 -0.500969030 female -1.25748435 ## 3 Adelie Torgersen -0.66340769 0.42983257 -0.42066030 -1.186793445 female -1.25748435 ## 4 Adelie Torgersen NA NA NA NA <NA> -1.25748435 ## 5 Adelie Torgersen -1.32279862 1.08812936 -0.56289047 -0.937402749 female -1.25748435 ## 6 Adelie Torgersen -0.84657184 1.74642615 -0.77623574 -0.688012052 male -1.25748435 ## 7 Adelie Torgersen -0.91983750 0.32855614 -1.41627153 -0.719185889 female -1.25748435 ## 8 Adelie Torgersen -0.86488825 1.24004400 -0.42066030 0.590115266 male -1.25748435 ## 9 Adelie Torgersen -1.79902541 0.48047078 -0.56289047 -0.906228912 <NA> -1.25748435 ## 10 Adelie Torgersen -0.35202864 1.54387329 -0.77623574 0.060160036 <NA> -1.25748435 ## 11 Adelie Torgersen -1.12131806 -0.02591137 -1.06069609 -1.124445771 <NA> -1.25748435 ## 12 Adelie Torgersen -1.12131806 0.07536506 -1.48738661 -0.625664378 <NA> -1.25748435 ## 13 Adelie Torgersen -0.51687637 0.22727971 -1.34515644 -1.249141119 female -1.25748435 ## 14 Adelie Torgersen -0.97478674 2.05025544 -0.70512065 -0.500969030 male -1.25748435 ## 15 Adelie Torgersen -1.70744334 1.99961722 -0.20731504 0.247203059 male -1.25748435 ## 16 Adelie Torgersen -1.34111504 0.32855614 -1.13181117 -0.625664378 female -1.25748435 ## 17 Adelie Torgersen -0.95647033 0.93621471 -0.42066030 -0.937402749 female -1.25748435 ## 18 Adelie Torgersen -0.26044656 1.79706436 -0.27843012 0.371898407 male -1.25748435 ## 19 Adelie Torgersen -1.74407616 0.63238542 -1.20292626 -1.093271934 female -1.25748435 ## 20 Adelie Torgersen 0.38062795 2.20217008 -0.49177539 -0.002187638 male -1.25748435 ## 21 Adelie Biscoe -1.12131806 0.58174721 -1.91407714 -0.999750423 female -1.25748435 ## 22 Adelie Biscoe -1.13963448 0.78430007 -1.48738661 -0.750359726 male -1.25748435 ## 23 Adelie Biscoe -1.46932994 1.03749114 -0.84735082 -0.500969030 female -1.25748435 ## 24 Adelie Biscoe -1.04805240 0.48047078 -1.13181117 -0.313926008 male -1.25748435 ## 25 Adelie Biscoe -0.93815391 0.02472685 -1.48738661 -0.500969030 male -1.25748435 ## 26 Adelie Biscoe -1.57922843 0.88557650 -0.98958100 -0.500969030 female -1.25748435 ## 27 Adelie Biscoe -0.60845845 0.73366185 -1.27404135 -0.812707400 male -1.25748435 ## 28 Adelie Biscoe -0.62677486 0.37919435 -0.98958100 -1.249141119 female -1.25748435 ## 29 Adelie Biscoe -1.10300165 0.73366185 -2.05630731 -1.311488793 female -1.25748435 ## 30 Adelie Biscoe -0.62677486 0.88557650 -1.48738661 -0.313926008 male -1.25748435 ## 31 Adelie Dream -0.80993901 -0.22846423 -1.62961679 -1.186793445 female -1.25748435 ## 32 Adelie Dream -1.23121655 0.48047078 -1.62961679 -0.376273682 male -1.25748435 ## 33 Adelie Dream -0.80993901 0.32855614 -0.91846591 -1.124445771 female -1.25748435 ## 34 Adelie Dream -0.55350920 0.88557650 -1.20292626 -0.376273682 male -1.25748435 ## 35 Adelie Dream -1.37774787 -0.07654958 -0.42066030 -1.093271934 female -1.25748435 ## 36 Adelie Dream -0.86488825 1.99961722 -0.34954521 -0.064535312 male -1.25748435 ## 37 Adelie Dream -0.93815391 1.44259686 -0.77623574 -0.313926008 male -1.25748435 ## 38 Adelie Dream -0.31539581 0.68302364 -1.48738661 -0.812707400 female -1.25748435 ## 39 Adelie Dream -1.15795089 1.08812936 -1.41627153 -1.124445771 female -1.25748435 ## 40 Adelie Dream -0.75498976 0.98685293 -1.20292626 0.558941429 male -1.25748435 ## 41 Adelie Dream -1.35943145 0.42983257 -1.34515644 -1.311488793 female -1.25748435 ## 42 Adelie Dream -0.57182562 0.63238542 -0.42066030 -0.376273682 male -1.25748435 ## 43 Adelie Dream -1.45101353 0.68302364 -1.06069609 -1.373836467 female -1.25748435 ## 44 Adelie Dream 0.03261607 1.29068222 -0.34954521 0.247203059 male -1.25748435 ## 45 Adelie Dream -1.26784938 -0.12718780 -1.13181117 -1.498531815 female -1.25748435 ## 46 Adelie Dream -0.79162259 0.83493828 -0.77623574 0.496593755 male -1.25748435 ## 47 Adelie Dream -0.51687637 0.93621471 -1.34515644 -0.968576586 male -1.25748435 ## 48 Adelie Dream -1.17626731 0.88557650 -1.55850170 -1.529705652 <NA> -1.25748435 ## 49 Adelie Dream -1.45101353 0.37919435 -0.77623574 -0.937402749 female -1.25748435 ## 50 Adelie Dream -0.29707939 2.05025544 -0.70512065 -0.064535312 male -1.25748435 ## 51 Adelie Biscoe -0.79162259 0.27791792 -1.06069609 -0.875055074 female -0.03552216 ## 52 Adelie Biscoe -0.70004052 0.88557650 -0.91846591 0.122507710 male -0.03552216 ## 53 Adelie Biscoe -1.63417768 0.37919435 -0.77623574 -0.937402749 female -0.03552216 ## 54 Adelie Biscoe -0.35202864 1.18940579 -0.06508486 -0.189230660 male -0.03552216 ## 55 Adelie Biscoe -1.72575975 0.48047078 -0.98958100 -1.623227163 female -0.03552216 ## 56 Adelie Biscoe -0.46192713 0.73366185 -0.70512065 -0.625664378 male -0.03552216 ## 57 Adelie Biscoe -0.90152108 0.17664149 -1.06069609 -0.812707400 female -0.03552216 ## 58 Adelie Biscoe -0.60845845 0.83493828 -0.56289047 -0.500969030 male -0.03552216 ## 59 Adelie Biscoe -1.35943145 -0.27910244 -1.41627153 -1.685574837 female -0.03552216 ## 60 Adelie Biscoe -1.15795089 0.98685293 -0.49177539 -0.563316704 male -0.03552216 ## 61 Adelie Biscoe -1.50596277 -0.12718780 -1.13181117 -1.311488793 female -0.03552216 ## 62 Adelie Biscoe -0.48024354 1.99961722 -0.42066030 0.247203059 male -0.03552216 ## 63 Adelie Biscoe -1.15795089 -0.07654958 -1.13181117 -0.750359726 female -0.03552216 ## 64 Adelie Biscoe -0.51687637 0.53110900 -0.63400556 -0.189230660 male -0.03552216 ## 65 Adelie Biscoe -1.37774787 -0.02591137 -1.20292626 -1.685574837 female -0.03552216 ## 66 Adelie Biscoe -0.42529430 0.42983257 -0.63400556 -0.313926008 male -0.03552216 ## 67 Adelie Biscoe -1.54259560 -0.48165530 -0.42066030 -1.062098097 female -0.03552216 ## 68 Adelie Biscoe -0.51687637 0.98685293 -0.91846591 -0.126882986 male -0.03552216 ## 69 Adelie Torgersen -1.46932994 -0.27910244 -0.77623574 -1.436184141 female -0.03552216 ## 70 Adelie Torgersen -0.38866147 1.13876757 -0.20731504 0.309550733 male -0.03552216 ## 71 Adelie Torgersen -1.90892390 0.93621471 -0.77623574 -0.750359726 female -0.03552216 ## 72 Adelie Torgersen -0.77330618 0.63238542 -0.77623574 -0.376273682 male -0.03552216 ## 73 Adelie Torgersen -0.79162259 0.02472685 -0.34954521 -0.812707400 female -0.03552216 ## 74 Adelie Torgersen 0.34399512 0.88557650 -0.27843012 -0.064535312 male -0.03552216 ## 75 Adelie Torgersen -1.54259560 0.17664149 -0.77623574 -0.625664378 female -0.03552216 ## 76 Adelie Torgersen -0.20549732 0.68302364 -0.42066030 0.060160036 male -0.03552216 ## 77 Adelie Torgersen -0.55350920 -0.17782601 -0.70512065 -0.625664378 female -0.03552216 ## 78 Adelie Torgersen -1.23121655 1.13876757 -1.20292626 -0.376273682 male -0.03552216 ## 79 Adelie Torgersen -1.41438070 -0.53229351 -0.98958100 -0.812707400 female -0.03552216 ## 80 Adelie Torgersen -0.33371222 0.98685293 -0.42066030 -0.251578334 male -0.03552216 ## 81 Adelie Torgersen -1.70744334 0.02472685 -0.84735082 -1.249141119 female -0.03552216 ## 82 Adelie Torgersen -0.18718091 0.22727971 -0.34954521 0.621289103 male -0.03552216 ## 83 Adelie Torgersen -1.32279862 0.83493828 -0.98958100 -0.500969030 female -0.03552216 ## 84 Adelie Torgersen -1.61586126 1.13876757 -0.56289047 -0.002187638 male -0.03552216 ## 85 Adelie Dream -1.21290014 0.32855614 -0.70512065 -1.062098097 female -0.03552216 ## 86 Adelie Dream -0.48024354 1.59451151 -0.49177539 -0.812707400 male -0.03552216 ## 87 Adelie Dream -1.39606428 1.18940579 -0.77623574 -0.500969030 male -0.03552216 ## 88 Adelie Dream -1.28616579 0.73366185 -0.84735082 -0.875055074 female -0.03552216 ## 89 Adelie Dream -1.02973599 1.03749114 -0.84735082 -0.313926008 male -0.03552216 ## 90 Adelie Dream -0.91983750 0.83493828 -0.77623574 -0.750359726 female -0.03552216 ## 91 Adelie Dream -1.50596277 0.42983257 0.07714531 -0.812707400 female -0.03552216 ## 92 Adelie Dream -0.51687637 0.48047078 0.29049058 0.122507710 male -0.03552216 ## 93 Adelie Dream -1.81734182 -0.02591137 -1.13181117 -0.999750423 female -0.03552216 ## 94 Adelie Dream -0.79162259 0.48047078 -1.06069609 0.309550733 male -0.03552216 ## 95 Adelie Dream -1.41438070 0.07536506 -0.98958100 -1.124445771 female -0.03552216 ## 96 Adelie Dream -0.57182562 0.88557650 0.50383584 0.122507710 male -0.03552216 ## 97 Adelie Dream -1.06636882 0.73366185 -0.77623574 -0.625664378 female -0.03552216 ## 98 Adelie Dream -0.66340769 0.68302364 -0.34954521 0.184855384 male -0.03552216 ## 99 Adelie Dream -1.98218956 -0.53229351 -1.62961679 -1.623227163 female -0.03552216 ## 100 Adelie Dream -0.13223166 0.68302364 -0.63400556 -0.126882986 male -0.03552216 ## 101 Adelie Biscoe -1.63417768 0.37919435 -0.63400556 -0.594490541 female 1.18644003 ## 102 Adelie Biscoe -0.53519279 1.44259686 0.14826040 0.652462940 male 1.18644003 ## 103 Adelie Biscoe -1.13963448 -0.58293173 -1.27404135 -1.405010304 female 1.18644003 ## 104 Adelie Biscoe -1.12131806 1.44259686 -0.77623574 0.060160036 male 1.18644003 ## 105 Adelie Biscoe -1.10300165 0.73366185 -0.56289047 -1.592053326 female 1.18644003 ## 106 Adelie Biscoe -0.77330618 0.88557650 -1.20292626 -0.812707400 male 1.18644003 ## 107 Adelie Biscoe -0.97478674 0.02472685 -0.13619995 -0.563316704 female 1.18644003 ## 108 Adelie Biscoe -1.04805240 1.44259686 -0.77623574 -0.376273682 male 1.18644003 ## 109 Adelie Biscoe -1.06636882 -0.07654958 -1.41627153 -1.280314956 female 1.18644003 ## 110 Adelie Biscoe -0.13223166 0.93621471 -0.27843012 0.714810614 male 1.18644003 ## 111 Adelie Biscoe -1.06636882 -0.32974066 -0.20731504 -0.469795193 female 1.18644003 ## 112 Adelie Biscoe 0.30736229 1.59451151 -0.70512065 0.496593755 male 1.18644003 ## 113 Adelie Biscoe -0.77330618 0.27791792 -0.56289047 -1.249141119 female 1.18644003 ## 114 Adelie Biscoe -0.31539581 1.18940579 -0.27843012 0.091333873 male 1.18644003 ## 115 Adelie Biscoe -0.79162259 1.79706436 -0.70512065 -0.376273682 female 1.18644003 ## 116 Adelie Biscoe -0.22381374 0.58174721 -0.34954521 -0.158056823 male 1.18644003 ## 117 Adelie Torgersen -0.97478674 -0.07654958 -0.91846591 -1.623227163 female 1.18644003 ## 118 Adelie Torgersen -1.21290014 1.69578793 -0.13619995 -0.532142867 male 1.18644003 ## 119 Adelie Torgersen -1.50596277 -0.07654958 -0.84735082 -1.062098097 female 1.18644003 ## 120 Adelie Torgersen -0.51687637 0.73366185 -0.84735082 -1.093271934 male 1.18644003 ## 121 Adelie Torgersen -1.41438070 0.02472685 -0.98958100 -1.311488793 female 1.18644003 ## 122 Adelie Torgersen -1.13963448 1.34132043 -0.20731504 -0.875055074 male 1.18644003 ## 123 Adelie Torgersen -0.68172411 -0.07654958 -1.77184696 -0.937402749 female 1.18644003 ## 124 Adelie Torgersen -0.46192713 0.68302364 0.07714531 -0.407447519 male 1.18644003 ## 125 Adelie Torgersen -1.59754485 -0.63356994 -1.06069609 -1.436184141 female 1.18644003 ## [ reached 'max' / getOption("max.print") -- omitted 219 rows ] ``` --- # {purrr} When do we use `~`? Remember, `purrr::map()` is *exactly* like `base::lapply()`, except for the shortcut syntax -- `~` is a shortcut for `function(.x)` --- # Equivalent The following are the same .pull-left[ ```r map_df(penguins, ~{ if(is.numeric(.x)) { .x <- scale(.x) } .x }) ``` ``` ## # A tibble: 344 x 8 ## species island bill_length_mm[,1] bill_depth_mm[,1] flipper_length_mm[,1] body_mass_g[,1] sex year[,1] ## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> ## 1 Adelie Torgersen -0.8832047 0.7843001 -1.416272 -0.5633167 male -1.257484 ## 2 Adelie Torgersen -0.8099390 0.1260033 -1.060696 -0.5009690 female -1.257484 ## 3 Adelie Torgersen -0.6634077 0.4298326 -0.4206603 -1.186793 female -1.257484 ## 4 Adelie Torgersen NA NA NA NA <NA> -1.257484 ## 5 Adelie Torgersen -1.322799 1.088129 -0.5628905 -0.9374027 female -1.257484 ## 6 Adelie Torgersen -0.8465718 1.746426 -0.7762357 -0.6880121 male -1.257484 ## 7 Adelie Torgersen -0.9198375 0.3285561 -1.416272 -0.7191859 female -1.257484 ## 8 Adelie Torgersen -0.8648883 1.240044 -0.4206603 0.5901153 male -1.257484 ## 9 Adelie Torgersen -1.799025 0.4804708 -0.5628905 -0.9062289 <NA> -1.257484 ## 10 Adelie Torgersen -0.3520286 1.543873 -0.7762357 0.06016004 <NA> -1.257484 ## # … with 334 more rows ``` ] .pull-right[ ```r map_df(penguins, function(.x) { if(is.numeric(.x)) { .x <- scale(.x) } .x }) ``` ``` ## # A tibble: 344 x 8 ## species island bill_length_mm[,1] bill_depth_mm[,1] flipper_length_mm[,1] body_mass_g[,1] sex year[,1] ## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> ## 1 Adelie Torgersen -0.8832047 0.7843001 -1.416272 -0.5633167 male -1.257484 ## 2 Adelie Torgersen -0.8099390 0.1260033 -1.060696 -0.5009690 female -1.257484 ## 3 Adelie Torgersen -0.6634077 0.4298326 -0.4206603 -1.186793 female -1.257484 ## 4 Adelie Torgersen NA NA NA NA <NA> -1.257484 ## 5 Adelie Torgersen -1.322799 1.088129 -0.5628905 -0.9374027 female -1.257484 ## 6 Adelie Torgersen -0.8465718 1.746426 -0.7762357 -0.6880121 male -1.257484 ## 7 Adelie Torgersen -0.9198375 0.3285561 -1.416272 -0.7191859 female -1.257484 ## 8 Adelie Torgersen -0.8648883 1.240044 -0.4206603 0.5901153 male -1.257484 ## 9 Adelie Torgersen -1.799025 0.4804708 -0.5628905 -0.9062289 <NA> -1.257484 ## 10 Adelie Torgersen -0.3520286 1.543873 -0.7762357 0.06016004 <NA> -1.257484 ## # … with 334 more rows ``` ] --- # Whe to use `~`? You can use it whenever you feel comfortable, including always. You can also just loop a function through, and pass additional arguments to that function, e.g., ```r map_dbl(penguins, mean, na.rm = TRUE) ``` ``` ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g ## NA NA 43.92193 17.15117 200.91520 4201.75439 ## sex year ## NA 2008.02907 ``` ```r map_dbl(penguins, ~mean(.x, na.rm = TRUE)) ``` ``` ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g ## NA NA 43.92193 17.15117 200.91520 4201.75439 ## sex year ## NA 2008.02907 ``` --- The `~ .x` syntax is also helpful for more complex things ```r map_dbl(penguins, ~ifelse( is.numeric(.x), mean(.x, na.rm = TRUE), 0 ) ) ``` ``` ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g ## 0.00000 0.00000 43.92193 17.15117 200.91520 4201.75439 ## sex year ## 0.00000 2008.02907 ``` --- class: inverse-blue middle # Functions review --- # Remember * Everything is a function -- The following are equivalent ```r 3 + 5 ``` ``` ## [1] 8 ``` ```r `+`(3, 5) ``` ``` ## [1] 8 ``` --- # Using functions * Most functions are bound to a name, e.g., `mean()` -- * Anonymous functions are also common + Apply the function in a loop, and it only ever exists in the loop -- * You can also store functions in lists + Helpful if you want to apply lots of operations to a single vector --- # Binding to a name * Let's create a function that takes two arguments: (a) a data frame, and (b) the name of a discrete/categorical variable/column in the data frame. * The function should return the count of each "level" in the categorical variable. * For a small added challenge, have it optionally add the proportion -- Example: Example output with `palmerpenguins::penguins`. .pull-left[ ``` ## species count ## 1 Adelie 152 ## 2 Chinstrap 68 ## 3 Gentoo 124 ``` ] .pull-right[ ``` ## species count proportion ## 1 Adelie 152 0.4418605 ## 2 Chinstrap 68 0.1976744 ## 3 Gentoo 124 0.3604651 ``` ] --- # You try first Test it out with the **palmerpenguins** dataset. Do you get the same results I did? Note - the example I used included only base R functions. You can feel free to use **dplyr** or whatevs, just be careful with NSE. You also don't have to return a data frame output - return it however you want
07
:
00
--- # Where to start? ~~write a function~~ * Solve the problem for one example, generalize it to a function. -- Use the **palmerpenguins** dataset *for* your example! -- ```r library(palmerpenguins) penguins ``` ``` ## # A tibble: 344 x 8 ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year ## <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int> ## 1 Adelie Torgersen 39.1 18.7 181 3750 male 2007 ## 2 Adelie Torgersen 39.5 17.4 186 3800 female 2007 ## 3 Adelie Torgersen 40.3 18 195 3250 female 2007 ## 4 Adelie Torgersen NA NA NA NA <NA> 2007 ## 5 Adelie Torgersen 36.7 19.3 193 3450 female 2007 ## 6 Adelie Torgersen 39.3 20.6 190 3650 male 2007 ## 7 Adelie Torgersen 38.9 17.8 181 3625 female 2007 ## 8 Adelie Torgersen 39.2 19.6 195 4675 male 2007 ## 9 Adelie Torgersen 34.1 18.1 193 3475 <NA> 2007 ## 10 Adelie Torgersen 42 20.2 190 4250 <NA> 2007 ## # … with 334 more rows ``` --- # How do you want to solve it? Lots of ways, here's a base method -- * First, split by species ```r splt <- split(penguins, penguins$species) ``` -- * Next, count how many rows (observations) in each split ```r sapply(splt, nrow) ``` ``` ## Adelie Chinstrap Gentoo ## 152 68 124 ``` -- Could go on, but this is basically the output. --- # Wrap in a function * What will the arguments be? -- The data frame and the column ```r get_counts <- function(df, column) { } ``` -- What will the body be? --- # Same as before Just swap out the code for the arguments. Notice I'm indexing the columns differently. Why? I'm also swapping out `sapply()` for `vapply()` to be a little more safe. ```r get_counts <- function(df, column) { splt <- split(df, df[[column]]) vapply(splt, nrow, FUN.VALUE = integer(1)) } ``` --- # Test it ```r get_counts(penguins, "species") ``` ``` ## Adelie Chinstrap Gentoo ## 152 68 124 ``` ```r get_counts(penguins, "island") ``` ``` ## Biscoe Dream Torgersen ## 168 124 52 ``` --- # Extensions Let's say we want a data frame as the output. Can you modify what we have now to make that so?
03
:
00
--- # Data frame ```r get_counts <- function(df, column) { splt <- split(df, df[[column]]) counts <- vapply(splt, nrow, FUN.VALUE = integer(1)) tibble::tibble( var_levels = names(counts), # could use names(splt) count = counts ) } ``` --- # Test it ```r get_counts(penguins, "species") ``` ``` ## # A tibble: 3 x 2 ## var_levels count ## <chr> <int> ## 1 Adelie 152 ## 2 Chinstrap 68 ## 3 Gentoo 124 ``` ```r get_counts(penguins, "island") ``` ``` ## # A tibble: 3 x 2 ## var_levels count ## <chr> <int> ## 1 Biscoe 168 ## 2 Dream 124 ## 3 Torgersen 52 ``` --- # Column name Can we make the output from the data frame have the same column that we fed it?
02
:
00
-- ```r get_counts <- function(df, column) { splt <- split(df, df[[column]]) counts <- vapply(splt, nrow, FUN.VALUE = integer(1)) d <- tibble::tibble( var_levels = names(counts), # could use names(splt) count = counts ) * names(d)[1] <- column d } ``` --- # Test it ```r get_counts(penguins, "species") ``` ``` ## # A tibble: 3 x 2 ## species count ## <chr> <int> ## 1 Adelie 152 ## 2 Chinstrap 68 ## 3 Gentoo 124 ``` ```r get_counts(penguins, "island") ``` ``` ## # A tibble: 3 x 2 ## island count ## <chr> <int> ## 1 Biscoe 168 ## 2 Dream 124 ## 3 Torgersen 52 ``` --- # {dplyr} version Can we replicate this function using dplyr? We'll have to use non-standard evaluation -- First, solve it on a use case ```r penguins %>% count(species) ``` ``` ## # A tibble: 3 x 2 ## species n ## <fct> <int> ## 1 Adelie 152 ## 2 Chinstrap 68 ## 3 Gentoo 124 ``` --- # Function Will this work? ```r get_counts <- function(df, column) { df %>% count(column) } ``` -- ```r get_counts(penguins, species) ``` ``` ## Error: Must group by variables found in `.data`. ## * Column `column` is not found. ``` --- # Use NSE ```r get_counts <- function(df, column) { df %>% count({{column}}) } ``` -- ```r get_counts(penguins, species) ``` ``` ## # A tibble: 3 x 2 ## species n ## <fct> <int> ## 1 Adelie 152 ## 2 Chinstrap 68 ## 3 Gentoo 124 ``` ```r get_counts(penguins, island) ``` ``` ## # A tibble: 3 x 2 ## island n ## <fct> <int> ## 1 Biscoe 168 ## 2 Dream 124 ## 3 Torgersen 52 ``` --- # Pass the dots Alternatively, you could just pass the dots Bonus, this will now give you the counts for multiple columns -- ```r get_counts <- function(df, ...) { df %>% count(...) } ``` --- # Test it ```r get_counts(penguins, species) ``` ``` ## # A tibble: 3 x 2 ## species n ## <fct> <int> ## 1 Adelie 152 ## 2 Chinstrap 68 ## 3 Gentoo 124 ``` ```r get_counts(penguins, species, island) ``` ``` ## # A tibble: 5 x 3 ## species island n ## <fct> <fct> <int> ## 1 Adelie Biscoe 44 ## 2 Adelie Dream 56 ## 3 Adelie Torgersen 52 ## 4 Chinstrap Dream 68 ## 5 Gentoo Biscoe 124 ``` --- # Conditions * Let's add a condition that optionally reports the proportions in addition to the counts. -- * What will be the first step? -- * Add a new argument (and consider setting defaults for that argument) -- ```r get_counts <- function(df, column, return_proportions = FALSE) { df %>% count({{column}}) } ``` --- # Set conditional block Create a block for operations to conduct **when the condition is TRUE** ```r get_counts <- function(df, column, return_proportions = FALSE) { counts <- df %>% count({{column}}) if (isTRUE(return_proportions)) { } counts } ``` --- # Write condition In the block, include the code that is only evaluated when the condition is `TRUE`. ```r get_counts <- function(df, column, return_proportions = FALSE) { counts <- df %>% count({{column}}) if (isTRUE(return_proportions)) { counts <- counts %>% mutate(proportion = n / sum(n)) } counts } ``` --- # Test it ```r get_counts(penguins, species) ``` ``` ## # A tibble: 3 x 2 ## species n ## <fct> <int> ## 1 Adelie 152 ## 2 Chinstrap 68 ## 3 Gentoo 124 ``` ```r get_counts(penguins, species, return_proportions = TRUE) ``` ``` ## # A tibble: 3 x 3 ## species n proportion ## <fct> <int> <dbl> ## 1 Adelie 152 0.4418605 ## 2 Chinstrap 68 0.1976744 ## 3 Gentoo 124 0.3604651 ``` --- # Challenge Now that we have a basic function, can you write a **new** function that *calls this function* to add the proportions and/or counts to a data frame? Should return the original data frame, but with the counts/proportions added in as a new column.
04
:
00
--- # One solution ```r add_counts <- function(data, column, add_proportions = FALSE) { counts <- get_counts(data, {{column}}, add_proportions) left_join(data, counts) } ``` --- # Test it out I'm selecting variables after just so we can see the counts ```r add_counts(penguins, species) %>% select(species, island, n) ``` ``` ## # A tibble: 344 x 3 ## species island n ## <fct> <fct> <int> ## 1 Adelie Torgersen 152 ## 2 Adelie Torgersen 152 ## 3 Adelie Torgersen 152 ## 4 Adelie Torgersen 152 ## 5 Adelie Torgersen 152 ## 6 Adelie Torgersen 152 ## 7 Adelie Torgersen 152 ## 8 Adelie Torgersen 152 ## 9 Adelie Torgersen 152 ## 10 Adelie Torgersen 152 ## # … with 334 more rows ``` --- # Test it again This time let's add the proportions ```r add_counts(penguins, species, add_proportions = TRUE) %>% select(species, island, n, proportion) ``` ``` ## # A tibble: 344 x 4 ## species island n proportion ## <fct> <fct> <int> <dbl> ## 1 Adelie Torgersen 152 0.4418605 ## 2 Adelie Torgersen 152 0.4418605 ## 3 Adelie Torgersen 152 0.4418605 ## 4 Adelie Torgersen 152 0.4418605 ## 5 Adelie Torgersen 152 0.4418605 ## 6 Adelie Torgersen 152 0.4418605 ## 7 Adelie Torgersen 152 0.4418605 ## 8 Adelie Torgersen 152 0.4418605 ## 9 Adelie Torgersen 152 0.4418605 ## 10 Adelie Torgersen 152 0.4418605 ## # … with 334 more rows ``` --- # Embed checks Can you embed a warning or error (your choice) if the column fed to the function is not discrete? Note - this is more difficult with our **dplyr** version. Try using `dplyr::pull()`.
03
:
00
--- ```r get_counts <- function(df, column, return_proportions = FALSE) { column_vec <- dplyr::pull(df, {{column}}) if(is.numeric(column_vec)) { stop("Numeric column passed to function. Counts must be computed on categorical data.") } counts <- df %>% count({{column}}) if (isTRUE(return_proportions)) { counts <- counts %>% mutate(proportion = n / sum(n)) } counts } ``` --- # Test it out Note we can test it with either the `get_counts()` or `add_counts()` functions ```r get_counts(penguins, bill_length_mm) ``` ``` ## Error in get_counts(penguins, bill_length_mm): Numeric column passed to function. Counts must be computed on categorical data. ``` ```r add_counts(penguins, bill_length_mm) ``` ``` ## Error in get_counts(data, {: Numeric column passed to function. Counts must be computed on categorical data. ``` --- class: inverse-blue middle # Shiny --- # Publishing * We never talked about publishing shiny apps See [here](https://statsandr.com/blog/how-to-publish-shiny-app-example-with-shinyapps-io/) for a nice step-by-step walkthrough for publishing with https://www.shinyapps.io/ -- Basically: * Register an account with https://www.shinyapps.io/ * Add a token to your account on shinyapps * Back locally, set your account info with the token and secret via ```r rsconnect::setAccountInfo( name = "myaccount", # replace with your account name token = "mytokencopiedfromshinyappsio", # your token secret = "mysecretcopiedfromshinyappsio" ) ``` --- .center[ ### Publish ] .footnote[image from tutorial [here](https://statsandr.com/blog/how-to-publish-shiny-app-example-with-shinyapps-io/)] ![](https://statsandr.com/blog/2020-05-29-how-to-deploy-a-shiny-app-an-example-with-shinyapps-io_files/publish-shiny-app-online-shinyapps-io-4.png) --- # Shiny app * Create a shiny app or shiny dashboard with the `palmerpenguins` dataset * Allow the x and y axis to be selected by the user + Only numeric variables should be available to be selected * Allow the points to be colored by any categorical variable + For an added challenge, try to add in a "no color" option, which should be the default Once you've gone this far, try to publish your app. If you're successful, continue with challenge on next slide --- # Challenge continued * Add a table to the app that reports descriptive data on the columns that are selected in the plot + e.g., `n()`, `mean()`, `sd()` * Use tabs so the plot shows up in one tab, and the table shows up in a different tab Now publish again to update it --- class: inverse-green middle # Next time ## No Class Monday ## Package Development on Wednesday