List columnsDaniel AndersonWeek 4, Class 21 / 57

Agenda

Review Lab 2
Introduce list columns
Contrast:
- group_by() %>% nest() %>% mutate() %>% map() with
- nest_by() %>% summarize()
In-class midterm (last 20 minutes)

2 / 57

Learning objectives

Understand list columns and how they relate to base::split
Fluently nest/unnest data frames
Understand why tidyr::nest can be a powerful framework (data frames) and when tidyr::unnest can/should be used to move out of nested data frames and into a standard data frame.

3 / 57

Review Lab 2

4 / 57

Setup

Please follow along

First import the data

library(tidyverse)
library(fs)
files <- dir_ls(here::here("data", "pfiles_sim"), 
                glob = "*.csv")
d <- files %>%
    map_df(read_csv, .id = "file")

5 / 57

Parse file data

d <- d %>%
  mutate(
    file = str_replace_all(
      file, 
      here::here("data", "pfiles_sim"), 
      ""
    ),
    grade = str_replace_all(file, "/g(\\d?\\d).+", "\\1"),
    grade = as.integer(grade),
    year = str_replace_all(
      file, 
      ".+files(\\d\\d)_sim.+", 
      "\\1"
    ),
    year = as.integer(year),
    content = str_replace_all(
      file, 
      "/g\\d?\\d(.+)pfiles.+", 
      "\\1"
    )
)

6 / 57

Select variables

d <- d %>% 
    select(ssid, grade, year, content, testeventid, 
           asmtprmrydsbltycd, asmtscndrydsbltycd, Entry:WMLE)

7 / 57

Comparing models

Let's say we wanted to fit/compare a set of models for each content area

lm(Theta ~ asmtprmrydsbltycd)
lm(Theta ~ asmtprmrydsbltycd + asmtscndrydsbltycd)
lm(Theta ~ asmtprmrydsbltycd * asmtscndrydsbltycd)

8 / 57

Data pre-processing

The disability variables are stored as numbers, we need them as factors
We'll make the names easier in the process

d <- d %>%
    mutate(primary = as.factor(asmtprmrydsbltycd),
           secondary = as.factor(asmtscndrydsbltycd))

If you're interested in what the specific codes refer to, see here.

9 / 57

Split the data

The base method we've been using...

splt_content <- split(d, d$content)
str(splt_content)

## List of 5
##  $ ELA    : tibble[,27] [3,627 × 27] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:3627] 9466908 7683685 9025693 10099824 18886078 ...
##   ..$ grade               : int [1:3627] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:3627] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ content             : chr [1:3627] "ELA" "ELA" "ELA" "ELA" ...
##   ..$ testeventid         : num [1:3627] 148933 147875 143699 143962 150680 ...
##   ..$ asmtprmrydsbltycd   : num [1:3627] 0 10 40 82 10 80 50 10 50 82 ...
##   ..$ asmtscndrydsbltycd  : num [1:3627] 0 0 20 0 0 80 0 0 0 0 ...
##   ..$ Entry               : num [1:3627] 123 88 105 153 437 307 305 42 59 304 ...
##   ..$ Theta               : num [1:3627] 1.27 1.55 3.28 4.48 2.67 ...
##   ..$ Status              : num [1:3627] 1 1 1 1 1 0 1 1 1 0 ...
##   ..$ Count               : num [1:3627] 36 36 36 36 36 36 36 36 36 36 ...
##   ..$ RawScore            : num [1:3627] 23 25 33 35 31 36 34 18 3 36 ...
##   ..$ SE                  : num [1:3627] 0.371 0.385 0.619 1.023 0.501 ...
##   ..$ Infit               : num [1:3627] 0.93 0.95 0.9 0.93 0.92 1 1.06 1.55 0.85 1 ...
##   ..$ Infit_Z             : num [1:3627] -0.34 -0.37 -0.04 0.23 -0.18 0 0.31 2.56 0.06 0 ...
##   ..$ Outfit              : num [1:3627] 0.82 0.81 1.63 0.35 0.88 1 0.86 1.74 0.17 1 ...
##   ..$ Outfit_Z            : num [1:3627] -0.62 -0.56 1.03 -0.16 -0.12 0 0.17 3.31 -0.37 0 ...
##   ..$ Displacement        : num [1:3627] 0.0018 0.0019 0.0022 0.0023 0.0021 0.0024 0.0022 0.0017 0.0009 0.0024 ...
##   ..$ PointMeasureCorr    : num [1:3627] 0.42 0.42 0.3 0.27 0.31 0 0.14 -0.12 0.32 0 ...
##   ..$ Weight              : num [1:3627] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:3627] 75 80.6 91.7 97.2 86.1 100 94.4 50 97.2 100 ...
##   ..$ ExpectMatch         : num [1:3627] 68.3 72 91.7 97.2 86.1 100 94.4 65 97.2 100 ...
##   ..$ PointMeasureExpected: num [1:3627] 0.35 0.33 0.2 0.12 0.25 0 0.17 0.38 0.17 0 ...
##   ..$ RMSR                : num [1:3627] 0.42 0.39 0.26 0.16 0.31 0 0.23 0.51 0.14 0 ...
##   ..$ WMLE                : num [1:3627] 1.25 1.68 3.13 4 2.59 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 1 2 4 11 2 10 6 2 6 11 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 1 3 1 1 10 1 1 1 1 ...
##  $ Math   : tibble[,27] [3,629 × 27] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:3629] 10129634 10926496 10616063 10139443 8887381 ...
##   ..$ grade               : int [1:3629] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:3629] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ content             : chr [1:3629] "Math" "Math" "Math" "Math" ...
##   ..$ testeventid         : num [1:3629] 147564 151249 146976 151229 147676 ...
##   ..$ asmtprmrydsbltycd   : num [1:3629] 10 80 10 10 80 82 10 10 40 82 ...
##   ..$ asmtscndrydsbltycd  : num [1:3629] 0 90 0 0 10 80 50 0 50 0 ...
##   ..$ Entry               : num [1:3629] 161 344 321 167 101 131 141 357 14 367 ...
##   ..$ Theta               : num [1:3629] 0.514 1.965 0.899 1.965 -4.724 ...
##   ..$ Status              : num [1:3629] 1 1 1 1 -1 1 1 1 1 1 ...
##   ..$ Count               : num [1:3629] 36 36 36 36 36 36 36 36 36 36 ...
##   ..$ RawScore            : num [1:3629] 19 29 22 29 0 14 18 25 19 22 ...
##   ..$ SE                  : num [1:3629] 0.356 0.436 0.362 0.436 1.839 ...
##   ..$ Infit               : num [1:3629] 0.97 1.03 0.99 1 1 1.01 0.99 0.85 0.96 1.06 ...
##   ..$ Infit_Z             : num [1:3629] -0.18 0.26 -0.09 0.1 0 0.13 -0.06 -0.81 -0.39 0.5 ...
##   ..$ Outfit              : num [1:3629] 0.97 1.03 1.01 0.87 1 0.93 1 0.77 0.97 0.99 ...
##   ..$ Outfit_Z            : num [1:3629] -0.17 0.23 0.17 -0.46 0 -0.03 0.03 -0.87 -0.19 0.07 ...
##   ..$ Displacement        : num [1:3629] 5e-04 5e-04 5e-04 5e-04 -7e-04 4e-04 5e-04 5e-04 5e-04 5e-04 ...
##   ..$ PointMeasureCorr    : num [1:3629] 0.38 0.28 0.34 0.29 0 0.34 0.38 0.45 0.38 0.27 ...
##   ..$ Weight              : num [1:3629] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:3629] 66.7 80.6 66.7 80.6 100 61.1 69.4 75 63.9 61.1 ...
##   ..$ ExpectMatch         : num [1:3629] 64.4 80.5 66 80.5 100 68.3 64.6 70.3 64.4 66 ...
##   ..$ PointMeasureExpected: num [1:3629] 0.35 0.25 0.33 0.25 0 0.35 0.35 0.3 0.35 0.33 ...
##   ..$ RMSR                : num [1:3629] 0.46 0.39 0.45 0.39 0 0.46 0.46 0.4 0.46 0.48 ...
##   ..$ WMLE                : num [1:3629] 0.512 1.915 0.888 1.915 -4.724 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 2 10 2 2 10 11 2 2 4 11 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 12 1 1 2 10 6 1 6 1 ...
##  $ Rdg    : tibble[,27] [3,627 × 27] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:3627] 18631185 18342736 10897771 7663935 6709613 ...
##   ..$ grade               : int [1:3627] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:3627] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ content             : chr [1:3627] "Rdg" "Rdg" "Rdg" "Rdg" ...
##   ..$ testeventid         : num [1:3627] 145641 147717 145196 149014 146545 ...
##   ..$ asmtprmrydsbltycd   : num [1:3627] 10 90 10 80 0 10 10 80 82 82 ...
##   ..$ asmtscndrydsbltycd  : num [1:3627] 0 0 0 10 0 82 0 70 0 0 ...
##   ..$ Entry               : num [1:3627] 444 444 343 74 3 77 181 261 297 182 ...
##   ..$ Theta               : num [1:3627] 2.24 3.48 2.72 -2.72 0.12 -1.39 -0.67 4.73 4.73 4.73 ...
##   ..$ Status              : num [1:3627] 1 1 1 1 1 1 1 0 0 0 ...
##   ..$ Count               : num [1:3627] 19 19 19 19 19 19 19 19 19 19 ...
##   ..$ RawScore            : num [1:3627] 16 18 17 1 8 3 5 19 19 19 ...
##   ..$ SE                  : num [1:3627] 0.64 1.03 0.76 1.06 0.49 0.66 0.55 1.84 1.84 1.84 ...
##   ..$ Infit               : num [1:3627] 1.07 1.02 1.1 1 0.98 0.73 1.08 1 1 1 ...
##   ..$ Infit_Z             : num [1:3627] 0.29 0.31 0.53 0.29 -0.11 -0.54 0.37 0 0 0 ...
##   ..$ Outfit              : num [1:3627] 1.17 0.94 1.22 0.76 0.95 0.62 1.09 1 1 1 ...
##   ..$ Outfit_Z            : num [1:3627] 0.45 0.32 0.71 0 -0.22 -0.62 0.42 0 0 0 ...
##   ..$ Displacement        : num [1:3627] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ PointMeasureCorr    : num [1:3627] 0.01 0.08 0.05 0 0.32 0.68 0.2 0 0 0 ...
##   ..$ Weight              : num [1:3627] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:3627] 84.2 94.7 89.5 94.7 63.2 84.2 78.9 100 100 100 ...
##   ..$ ExpectMatch         : num [1:3627] 84.2 94.7 89.5 94.7 64.2 85.1 76.3 100 100 100 ...
##   ..$ PointMeasureExpected: num [1:3627] 0.17 0.1 0.14 0.24 0.3 0.31 0.32 0 0 0 ...
##   ..$ RMSR                : num [1:3627] 0.38 0.22 0.31 0.22 0.47 0.34 0.36 0 0 0 ...
##   ..$ WMLE                : num [1:3627] 2.11 3.02 2.5 -2.27 0.14 -0.9 -0.6 4.73 4.73 4.73 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 2 12 2 10 1 2 2 10 11 11 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 1 1 2 1 11 1 8 1 1 ...
##  $ Science: tibble[,27] [1,435 × 27] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:1435] 7617607 7642717 10341706 9811494 10469745 ...
##   ..$ grade               : int [1:1435] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:1435] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ content             : chr [1:1435] "Science" "Science" "Science" "Science" ...
##   ..$ testeventid         : num [1:1435] 148838 144146 149634 146456 144426 ...
##   ..$ asmtprmrydsbltycd   : num [1:1435] 10 82 80 10 90 10 10 80 80 10 ...
##   ..$ asmtscndrydsbltycd  : num [1:1435] 0 80 20 0 50 80 70 0 50 0 ...
##   ..$ Entry               : num [1:1435] 28 43 274 118 291 91 58 130 297 337 ...
##   ..$ Theta               : num [1:1435] 0.875 2.647 4.17 5.404 5.404 ...
##   ..$ Status              : num [1:1435] 1 1 1 0 0 1 1 1 1 1 ...
##   ..$ Count               : num [1:1435] 36 36 36 36 36 36 36 36 36 36 ...
##   ..$ RawScore            : num [1:1435] 22 32 35 36 36 26 2 34 34 28 ...
##   ..$ SE                  : num [1:1435] 0.361 0.544 1.021 1.835 1.835 ...
##   ..$ Infit               : num [1:1435] 1.3 0.94 1.05 1.04 0.99 0.9 0.97 0.99 0.98 0.9 ...
##   ..$ Infit_Z             : num [1:1435] 2.35 0.02 0.37 0.35 0 -0.62 0.17 0.15 0.16 -0.52 ...
##   ..$ Outfit              : num [1:1435] 1.22 0.76 1.5 1.2 0.48 0.85 0.69 1.57 0.67 0.8 ...
##   ..$ Outfit_Z            : num [1:1435] 1.87 -0.27 0.91 0.6 -0.16 -0.87 -0.41 0.74 0.01 -0.76 ...
##   ..$ Displacement        : num [1:1435] 0.0011 0.0016 0.0016 0.0027 0.0027 0.0013 0.0007 0.0016 0.0016 0.0014 ...
##   ..$ PointMeasureCorr    : num [1:1435] 0.07 0.3 -0.07 0 0.29 0.46 0.28 0.05 0.22 0.45 ...
##   ..$ Weight              : num [1:1435] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:1435] 55.6 88.9 97.2 100 100 75 94.4 94.4 94.4 80.6 ...
##   ..$ ExpectMatch         : num [1:1435] 66.1 88.9 97.2 100 100 73.2 94.4 94.4 94.4 77.9 ...
##   ..$ PointMeasureExpected: num [1:1435] 0.32 0.22 0.12 0 0 0.3 0.15 0.16 0.16 0.28 ...
##   ..$ RMSR                : num [1:1435] 0.52 0.3 0.17 0 0 0.39 0.22 0.22 0.22 0.36 ...
##   ..$ WMLE                : num [1:1435] 0.863 2.543 3.692 5.404 5.404 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 2 11 10 2 12 2 2 10 10 2 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 10 3 1 6 10 8 1 6 1 ...
##  $ Wri    : tibble[,27] [3,627 × 27] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:3627] 7653093 7640498 7650957 9305807 18060455 ...
##   ..$ grade               : int [1:3627] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:3627] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ content             : chr [1:3627] "Wri" "Wri" "Wri" "Wri" ...
##   ..$ testeventid         : num [1:3627] 148893 148868 144397 148988 144865 ...
##   ..$ asmtprmrydsbltycd   : num [1:3627] 10 10 90 0 82 10 80 0 10 82 ...
##   ..$ asmtscndrydsbltycd  : num [1:3627] 0 0 80 0 0 0 50 0 0 10 ...
##   ..$ Entry               : num [1:3627] 61 48 63 107 443 214 267 344 423 27 ...
##   ..$ Theta               : num [1:3627] 1.67 2.17 -1.56 5.01 2.17 0.38 0.79 5.01 -0.48 2.78 ...
##   ..$ Status              : num [1:3627] 1 1 1 0 1 1 1 0 1 1 ...
##   ..$ Count               : num [1:3627] 13 13 13 13 13 13 13 13 13 13 ...
##   ..$ RawScore            : num [1:3627] 9 10 2 13 10 6 7 13 4 11 ...
##   ..$ SE                  : num [1:3627] 0.69 0.74 0.83 1.87 0.74 0.64 0.65 1.87 0.68 0.84 ...
##   ..$ Infit               : num [1:3627] 0.52 0.65 0.61 1 1.01 0.46 0.47 1 0.9 0.88 ...
##   ..$ Infit_Z             : num [1:3627] -2.07 -1.02 -0.79 0 0.24 -2.3 -2.2 0 -0.16 -0.21 ...
##   ..$ Outfit              : num [1:3627] 0.44 0.52 0.33 1 0.74 0.41 0.4 1 1.32 0.54 ...
##   ..$ Outfit_Z            : num [1:3627] -1.65 -0.81 -0.47 0 -0.06 -1.88 -1.85 0 -0.14 -0.16 ...
##   ..$ Displacement        : num [1:3627] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ PointMeasureCorr    : num [1:3627] 0.86 0.73 0.62 0 0.51 0.87 0.87 0 0.56 0.56 ...
##   ..$ Weight              : num [1:3627] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:3627] 100 92.3 84.6 100 76.9 100 100 100 92.3 84.6 ...
##   ..$ ExpectMatch         : num [1:3627] 75.5 79.7 84.6 100 79.7 72.9 73 100 74.5 84.6 ...
##   ..$ PointMeasureExpected: num [1:3627] 0.48 0.44 0.34 0 0.44 0.5 0.51 0 0.45 0.38 ...
##   ..$ RMSR                : num [1:3627] 0.31 0.34 0.26 0 0.38 0.29 0.29 0 0.37 0.31 ...
##   ..$ WMLE                : num [1:3627] 1.61 2.08 -1.38 5.01 2.08 0.38 0.78 5.01 -0.43 2.61 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 2 2 12 1 11 2 10 1 2 11 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 1 10 1 1 1 6 1 1 2 ...

10 / 57

We could use this method

m1 <- map(
  splt_content, 
  ~lm(Theta ~ asmtprmrydsbltycd, data = .x)
)
m2 <- map(
  splt_content, 
  ~lm(Theta ~ asmtprmrydsbltycd + asmtscndrydsbltycd, 
      data = .x)
)
m3 <- map(
  splt_content, 
  ~lm(Theta ~ asmtprmrydsbltycd * asmtscndrydsbltycd, 
      data = .x)
)

Then conduct tests to see which model fit better, etc.

11 / 57

Alternative

Create a data frame with a list column

by_content <- d %>%
  group_by(content) %>% 
  nest()
by_content

## # A tibble: 5 x 2
## # Groups:   content [5]
##   content data                      
##   <chr>   <list>                    
## 1 ELA     <tibble[,26] [3,627 × 26]>
## 2 Math    <tibble[,26] [3,629 × 26]>
## 3 Rdg     <tibble[,26] [3,627 × 26]>
## 4 Science <tibble[,26] [1,435 × 26]>
## 5 Wri     <tibble[,26] [3,627 × 26]>

12 / 57

What's going on here?

str(by_content$data)

## List of 5
##  $ : tibble[,26] [3,627 × 26] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:3627] 9466908 7683685 9025693 10099824 18886078 ...
##   ..$ grade               : int [1:3627] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:3627] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ testeventid         : num [1:3627] 148933 147875 143699 143962 150680 ...
##   ..$ asmtprmrydsbltycd   : num [1:3627] 0 10 40 82 10 80 50 10 50 82 ...
##   ..$ asmtscndrydsbltycd  : num [1:3627] 0 0 20 0 0 80 0 0 0 0 ...
##   ..$ Entry               : num [1:3627] 123 88 105 153 437 307 305 42 59 304 ...
##   ..$ Theta               : num [1:3627] 1.27 1.55 3.28 4.48 2.67 ...
##   ..$ Status              : num [1:3627] 1 1 1 1 1 0 1 1 1 0 ...
##   ..$ Count               : num [1:3627] 36 36 36 36 36 36 36 36 36 36 ...
##   ..$ RawScore            : num [1:3627] 23 25 33 35 31 36 34 18 3 36 ...
##   ..$ SE                  : num [1:3627] 0.371 0.385 0.619 1.023 0.501 ...
##   ..$ Infit               : num [1:3627] 0.93 0.95 0.9 0.93 0.92 1 1.06 1.55 0.85 1 ...
##   ..$ Infit_Z             : num [1:3627] -0.34 -0.37 -0.04 0.23 -0.18 0 0.31 2.56 0.06 0 ...
##   ..$ Outfit              : num [1:3627] 0.82 0.81 1.63 0.35 0.88 1 0.86 1.74 0.17 1 ...
##   ..$ Outfit_Z            : num [1:3627] -0.62 -0.56 1.03 -0.16 -0.12 0 0.17 3.31 -0.37 0 ...
##   ..$ Displacement        : num [1:3627] 0.0018 0.0019 0.0022 0.0023 0.0021 0.0024 0.0022 0.0017 0.0009 0.0024 ...
##   ..$ PointMeasureCorr    : num [1:3627] 0.42 0.42 0.3 0.27 0.31 0 0.14 -0.12 0.32 0 ...
##   ..$ Weight              : num [1:3627] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:3627] 75 80.6 91.7 97.2 86.1 100 94.4 50 97.2 100 ...
##   ..$ ExpectMatch         : num [1:3627] 68.3 72 91.7 97.2 86.1 100 94.4 65 97.2 100 ...
##   ..$ PointMeasureExpected: num [1:3627] 0.35 0.33 0.2 0.12 0.25 0 0.17 0.38 0.17 0 ...
##   ..$ RMSR                : num [1:3627] 0.42 0.39 0.26 0.16 0.31 0 0.23 0.51 0.14 0 ...
##   ..$ WMLE                : num [1:3627] 1.25 1.68 3.13 4 2.59 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 1 2 4 11 2 10 6 2 6 11 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 1 3 1 1 10 1 1 1 1 ...
##  $ : tibble[,26] [3,629 × 26] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:3629] 10129634 10926496 10616063 10139443 8887381 ...
##   ..$ grade               : int [1:3629] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:3629] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ testeventid         : num [1:3629] 147564 151249 146976 151229 147676 ...
##   ..$ asmtprmrydsbltycd   : num [1:3629] 10 80 10 10 80 82 10 10 40 82 ...
##   ..$ asmtscndrydsbltycd  : num [1:3629] 0 90 0 0 10 80 50 0 50 0 ...
##   ..$ Entry               : num [1:3629] 161 344 321 167 101 131 141 357 14 367 ...
##   ..$ Theta               : num [1:3629] 0.514 1.965 0.899 1.965 -4.724 ...
##   ..$ Status              : num [1:3629] 1 1 1 1 -1 1 1 1 1 1 ...
##   ..$ Count               : num [1:3629] 36 36 36 36 36 36 36 36 36 36 ...
##   ..$ RawScore            : num [1:3629] 19 29 22 29 0 14 18 25 19 22 ...
##   ..$ SE                  : num [1:3629] 0.356 0.436 0.362 0.436 1.839 ...
##   ..$ Infit               : num [1:3629] 0.97 1.03 0.99 1 1 1.01 0.99 0.85 0.96 1.06 ...
##   ..$ Infit_Z             : num [1:3629] -0.18 0.26 -0.09 0.1 0 0.13 -0.06 -0.81 -0.39 0.5 ...
##   ..$ Outfit              : num [1:3629] 0.97 1.03 1.01 0.87 1 0.93 1 0.77 0.97 0.99 ...
##   ..$ Outfit_Z            : num [1:3629] -0.17 0.23 0.17 -0.46 0 -0.03 0.03 -0.87 -0.19 0.07 ...
##   ..$ Displacement        : num [1:3629] 5e-04 5e-04 5e-04 5e-04 -7e-04 4e-04 5e-04 5e-04 5e-04 5e-04 ...
##   ..$ PointMeasureCorr    : num [1:3629] 0.38 0.28 0.34 0.29 0 0.34 0.38 0.45 0.38 0.27 ...
##   ..$ Weight              : num [1:3629] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:3629] 66.7 80.6 66.7 80.6 100 61.1 69.4 75 63.9 61.1 ...
##   ..$ ExpectMatch         : num [1:3629] 64.4 80.5 66 80.5 100 68.3 64.6 70.3 64.4 66 ...
##   ..$ PointMeasureExpected: num [1:3629] 0.35 0.25 0.33 0.25 0 0.35 0.35 0.3 0.35 0.33 ...
##   ..$ RMSR                : num [1:3629] 0.46 0.39 0.45 0.39 0 0.46 0.46 0.4 0.46 0.48 ...
##   ..$ WMLE                : num [1:3629] 0.512 1.915 0.888 1.915 -4.724 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 2 10 2 2 10 11 2 2 4 11 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 12 1 1 2 10 6 1 6 1 ...
##  $ : tibble[,26] [3,627 × 26] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:3627] 18631185 18342736 10897771 7663935 6709613 ...
##   ..$ grade               : int [1:3627] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:3627] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ testeventid         : num [1:3627] 145641 147717 145196 149014 146545 ...
##   ..$ asmtprmrydsbltycd   : num [1:3627] 10 90 10 80 0 10 10 80 82 82 ...
##   ..$ asmtscndrydsbltycd  : num [1:3627] 0 0 0 10 0 82 0 70 0 0 ...
##   ..$ Entry               : num [1:3627] 444 444 343 74 3 77 181 261 297 182 ...
##   ..$ Theta               : num [1:3627] 2.24 3.48 2.72 -2.72 0.12 -1.39 -0.67 4.73 4.73 4.73 ...
##   ..$ Status              : num [1:3627] 1 1 1 1 1 1 1 0 0 0 ...
##   ..$ Count               : num [1:3627] 19 19 19 19 19 19 19 19 19 19 ...
##   ..$ RawScore            : num [1:3627] 16 18 17 1 8 3 5 19 19 19 ...
##   ..$ SE                  : num [1:3627] 0.64 1.03 0.76 1.06 0.49 0.66 0.55 1.84 1.84 1.84 ...
##   ..$ Infit               : num [1:3627] 1.07 1.02 1.1 1 0.98 0.73 1.08 1 1 1 ...
##   ..$ Infit_Z             : num [1:3627] 0.29 0.31 0.53 0.29 -0.11 -0.54 0.37 0 0 0 ...
##   ..$ Outfit              : num [1:3627] 1.17 0.94 1.22 0.76 0.95 0.62 1.09 1 1 1 ...
##   ..$ Outfit_Z            : num [1:3627] 0.45 0.32 0.71 0 -0.22 -0.62 0.42 0 0 0 ...
##   ..$ Displacement        : num [1:3627] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ PointMeasureCorr    : num [1:3627] 0.01 0.08 0.05 0 0.32 0.68 0.2 0 0 0 ...
##   ..$ Weight              : num [1:3627] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:3627] 84.2 94.7 89.5 94.7 63.2 84.2 78.9 100 100 100 ...
##   ..$ ExpectMatch         : num [1:3627] 84.2 94.7 89.5 94.7 64.2 85.1 76.3 100 100 100 ...
##   ..$ PointMeasureExpected: num [1:3627] 0.17 0.1 0.14 0.24 0.3 0.31 0.32 0 0 0 ...
##   ..$ RMSR                : num [1:3627] 0.38 0.22 0.31 0.22 0.47 0.34 0.36 0 0 0 ...
##   ..$ WMLE                : num [1:3627] 2.11 3.02 2.5 -2.27 0.14 -0.9 -0.6 4.73 4.73 4.73 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 2 12 2 10 1 2 2 10 11 11 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 1 1 2 1 11 1 8 1 1 ...
##  $ : tibble[,26] [1,435 × 26] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:1435] 7617607 7642717 10341706 9811494 10469745 ...
##   ..$ grade               : int [1:1435] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:1435] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ testeventid         : num [1:1435] 148838 144146 149634 146456 144426 ...
##   ..$ asmtprmrydsbltycd   : num [1:1435] 10 82 80 10 90 10 10 80 80 10 ...
##   ..$ asmtscndrydsbltycd  : num [1:1435] 0 80 20 0 50 80 70 0 50 0 ...
##   ..$ Entry               : num [1:1435] 28 43 274 118 291 91 58 130 297 337 ...
##   ..$ Theta               : num [1:1435] 0.875 2.647 4.17 5.404 5.404 ...
##   ..$ Status              : num [1:1435] 1 1 1 0 0 1 1 1 1 1 ...
##   ..$ Count               : num [1:1435] 36 36 36 36 36 36 36 36 36 36 ...
##   ..$ RawScore            : num [1:1435] 22 32 35 36 36 26 2 34 34 28 ...
##   ..$ SE                  : num [1:1435] 0.361 0.544 1.021 1.835 1.835 ...
##   ..$ Infit               : num [1:1435] 1.3 0.94 1.05 1.04 0.99 0.9 0.97 0.99 0.98 0.9 ...
##   ..$ Infit_Z             : num [1:1435] 2.35 0.02 0.37 0.35 0 -0.62 0.17 0.15 0.16 -0.52 ...
##   ..$ Outfit              : num [1:1435] 1.22 0.76 1.5 1.2 0.48 0.85 0.69 1.57 0.67 0.8 ...
##   ..$ Outfit_Z            : num [1:1435] 1.87 -0.27 0.91 0.6 -0.16 -0.87 -0.41 0.74 0.01 -0.76 ...
##   ..$ Displacement        : num [1:1435] 0.0011 0.0016 0.0016 0.0027 0.0027 0.0013 0.0007 0.0016 0.0016 0.0014 ...
##   ..$ PointMeasureCorr    : num [1:1435] 0.07 0.3 -0.07 0 0.29 0.46 0.28 0.05 0.22 0.45 ...
##   ..$ Weight              : num [1:1435] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:1435] 55.6 88.9 97.2 100 100 75 94.4 94.4 94.4 80.6 ...
##   ..$ ExpectMatch         : num [1:1435] 66.1 88.9 97.2 100 100 73.2 94.4 94.4 94.4 77.9 ...
##   ..$ PointMeasureExpected: num [1:1435] 0.32 0.22 0.12 0 0 0.3 0.15 0.16 0.16 0.28 ...
##   ..$ RMSR                : num [1:1435] 0.52 0.3 0.17 0 0 0.39 0.22 0.22 0.22 0.36 ...
##   ..$ WMLE                : num [1:1435] 0.863 2.543 3.692 5.404 5.404 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 2 11 10 2 12 2 2 10 10 2 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 10 3 1 6 10 8 1 6 1 ...
##  $ : tibble[,26] [3,627 × 26] (S3: tbl_df/tbl/data.frame)
##   ..$ ssid                : num [1:3627] 7653093 7640498 7650957 9305807 18060455 ...
##   ..$ grade               : int [1:3627] 11 11 11 11 11 11 11 11 11 11 ...
##   ..$ year                : int [1:3627] 18 18 18 18 18 18 18 18 18 18 ...
##   ..$ testeventid         : num [1:3627] 148893 148868 144397 148988 144865 ...
##   ..$ asmtprmrydsbltycd   : num [1:3627] 10 10 90 0 82 10 80 0 10 82 ...
##   ..$ asmtscndrydsbltycd  : num [1:3627] 0 0 80 0 0 0 50 0 0 10 ...
##   ..$ Entry               : num [1:3627] 61 48 63 107 443 214 267 344 423 27 ...
##   ..$ Theta               : num [1:3627] 1.67 2.17 -1.56 5.01 2.17 0.38 0.79 5.01 -0.48 2.78 ...
##   ..$ Status              : num [1:3627] 1 1 1 0 1 1 1 0 1 1 ...
##   ..$ Count               : num [1:3627] 13 13 13 13 13 13 13 13 13 13 ...
##   ..$ RawScore            : num [1:3627] 9 10 2 13 10 6 7 13 4 11 ...
##   ..$ SE                  : num [1:3627] 0.69 0.74 0.83 1.87 0.74 0.64 0.65 1.87 0.68 0.84 ...
##   ..$ Infit               : num [1:3627] 0.52 0.65 0.61 1 1.01 0.46 0.47 1 0.9 0.88 ...
##   ..$ Infit_Z             : num [1:3627] -2.07 -1.02 -0.79 0 0.24 -2.3 -2.2 0 -0.16 -0.21 ...
##   ..$ Outfit              : num [1:3627] 0.44 0.52 0.33 1 0.74 0.41 0.4 1 1.32 0.54 ...
##   ..$ Outfit_Z            : num [1:3627] -1.65 -0.81 -0.47 0 -0.06 -1.88 -1.85 0 -0.14 -0.16 ...
##   ..$ Displacement        : num [1:3627] 0 0 0 0 0 0 0 0 0 0 ...
##   ..$ PointMeasureCorr    : num [1:3627] 0.86 0.73 0.62 0 0.51 0.87 0.87 0 0.56 0.56 ...
##   ..$ Weight              : num [1:3627] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ObservMatch         : num [1:3627] 100 92.3 84.6 100 76.9 100 100 100 92.3 84.6 ...
##   ..$ ExpectMatch         : num [1:3627] 75.5 79.7 84.6 100 79.7 72.9 73 100 74.5 84.6 ...
##   ..$ PointMeasureExpected: num [1:3627] 0.48 0.44 0.34 0 0.44 0.5 0.51 0 0.45 0.38 ...
##   ..$ RMSR                : num [1:3627] 0.31 0.34 0.26 0 0.38 0.29 0.29 0 0.37 0.31 ...
##   ..$ WMLE                : num [1:3627] 1.61 2.08 -1.38 5.01 2.08 0.38 0.78 5.01 -0.43 2.61 ...
##   ..$ primary             : Factor w/ 12 levels "0","10","20",..: 2 2 12 1 11 2 10 1 2 11 ...
##   ..$ secondary           : Factor w/ 12 levels "0","10","20",..: 1 1 10 1 1 1 6 1 1 2 ...

13 / 57

Explore a bit

map_dbl(by_content$data, nrow)

## [1] 3627 3629 3627 1435 3627

map_dbl(by_content$data, ncol)

## [1] 26 26 26 26 26

map_dbl(by_content$data, ~mean(.x$Theta))

## [1]  1.28001056 -0.06683086  1.37068376  1.57850321  1.26090709

14 / 57

It's a data frame!

We can add these summaries if we want

by_content %>%
    mutate(n = map_dbl(data, nrow))

## # A tibble: 5 x 3
## # Groups:   content [5]
##   content data                           n
##   <chr>   <list>                     <dbl>
## 1 ELA     <tibble[,26] [3,627 × 26]>  3627
## 2 Math    <tibble[,26] [3,629 × 26]>  3629
## 3 Rdg     <tibble[,26] [3,627 × 26]>  3627
## 4 Science <tibble[,26] [1,435 × 26]>  1435
## 5 Wri     <tibble[,26] [3,627 × 26]>  3627

15 / 57

`map_*`

Note on the previous example we used map_dbl and we got a vector in return.
What would happen if we just used map?

16 / 57

`map_*`

Note on the previous example we used map_dbl and we got a vector in return.
What would happen if we just used map?

by_content %>%
    mutate(n = map(data, nrow))

## # A tibble: 5 x 3
## # Groups:   content [5]
##   content data                       n        
##   <chr>   <list>                     <list>   
## 1 ELA     <tibble[,26] [3,627 × 26]> <int [1]>
## 2 Math    <tibble[,26] [3,629 × 26]> <int [1]>
## 3 Rdg     <tibble[,26] [3,627 × 26]> <int [1]>
## 4 Science <tibble[,26] [1,435 × 26]> <int [1]>
## 5 Wri     <tibble[,26] [3,627 × 26]> <int [1]>

16 / 57

Let's fit a model!

by_content %>%
    mutate(m1 = map(data, ~lm(Theta ~ primary, data = .x)))

## # A tibble: 5 x 3
## # Groups:   content [5]
##   content data                       m1    
##   <chr>   <list>                     <list>
## 1 ELA     <tibble[,26] [3,627 × 26]> <lm>  
## 2 Math    <tibble[,26] [3,629 × 26]> <lm>  
## 3 Rdg     <tibble[,26] [3,627 × 26]> <lm>  
## 4 Science <tibble[,26] [1,435 × 26]> <lm>  
## 5 Wri     <tibble[,26] [3,627 × 26]> <lm>

17 / 57

Extract the coefficients

by_content %>%
  mutate(
    m1 = map(data, ~lm(Theta ~ primary, data = .x)),
    coefs = map(m1, coef)
)

## # A tibble: 5 x 4
## # Groups:   content [5]
##   content data                       m1     coefs     
##   <chr>   <list>                     <list> <list>    
## 1 ELA     <tibble[,26] [3,627 × 26]> <lm>   <dbl [11]>
## 2 Math    <tibble[,26] [3,629 × 26]> <lm>   <dbl [12]>
## 3 Rdg     <tibble[,26] [3,627 × 26]> <lm>   <dbl [11]>
## 4 Science <tibble[,26] [1,435 × 26]> <lm>   <dbl [12]>
## 5 Wri     <tibble[,26] [3,627 × 26]> <lm>   <dbl [12]>

18 / 57

Challenge

Continue with the above, but output a data frame with three columns: content, intercept, and TBI (which is code 74).
In other words, output the mean score for students who were coded as not having a disability (code 0), along with students coded as having TBI.

04:00

19 / 57

by_content %>%
  mutate(
    m1 = map(data, ~lm(Theta ~ primary, data = .x)),
    coefs = map(m1, coef),
    no_disab = map_dbl(coefs, 1),
    tbi = no_disab + map_dbl(coefs, "primary74")
  ) %>%
  select(content, no_disab, tbi)

## # A tibble: 5 x 3
## # Groups:   content [5]
##   content   no_disab       tbi
##   <chr>        <dbl>     <dbl>
## 1 ELA      0.9322336 0.1674462
## 2 Math    -0.1587907 0.1910821
## 3 Rdg      1.363101  1.629048 
## 4 Science  1.491319  2.790971 
## 5 Wri      1.571441  1.167429

20 / 57

by_content %>%
  mutate(
    m1 = map(data, ~lm(Theta ~ primary, data = .x)),
    coefs = map(m1, coef),
    no_disab = map_dbl(coefs, 1),
    tbi = no_disab + map_dbl(coefs, "primary74")
  ) %>%
  select(content, no_disab, tbi)

## # A tibble: 5 x 3
## # Groups:   content [5]
##   content   no_disab       tbi
##   <chr>        <dbl>     <dbl>
## 1 ELA      0.9322336 0.1674462
## 2 Math    -0.1587907 0.1910821
## 3 Rdg      1.363101  1.629048 
## 4 Science  1.491319  2.790971 
## 5 Wri      1.571441  1.167429

Note - I wouldn't have neccesarily expected you to add no_disab to the TBI coefficient.

20 / 57

Compare models

Back to our original task - fit all three models

You try first

lm(Theta ~ primary)
lm(Theta ~ primary + secondary)
lm(Theta ~ primary + secondary + primary:secondary)

04:00

21 / 57

Model fits

mods <- by_content %>%
  mutate(
    m1 = map(data, ~lm(Theta ~ primary, data = .x)),
    m2 = map(data, ~lm(Theta ~ primary + secondary, data = .x)),
    m3 = map(data, ~lm(Theta ~ primary * secondary, data = .x))
)
mods

## # A tibble: 5 x 5
## # Groups:   content [5]
##   content data                       m1     m2     m3    
##   <chr>   <list>                     <list> <list> <list>
## 1 ELA     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>  
## 2 Math    <tibble[,26] [3,629 × 26]> <lm>   <lm>   <lm>  
## 3 Rdg     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>  
## 4 Science <tibble[,26] [1,435 × 26]> <lm>   <lm>   <lm>  
## 5 Wri     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>

22 / 57

Brief foray into parallel iterations

The stats::anova function can compare the fit of two models

23 / 57

Brief foray into parallel iterations

The stats::anova function can compare the fit of two models

Pop Quiz

How would we extract just ELA model 1 and 2?

23 / 57

Brief foray into parallel iterations

The stats::anova function can compare the fit of two models

Pop Quiz

How would we extract just ELA model 1 and 2?

mods$m1[[1]]

## 
## Call:
## lm(formula = Theta ~ primary, data = .x)
## 
## Coefficients:
## (Intercept)    primary10    primary20    primary40    primary50    primary60  
##     0.93223      0.38570     -0.03168     -1.84434      1.17372      0.77625  
##   primary70    primary74    primary80    primary82    primary90  
##    -1.83026     -0.76479      0.46765      0.33825      1.47936

mods$m2[[1]]

## 
## Call:
## lm(formula = Theta ~ primary + secondary, data = .x)
## 
## Coefficients:
## (Intercept)    primary10    primary20    primary40    primary50    primary60  
##      1.0043       0.4285       0.2807      -1.6282       1.1147       0.8676  
##   primary70    primary74    primary80    primary82    primary90  secondary10  
##     -1.4430      -0.1168       0.5283       0.3298       1.4101      -0.5685  
## secondary20  secondary40  secondary43  secondary50  secondary60  secondary70  
##     -1.0558      -2.2707      -1.7943       0.2484       0.4831      -1.3336  
## secondary74  secondary80  secondary82  secondary90  
##     -6.3695      -0.5142      -0.7259       1.6000

23 / 57

Which fits better?

compare <- anova(mods$m1[[1]], mods$m2[[1]])
compare

## Analysis of Variance Table
## 
## Model 1: Theta ~ primary
## Model 2: Theta ~ primary + secondary
##   Res.Df   RSS Df Sum of Sq      F    Pr(>F)    
## 1   3616 20905                                  
## 2   3605 20100 11    804.26 13.113 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

24 / 57

`map2`

Works the same as map but iterates over two vectors concurrently
Let's compare model 1 and 2

25 / 57

`map2`

Works the same as map but iterates over two vectors concurrently
Let's compare model 1 and 2

mods %>%
  mutate(comp12 = map2(m1, m2, anova))

## # A tibble: 5 x 6
## # Groups:   content [5]
##   content data                       m1     m2     m3     comp12             
##   <chr>   <list>                     <list> <list> <list> <list>             
## 1 ELA     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>
## 2 Math    <tibble[,26] [3,629 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>
## 3 Rdg     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>
## 4 Science <tibble[,26] [1,435 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>
## 5 Wri     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>

25 / 57

`map2`

Works the same as map but iterates over two vectors concurrently
Let's compare model 1 and 2

mods %>%
  mutate(comp12 = map2(m1, m2, anova))

## # A tibble: 5 x 6
## # Groups:   content [5]
##   content data                       m1     m2     m3     comp12             
##   <chr>   <list>                     <list> <list> <list> <list>             
## 1 ELA     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>
## 2 Math    <tibble[,26] [3,629 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>
## 3 Rdg     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>
## 4 Science <tibble[,26] [1,435 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>
## 5 Wri     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6]>

Perhaps not terrifically helpful

25 / 57

Back to our `anova` object

Can we pull out useful things?

str(compare)

## Classes 'anova' and 'data.frame':    2 obs. of  6 variables:
##  $ Res.Df   : num  3616 3605
##  $ RSS      : num  20905 20100
##  $ Df       : num  NA 11
##  $ Sum of Sq: num  NA 804
##  $ F        : num  NA 13.1
##  $ Pr(>F)   : num  NA 7.66e-25
##  - attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Model 1: Theta ~ primary\nModel 2: Theta ~ primary + secondary"

26 / 57

Back to our `anova` object

Can we pull out useful things?

str(compare)

## Classes 'anova' and 'data.frame':    2 obs. of  6 variables:
##  $ Res.Df   : num  3616 3605
##  $ RSS      : num  20905 20100
##  $ Df       : num  NA 11
##  $ Sum of Sq: num  NA 804
##  $ F        : num  NA 13.1
##  $ Pr(>F)   : num  NA 7.66e-25
##  - attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Model 1: Theta ~ primary\nModel 2: Theta ~ primary + secondary"

Try pulling out the $p$ value

26 / 57

Extract $p$ value

Note - I'd recommend looking at more than just a p-value, but I do think this is useful for a quick glance

compare$`Pr(>F)`

## [1]           NA 7.663566e-25

compare[["Pr(>F)"]]

## [1]           NA 7.663566e-25

27 / 57

Extract $p$ value

Note - I'd recommend looking at more than just a p-value, but I do think this is useful for a quick glance

compare$`Pr(>F)`

## [1]           NA 7.663566e-25

compare[["Pr(>F)"]]

## [1]           NA 7.663566e-25

compare$`Pr(>F)`[2]

## [1] 7.663566e-25

compare[["Pr(>F)"]][2]

## [1] 7.663566e-25

27 / 57

All p-values

Note - this is probably the most compact syntax, but that doesn't mean it's the most clear

mods %>%
  mutate(comp12 = map2(m1, m2, anova),
         p12 = map_dbl(comp12, list("Pr(>F)", 2)))

## # A tibble: 5 x 7
## # Groups:   content [5]
##   content data                       m1     m2     m3     comp12                      p12
##   <chr>   <list>                     <list> <list> <list> <list>                    <dbl>
## 1 ELA     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 7.663566e-25
## 2 Math    <tibble[,26] [3,629 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 1.724262e-22
## 3 Rdg     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 1.527172e-28
## 4 Science <tibble[,26] [1,435 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 4.685885e-18
## 5 Wri     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 5.785623e-11

28 / 57

Slight alternative

Write a function that pulls the p-value from model comparison objects

extract_p <- function(anova_ob) {
  anova_ob[["Pr(>F)"]][2]
}

29 / 57

Slight alternative

Write a function that pulls the p-value from model comparison objects

extract_p <- function(anova_ob) {
  anova_ob[["Pr(>F)"]][2]
}

Loop this function through the anova objects

29 / 57

mods %>%
  mutate(comp12 = map2(m1, m2, anova),
         p12 = map_dbl(comp12, extract_p))

## # A tibble: 5 x 7
## # Groups:   content [5]
##   content data                       m1     m2     m3     comp12                      p12
##   <chr>   <list>                     <list> <list> <list> <list>                    <dbl>
## 1 ELA     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 7.663566e-25
## 2 Math    <tibble[,26] [3,629 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 1.724262e-22
## 3 Rdg     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 1.527172e-28
## 4 Science <tibble[,26] [1,435 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 4.685885e-18
## 5 Wri     <tibble[,26] [3,627 × 26]> <lm>   <lm>   <lm>   <anova[,6] [2 × 6… 5.785623e-11

30 / 57

An alternative

Conducting operations by row

31 / 57

Operations by row

The dplyr::rowwise() function fundamentally changes the way a tibble() behaves

df <- tibble(name = c("Me", "You"), x = 1:2, y = 3:4, z = 5:6)

df %>% 
  mutate(m = mean(c(x, y, z)))

## # A tibble: 2 x 5
##   name      x     y     z     m
##   <chr> <int> <int> <int> <dbl>
## 1 Me        1     3     5   3.5
## 2 You       2     4     6   3.5

df %>% 
  rowwise() %>% 
  mutate(m = mean(c(x, y, z)))

## # A tibble: 2 x 5
## # Rowwise: 
##   name      x     y     z     m
##   <chr> <int> <int> <int> <dbl>
## 1 Me        1     3     5     3
## 2 You       2     4     6     4

32 / 57

Add a group & summarize

df %>% 
  rowwise(name) %>% 
  summarize(m = mean(c(x, y, z)))

## # A tibble: 2 x 2
## # Groups:   name [2]
##   name      m
##   <chr> <dbl>
## 1 Me        3
## 2 You       4

33 / 57

List columns

If you apply rowwise operation with a list column, you don't have to loop

df <- tibble(x = list(1, 2:3, 4:6))

df %>% 
  mutate(
    l = map_int(x, length)
  )

## # A tibble: 3 x 2
##   x             l
##   <list>    <int>
## 1 <dbl [1]>     1
## 2 <int [2]>     2
## 3 <int [3]>     3

df %>% 
  rowwise() %>% 
  mutate(l = length(x))

## # A tibble: 3 x 2
## # Rowwise: 
##   x             l
##   <list>    <int>
## 1 <dbl [1]>     1
## 2 <int [2]>     2
## 3 <int [3]>     3

34 / 57

Creating list columns

You can use the dplyr::nest_by() function to create a list column for each group, and convert it to a rowwise data frame.

35 / 57

Creating list columns

You can use the dplyr::nest_by() function to create a list column for each group, and convert it to a rowwise data frame.

d %>% 
  nest_by(content)

## # A tibble: 5 x 2
## # Rowwise:  content
##   content                data
##   <chr>   <list<tibble[,26]>>
## 1 ELA            [3,627 × 26]
## 2 Math           [3,629 × 26]
## 3 Rdg            [3,627 × 26]
## 4 Science        [1,435 × 26]
## 5 Wri            [3,627 × 26]

35 / 57

Challenge

Given what we just learned, can you fit a model of the form Theta ~ primary to each content area (i.e., not using {purrr})?

02:00

36 / 57

Challenge

Given what we just learned, can you fit a model of the form Theta ~ primary to each content area (i.e., not using {purrr})?

02:00

Wrap it in list() (should suggest this in the error reporting if you don't)

d %>% 
  nest_by(content) %>% 
  mutate(m1 = list(lm(Theta ~ primary, data = data)))

## # A tibble: 5 x 3
## # Rowwise:  content
##   content                data m1    
##   <chr>   <list<tibble[,26]>> <list>
## 1 ELA            [3,627 × 26] <lm>  
## 2 Math           [3,629 × 26] <lm>  
## 3 Rdg            [3,627 × 26] <lm>  
## 4 Science        [1,435 × 26] <lm>  
## 5 Wri            [3,627 × 26] <lm>

36 / 57

Challenge 2

Can you extend it further and extract the coefficients with coef? What about creating a new column that has the intercept values?

02:00

37 / 57

Challenge 2

Can you extend it further and extract the coefficients with coef? What about creating a new column that has the intercept values?

02:00

d %>% 
  nest_by(content) %>% 
  mutate(m1 = list(lm(Theta ~ primary, data = data)),
         coefs = list(coef(m1)))

## # A tibble: 5 x 4
## # Rowwise:  content
##   content                data m1     coefs     
##   <chr>   <list<tibble[,26]>> <list> <list>    
## 1 ELA            [3,627 × 26] <lm>   <dbl [11]>
## 2 Math           [3,629 × 26] <lm>   <dbl [12]>
## 3 Rdg            [3,627 × 26] <lm>   <dbl [11]>
## 4 Science        [1,435 × 26] <lm>   <dbl [12]>
## 5 Wri            [3,627 × 26] <lm>   <dbl [12]>

37 / 57

Return atomic vectors

d %>% 
  nest_by(content) %>% 
  mutate(m1 = list(lm(Theta ~ primary, data = data)),
         intercept = coef(m1)[1])

## # A tibble: 5 x 4
## # Rowwise:  content
##   content                data m1      intercept
##   <chr>   <list<tibble[,26]>> <list>      <dbl>
## 1 ELA            [3,627 × 26] <lm>    0.9322336
## 2 Math           [3,629 × 26] <lm>   -0.1587907
## 3 Rdg            [3,627 × 26] <lm>    1.363101 
## 4 Science        [1,435 × 26] <lm>    1.491319 
## 5 Wri            [3,627 × 26] <lm>    1.571441

38 / 57

Fit all models

The below gets us the same results we got before

mods2 <- d %>%
  nest_by(content) %>% 
  mutate(
    m1 = list(lm(Theta ~ primary, data = data)),
    m2 = list(lm(Theta ~ primary + secondary, data = data)),
    m3 = list(lm(Theta ~ primary * secondary, data = data))
)
mods2

## # A tibble: 5 x 5
## # Rowwise:  content
##   content                data m1     m2     m3    
##   <chr>   <list<tibble[,26]>> <list> <list> <list>
## 1 ELA            [3,627 × 26] <lm>   <lm>   <lm>  
## 2 Math           [3,629 × 26] <lm>   <lm>   <lm>  
## 3 Rdg            [3,627 × 26] <lm>   <lm>   <lm>  
## 4 Science        [1,435 × 26] <lm>   <lm>   <lm>  
## 5 Wri            [3,627 × 26] <lm>   <lm>   <lm>

39 / 57

Look at all $R^{2}$

It's a normal data frame!

mods %>%
  pivot_longer(
    m1:m3,
    names_to = "model",
    values_to = "output"
)

## # A tibble: 15 x 4
## # Groups:   content [5]
##    content data                       model output
##    <chr>   <list>                     <chr> <list>
##  1 ELA     <tibble[,26] [3,627 × 26]> m1    <lm>  
##  2 ELA     <tibble[,26] [3,627 × 26]> m2    <lm>  
##  3 ELA     <tibble[,26] [3,627 × 26]> m3    <lm>  
##  4 Math    <tibble[,26] [3,629 × 26]> m1    <lm>  
##  5 Math    <tibble[,26] [3,629 × 26]> m2    <lm>  
##  6 Math    <tibble[,26] [3,629 × 26]> m3    <lm>  
##  7 Rdg     <tibble[,26] [3,627 × 26]> m1    <lm>  
##  8 Rdg     <tibble[,26] [3,627 × 26]> m2    <lm>  
##  9 Rdg     <tibble[,26] [3,627 × 26]> m3    <lm>  
## 10 Science <tibble[,26] [1,435 × 26]> m1    <lm>  
## 11 Science <tibble[,26] [1,435 × 26]> m2    <lm>  
## 12 Science <tibble[,26] [1,435 × 26]> m3    <lm>  
## 13 Wri     <tibble[,26] [3,627 × 26]> m1    <lm>  
## 14 Wri     <tibble[,26] [3,627 × 26]> m2    <lm>  
## 15 Wri     <tibble[,26] [3,627 × 26]> m3    <lm>

40 / 57

Extract all $R^{2}$

Note - might want to write a function here again

r2 <- mods %>%
  pivot_longer(
    m1:m3,
    names_to = "model",
    values_to = "output"
  ) %>% 
  mutate(r2 = map_dbl(output, ~summary(.x)$r.squared))
r2

## # A tibble: 15 x 5
## # Groups:   content [5]
##    content data                       model output         r2
##    <chr>   <list>                     <chr> <list>      <dbl>
##  1 ELA     <tibble[,26] [3,627 × 26]> m1    <lm>   0.04517421
##  2 ELA     <tibble[,26] [3,627 × 26]> m2    <lm>   0.08190917
##  3 ELA     <tibble[,26] [3,627 × 26]> m3    <lm>   0.1161187 
##  4 Math    <tibble[,26] [3,629 × 26]> m1    <lm>   0.05326550
##  5 Math    <tibble[,26] [3,629 × 26]> m2    <lm>   0.08675264
##  6 Math    <tibble[,26] [3,629 × 26]> m3    <lm>   0.1185931 
##  7 Rdg     <tibble[,26] [3,627 × 26]> m1    <lm>   0.04805713
##  8 Rdg     <tibble[,26] [3,627 × 26]> m2    <lm>   0.08926212
##  9 Rdg     <tibble[,26] [3,627 × 26]> m3    <lm>   0.1217497 
## 10 Science <tibble[,26] [1,435 × 26]> m1    <lm>   0.08683581
## 11 Science <tibble[,26] [1,435 × 26]> m2    <lm>   0.1522437 
## 12 Science <tibble[,26] [1,435 × 26]> m3    <lm>   0.2170660 
## 13 Wri     <tibble[,26] [3,627 × 26]> m1    <lm>   0.05171555
## 14 Wri     <tibble[,26] [3,627 × 26]> m2    <lm>   0.06977688
## 15 Wri     <tibble[,26] [3,627 × 26]> m3    <lm>   0.09820067

41 / 57

Plot

ggplot(r2, aes(model, r2)) +
    geom_col(aes(fill = model)) +
    facet_wrap(~content) +
    guides(fill = "none") +
    scale_fill_brewer(palette = "Set2")

42 / 57

Unnesting

Sometimes you just want to unnest

43 / 57

Unnesting

Sometimes you just want to unnest
Imagine we want to plot the coefficients by model... how?

43 / 57

Unnesting

Sometimes you just want to unnest
Imagine we want to plot the coefficients by model... how?
broom::tidy() => tidyr::unnest()

43 / 57

Tidy

mods %>%
    pivot_longer(
      m1:m3,
      names_to = "model", 
      values_to = "output"
  ) %>%
    mutate(tidied = map(output, broom::tidy))

## # A tibble: 15 x 5
## # Groups:   content [5]
##    content data                       model output tidied                
##    <chr>   <list>                     <chr> <list> <list>                
##  1 ELA     <tibble[,26] [3,627 × 26]> m1    <lm>   <tibble[,5] [11 × 5]> 
##  2 ELA     <tibble[,26] [3,627 × 26]> m2    <lm>   <tibble[,5] [22 × 5]> 
##  3 ELA     <tibble[,26] [3,627 × 26]> m3    <lm>   <tibble[,5] [132 × 5]>
##  4 Math    <tibble[,26] [3,629 × 26]> m1    <lm>   <tibble[,5] [12 × 5]> 
##  5 Math    <tibble[,26] [3,629 × 26]> m2    <lm>   <tibble[,5] [23 × 5]> 
##  6 Math    <tibble[,26] [3,629 × 26]> m3    <lm>   <tibble[,5] [144 × 5]>
##  7 Rdg     <tibble[,26] [3,627 × 26]> m1    <lm>   <tibble[,5] [11 × 5]> 
##  8 Rdg     <tibble[,26] [3,627 × 26]> m2    <lm>   <tibble[,5] [22 × 5]> 
##  9 Rdg     <tibble[,26] [3,627 × 26]> m3    <lm>   <tibble[,5] [132 × 5]>
## 10 Science <tibble[,26] [1,435 × 26]> m1    <lm>   <tibble[,5] [12 × 5]> 
## 11 Science <tibble[,26] [1,435 × 26]> m2    <lm>   <tibble[,5] [22 × 5]> 
## 12 Science <tibble[,26] [1,435 × 26]> m3    <lm>   <tibble[,5] [132 × 5]>
## 13 Wri     <tibble[,26] [3,627 × 26]> m1    <lm>   <tibble[,5] [12 × 5]> 
## 14 Wri     <tibble[,26] [3,627 × 26]> m2    <lm>   <tibble[,5] [22 × 5]> 
## 15 Wri     <tibble[,26] [3,627 × 26]> m3    <lm>   <tibble[,5] [132 × 5]>

44 / 57

Equivalently

mods %>%
    pivot_longer(
      m1:m3,
      names_to = "model", 
      values_to = "output"
  ) %>%
  rowwise() %>% 
  mutate(tidied = list(broom::tidy(output)))

## # A tibble: 15 x 5
## # Rowwise:  content
##    content data                       model output tidied                
##    <chr>   <list>                     <chr> <list> <list>                
##  1 ELA     <tibble[,26] [3,627 × 26]> m1    <lm>   <tibble[,5] [11 × 5]> 
##  2 ELA     <tibble[,26] [3,627 × 26]> m2    <lm>   <tibble[,5] [22 × 5]> 
##  3 ELA     <tibble[,26] [3,627 × 26]> m3    <lm>   <tibble[,5] [132 × 5]>
##  4 Math    <tibble[,26] [3,629 × 26]> m1    <lm>   <tibble[,5] [12 × 5]> 
##  5 Math    <tibble[,26] [3,629 × 26]> m2    <lm>   <tibble[,5] [23 × 5]> 
##  6 Math    <tibble[,26] [3,629 × 26]> m3    <lm>   <tibble[,5] [144 × 5]>
##  7 Rdg     <tibble[,26] [3,627 × 26]> m1    <lm>   <tibble[,5] [11 × 5]> 
##  8 Rdg     <tibble[,26] [3,627 × 26]> m2    <lm>   <tibble[,5] [22 × 5]> 
##  9 Rdg     <tibble[,26] [3,627 × 26]> m3    <lm>   <tibble[,5] [132 × 5]>
## 10 Science <tibble[,26] [1,435 × 26]> m1    <lm>   <tibble[,5] [12 × 5]> 
## 11 Science <tibble[,26] [1,435 × 26]> m2    <lm>   <tibble[,5] [22 × 5]> 
## 12 Science <tibble[,26] [1,435 × 26]> m3    <lm>   <tibble[,5] [132 × 5]>
## 13 Wri     <tibble[,26] [3,627 × 26]> m1    <lm>   <tibble[,5] [12 × 5]> 
## 14 Wri     <tibble[,26] [3,627 × 26]> m2    <lm>   <tibble[,5] [22 × 5]> 
## 15 Wri     <tibble[,26] [3,627 × 26]> m3    <lm>   <tibble[,5] [132 × 5]>

45 / 57

Select and unnest

tidied <- mods %>%
    gather(model, output, m1:m3) %>%
    mutate(tidied = map(output, broom::tidy)) %>%
    select(content, model, tidied) %>%
    unnest(tidied)
tidied

## # A tibble: 841 x 7
## # Groups:   content [5]
##    content model term           estimate std.error   statistic     p.value
##    <chr>   <chr> <chr>             <dbl>     <dbl>       <dbl>       <dbl>
##  1 ELA     m1    (Intercept)  0.9322336  0.2150561  4.334839   1.498396e-5
##  2 ELA     m1    primary10    0.3856986  0.2242965  1.719593   8.559207e-2
##  3 ELA     m1    primary20   -0.03167527 0.7266436 -0.04359120 9.652327e-1
##  4 ELA     m1    primary40   -1.844343   0.5559031 -3.317741   9.164595e-4
##  5 ELA     m1    primary50    1.173722   0.2890447  4.060694   4.996391e-5
##  6 ELA     m1    primary60    0.7762539  0.3866313  2.007737   4.474555e-2
##  7 ELA     m1    primary70   -1.830257   0.3086128 -5.930595   3.301860e-9
##  8 ELA     m1    primary74   -0.7647874  0.5182670 -1.475663   1.401215e-1
##  9 ELA     m1    primary80    0.4676481  0.2428640  1.925556   5.423822e-2
## 10 ELA     m1    primary82    0.3382547  0.2267600  1.491686   1.358687e-1
## # … with 831 more rows

46 / 57

Plot

Lets look how the primary coefficients change

to_plot <- names(coef(mods$m1[[1]]))
tidied %>%
  filter(term %in% to_plot) %>%
  ggplot(aes(estimate, term, color = model)) +
  geom_point() +
  scale_color_brewer(palette = "Set2") +
  facet_wrap(~content)

47 / 57

48 / 57

Last bit

We've kind of been running the wrong models this whole time
We forgot about grade!
No problem, just change the grouping factor

49 / 57

By grade

by_grade_content <- d %>%
  group_by(content, grade) %>% 
    nest()
by_grade_content

## # A tibble: 31 x 3
## # Groups:   grade, content [31]
##    grade content data                    
##    <int> <chr>   <list>                  
##  1    11 ELA     <tibble[,25] [453 × 25]>
##  2    11 Math    <tibble[,25] [460 × 25]>
##  3    11 Rdg     <tibble[,25] [453 × 25]>
##  4    11 Science <tibble[,25] [438 × 25]>
##  5    11 Wri     <tibble[,25] [453 × 25]>
##  6     3 ELA     <tibble[,25] [540 × 25]>
##  7     3 Math    <tibble[,25] [536 × 25]>
##  8     3 Rdg     <tibble[,25] [540 × 25]>
##  9     3 Wri     <tibble[,25] [540 × 25]>
## 10     4 ELA     <tibble[,25] [585 × 25]>
## # … with 21 more rows

50 / 57

Fit models

mods_grade <- by_grade_content %>%
  mutate(
    m1 = map(data, ~lm(Theta ~ primary, data = .x)),
    m2 = map(data, ~lm(Theta ~ primary + secondary, 
                       data = .x)),
    m3 = map(data, ~lm(Theta ~ primary * secondary, 
                       data = .x))
)
mods_grade

## # A tibble: 31 x 6
## # Groups:   grade, content [31]
##    grade content data                     m1     m2     m3    
##    <int> <chr>   <list>                   <list> <list> <list>
##  1    11 ELA     <tibble[,25] [453 × 25]> <lm>   <lm>   <lm>  
##  2    11 Math    <tibble[,25] [460 × 25]> <lm>   <lm>   <lm>  
##  3    11 Rdg     <tibble[,25] [453 × 25]> <lm>   <lm>   <lm>  
##  4    11 Science <tibble[,25] [438 × 25]> <lm>   <lm>   <lm>  
##  5    11 Wri     <tibble[,25] [453 × 25]> <lm>   <lm>   <lm>  
##  6     3 ELA     <tibble[,25] [540 × 25]> <lm>   <lm>   <lm>  
##  7     3 Math    <tibble[,25] [536 × 25]> <lm>   <lm>   <lm>  
##  8     3 Rdg     <tibble[,25] [540 × 25]> <lm>   <lm>   <lm>  
##  9     3 Wri     <tibble[,25] [540 × 25]> <lm>   <lm>   <lm>  
## 10     4 ELA     <tibble[,25] [585 × 25]> <lm>   <lm>   <lm>  
## # … with 21 more rows

51 / 57

Look at $R^{2}$

mods_grade %>%
    pivot_longer(
    m1:m3, 
    names_to = "model", 
    values_to = "output"
  ) %>% 
    mutate(r2 = map_dbl(output, ~summary(.x)$r.squared))

## # A tibble: 93 x 6
## # Groups:   grade, content [31]
##    grade content data                     model output         r2
##    <int> <chr>   <list>                   <chr> <list>      <dbl>
##  1    11 ELA     <tibble[,25] [453 × 25]> m1    <lm>   0.03353818
##  2    11 ELA     <tibble[,25] [453 × 25]> m2    <lm>   0.1084394 
##  3    11 ELA     <tibble[,25] [453 × 25]> m3    <lm>   0.1536891 
##  4    11 Math    <tibble[,25] [460 × 25]> m1    <lm>   0.1886003 
##  5    11 Math    <tibble[,25] [460 × 25]> m2    <lm>   0.3161226 
##  6    11 Math    <tibble[,25] [460 × 25]> m3    <lm>   0.4046634 
##  7    11 Rdg     <tibble[,25] [453 × 25]> m1    <lm>   0.02066316
##  8    11 Rdg     <tibble[,25] [453 × 25]> m2    <lm>   0.1820512 
##  9    11 Rdg     <tibble[,25] [453 × 25]> m3    <lm>   0.2337721 
## 10    11 Science <tibble[,25] [438 × 25]> m1    <lm>   0.1259080 
## # … with 83 more rows

52 / 57

Plot

mods_grade %>%
  pivot_longer(
    m1:m3, 
    names_to = "model", 
    values_to = "output"
  ) %>% 
  mutate(r2 = map_dbl(output, ~summary(.x)$r.squared)) %>%
  ggplot(aes(model, r2)) +
  geom_col(aes(fill = model)) +
  facet_grid(grade ~ content) +
  guides(fill = "none") +
  scale_fill_brewer(palette = "Set2")

53 / 57

54 / 57

Summary

List columns are really powerful and really flexible
Also help you stay organized
You can approach the problem either with {purrr} or dplyr::rowwise().
- Important: If you use rowwise(), remember to ungroup() when you want it to go back to being a normal data frame
- I'm asking you to learn both - the row-wise approach might be a bit easier but is a little less general (only works with data frames)

55 / 57

Questions?

56 / 57

In-class Midterm

Next time: Parallel iterations

57 / 57

Help

Keyboard shortcuts

↑, ←, Pg Up, k

Go to previous slide

↓, →, Pg Dn, Space, j

Go to next slide

Home

Go to first slide

End

Go to last slide

Number + Return

Go to specific slide

b / m / f

Toggle blackout / mirrored / fullscreen mode

Clone slideshow

Toggle presenter mode

Restart the presentation timer

?, h

Toggle this help

List columns

Daniel Anderson

Week 4, Class 2

Agenda

Learning objectives

Review Lab 2

Setup

Please follow along

Parse file data

Select variables

Comparing models

Data pre-processing

Split the data

We could use this method

Alternative

What's going on here?

Explore a bit

It's a data frame!

map_*

map_*

Let's fit a model!

Extract the coefficients

Challenge

Compare models

You try first

Model fits

Brief foray into parallel iterations

Brief foray into parallel iterations

Pop Quiz

Brief foray into parallel iterations

Pop Quiz

Which fits better?

map2

map2

map2

Back to our anova object

Back to our anova object

Extract pp value

Extract pp value

All p-values

Slight alternative

Slight alternative

An alternative

Conducting operations by row

Operations by row

Add a group & summarize

List columns

Creating list columns

Creating list columns

Challenge

Challenge

Challenge 2

Challenge 2

Return atomic vectors

Fit all models

Look at all R2R2

It's a normal data frame!

Extract all R2R2

Plot

Unnesting

Unnesting

Unnesting

Tidy

Equivalently

Select and unnest

Plot

Lets look how the primary coefficients change

Last bit

By grade

Fit models

Look at R2R2

Plot

Summary

Questions?

In-class Midterm

Next time: Parallel iterations

Agenda

Help

`map_*`

`map_*`

`map2`

`map2`

`map2`

Back to our `anova` object

Back to our `anova` object

Extract $p$ value

Extract $p$ value

Look at all $R^{2}$

Extract all $R^{2}$

Look at $R^{2}$