Here’s where you’ll explain your data. Where is it from and what’s a little bit of the background. Then you need to explain the columns (variables) in the dataset:
Describe here what you did to clean the data.
### Use this chunk to read in the data and clean it
scoobydoo <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-07-13/scoobydoo.csv')
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## .default = col_character(),
## index = col_double(),
## date_aired = col_date(format = ""),
## run_time = col_double(),
## monster_amount = col_double(),
## unmask_other = col_logical(),
## caught_other = col_logical(),
## caught_not = col_logical(),
## suspects_amount = col_double(),
## culprit_amount = col_double(),
## door_gag = col_logical(),
## batman = col_logical(),
## scooby_dum = col_logical(),
## scrappy_doo = col_logical(),
## hex_girls = col_logical(),
## blue_falcon = col_logical()
## )
## ℹ Use `spec()` for the full column specifications.
scoobydoo %>%
mutate(imdb = as.numeric(ifelse(imdb == 'NULL', NA, imdb)),
engagement = as.numeric(ifelse(engagement == 'NULL',
NA, imdb))) -> scoobydoo_tidy
if_it_wasnt_for
have higher engagement?For each question, make a plot illustrating the question, use a statistical to answer the question, and describe your conclusions.
if_it_wasnt_for
have higher engagement?DESCRIBE YOUR RESULTS HERE
### imdb rating
scoobydoo_tidy %>%
mutate(if_it_wasnt_for2 = ifelse(if_it_wasnt_for == 'NULL', 'no', 'yes')) %>%
ggplot(aes(x = imdb)) +
geom_density(aes(color = if_it_wasnt_for2)) +
theme_classic()
## Warning: Removed 15 rows containing non-finite values (stat_density).
### use this chunk to conduct a statistical test to answer your question
scoobydoo_tidy %>%
mutate(if_it_wasnt_for2 = ifelse(if_it_wasnt_for == 'NULL', 'no', 'yes')) %$%
chisq.test(imdb, if_it_wasnt_for2) %>%
tidy()
## Warning in chisq.test(imdb, if_it_wasnt_for2): Chi-squared approximation may be
## incorrect
## # A tibble: 1 x 4
## statistic p.value parameter method
## <dbl> <dbl> <int> <chr>
## 1 83.5 0.00113 48 Pearson's Chi-squared test
Rinse and repeat for another 9 questions