Project Part 1

Dataset for the age of schooling

  1. I downloaded ‘School Life Expectancy’ from Our World in Data. I chose this data because I thought it was interesting to see the age variations between each country.

  2. This is the link to the data.

  3. The following code chunk loads the package I will use to read in and prepare the data for analysis.

  1. Read the data in
school_years <- read_csv(here::here("_posts/2022-05-08-project-part-1/expected-years-of-schooling.csv"))
  1. Use glimpse to see the names and types of the columns
glimpse(school_years)
Rows: 5,142
Columns: 4
$ Entity                                <chr> "Afghanistan", "Afghan…
$ Code                                  <chr> "AFG", "AFG", "AFG", "…
$ Year                                  <dbl> 1990, 1991, 1992, 1993…
$ `Expected Years of Schooling (years)` <dbl> 2.6, 2.9, 3.2, 3.6, 3.…
# View(school_years)
  1. Use output from glimpse (and View) to prepare the data for analysis
countries  <- c("China",
               "United States",
               "South Korea",
               "Phillippines",
               "India", 
               "Ghana",
               "Ethiopia")

countries_age <- school_years  %>% 
  rename(Country = 1, Age = 4)  %>% 
  filter(Year >= 2000, Country %in%  countries)  %>% 
  select(Country, Year, Age)  %>% 
  mutate(Age = Age)

countries_age
# A tibble: 108 × 3
   Country  Year   Age
   <chr>   <dbl> <dbl>
 1 China    2000   9.6
 2 China    2001   9.7
 3 China    2002   9.9
 4 China    2003  10.2
 5 China    2004  10.6
 6 China    2005  11  
 7 China    2006  11.5
 8 China    2007  12  
 9 China    2008  12.3
10 China    2009  12.6
# … with 98 more rows

Check that the maximum years in school for 2000 equals the maximum in the graph

countries_age %>% filter(Year == 2000)  %>% 
  summarise(total_age = max(Age))
# A tibble: 1 × 1
  total_age
      <dbl>
1      15.6

Add a picture

School Life Expectancy

Write the data to file in the project directory

write_csv(countries_age, file='countries_age.csv')