Dataset for the age of schooling
I downloaded ‘School Life Expectancy’ from Our World in Data. I chose this data because I thought it was interesting to see the age variations between each country.
This is the link to the data.
The following code chunk loads the package I will use to read in and prepare the data for analysis.
glimpse(school_years)
Rows: 5,142
Columns: 4
$ Entity <chr> "Afghanistan", "Afghan…
$ Code <chr> "AFG", "AFG", "AFG", "…
$ Year <dbl> 1990, 1991, 1992, 1993…
$ `Expected Years of Schooling (years)` <dbl> 2.6, 2.9, 3.2, 3.6, 3.…
# View(school_years)
countries
that is a list of countries that I want to extract from the datasetcountries_age
countries_age
countries <- c("China",
"United States",
"South Korea",
"Phillippines",
"India",
"Ghana",
"Ethiopia")
countries_age <- school_years %>%
rename(Country = 1, Age = 4) %>%
filter(Year >= 2000, Country %in% countries) %>%
select(Country, Year, Age) %>%
mutate(Age = Age)
countries_age
# A tibble: 108 × 3
Country Year Age
<chr> <dbl> <dbl>
1 China 2000 9.6
2 China 2001 9.7
3 China 2002 9.9
4 China 2003 10.2
5 China 2004 10.6
6 China 2005 11
7 China 2006 11.5
8 China 2007 12
9 China 2008 12.3
10 China 2009 12.6
# … with 98 more rows
Check that the maximum years in school for 2000 equals the maximum in the graph
# A tibble: 1 × 1
total_age
<dbl>
1 15.6
Add a picture
Write the data to file in the project directory
write_csv(countries_age, file='countries_age.csv')