pacman::p_load(sf,tidyverse)In Class exercise 2
Analysis
R
sf
tidyverse
2.0 Getting Started
For this in-class exercise, two R packages will be used:
sffor importing, managing, and processing geospatial datatidyversefor performing data science tasks such as importing, wrangling and visualising data.
To install and load these packages into the R environment, we use the p_load function from the pacman package:
2.1 Working with Master Plan 2014 Subzone Boundary Data
mpsz14_shp <- st_read(dsn = "data/MasterPlan2014SubzoneBoundaryWebSHP",
layer = "MP14_SUBZONE_WEB_PL")Reading layer `MP14_SUBZONE_WEB_PL' from data source
`C:\Users\blzll\OneDrive\Desktop\Y3S1\IS415\Quarto\IS415\In-class_Ex\data\MasterPlan2014SubzoneBoundaryWebSHP'
using driver `ESRI Shapefile'
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
The code chunk below demonstrates data conversion from SHP file format to KML file format:
mpsz14_kml <- st_write(mpsz14_shp,
"data/MasterPlan2014SubzoneBoundaryWebKML.kml",
delete_dsn = TRUE)The delete_dsn argument relates to the dsn (Data Source Name) to delete original source before writing the new file
2.2 Working with Master Plan 2019 Subzone Boundary Data
mpsz19_kml <- st_read("data/MasterPlan2019SubzoneBoundaryNoSeaKML.kml")Reading layer `URA_MP19_SUBZONE_NO_SEA_PL' from data source
`C:\Users\blzll\OneDrive\Desktop\Y3S1\IS415\Quarto\IS415\In-class_Ex\data\MasterPlan2019SubzoneBoundaryNoSeaKML.kml'
using driver `KML'
Simple feature collection with 332 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension: XY, XYZ
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
z_range: zmin: 0 zmax: 0
Geodetic CRS: WGS 84
mpsz19_shp <- st_read(dsn = "data/MasterPlan2019SubzoneBoundaryWebSHP",
layer = "MPSZ-2019") %>%
st_transform(crs = 3414)Reading layer `MPSZ-2019' from data source
`C:\Users\blzll\OneDrive\Desktop\Y3S1\IS415\Quarto\IS415\In-class_Ex\data\MasterPlan2019SubzoneBoundaryWebSHP'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
2.3 Working with population data
popdata <- read_csv("data/respopagesextod2023/respopagesextod2023.csv")2.3.1 Data Preparation
popdata2023 <- popdata %>%
group_by(PA, SZ, AG) %>%
summarise(`POP` = sum(`Pop`)) %>%
ungroup() %>%
pivot_wider(names_from = AG,
values_from = POP)
colnames(popdata2023) [1] "PA" "SZ" "0_to_4" "10_to_14" "15_to_19"
[6] "20_to_24" "25_to_29" "30_to_34" "35_to_39" "40_to_44"
[11] "45_to_49" "50_to_54" "55_to_59" "5_to_9" "60_to_64"
[16] "65_to_69" "70_to_74" "75_to_79" "80_to_84" "85_to_89"
[21] "90_and_Over"
As seen above, unlike other programming languages, R indexes from ‘1’ instead of ‘0’. The rows begin from [1],[6],[11], etc.
2.3.2 Data Wrangling
popdata2023 <- popdata2023 %>%
mutate(YOUNG = rowSums(.[3:6])
+rowSums(.[14])) %>%
mutate(`ECONOMY ACTIVE` = rowSums(.[7:13])+
rowSums(.[15]))%>%
mutate(`AGED`=rowSums(.[16:21])) %>%
mutate(`TOTAL`=rowSums(.[3:21])) %>%
mutate(`DEPENDENCY` = (`YOUNG` + `AGED`)
/`ECONOMY ACTIVE`) %>%
select(`PA`, `SZ`, `YOUNG`,
`ECONOMY ACTIVE`, `AGED`,
`TOTAL`, `DEPENDENCY`)popdata2023 <- popdata2023 %>%
mutate_at(.vars = vars(PA, SZ),
.funs = list(toupper))2.3.3 Joining the attribute data and geospatial data
mpsz_2023 <- left_join(mpsz19_shp, popdata2023,
by = c("SUBZONE_N" = "SZ"))pop2023_mpsz <- left_join(popdata2023, mpsz19_shp,
by = c("SZ" = "SUBZONE_N"))