library(tidyverse)
library(glue)
library(scales)
library(gt)
library(ggiraph) # interactive plots
library(tidycensus)
library(tigris) # TIGER/Line shapefiles
library(sf) # simple features for spatial analysis
Cleveland’s Changing Population
Exploring the US Census with R
The resurgence of people moving to downtown Cleveland is making news.1 According to a study commissioned by Downtown Cleveland Inc., the downtown population was almost 19,000 in the 2020 census, a 22% increase from 2010.2 However, Cleveland Open Data shows only 13,0003. Cleveland Scene reports that there are lots of estimates out there, one as low as 8,000!4 What gives? The organizations may be using different sources, like the decennial US census vs the more recent, but less comprehensive, American Community Survey. But it seems more likely they are using different geographic boundaries.
I was able to reproduce some estimates. My main tools to do this were the tidycensus R package for US Census data, and the Cleveland Open Data service for Cleveland neighborhood definitions. I’ll step through the process below.
This is a work file / tutorial. Researching Cleveland’s population is mostly a toy project to experiment with R tools that work with APIs. This should come in handy for some future project. If you are not me, I hope this helps with whatever you’re doing. Otherwise, ‘hello, future me!’ You can find the source code and downloaded data on my GitHub page.
Defining “Downtown”
Cleveland extends from Cleveland Hopkins Airport on the west all the way to Euclid on the east. It’s mostly bounded on the south by I-80. Here is the map from the Cleveland Wikipedia Page.
The 2020 US decennial census counted 372K people in Cleveland.5 That’s a decline from 397K in 2010. The 1-year American Community Survey (ACS) shows it is still falling, down to 363K in 2023.6 But the decline is uneven, and parts of the city are actually growing, including the downtown area. There is no official definition of downtown, so we can make some choices. The Census Bureau provides the building blocks for a definition: over 15K census blocks in Cleveland, rolled up to around 200 census tracts.
Cleveland’s City Planning Commission (CPC) defines 34 neighborhoods for urban planning initiatives.7 They are commonly referred to Statistical (or Social) Planning Areas (SPAs). I pasted a pdf map from the CPC below. You can see there is an SPA actually named “Downtown”. It’s bounded by the Cuyahoga River and I-90. Cleveland Open Data has an interactive map that you can explore and download. I downloaded and extracted its shapefile to my local drive.
So that is one definition. A second one comes from a study by Urban Partners that was commission by Downtown Cleveland, Inc. in 2023. Page 3 of the pdf report (copy/pasted below) shows a Westside and a Downtown Core. Whereas the Downtown SPA had about 13.3K people in the 2020 census, this Downtown Core had 18.7K people. The main differences are that Urban Partners took a bite out of the Central neighborhood on the east side, and parts of the West Bank of the Flats in the Cuyahoga Valley and Ohio City neighborhoods on the west side.
Blocks, Tracts, and Subdivisions
Let’s gather the materials to segment population estimates into these boundaries. Several R libraries make it easy to work with census data. The tidycencus package was developed to interface with the US Census Bureau APIs. It also returns feature geometries for spatial analysis. The tigris package works with the Census Bureau’s TIGER/Line shape files, and the sf (simple features) package performs spatial operations.
Let’s get the CPC’s definition of neighborhoods. I went to the City of Cleveland Open Data web site and and navigated to their analysis of the 2020 US Census.8 There interactive map has five layers (screen capture below). The first is the shape file of the 34 neighborhoods (SPAs). The second file contains population data from the 2020 decennial census complete with census block, census tract, and SPA. I downloaded and unzipped the first two files. Now I have a way to map the SPA boundaries within Cleveland, and I have a mapping of census blocks to SPAs so I can join this to the US Census data.
# Nice contiguous shape file. One record for each of the 34 SPAs.
<-
cleve_neigh_0 st_read(file.path(
"inputs/Cleveland Neighborhoods",
"Neighborhood_Population_Change.shp"
))
# Cleveland populated blocks. Includes block, tract, and SPA name.
<- st_read(file.path(
cleve_blocks "inputs/Cleveland Populated Blocks 2020",
"Decennial_2020_Populated_Blocks_Cleveland_Only.shp"
|>
)) select(-starts_with("P0"), -starts_with("H0"))
I could just join cleve_neigh_0
to US Census Bureau data files by the geography elements using the sf package. I know exactly which blocks belong in each SPA for 2020, but block definitions change across censuses, so joining to cleve_neigh_0
will get me the 2000 and 2010 figures. The shape file may not be perfectly precise because I can’t quite match quoted population estimates for 2000 and 2010, but it’s close.
Load shape files using the tigris package to facilitate mapping. I’ll get the state, county boundaries, and a few cities. I also got the Terminal town coordinates from Google. I don’t want to abuse the US Census Bureau website and API, so I’ll set a flag to only download data as I’m developing this script. Once I have what I want, I’ll keep my data on my local drive and build my report.
<- FALSE USE_API
if (USE_API) {
<- tigris::states(cb = TRUE) |> filter(STUSPS == "OH")
oh_state <- tigris::counties(cb = TRUE) |> filter(STUSPS == "OH")
oh_counties <- tigris::places("OH", year = 2024)
oh_places save(oh_state, oh_counties, oh_places, file = "tigris_shapes.Rdata")
else {
} load("tigris_shapes.Rdata")
}
Here is a plot of Cuyahoga County, Cleveland, and Cleveland’s 34 SPAs. Hover over the shapes to see their names. The red dot is Terminal Tower in the heart of downtown. Progressive field is a few blocks away, and 7.0 miles from my home in Cleveland Heights.
Show the code
<-
p ggplot() +
geom_sf_interactive(data = oh_counties,
aes(tooltip = paste(NAME, "County")),
fill = "honeydew", color = "honeydew3") +
geom_sf_interactive(data = oh_places,
aes(tooltip = NAME), fill = "honeydew3", color = "honeydew") +
geom_sf(data = oh_counties |> filter(NAME == "Cuyahoga"),
fill = NA, color = "honeydew4", linewidth = 1) +
geom_sf_interactive(data = cleve_neigh_0,
aes(tooltip = SPA_NAME),
fill = "honeydew4", color = "honeydew3") +
geom_sf_interactive(data = st_sfc(st_point(c(-81.69387, 41.49824)), crs = 4326),
aes(tooltip = "Terminal Tower"),
size = 3, color = "firebrick") +
geom_sf_text_interactive(data = oh_places, aes(label = NAME),
check_overlap = TRUE, size = 2) +
scale_x_continuous(limits = c(-82.0, -81.3)) +
scale_y_continuous(limits = c(41.25, 41.65)) +
theme(
panel.background = element_rect(fill = "skyblue"),
panel.grid = element_blank(),
axis.text = element_blank()
+
) labs(
x = NULL, y = NULL,
title = glue("Cuyahoga County, Cleveland, and SPAs"),
subtitle = "SPAs in dark sage. Terminal Tower in center of downtown in red."
)
girafe(ggobj = p)