Cleveland’s Changing Population

Exploring the US Census with R

Published

February 7, 2025

The resurgence of people moving to downtown Cleveland is making news.1 According to a study commissioned by Downtown Cleveland Inc., the downtown population was almost 19,000 in the 2020 census, a 22% increase from 2010.2 However, Cleveland Open Data shows only 13,0003. Cleveland Scene reports that there are lots of estimates out there, one as low as 8,000!4 What gives? The organizations may be using different sources, like the decennial US census vs the more recent, but less comprehensive, American Community Survey. But it seems more likely they are using different geographic boundaries.

I was able to reproduce some estimates. My main tools to do this were the tidycensus R package for US Census data, and the Cleveland Open Data service for Cleveland neighborhood definitions. I’ll step through the process below.

Note

This is a work file / tutorial. Researching Cleveland’s population is mostly a toy project to experiment with R tools that work with APIs. This should come in handy for some future project. If you are not me, I hope this helps with whatever you’re doing. Otherwise, ‘hello, future me!’ You can find the source code and downloaded data on my GitHub page.

Defining “Downtown”

Cleveland extends from Cleveland Hopkins Airport on the west all the way to Euclid on the east. It’s mostly bounded on the south by I-80. Here is the map from the Cleveland Wikipedia Page.

Screen capture from Cleveland article on Wikipedia.

Screen capture from Cleveland article on Wikipedia.

The 2020 US decennial census counted 372K people in Cleveland.5 That’s a decline from 397K in 2010. The 1-year American Community Survey (ACS) shows it is still falling, down to 363K in 2023.6 But the decline is uneven, and parts of the city are actually growing, including the downtown area. There is no official definition of downtown, so we can make some choices. The Census Bureau provides the building blocks for a definition: over 15K census blocks in Cleveland, rolled up to around 200 census tracts.

Cleveland’s City Planning Commission (CPC) defines 34 neighborhoods for urban planning initiatives.7 They are commonly referred to Statistical (or Social) Planning Areas (SPAs). I pasted a pdf map from the CPC below. You can see there is an SPA actually named “Downtown”. It’s bounded by the Cuyahoga River and I-90. Cleveland Open Data has an interactive map that you can explore and download. I downloaded and extracted its shapefile to my local drive.

Screen capture from City Planning Commission 2010 Census pdf.

Screen capture from City Planning Commission 2010 Census pdf.

So that is one definition. A second one comes from a study by Urban Partners that was commission by Downtown Cleveland, Inc. in 2023. Page 3 of the pdf report (copy/pasted below) shows a Westside and a Downtown Core. Whereas the Downtown SPA had about 13.3K people in the 2020 census, this Downtown Core had 18.7K people. The main differences are that Urban Partners took a bite out of the Central neighborhood on the east side, and parts of the West Bank of the Flats in the Cuyahoga Valley and Ohio City neighborhoods on the west side.

Downtown Cleveland Market Study Report, p3. Urban Partners.

Downtown Cleveland Market Study Report, p3. Urban Partners.

Blocks, Tracts, and Subdivisions

Let’s gather the materials to segment population estimates into these boundaries. Several R libraries make it easy to work with census data. The tidycencus package was developed to interface with the US Census Bureau APIs. It also returns feature geometries for spatial analysis. The tigris package works with the Census Bureau’s TIGER/Line shape files, and the sf (simple features) package performs spatial operations.

library(tidyverse)
library(glue)
library(scales)
library(gt)
library(ggiraph)  # interactive plots

library(tidycensus)
library(tigris) # TIGER/Line shapefiles
library(sf)  # simple features for spatial analysis

Let’s get the CPC’s definition of neighborhoods. I went to the City of Cleveland Open Data web site and and navigated to their analysis of the 2020 US Census.8 There interactive map has five layers (screen capture below). The first is the shape file of the 34 neighborhoods (SPAs). The second file contains population data from the 2020 decennial census complete with census block, census tract, and SPA. I downloaded and unzipped the first two files. Now I have a way to map the SPA boundaries within Cleveland, and I have a mapping of census blocks to SPAs so I can join this to the US Census data.

Screen capture from Open Data

Screen capture from Open Data
# Nice contiguous shape file. One record for each of the 34 SPAs.
cleve_neigh_0 <-
  st_read(file.path(
  "inputs/Cleveland Neighborhoods",
  "Neighborhood_Population_Change.shp"
))

# Cleveland populated blocks. Includes block, tract, and SPA name.
cleve_blocks <- st_read(file.path(
  "inputs/Cleveland Populated Blocks 2020",
  "Decennial_2020_Populated_Blocks_Cleveland_Only.shp"
)) |>
  select(-starts_with("P0"), -starts_with("H0"))

I could just join cleve_neigh_0 to US Census Bureau data files by the geography elements using the sf package. I know exactly which blocks belong in each SPA for 2020, but block definitions change across censuses, so joining to cleve_neigh_0 will get me the 2000 and 2010 figures. The shape file may not be perfectly precise because I can’t quite match quoted population estimates for 2000 and 2010, but it’s close.

Load shape files using the tigris package to facilitate mapping. I’ll get the state, county boundaries, and a few cities. I also got the Terminal town coordinates from Google. I don’t want to abuse the US Census Bureau website and API, so I’ll set a flag to only download data as I’m developing this script. Once I have what I want, I’ll keep my data on my local drive and build my report.

USE_API <- FALSE
if (USE_API) {
  oh_state <- tigris::states(cb = TRUE) |> filter(STUSPS == "OH") 
  oh_counties <- tigris::counties(cb = TRUE) |> filter(STUSPS == "OH")
  oh_places <- tigris::places("OH", year = 2024)
  save(oh_state, oh_counties, oh_places, file = "tigris_shapes.Rdata")
} else {
  load("tigris_shapes.Rdata")
}

Here is a plot of Cuyahoga County, Cleveland, and Cleveland’s 34 SPAs. Hover over the shapes to see their names. The red dot is Terminal Tower in the heart of downtown. Progressive field is a few blocks away, and 7.0 miles from my home in Cleveland Heights.

Show the code
p <-
  ggplot() +
  geom_sf_interactive(data = oh_counties, 
                      aes(tooltip = paste(NAME, "County")), 
                      fill = "honeydew", color = "honeydew3") +
  geom_sf_interactive(data = oh_places, 
                      aes(tooltip = NAME), fill = "honeydew3", color = "honeydew") +
  geom_sf(data = oh_counties |> filter(NAME == "Cuyahoga"), 
          fill = NA, color = "honeydew4", linewidth = 1) +
  geom_sf_interactive(data = cleve_neigh_0,
                      aes(tooltip = SPA_NAME),
                      fill = "honeydew4", color = "honeydew3") +
  geom_sf_interactive(data = st_sfc(st_point(c(-81.69387, 41.49824)), crs = 4326), 
                      aes(tooltip = "Terminal Tower"),
                      size = 3, color = "firebrick") +
  geom_sf_text_interactive(data = oh_places, aes(label = NAME), 
                           check_overlap = TRUE, size = 2) +
  scale_x_continuous(limits = c(-82.0, -81.3)) +
  scale_y_continuous(limits = c(41.25, 41.65)) +
  theme(
    panel.background = element_rect(fill = "skyblue"),
    panel.grid = element_blank(),
    axis.text = element_blank()
  ) +
  labs(
    x = NULL, y = NULL, 
    title = glue("Cuyahoga County, Cleveland, and SPAs"),
    subtitle = "SPAs in dark sage. Terminal Tower in center of downtown in red."
  )

girafe(ggobj = p)

Census Data

The Census Bureau API allows you to select multiple variables from a single census file. There are a few files for each census, and the variable names change. I want the Cleveland area population in 2000, 2010, 2020, and the American Community Survey (ACS) 1-year estimate from 2023 (most recent). So despite the handiness of tidycensus package, data collection is still going to be a bit tedious.

The decennial census developer page lists the accessible datasets: 2000, 2010, and 2020. You need an API key from the Bureau before you can do anything. This is quick and easy: just click the “Request a KEY” tile in the menu at the left. The Census Bureau emails you a key. Best practice is to save the key in an .Renviron file.

usethis::edit_r_environ(scope = "project")

This opens (or creates) a .Renviron file in your project root. Add your key. The name is important: CENSUS_API_KEY. The tidycensus functions send that system variable (if you don’t explicitly supply it in the function). Set it like this:

CENSUS_API_KEY="abc123"

Now you can pull census data. I’ll start with 2020.

2020

The decennial census developer page has several data files for each census. Through trial and error, I discovered Redistricting Data (PL 94-171) contains overall population. There is a full list of variables that represent the various sub-groups of the population. I used it and the tidycensus::load_variables() function to identify the ones I want. I’ll include race/ethnicity to investigate demographic trends.

pl_2020_vars <-
  tidycensus::load_variables(2020, "pl") |>
  filter(
    between(name, "P2_001N", "P2_011N"),
    !name %in% c("P2_003N", "P2_004N")
  )

Here they are after a bit of cleaning. I created var_level to group several infrequent values into “Other”.

Show the code
pl_2020_vars <- 
  pl_2020_vars |>
  mutate(
    var_detail = case_when(
      str_detect(label, "White") ~ "White",
      str_detect(label, "Black") ~ "Black",
      str_detect(label, "Asian") ~ "Asian",
      str_detect(label, "American Indian") ~ "American Indian",
      str_detect(label, "Native Hawaiian") ~ "Pacific Islander",
      str_detect(label, "Some Other Race") ~ "Some Other Race",
      str_detect(label, "two or more races") ~ "Two or more races",
      str_detect(label, "Hispanic") ~ "Hispanic",
      str_detect(label, "Total") ~ "Total",
      TRUE ~ label
    ),
    var_group = if_else(name == "P2_001N", "Total", "Race/ethnicity"),
    var_level = if_else(
      var_detail %in% c("White", "Black", "Hispanic", "Asian", "Total"),
      var_detail, "Other")
  ) |>
  select(variable = name, var_group, var_level, var_detail)

pl_2020_vars
# A tibble: 9 × 4
  variable var_group      var_level var_detail       
  <chr>    <chr>          <chr>     <chr>            
1 P2_001N  Total          Total     Total            
2 P2_002N  Race/ethnicity Hispanic  Hispanic         
3 P2_005N  Race/ethnicity White     White            
4 P2_006N  Race/ethnicity Black     Black            
5 P2_007N  Race/ethnicity Other     American Indian  
6 P2_008N  Race/ethnicity Asian     Asian            
7 P2_009N  Race/ethnicity Other     Pacific Islander 
8 P2_010N  Race/ethnicity Other     Some Other Race  
9 P2_011N  Race/ethnicity Other     Two or more races

The Demographic Profile contains age data.

dp_2020_vars <- 
  tidycensus::load_variables(2020, "dp") |>
  filter(
    str_detect(label, "Count!!SEX AND AGE!!Total population"),
    !str_detect(label, "Selected Age Categories"),
    name != "DP1_0001C"
  )

I’ll ignore sex and aggregate the ages into ten-year buckets.

Show the code
dp_2020_vars <- 
  dp_2020_vars |>
  mutate(
    var_detail = str_remove_all(label, "(Count!!SEX AND AGE!!Total population)|(!!)"),
    var_detail = if_else(var_detail == "", "Total", var_detail),
    var_group = "Age",
    var_level = case_when(
      name <= "DP1_0004C" ~ "Under 15 yrs",
      name <= "DP1_0006C" ~ "15 to 24 yrs",
      name <= "DP1_0008C" ~ "25 to 34 yrs",
      name <= "DP1_0010C" ~ "35 to 44 yrs",
      name <= "DP1_0012C" ~ "45 to 54 yrs",
      name <= "DP1_0014C" ~ "55 to 64 yrs",
      TRUE ~ "65+ yrs"
    )
  ) |>
  select(variable = name, var_group, var_level, var_detail)

dp_2020_vars
# A tibble: 18 × 4
   variable  var_group var_level    var_detail       
   <chr>     <chr>     <chr>        <chr>            
 1 DP1_0002C Age       Under 15 yrs Under 5 years    
 2 DP1_0003C Age       Under 15 yrs 5 to 9 years     
 3 DP1_0004C Age       Under 15 yrs 10 to 14 years   
 4 DP1_0005C Age       15 to 24 yrs 15 to 19 years   
 5 DP1_0006C Age       15 to 24 yrs 20 to 24 years   
 6 DP1_0007C Age       25 to 34 yrs 25 to 29 years   
 7 DP1_0008C Age       25 to 34 yrs 30 to 34 years   
 8 DP1_0009C Age       35 to 44 yrs 35 to 39 years   
 9 DP1_0010C Age       35 to 44 yrs 40 to 44 years   
10 DP1_0011C Age       45 to 54 yrs 45 to 49 years   
11 DP1_0012C Age       45 to 54 yrs 50 to 54 years   
12 DP1_0013C Age       55 to 64 yrs 55 to 59 years   
13 DP1_0014C Age       55 to 64 yrs 60 to 64 years   
14 DP1_0015C Age       65+ yrs      65 to 69 years   
15 DP1_0016C Age       65+ yrs      70 to 74 years   
16 DP1_0017C Age       65+ yrs      75 to 79 years   
17 DP1_0018C Age       65+ yrs      80 to 84 years   
18 DP1_0019C Age       65+ yrs      85 years and over

With the variables identified, request the data from the API. Cleveland is one of 59 subdivisions within Cuyahoga County. I’ll download the subdivision data to get a total count for Cuyahoga County and for Cleveland. Counties are composed of census tracts, and census tracts are composed of census blocks. Cities overlap census tracts, so I’ll download the block-level data and join to the cleve_blocks dataset from Cleveland Open Data. Urban Partners defined their Downtown Core and Westside areas by tract and some blocks from the Central SPA. I figured out which tracts and blocks by studying their map and swearing a lot.

Show the code
# Utility function to create factors
var_relevel <- function(x) {
  race_levels = c("Black", "White", "Hispanic", "Asian", "Other", "Total")
  ethn <- c("Black", "White", "Hispanic", "Asian", "Other", "Total")
  x <- fct_relevel(x, race_levels, after = Inf)
  x <- fct_relevel(x, "Under 15 yrs", after = 0)
  return(x)
}

# Urban Partners definitions of greater Cleveland uses tracts and blocks.
westside_tracts_2020 <- c(
  "103100", "103400", "103500", "103602", "103800", "103900", "104100", 
  "104200", "104300", "197800", "197700", "197500", "104400"
)
downtown_core_tracts_2020 <- c("103300", "107101", "107701", "107802", "109301")
downtown_core_blocks_2020 <- paste0("39035108701", c("2001", "2004", "2006", "2008"))

if (USE_API) {

  subdiv_2020_pl_detail <- 
    get_decennial( 
      geography = "county subdivision",
      sumfile = "pl",
      variables = pl_2020_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2020
    ) |> 
    inner_join(pl_2020_vars, by = "variable")
  
  subdiv_2020_dp_detail <- 
    get_decennial( 
      geography = "county subdivision",
      sumfile = "dp",
      variables = dp_2020_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2020
    ) |> 
    inner_join(dp_2020_vars, by = "variable")
  
  block_2020_pl_detail <- 
    get_decennial( 
      geography = "block",
      sumfile = "pl",
      variables = pl_2020_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2020
    ) |> 
    inner_join(pl_2020_vars, by = "variable")
  
  # dp is not available at the block level
  # block_2020_dp_detail <- 
  #   get_decennial( 
  #     geography = "block",
  #     sumfile = "dp",
  #     variables = dp_2020_vars$variable,
  #     state = "OH",
  #     county = "Cuyahoga",
  #     geometry = TRUE, 
  #     year = 2020
  #   ) |> 
  #   inner_join(dp_2020_vars, by = "variable")

  # The "_detail" objects include the original levels for race/ethnicity and age.
   
  subdiv_2020_detail <- 
    bind_rows(subdiv_2020_pl_detail, subdiv_2020_dp_detail) |>
    mutate(
      var_level = var_relevel(var_level),
      NAME = str_remove_all(NAME, "(, Cuyahoga County, Ohio)|(village)|(city)"),
      NAME = str_trim(NAME)
    ) |>
    select(-variable) |>
    rename(geo_name = NAME) |>
    relocate(value, .after = var_detail)
           
  block_2020_detail <-
    block_2020_pl_detail |>
    inner_join(
      cleve_blocks |> as_tibble() |> select(GEOID20, SPA = SPA_NAME), 
      by = c("GEOID" = "GEOID20")
    ) |>
    mutate(
      var_level = var_relevel(var_level),
      greater_downtown = case_when(
        str_sub(GEOID, 6, 11) %in% westside_tracts_2020 ~ "Westside",
        str_sub(GEOID, 6, 11) %in% downtown_core_tracts_2020 ~ "Downtown Core",
        GEOID %in% downtown_core_blocks_2020 ~ "Downtown Core",
        TRUE ~ "Other"
      ),
      SPA = factor(str_to_title(SPA)),
      SPA = fct_relevel(SPA, "Downtown", after = 0),
      greater_downtown = factor(
        greater_downtown, levels = c("Downtown Core", "Westside", "Other"))
    ) |>
    select(-variable) |>
    rename(geo_name = NAME) |>
    relocate(value, .after = var_detail)

  # The non-_detail objects summarize by my grouped race/ethnicity and age.
  
  subdiv_2020 <- 
    subdiv_2020_detail |>
    summarize(
      .by = c(GEOID, geo_name, geometry, var_group, var_level),
      value = sum(value)
    )
  
  block_2020 <- 
    block_2020_detail |>
    summarize(
      .by = c(GEOID, geo_name, geometry, var_group, var_level, SPA, greater_downtown),
      value = sum(value)
    )
  
  save(
    subdiv_2020_detail, block_2020_detail, subdiv_2020, block_2020, 
    file = "decennial_2020.Rdata"
  )

} else {
  
  load("decennial_2020.Rdata")
  
}

The data is sliced three ways below. The top section of the table below is the subdivision data. Cuyahoga County has 1.3 million people with Cleveland’s at 372,624. The second section groups the block-level data by SPA. The Downtown SPA had 13,302 people. This matches the data table on Cleveland Open Data. The Downtown Core defined by Urban Partners, which included portions of the Central, Ohio City, and Cuyahoga Valley SPAs, had 18,708 people.

The map on the second tab shows Cuyahoga County and all of its subdivisions. Cleveland is the largest, and each of its SPAs are broken out. The Downtown SPA is highlighted. The Urban Partners extensions to downtown aren’t shown.

2020 Population Estimates for Cleveland and Vicinity
Population
Cuyahoga County Subdivisions
Cleveland 372,624
Other 892,193
Total 1,264,817
Cleveland Neighborhoods
Downtown 13,302
Bellaire-Puritas 13,823
Broadway-Slavic Village 19,022
Brooklyn Centre 8,315
Buckeye Shaker 11,419
Central 11,955
Clark Fulton 7,625
Collinwood Nottingham 9,616
Cudell 9,115
Cuyahoga Valley 1,293
Detroit-Shoreway 11,326
Edgewater 6,000
Euclid Green 5,051
Fairfax 5,167
Glenville 21,137
Goodrich-Kirtland Park 3,955
Hopkins 534
Hough 9,702
Jefferson 17,351
Kamms Corners 24,312
Kinsman 5,876
Lee-Harvard 9,770
Lee-Seville 4,171
Mount Pleasant 14,015
North Shore Collinwood 14,928
Ohio City 9,219
Old Brooklyn 32,315
Saint Clair-Superior 5,139
Stockyards 9,522
Tremont 7,798
Union-Miles Park 15,625
University Circle 9,620
West Boulevard 18,981
Woodland Hills 5,625
Total 372,624
Greater Downtown
Downtown Core 18,708
Westside 18,407
Other 335,509
Total 372,624

2010

Unfortunately, pulling 2010 and 2000 isn’t as simple as changing the year parameter in the API calls because they use a different file, Summary File 1.

sf1_2010_vars <-
  tidycensus::load_variables(2010, "sf1") |>
  filter(
    concept %in% c("HISPANIC OR LATINO ORIGIN BY RACE", "SEX BY AGE"),
    !name %in% c("P005002", "P012001", "P012002", "P012026"),
    !str_detect(label, "Total!!Hispanic or Latino!!"),
    !str_detect(name, "^PCT012")
  )

I’ll prepare the variables the same way as with 2020. There are twice as many age variables this time because there is one variable for each sex.

Show the code
sf1_2010_vars <-
  sf1_2010_vars |>
  mutate(
    label = str_remove(label, "(Total!!Male!!)|(Total!!Female!!)"),
    label = str_remove(label, "(Total!!Not Hispanic or Latino!!)"),
    var_detail = case_when(
      str_detect(label, "White") ~ "White",
      str_detect(label, "Black") ~ "Black",
      str_detect(label, "Asian") ~ "Asian",
      str_detect(label, "American Indian") ~ "American Indian",
      str_detect(label, "Native Hawaiian") ~ "Pacific Islander",
      str_detect(label, "Some Other Race") ~ "Other",
      str_detect(label, "Two or More Races") ~ "Two or more races",
      str_detect(label, "Hispanic") ~ "Hispanic",
      str_detect(label, "Total") ~ "Total",
      TRUE ~ label,
    ),
    var_group = case_when(
      name == "P005001" ~ "Total",
      between(name, "P005003", "P005010") ~ "Race/ethnicity",
      TRUE ~ "Age"
    ),
    var_level = case_when(
      var_detail %in% c("White", "Black", "Hispanic", "Asian", "Total", "Other") ~ var_detail,
      var_detail %in% c("American Indian", "Pacific Islander", "Two or more races") ~ "Other",
      var_detail %in% c("Under 5 years", "5 to 9 years", "10 to 14 years") ~ "Under 15 yrs",
      between(var_detail, "15 to 17 years", "22 to 24 years") ~ "15 to 24 yrs",
      var_detail %in% c("25 to 29 years", "30 to 34 years") ~ "25 to 34 yrs",
      var_detail %in% c("35 to 39 years", "40 to 44 years") ~ "35 to 44 yrs",
      var_detail %in% c("45 to 49 years", "50 to 54 years") ~ "45 to 54 yrs",
      between(var_detail, "55 to 59 years", "62 to 64 years") ~ "55 to 64 yrs",
      between(var_detail, "65 and 66 years", "85 years and over") ~ "65+ yrs"
    )
  ) |>
  select(variable = name, var_group, var_level, var_detail)

sf1_2010_vars
# A tibble: 55 × 4
   variable var_group      var_level    var_detail       
   <chr>    <chr>          <chr>        <chr>            
 1 P005001  Total          Total        Total            
 2 P005003  Race/ethnicity White        White            
 3 P005004  Race/ethnicity Black        Black            
 4 P005005  Race/ethnicity Other        American Indian  
 5 P005006  Race/ethnicity Asian        Asian            
 6 P005007  Race/ethnicity Other        Pacific Islander 
 7 P005008  Race/ethnicity Other        Other            
 8 P005009  Race/ethnicity Other        Two or more races
 9 P005010  Race/ethnicity Hispanic     Hispanic         
10 P012003  Age            Under 15 yrs Under 5 years    
# ℹ 45 more rows

Request the data from the API. This time I cannot join to cleve_blocks to get precise mappings of census blocks to SPAs. Instead, I’ll join to the cleve_neigh shape file to spatially join to the SPAs. This turns out to be almost as good, but not perfect.

Show the code
# Urban Partners definitions of greater Cleveland uses tracts and blocks.
westside_tracts_2010 <- westside_tracts_2020
downtown_core_tracts_2010 <- downtown_core_tracts_2020
downtown_core_blocks_2010 <- paste0("39035108701", c("3000", "3001", "3002", "3003", "3004"))

if (USE_API) {

  subdiv_2010_detail_0 <- 
    get_decennial( 
      geography = "county subdivision",
      sumfile = "sf1",
      variables = sf1_2010_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2010
    ) |> 
    inner_join(sf1_2010_vars, by = "variable")

  subdiv_2010_detail_1 <- 
    subdiv_2010_detail_0 |>
    mutate(
      var_level = var_relevel(var_level),
      NAME = str_remove_all(NAME, "(, Cuyahoga County, Ohio)|(village)|(city)"),
      NAME = str_trim(NAME)
    ) |>
    rename(geo_name = NAME)
  
  # Sum the two rows (for each sex)
  subdiv_2010_detail <-
    subdiv_2010_detail_1 |> 
    select(-variable) |>
    summarize(.by = -c(value), value = sum(value))
           
  block_2010_detail_0 <- 
    get_decennial( 
      geography = "block",
      sumfile = "sf1",
      variables = sf1_2010_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2010
    ) |> 
    inner_join(sf1_2010_vars, by = "variable")

  block_2010_detail_1 <-
    st_join(cleve_neigh, st_centroid(block_2010_detail_0), join = st_contains) |>
    mutate(
      var_level = var_relevel(var_level),
      greater_downtown = case_when(
        str_sub(GEOID, 6, 11) %in% westside_tracts_2010 ~ "Westside",
        str_sub(GEOID, 6, 11) %in% downtown_core_tracts_2010 ~ "Downtown Core",
        GEOID %in% downtown_core_blocks_2010 ~ "Downtown Core",
        TRUE ~ "Other"
      ),
      SPA = factor(str_to_title(SPA)),
      SPA = fct_relevel(SPA, "Downtown", after = 0),
      greater_downtown = factor(
        greater_downtown, levels = c("Downtown Core", "Westside", "Other"))
    ) |>
    select(GEOID, geo_name = NAME, geometry, var_group, var_level, var_detail, 
           value, SPA, greater_downtown)

  # Sum the two rows (for each sex)
  block_2010_detail <-
    block_2010_detail_1 |> 
    summarize(.by = -c(value), value = sum(value))
  
  subdiv_2010 <- 
    subdiv_2010_detail |>
    summarize(
      .by = c(GEOID, geo_name, geometry, var_group, var_level),
      value = sum(value)
    )
  
  block_2010 <- 
    block_2010_detail |>
    summarize(
      .by = c(GEOID, geo_name, geometry, var_group, var_level, SPA, greater_downtown),
      value = sum(value)
    )
  
  save(
    subdiv_2010_detail, block_2010_detail, subdiv_2010, block_2010, 
    file = "decennial_2010.Rdata"
  )

} else {
  
  load("decennial_2010.Rdata")
  
}

This time the sum of the neighborhoods, 395,601, doesn’t quite equal the city total, 396,815. There must be city blocks whose centers are not captured in the shapes in cleve_neigh. Comparing my values to those in the data table on Cleveland Open Data, the largest differences are in Euclid Green, Kamm’s Corners, and Hopkins. I haven’t thought of a good way to fix this, so I’m settling for “close enough”.

The Downtown SPA population of 9,464 does match the value reported in Cleveland Open Data. It was quite a bit lower than 2020 (13,302). My Downtown Core population of 15,156 is slightly different from Urban Partner’s value of 15,330. The Westside population does match though.

2010 Population Estimates for Cleveland and Vicinity
Population
Cuyahoga County Subdivisions
Cleveland 396,815
Other 883,307
Total 1,280,122
Cleveland Neighborhoods
Downtown 9,464
Bellaire-Puritas 13,380
Broadway-Slavic Village 22,331
Brooklyn Centre 8,948
Buckeye Shaker 12,470
Central 12,306
Clark Fulton 8,509
Collinwood Nottingham 11,542
Cudell 9,295
Cuyahoga Valley 1,378
Detroit-Shoreway 11,577
Edgewater 5,851
Euclid Green 4,873
Fairfax 6,239
Glenville 27,394
Goodrich-Kirtland Park 4,238
Hopkins 646
Hough 11,490
Jefferson 16,548
Kamms Corners 24,097
Kinsman 6,966
Lee-Harvard 10,326
Lee-Seville 4,477
Mount Pleasant 17,320
North Shore Collinwood 15,768
Ohio City 8,396
Old Brooklyn 32,009
Saint Clair-Superior 6,876
Stockyards 10,411
Tremont 7,975
Union-Miles Park 19,004
University Circle 7,939
West Boulevard 18,880
Woodland Hills 6,678
Total 395,601
Greater Downtown
Downtown Core 15,156
Westside 18,433
Other 362,012
Total 395,601

2000

2000 is similar to 2010 in that it uses Summary File 1.

sf1_2000_vars <-
  tidycensus::load_variables(2000, "sf1") |>
  filter(
    concept %in% c(
      "HISPANIC OR LATINO, AND NOT HISPANIC OR LATINO BY RACE [73]", 
      "SEX BY AGE [49]"
    ),
    !name %in% c("P004003", "P004004", "P012001", "P012002", "P012026"),
    !str_detect(label, "Population of two or more races!!"),
    !str_detect(name, "^PCT013")
  )

Same process: prepare the variables.

Show the code
sf1_2000_vars <-
  sf1_2000_vars |>
  mutate(
    label = str_remove(label, "(Total!!Male!!)|(Total!!Female!!)"),
    label = str_remove(label, "(Total!!Not Hispanic or Latino!!)"),
    var_detail = case_when(
      str_detect(label, "White") ~ "White",
      str_detect(label, "Black") ~ "Black",
      str_detect(label, "Asian") ~ "Asian",
      str_detect(label, "American Indian") ~ "American Indian",
      str_detect(label, "Native Hawaiian") ~ "Pacific Islander",
      str_detect(label, "Some other race") ~ "Other",
      str_detect(label, "two or more races") ~ "Two or more races",
      str_detect(label, "Hispanic") ~ "Hispanic",
      str_detect(label, "Total") ~ "Total",
      TRUE ~ label,
    ),
    var_group = case_when(
      name == "P004001" ~ "Total",
      between(name, "P004002", "P004011") ~ "Race/ethnicity",
      TRUE ~ "Age"
    ),
    var_level = case_when(
      var_detail %in% c("White", "Black", "Hispanic", "Asian", "Total", "Other") ~ var_detail,
      var_detail %in% c("American Indian", "Pacific Islander", "Two or more races") ~ "Other",
      var_detail %in% c("Under 5 years", "5 to 9 years", "10 to 14 years") ~ "Under 15 yrs",
      between(var_detail, "15 to 17 years", "22 to 24 years") ~ "15 to 24 yrs",
      var_detail %in% c("25 to 29 years", "30 to 34 years") ~ "25 to 34 yrs",
      var_detail %in% c("35 to 39 years", "40 to 44 years") ~ "35 to 44 yrs",
      var_detail %in% c("45 to 49 years", "50 to 54 years") ~ "45 to 54 yrs",
      between(var_detail, "55 to 59 years", "62 to 64 years") ~ "55 to 64 yrs",
      between(var_detail, "65 and 66 years", "85 years and over") ~ "65+ yrs"
    )
  ) |>
  select(variable = name, var_group, var_level, var_detail)

sf1_2000_vars
# A tibble: 55 × 4
   variable var_group      var_level    var_detail       
   <chr>    <chr>          <chr>        <chr>            
 1 P004001  Total          Total        Total            
 2 P004002  Race/ethnicity Hispanic     Hispanic         
 3 P004005  Race/ethnicity White        White            
 4 P004006  Race/ethnicity Black        Black            
 5 P004007  Race/ethnicity Other        American Indian  
 6 P004008  Race/ethnicity Asian        Asian            
 7 P004009  Race/ethnicity Other        Pacific Islander 
 8 P004010  Race/ethnicity Other        Other            
 9 P004011  Race/ethnicity Other        Two or more races
10 P012003  Age            Under 15 yrs Under 5 years    
# ℹ 45 more rows

Request the data from the API. I’ll used the cleve_neigh shape file again to identify the SPAs. Tract and block identifiers can change from census to census, so I had to make some changes to the Downtown Core definition. I used the same block identifiers for Urban Partners’ definitions, but they did not include 2020 in their report, so I’m not sure how much this differs.

Show the code
# Tract definition are same for 2000.
westside_tracts_2000 <- westside_tracts_2020

downtown_core_tracts_2000 <- c(
  "107100", "107200", "107300", "107400", "107500", "107600", "107700",
  "107800", "107900", "109200")

downtown_core_blocks_2000 <- 
  paste0("39035108701", c("3000", "3001", "3002", "3003", "3004"))

if (USE_API) {

  subdiv_2000_detail_0 <- 
    get_decennial( 
      geography = "county subdivision",
      sumfile = "sf1",
      variables = sf1_2000_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = FALSE, # no county subdivision geography in 2000
      year = 2000
    ) |> 
    inner_join(sf1_2000_vars, by = "variable") 
  
  subdiv_2000_detail_1 <- 
    subdiv_2000_detail_0 |>
    mutate(
      var_level = var_relevel(var_level),
      NAME = str_remove_all(NAME, "(, Cuyahoga County, Ohio)|(village)|(city)"),
      NAME = str_trim(NAME)
    ) |>
    rename(geo_name = NAME)

  # Sum the two rows (for each sex)
  subdiv_2000_detail_2 <-
    subdiv_2000_detail_1 |> 
    select(-variable) |>
    summarize(.by = -c(value), value = sum(value))
           
  # No geometry for 2000? No problem! Use the 2010 geometry and replace the 
  # values with 2000.
  subdiv_2000_detail_3 <- 
    subdiv_2000_detail_2 |> 
    as_tibble() |> 
    select(GEOID, var_group, var_level, var_detail, value)
  
  subdiv_2000_detail <- 
    subdiv_2010_detail |>
    select(-value) |>
    inner_join(subdiv_2000_detail_3, 
               by = c("GEOID", "var_group", "var_level", "var_detail"))

  block_2000_detail_0 <- 
    get_decennial( 
      geography = "block",
      sumfile = "sf1",
      variables = sf1_2000_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2000
    ) |> 
    inner_join(sf1_2000_vars, by = "variable")
  
  block_2000_detail_1 <-
    st_join(cleve_neigh, st_centroid(block_2000_detail_0), join = st_contains) |>
    mutate(
      var_level = var_relevel(var_level),
      greater_downtown = case_when(
        str_sub(GEOID, 6, 11) %in% westside_tracts_2000 ~ "Westside",
        str_sub(GEOID, 6, 11) %in% downtown_core_tracts_2000 ~ "Downtown Core",
        GEOID %in% downtown_core_blocks_2000 ~ "Downtown Core",
        TRUE ~ "Other"
      ),
      SPA = factor(str_to_title(SPA)),
      SPA = fct_relevel(SPA, "Downtown", after = 0),
      greater_downtown = factor(
        greater_downtown, levels = c("Downtown Core", "Westside", "Other"))
    ) |>
    select(GEOID, geo_name = NAME, geometry, var_group, var_level, var_detail, 
           value, SPA, greater_downtown)
  
  # Sum the two rows (for each sex)
  block_2000_detail <-
    block_2000_detail_1 |> 
    summarize(.by = -c(value), value = sum(value))
  
  subdiv_2000 <- 
    subdiv_2000_detail |>
    summarize(
      .by = c(GEOID, geo_name, geometry, var_group, var_level),
      value = sum(value)
    )
  
  block_2000 <- 
    block_2000_detail |>
    summarize(
      .by = c(GEOID, geo_name, geometry, var_group, var_level, SPA, greater_downtown),
      value = sum(value)
    )
  
  save(
    subdiv_2000_detail, block_2000_detail, subdiv_2000, block_2000, 
    file = "decennial_2000.Rdata"
  )

} else {
  
  load("decennial_2000.Rdata")
  
}

As with 2010, the sum of the neighborhoods, 477,107, doesn’t quite match the city value, 478,403, but that is still pretty close. Wow, 478,403 people in 2000, that’s 100K more than 2020. On the other hand, only 6,310 people lived Downtown. The Downtown resurgence of does not seem to be a recent phenomena.

2000 Population Estimates for Cleveland and Vicinity
Population
Cuyahoga County Subdivisions
Cleveland 478,403
Other 915,575
Total 1,393,978
Cleveland Neighborhoods
Downtown 6,310
Bellaire-Puritas 14,520
Broadway-Slavic Village 30,652
Brooklyn Centre 10,155
Buckeye Shaker 16,063
Central 11,568
Clark Fulton 10,672
Collinwood Nottingham 15,874
Cudell 10,630
Cuyahoga Valley 1,307
Detroit-Shoreway 13,917
Edgewater 6,360
Euclid Green 6,169
Fairfax 8,447
Glenville 39,941
Goodrich-Kirtland Park 4,580
Hopkins 338
Hough 14,734
Jefferson 18,266
Kamms Corners 25,256
Kinsman 10,256
Lee-Harvard 11,665
Lee-Seville 5,595
Mount Pleasant 24,013
North Shore Collinwood 18,346
Ohio City 8,726
Old Brooklyn 34,169
Saint Clair-Superior 11,534
Stockyards 12,076
Tremont 9,317
Union-Miles Park 26,539
University Circle 9,386
West Boulevard 20,492
Woodland Hills 9,234
Total 477,107
Greater Downtown
Downtown Core 8,412
Westside 15,154
Other 453,541
Total 477,107

2023 (ACS)

The 2023 American Community Survey publishes a 1-year and 5-year average. The 1-year survey might be helpful, but it doesn’t have block-level data. I’ll download the subdivision file and check in on Cleveland as a whole.

acs1_2023_vars <-
  tidycensus::load_variables(2023, "acs1") |>
  filter(
    concept %in% c("Sex by Age", "Hispanic or Latino Origin by Race"),
    # between(name, "B01001_001E_001N", "P2_011N"),
    !name %in% c("B01001_002", "B01001_026", "B03002_001", "B03002_002",
                 "B03002_010", "B03002_011"),
    name <= "B03002_012"
  )

Same variable prep.

Show the code
acs1_2023_vars <-
  acs1_2023_vars |>
  mutate(
    label = str_remove_all(label, "(Estimate!!Total:!!)|(Male:!!)|(Female:!!)"),
    label = case_when(
      str_detect(label, "White") ~ "White",
      str_detect(label, "Black") ~ "Black",
      str_detect(label, "Asian") ~ "Asian",
      str_detect(label, "American Indian") ~ "American Indian",
      str_detect(label, "Native Hawaiian") ~ "Pacific Islander",
      str_detect(label, "Some other race") ~ "Other",
      str_detect(label, "Two or more races") ~ "Two or more races",
      str_detect(label, "Hispanic") ~ "Hispanic",
      str_detect(label, "Total") ~ "Total",
      TRUE ~ label,
    ),
    var_group = case_when(
      name == "B01001_001" ~ "Total",
      between(name, "B03002_003", "B03002_012") ~ "Race/ethnicity",
      TRUE ~ "Age"
    ),
    var_level = case_when(
      label %in% c("White", "Black", "Hispanic", "Asian", "Total", "Other") ~ label,
      label %in% c("American Indian", "Pacific Islander", "Two or more races") ~ "Other",
      label %in% c("Under 5 years", "5 to 9 years", "10 to 14 years") ~ "Under 15 yrs",
      between(label, "15 to 17 years", "22 to 24 years") ~ "15 to 24 yrs",
      label %in% c("25 to 29 years", "30 to 34 years") ~ "25 to 34 yrs",
      label %in% c("35 to 39 years", "40 to 44 years") ~ "35 to 44 yrs",
      label %in% c("45 to 49 years", "50 to 54 years") ~ "45 to 54 yrs",
      between(label, "55 to 59 years", "62 to 64 years") ~ "55 to 64 yrs",
      between(label, "65 and 66 years", "85 years and over") ~ "65+ yrs"
    )
  ) |>
  select(variable = name, label, var_group, var_level)

acs1_2023_vars
# A tibble: 55 × 4
   variable   label           var_group var_level   
   <chr>      <chr>           <chr>     <chr>       
 1 B01001_001 Total           Total     Total       
 2 B01001_003 Under 5 years   Age       Under 15 yrs
 3 B01001_004 5 to 9 years    Age       Under 15 yrs
 4 B01001_005 10 to 14 years  Age       Under 15 yrs
 5 B01001_006 15 to 17 years  Age       15 to 24 yrs
 6 B01001_007 18 and 19 years Age       15 to 24 yrs
 7 B01001_008 20 years        Age       15 to 24 yrs
 8 B01001_009 21 years        Age       15 to 24 yrs
 9 B01001_010 22 to 24 years  Age       15 to 24 yrs
10 B01001_011 25 to 29 years  Age       25 to 34 yrs
# ℹ 45 more rows

Request the data from the API.

Show the code
if (USE_API) {

  subdiv_2023_detail_0 <- 
    get_acs( 
      geography = "county subdivision",
      sumfile = "acs1",
      variables = acs1_2023_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = FALSE, # no geo file for ACS-1yr
      year = 2023
    ) |> 
    inner_join(acs1_2023_vars, by = "variable")
  
  subdiv_2023_detail_1 <- 
    subdiv_2023_detail_0 |>
    mutate(
      var_level = var_relevel(var_level),
      NAME = str_remove_all(NAME, "(, Cuyahoga County, Ohio)|(village)|(city)"),
      NAME = str_trim(NAME)
    ) |>
    rename(geo_name = NAME)

  # Sum the two rows (for each sex)
  subdiv_2023_detail <-
    subdiv_2023_detail_1 |> 
    select(-variable) |>
    summarize(.by = -c(estimate), value = sum(estimate))

  subdiv_2023 <- 
    subdiv_2023_detail |>
    summarize(
      .by = c(GEOID, geo_name, var_group, var_level),
      value = sum(value)
    )
  
  save(subdiv_2023_detail, subdiv_2023, file = "acs1yr_2023.Rdata")

} else {
  
  load("acs1yr_2023.Rdata")
  
}

Cleveland’s population has continued to decline, down to 367,523 from 372,624 in 2020.

2023 Population Estimates for Cleveland and Vicinity
Population
Cleveland 367,523
Other 881,895
Total 1,249,418

Race/ethnicity

While Cleveland’s white and black populations have fallen, Asian, Hispanic, and other (including two or more races) have increased. The non-white populations have increased in the rest of Cuyahoga County, while white has fallen from 80% in 2000 to 67% in 2023.

Cuyahoga Race/ethnicity Population Change
2000 2010 2020 2023
Cleveland
Black 241,512 50% 208,208 52% 176,813 47% 169,138 46%
White 185,641 39% 132,710 33% 119,547 32% 124,183 34%
Hispanic 34,728 7% 39,534 10% 48,699 13% 47,132 13%
Asian 6,284 1% 7,213 2% 10,390 3% 8,356 2%
Other 10,238 2% 9,150 2% 17,175 5% 18,714 5%
Other Parts of Cuyahoga County
Black 137,885 15% 166,760 19% 188,356 21% 188,335 21%
White 732,936 80% 653,267 74% 599,206 67% 588,395 67%
Hispanic 12,350 1% 21,736 2% 34,628 4% 37,732 4%
Asian 18,735 2% 25,402 3% 33,349 4% 31,916 4%
Other 13,669 1% 16,142 2% 36,654 4% 35,517 4%
County Total
Total 1,393,978 1,280,122 1,264,817 1,249,418

The SPAs with historically high concentrations of black populations like Glenville, Hough, and Mount Pleasant, have experienced the greatest population declines. The more integrated SPAs like Downtown, Bellaire-Puritas, and Brooklyn Centre, have tended to retain their populations. These neighborhoods have become more diverse with rising Hispanic, Asian, and other populations.

Age

Unfortunately, there 2020 and 2023 ACS censuses do not provide block level data for age. We can still make some inferences from the higher-level summaries. The one trend that stands out is the decline in children under 15 in Cuyahoga County, and especially Cleveland. Older people aged 55 and up have held or even increased their numbers.

Cuyahoga Age Population Change
2000 2010 2020 2023
Cleveland
Under 15 yrs 117,101 24% 80,298 20% 67,636 18% 64,554 18%
15 to 24 yrs 64,556 13% 61,044 15% 50,100 13% 48,566 13%
25 to 34 yrs 71,847 15% 53,996 14% 62,334 17% 63,832 17%
35 to 44 yrs 73,822 15% 49,555 12% 43,901 12% 44,705 12%
45 to 54 yrs 55,111 12% 59,726 15% 42,857 12% 41,195 11%
55 to 64 yrs 35,987 8% 44,700 11% 51,614 14% 49,383 13%
65+ yrs 59,979 13% 47,496 12% 54,182 15% 55,288 15%
Other Parts of Cuyahoga County
Under 15 yrs 174,502 19% 154,662 18% 143,357 16% 147,603 17%
15 to 24 yrs 102,919 11% 107,421 12% 106,136 12% 100,684 11%
25 to 34 yrs 117,026 13% 103,990 12% 116,982 13% 114,935 13%
35 to 44 yrs 145,627 16% 109,318 12% 104,209 12% 107,483 12%
45 to 54 yrs 132,490 14% 137,460 16% 107,217 12% 105,247 12%
55 to 64 yrs 85,829 9% 119,411 14% 129,766 15% 123,563 14%
65+ yrs 157,182 17% 151,045 17% 184,526 21% 182,380 21%
County Total
Total 1,393,978 1,280,122 1,264,817 1,249,418

Footnotes

  1. “Opinion: Downtown Cleveland’s strategy to broaden appeal sees success”, Crains Cleveland Business. “Cleveland’s downtown population continues to surge”, Cleveland Fox 19 News.↩︎

  2. Downtown Cleveland Inc. commissioned a report, “Downtown Cleveland Market Study Report” (pdf), by the Urban Partners consulting firm. The report was released in Apr 2023. Figures are from Table 1: 15,330 people in 2010, 18,708 people in 2020 (22% increase).↩︎

  3. See the Downtown neighborhood (statistical processing area, SPA) in the data table.↩︎

  4. “There’s Still No Agreement on How Many Clevelanders Actually Live Downtown”, Cleveland Scene, Sep 17, 2024.↩︎

  5. Cleveland’s population plateaued around 1930 at 900K. The peak was 914K in the 1950 census. Between 1960 and 1980 the population declined by a third. The current population is slightly below the 1900 value. See Visual Cleveland at https://visual.clevelandhistory.org/census/.↩︎

  6. 362,670 +/- 62. https://data.census.gov/table/ACSST1Y2023.S0101?q=cleveland,%20oh↩︎

  7. Social Planning Areas (SPAs) were developed in the 1950s to coordinate social services at the neighborhood level. Learn more at the Encyclopedia of Cleveland History. Wikipedia has a nice explanation of how neighborhoods relate to Statistical (or social) Planning Areas.↩︎

  8. From https://data.clevelandohio.gov/, go to the Data Catalog and scroll to Census 2020 Analysis.↩︎