2026 Model data update #960

wrridgeway · 2025-12-30T17:54:59Z

No description provided.

Damonamajor · 2025-12-31T19:01:46Z

Since we now have an excel and csv file, we have to do some data manipulation in the raw file. This is especially true since there is no year in either data (aside from the title). It seems to make sense to do the small amount of data transformation in that file (renaming column names and adding year), and removing the cleaning file. Do you have thoughts on that?

…ta-architecture into 953-model-data-2026-refresh

…-data/data-architecture into 953-model-data-2026-refresh

…ta-architecture into 953-model-data-2026-refresh

…-data/data-architecture into 953-model-data-2026-refresh

…ta-architecture into 953-model-data-2026-refresh

wrridgeway · 2026-01-14T21:46:22Z

etl/scripts-ccao-data-raw-us-east-1/spatial/spatial-environment-major_road.R

-      ) %>%
+      )
+
+    osm_roads <- osm_roads %>%


R was having a hard time not crashing when I tried to pass this entire process through without giving it a break.

wrridgeway · 2026-01-14T21:46:42Z

etl/scripts-ccao-data-raw-us-east-1/spatial/spatial-environment-secondary_road.R

-      ) %>%
+      )
+
+    osm_roads <- osm_roads %>%


R was having a hard time not crashing when I tried to pass this entire process through without giving it a break.

wrridgeway · 2026-01-14T21:47:55Z

etl/scripts-ccao-data-raw-us-east-1/spatial/spatial-parcel.R

 # Read privileges for the this drive location are limited.
 # Contact Cook County GIS if permissions need to be changed.
-file_path <- "//10.122.19.14/ArchiveServices"
+file_path <- "//gisemcv1.ccounty.com/ArchiveServices"


Positron wasn't letting me connect to this data using the IP address. Writing out the path does work.

wrridgeway · 2026-01-14T21:49:49Z

etl/scripts-ccao-data-warehouse-us-east-1/ccao/ccao-land-land_nbhd_rate.R

+land_nbhd_rate_2026 <- openxlsx::read.xlsx(tmp_file_nbhd_rate_2026) %>%
+  set_names(snakecase::to_snake_case(names(.))) %>%
+  select(
+    town_nbhd = neighborhood_number,
+    `2026` = proposed_2026_class_two_rate
+  ) %>%
+  mutate(
+    town_nbhd = gsub("\\D", "", town_nbhd),
+    township_code = substr(town_nbhd, 1, 2),
+    township_name = ccao::town_convert(township_code)
+  ) %>%
+  relocate(c(township_code, township_name)) %>%
+  pivot_longer(
+    c(`2026`),
+    names_to = "year", values_to = "land_rate_per_sqft"
+  ) %>%
+  mutate(
+    across(c(township_code:year), as.character),
+    land_rate_per_sqft = parse_number(land_rate_per_sqft),
+    data_year = "2026"
+  ) %>%
+  expand_grid(class)
+
+


Since this is for the south tri, we want processing '26 data to look like processing '23 data rather than '24 or '25. There are no bifurcated rates in the south.

wrridgeway · 2026-01-14T21:56:25Z

etl/scripts-ccao-data-warehouse-us-east-1/sale/sale-mydec.R

+  # This filter keeps only the multisale rows with the most non-null values
+  # within document number and pin
+  mutate(non_null_count = rowSums(!is.na(across(everything())))) %>%
+  filter(
+    non_null_count == max(non_null_count),
+    .by = c(document_number, line_1_primary_pin)
+  ) %>%
+  select(-non_null_count) %>%
+  # After the abover filter, what's left are true duplicates if there are
+  # multiple rows within documnet number and pin with the same number of
+  # non-null values. We use distinct() to keep only one of those rows.
+  distinct(document_number, line_1_primary_pin, .keep_all = TRUE) %>%
  relocate(year_of_sale = year, .after = last_col()) %>%
-  group_by(year_of_sale) %>%
+  group_by(year_of_sale)


We're getting some duplicates in our mydece sales that are either complete duplicates, or one has slightly more NAs than the other. I've tried to treat both cases appropriately.

wrridgeway · 2026-01-14T22:12:08Z

etl/scripts-ccao-data-warehouse-us-east-1/spatial/spatial-environment.R

+coastline_years <- parse_number(
+  get_bucket_df(input_bucket, prefix = "spatial/environment/coastline/")$Key
+)
+walk(coastline_years, function(x) {


All the current_year business can lead to errors. Much better to only look at raw data that actually exists.

wrridgeway · 2026-01-14T22:12:37Z

etl/scripts-ccao-data-warehouse-us-east-1/spatial/spatial-environment.R

-)
-flood_fema_warehouse <- file.path(
-  output_bucket, "flood_fema", "year=2024", "part-0.parquet"
+fema_years <- parse_number(


We have multiple years of fema data but we were only looking at one.

wrridgeway · 2026-01-14T22:14:06Z

etl/scripts-ccao-data-warehouse-us-east-1/spatial/spatial-political.R

  "ward_evanston_2019" = c("ward"),
-  "ward_evanston_2022" = c("ward")
+  "ward_evanston_2022" = c("ward"),
+  "ward_evanston_2025" = c("ward")


New Evanston wards just dropped.

wrridgeway · 2026-01-14T22:16:08Z

etl/renv.lock

    "arrow": {
      "Package": "arrow",
-      "Version": "21.0.0.1",
+      "Version": "15.0.1",


You hate to see it, but we need to downgrade this work with the deprecated version of geoarrow we depend on.

Update raw foreclosure ingest commenting

b41b125

wrridgeway assigned wrridgeway and Damonamajor Dec 30, 2025

wrridgeway linked an issue Dec 30, 2025 that may be closed by this pull request

Update ARI ingest script to gather new data #959

Open

Update ccao and paws packages

ed7274c

wrridgeway linked an issue Dec 31, 2025 that may be closed by this pull request

Model data 2026 refresh #953

Open

wrridgeway and others added 23 commits January 5, 2026 18:15

Add links to help check if data is updated

fd8e06e

Downgrade arrow version

ee4c827

Slow script down to avoid timeout

c48aa89

Same changes to major roads

c86a397

improve commenting

f9839ef

Merge branch '953-model-data-2026-refresh' of github.com:ccao-data/da…

7b68b65

…ta-architecture into 953-model-data-2026-refresh

lintr

32c600d

lintr

91f1c25

Make sure railroad portion doesn't get same data and claim it's new

05fa821

Merge branch '953-model-data-2026-refresh' of https://github.com/ccao…

3b49810

…-data/data-architecture into 953-model-data-2026-refresh

Add openpyxl

f8b67f8

Add warehouse script

3e165a4

Merge branch '953-model-data-2026-refresh' of github.com:ccao-data/da…

684d1dc

…ta-architecture into 953-model-data-2026-refresh

Add commenting

7913692

Update path to GIS gdbs

c26ba18

Merge branch '953-model-data-2026-refresh' of https://github.com/ccao…

53f50cd

…-data/data-architecture into 953-model-data-2026-refresh

Add 2025 evanston wards

834b1f3

Reformat for consistency

89ba3f9

Add 2026 attendance boundaries

04d9781

Update GIS data address

30bdb15

Hardcode more of transit URLs

7ab0c5f

Add commenting

250ebf3

Merge branch '953-model-data-2026-refresh' of github.com:ccao-data/da…

7dbced3

…ta-architecture into 953-model-data-2026-refresh

wrridgeway added 18 commits January 8, 2026 15:40

Stop hardcoding hydrology years

321ac53

Typo

3455fe2

Add new political shapefiles and columns

a945f0e

Remove duplicate data

048ab51

Add pyarrow to UV

2c53686

Make sure there are no duplicates for mydec

eab63d0

Ensure foreclosure data is unique by desired keys

2020bd2

Add 2026 land rates

d3a0c8a

Test another mypy fix

e347484

Remove mypy_path config

8a34833

Commenting

3386023

Refactor ari scripts to R

ce2fe32

Commenting

89fa882

Commenting

7382514

Commenting

9b04551

Refactor DCI script to R

8cf2ac1

Remove mypy changes

9b1ef9e

Typo

09898d8

wrridgeway commented Jan 14, 2026

View reviewed changes

Typo

049eb78

wrridgeway commented Jan 14, 2026

View reviewed changes

wrridgeway added 2 commits January 14, 2026 22:16

Remove python packages added for scripts that are gone

78731f5

Update uv lock

0060731

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2026 Model data update #960

2026 Model data update #960

wrridgeway commented Dec 30, 2025

Uh oh!

Damonamajor commented Dec 31, 2025 •

edited

Loading

Uh oh!

wrridgeway Jan 14, 2026

Uh oh!

wrridgeway Jan 14, 2026

Uh oh!

wrridgeway Jan 14, 2026

Uh oh!

wrridgeway Jan 14, 2026

Uh oh!

wrridgeway Jan 14, 2026

Uh oh!

wrridgeway Jan 14, 2026

Uh oh!

wrridgeway Jan 14, 2026 •

edited

Loading

Uh oh!

wrridgeway Jan 14, 2026

Uh oh!

wrridgeway Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2026 Model data update #960

Are you sure you want to change the base?

2026 Model data update #960

Conversation

wrridgeway commented Dec 30, 2025

Uh oh!

Damonamajor commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wrridgeway Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Damonamajor commented Dec 31, 2025 •

edited

Loading

wrridgeway Jan 14, 2026 •

edited

Loading