cheapsensationalism

Weird Weather

Winter 2025–2026 vs. 30-year normals across 50 US cities
Weird Weather / Analysis

Winter elsewhere

Where 50 American cities wintered this year

I have a vague sense that many Americans didn’t get the winter they expected. In Denver, February felt like early April, even late May. In Boston and New York, winter seemed to have a grudge. After a nearly 90-degree day this March at ~5,000’ elevation in Colorado, I set out to explore if the patterns matched the vibe I had collected from the ambient noise around me. An article or two probably would have answered my question, but, as Gogol notes in The Overcoat, “there are such puzzles in the world, and it is not our place to judge.”

The first thought was to take 2026’s winter for 50 cities across the United States and compare that against thirty years of their own history: Winter 2025–26—December through February—against the 1991–2020 period.

An important note at the outset about what I measured and what I did not. Fifty cities is not America. This sample skews toward the Northeast and toward large metros. Rural weather, which is most of the country’s geography, is absent here. The Southern Plains are underrepresented. Hawaii is missing entirely. What follows is a portrait of fifty places, not a census. That said: even in—or because of—this biased sample, the patterns are striking.

What’s inside — open any in a new tab
Map ↗ Summary ↗ Distance matrix ↗ Data table ↗ Time series ↗ Full methodology ↗

The National Picture

The national average temperature anomaly across our fifty cities came out to +1.4°F. A small number. But national averages are to weather what GDP is to the economy—a figure that describes no one’s actual experience. Twenty cities ran warmer than their raw thirty-year average. Fifteen ran cooler. Fifteen landed within a degree of it.4

Note 4 The 20/15/15 split is based on raw anomalies against the thirty-year average. Later, when I say “seven cities had a normal winter,” I mean something stricter: their z-score from the model was less than ±1σ—well within the range of ordinary variation. Two different thresholds, two different answers. Both honest.

The average itself carries a quiet limitation. Our baseline, 1991–2020, already includes three decades of warming. The “normal” is not some fixed, Platonic climate. It is a sliding window that absorbs the recent past, making each generation’s strange weather the next generation’s ordinary. A +1.4-degree anomaly measured against an already-warm baseline is more remarkable than it sounds.

The West–East Divide

The West ran warm. Denver posted +11.1°F above its thirty-year average. Billings, Montana came in +9.1°F above. Las Vegas, +7.0°F. Phoenix, +6.7°F. Boise, Reno, Albuquerque, Tucson, Salt Lake City—all five or more degrees above the script.

Temperature anomaly, selected cities (°F vs. 30-year normal)
0°F +11.1 DEN +9.1 BIL +7.0 LAS +6.7 PHX +6.0 ABQ +5.8 BOI WEST | EAST -2.2 CLE -2.7 DET -3.7 BOS -5.3 NYC warmer ↑ cooler ↓
Temperature deviation from 30-year normal (1991–2020), Winter 2025–26. Western cities ran dramatically warm; Eastern cities dipped below historical averages.

The East cooled. New York posted -5.3°F below normal. Boston dropped -3.7°F and added snow beyond what history would suggest is plenty. Buffalo added to its surplus. Cleveland, Detroit, the great frozen crescent of the Northeast and Great Lakes—they all bent deeper.

This east-west dipole is not random. The jet stream—that atmospheric river at thirty thousand feet—buckled and held its shape for much of the season.1 A persistent ridge of high pressure sat over the West like a warm lid, while a trough funneled Arctic air south and east. This pattern has a name in meteorology: an amplified Rossby wave. It is consistent with what La Niña winters tend to produce—the tropical Pacific’s cold phase nudges the jet stream northward over the West and drops it southward over the East, splitting the continent thermally.

Note 1 I liked watching the Weather Channel as a kid, but I lack any depth of meteorological knowledge. The jet stream language here comes from standard NOAA explainers, not personal expertise.

NOAA’s Climate Prediction Center had forecast a “tilt of the odds” toward exactly this pattern for a La Niña winter. The forecast was correct in direction, though the magnitude—eleven degrees hot in Denver, five degrees cold in New York—exceeded what seasonal outlooks typically capture. The atmosphere followed the script’s stage directions but ad-libbed the dialogue.

The Teleportation Question

The most useful (postable) thing I did with this data was to ask a question: if your city’s winter was teleported, where did it land? [explore the map ↗]

The method is straightforward. Take each city’s actual winter and compare it to the thirty-year normal of every other city. The closest match, measured by normalized Euclidean distance (each variable standardized by its cross-city standard deviation before computing distance), tells you whose winter you actually had.

City Experienced the normal of… DenverAlbuquerque BostonBurlington, VT New YorkAlbany Las VegasEl Paso AnchorageBurlington, VT PhoenixPhoenix ★

Denver became Albuquerque. Boston became Burlington, Vermont. New York became Albany. Las Vegas turned into El Paso. Anchorage, which expected snow and did not get it, landed nearest to Burlington—not because Anchorage got warm, but because its snow deficit made it unrecognizable as itself.

Seven cities stayed close enough to their own norms that their nearest match was themselves and the match was within a 5% band of self-identity. Phoenix was among them, though barely. (Loosening the threshold to “nearest match is self, period,” catches ten cities—we pick that number up again a few paragraphs down.) These are the places where winter still resembles winter. The autobiography holds.

The Geometry of Strangeness

The teleportation question tells you where your city went. But the distance matrix [view it ↗] tells you something more fundamental: how alone your city is. Every city sits at a point in climate space, and some points have neighbors and some do not.

Cities at the temperature extremes—Miami, Anchorage, Phoenix—live in a kind of climate solitude. Their nearest neighbor in the distance matrix is still far away. Anchorage’s nearest match is Burlington, VT, at a distance of 0.259. Miami’s is Tampa at 0.149. These cities have no close twins in our dataset. Their weather is too distinctive for comparison. They are the only copies of themselves.

Richmond, VA occupies the opposite position. Its average distance to all other cities was 0.371, the lowest in the dataset. Richmond sits in the mathematical center of American winter—close enough to many cities that it could almost belong anywhere south of the Mason-Dixon line and east of the Mississippi.

Portland, Oregon is its inverse: the highest average distance at 0.779. Portland is climatically alone, a mild wet anomaly in a dataset of cold dry winters and hot dry ones.

At the pair level, the extremes confirm what intuition suspects. Albany and Burlington are practically the same city, climatically, with a distance of 0.038. Miami and Anchorage are the most different pair at 1.443—which surprises no one, but it is nice when the math confirms the obvious.

The distance matrix’s diagonal also allows us to compare a city to its thirty-year average. Anchorage was the city most unrecognizable to itself. Its self-distance—the distance between its actual winter and its own thirty-year normal—was 0.492, the highest of all fifty cities. Its winter was the furthest from its recent history. Anchorage was weird, and weird across multiple dimensions simultaneously. Temperature alone does not capture it. The combination of colder-than-modeled temperatures (z-score of -1.18), less snow, and less precipitation pushed Anchorage far from itself—a displacement that no single variable reveals.2

Note 2 Despite this, I don’t know if people from Anchorage were terribly surprised (I didn’t talk to any). Anchorage varies a lot (σ = 3.64°F), so despite being very different from its 30-year averages, Anchorage is often very different from its average. Its z-score of -1.18 for temperature means winters this far from the mean happen about every 4–5 years. People in Anchorage might have only been a little weirded out.

Some cities stayed themselves. Ten cities matched themselves—their actual winter was closer to their own thirty-year normal than to any other city’s. Tampa was the most emphatically itself, with a self-distance of just 0.040. Minneapolis was right behind at 0.042. Seven of the ten self-matching cities were either in very warm climates (Miami, Tampa, Phoenix) or very cold ones (Minneapolis, Burlington). The extremes held. The middle drifted.

I suspect that this pattern is probably as much an artifact of the data as something generated from the noumenal realm. The cities with the most distinctive climates—the ones that sit at the edges of the distribution—are the ones that stayed put. They have nowhere to go. A mild winter in Miami is still Miami. A warm winter in Minneapolis is still, recognizably, Minneapolis. But a warm winter in Denver is Albuquerque. A cold winter in New York is Albany. The cities in the middle of the distribution have more neighbors, and when the weather shifts, it shifts them into someone else’s territory. The center does not hold because the center has options.

Self-distance spectrum: how far each city drifted from its own 30-year normal
self-distance (0 = identical to norm) 0.50 0.40 0.30 0.20 0.10 0.00 most displaced stayed themselves 0.492 ANC 0.356 DAL 0.318 RNO 0.295 DEN 0.267 NYC 0.279 ATL 0.189 PDX 0.157 PHX 0.056 SFO 0.042 MSP 0.040 TPA unrecognizable ← → emphatically itself
Self-distance: the normalized Euclidean distance between each city’s actual 2025–26 winter and its own 30-year normal. Higher = greater departure from historical identity. Anchorage drifted furthest; Tampa barely moved.
Note 3 — On Snowfall Data Our snowfall figures come from Open-Meteo’s reanalysis archive, which derives snowfall from precipitation and temperature using the ERA5 climate model. This method systematically underestimates actual measured snowfall, particularly in lake-effect regions (Buffalo, Cleveland) and mountain cities (Denver, Salt Lake City), where local atmospheric dynamics produce more snow than a gridded model captures. Buffalo’s historical average appears as roughly 28 inches in our data; the actual measured December–February average is closer to 55 inches. The analysis remains valid because we are comparing each city against its own baseline from the same source. The deltas, z-scores, and climate matches are internally consistent. But the absolute snowfall numbers should not be taken at face value.
3

Predicting predictability

The distance matrix doesn’t tell you about how different winter was from expectations. To answer that, you need a model.

My model has a smooth curve fitted through thirty years of data: a B-spline (5 degrees of freedom, equally spaced knots), which bends to follow the data without chasing every wiggle. This gives the national trajectory—the slow, nonlinear drift of American winters over three decades. [full methodology ↗]

But cities are not the nation. Miami is not Minneapolis. So each city gets its own intercept (its baseline personality) and its own slope (its individual rate of change over time). This is a mixed-effects model. The fixed effect is the national trend. The random effects are each city’s departure from it.

There is a further refinement. Not all cities are equally predictable. Miami’s winter temperature varies little from year to year. Billings swings wildly. Our model assigns each city its own error variance—its own sigma—so that a two-sigma winter temperature increase in Miami (where sigma is 1.18°F) means something different from a two-sigma event in Billings (where sigma is 4.84°F). The technical term is heteroscedastic, from the Greek for “different scatter.” Instead of treating heteroscedasticity as a disease afflicting the model, we made it part of our model, a parameter our model wanted to estimate.

Per-city sigma (σ): how predictable is each city’s winter?
1.0 2.0 3.0 4.0 5.0 Miami 1.18 SD PHX Denver 3.31 NYC BOS Billings 4.84 Temperature σ (°F) — higher = more volatile
Each city gets its own σ. Miami’s winters barely vary (σ = 1.18°F). Billings swings wildly (σ = 4.84°F). A 2σ event in Miami is a 2.4°F departure; in Billings, it’s 9.7°F.

I couldn’t help but notice that interior cities seemed more volatile. I wondered: is there a pattern between how far a city is from the nearest ocean and how unpredictable its winter is? The answer is yes, though noisily so.

Distance from nearest ocean vs. winter volatility (σ) — r = 0.53
Miles from nearest ocean coastline σ (°F) 0 300 600 900 1200 1500 1 2 3 4 5 ANC SFO MIA BOS SEA NYC MSP BIL CHI OMA DEN r = 0.53
Each dot is a city. Orange/red = more volatile. Inland cities tend toward higher winter-to-winter variability, but the relationship is noisy (r = 0.53). Anchorage (coastal, high σ) and Denver (far inland, moderate σ) are notable outliers—geography is one driver, not the only one.

The pattern holds, but imperfectly. Ocean proximity is a genuine moderating force—maritime air masses buffer winter temperatures against extreme swings. But it competes with latitude, continental air mass exposure, and regional topography. Billings’s chinook winds and exposure to Arctic air masses make it the most volatile city regardless of its 800-mile ocean distance. Anchorage, coastal but Alaskan, defies the trend in the other direction. The r = 0.53 is worth showing, but not worth overstating.

On leverage: a single outlier shouldn’t drive a correlation. So I checked. Leave-one-out diagnostics show the relationship is robust—removing Anchorage (the most obvious coastal-but-volatile point) actually strengthens r to 0.57; dropping both Anchorage and Billings yields r = 0.56. The biggest single influence is Omaha, whose removal lowers r to 0.47. Spearman’s rank correlation, which is insensitive to outliers, is ρ = 0.47—close enough to the Pearson value that the linear story isn’t an artifact of a few extreme cities.

Denver’s temperature z-score was 2.70. In a normal distribution, that occurs less than one percent of the time. Only two of the fifty cities exceeded the two-sigma threshold for temperature. The mean absolute z-score across all cities was 1.06—broadly unusual, not just locally.

This means a winter this unusual across these cities would happen about every 3.5 years, assuming winters don’t bunch together because of underlying generation processes (La Niña, El Niño, etc.).

How rare is rare? Standard deviations and average return periods
Standard deviations from expected (|z|) Avg. years between 0 1 10 50 200 750 mean |z| = 1.06 ~3.5 yrs Denver z = 2.70 ~143 yrs
The return period rises exponentially with standard deviations. The average city’s deviation (1.06σ) recurs every ~3.5 years. Denver’s 2.70σ event? Once every ~143 years, assuming independent draws. Reassuringly, this matches the historical record: Denver7 reports this was Denver’s second-warmest winter on record (39.6°F), behind only 1933–34 (40.1°F)—a 92-year gap that sits comfortably inside the model’s ~143-year return interval and well outside the 30-year sample window, validating the anomaly call.
Temperature z-scores: how many standard deviations from expected?
>2σ 1–2σ <1σ DEN 2.70 SLC 2.18 LAS BIL PHX BOI ABQ NYC BOS ANC CLE MIA
Each dot represents a city’s temperature z-score for Winter 2025–26. Denver, at 2.70σ, stands in rare statistical territory. Only two cities exceeded the 2σ threshold.

Where you start impacts where you go

Lucky for us, we tried to explore this local variability. When you give each city its own intercept and slope, you can ask: do warmer cities warm faster? Or, more broadly, does where we start impact where we go?

Yes.

The correlation between random intercepts and random slopes for temperature is 0.591—a moderately strong positive relationship. Cities that start warmer in our thirty-year baseline also tend to have steeper warming trends over time. This suggests that whatever is driving the warming is amplified in places that are already warm—possibly through feedback mechanisms like reduced snow cover or changes in regional circulation.

Random intercept vs. random slope: do warmer cities warm faster?
Random intercept (baseline °F above national mean) Random slope (°F/decade) -20 -10 0 +10 +20 -0.4 0 +0.4 +0.8 ANC BUR BUF DET BIL BOS NYC DEN ABQ LAS TUS PHX MIA ρ = 0.591
Each dot is a city. Warmer cities (right) tend to have steeper warming trends (up). The dashed line shows the positive correlation (ρ = 0.591). Cold northeastern cities cluster in the lower-left; hot southwestern cities pull toward the upper-right.

For snowfall, the correlation reverses: -0.287. Snowier cities tend to be losing snow faster, though the signal is weaker.6 For precipitation, there is essentially no correlation (0.097). Temperature is where something—perhaps some feedback loops—lives.

Note 6 Is this an artifact? Possibly, partially. Mixed-effects models can show “regression to the mean” in their random effects: cities with the highest intercepts (snowiest) get partially pulled toward the average, which can generate spurious negative correlations with slopes. But the pattern is probably also real—warmer temperatures convert snow to rain, and snow-albedo feedback means less snow cover leads to faster warming in places that had lots of snow to begin with. Both the statistical and physical stories predict the same sign. I can’t cleanly separate them with this data.

These correlations are interesting and may be the result of how climate change is emerging, but I need to be emphatically clear: neither I nor the data are in the necessary shape to make claims. [caveats & limitations ↗]7 I have exceptionally low familiarity with the meteorological and climatological literature. The data is temporally (only 30 years) and spatially (focused on metro areas) limited. Is this a micro-cycle or meso-cycle that would be obvious if we zoomed out? Is this the signature of a macro-cycle or new epoch taking hold? Is this the product of an error of someone messing around with data? Is this an artifact of said data?

Note 7 This feels apt.
Succession quote

Too small a window for seeing trends

The national temperature trend across our fifty cities is +0.506°F per decade. This is not statistically significant at conventional levels (p = 0.16), which may surprise people accustomed to hearing that warming is settled science. It is settled—but a thirty-year window with fifty cities is a small lens through which to measure a global phenomenon, and the year-to-year noise in winter temperatures is substantial.5 Our model explains 94.5% of the temperature variance, but almost all of that (94.2%) is city-to-city differences, not temporal trends. Winter temperature is overwhelmingly a function of where you are, not when you are. I can’t help but wonder if the model’s random effects and the sigmas predict a region’s belief in climate change.

Note 5 Is this surprising? Not really, to anyone who works with climate data. Winter temperatures swing 3–5°F from year to year naturally. The signal we’re trying to detect—~0.5°F per decade—is small relative to that noise. The “settled science” of global warming comes from much longer records (150+ years) and global averages, where natural variability cancels out. Our 30-year, 50-city sample simply lacks the statistical power to detect what we know is there.
Variance decomposition: what explains winter temperature?
City-to-city differences: 94.2% Noise 4.8% Trend 0.4% Where you are matters more than when you are
94.2% of winter temperature variance is explained by which city you’re in. The temporal trend (red sliver) accounts for less than half a percent. Geography dominates.

Snowfall shows no meaningful trend (+0.038"/decade, p = 0.94). Neither does precipitation (-0.004"/decade, p = 0.99). If winter is changing, it is changing in the background, beneath a fog of natural variability that a three-decade sample cannot fully resolve.

You read this far?

I was drawn to this analysis by trying to contextualize the vibe of an anomaly. Denver running eleven degrees above normal is not merely a warm winter. It is a winter that no longer belongs to Denver. Whether the frequency of such displacements is itself increasing—whether the dice are not just loaded but increasingly so—is a question our thirty-year sample can subtly gesture toward but not resolve.

Our findings are consistent with the broader literature. La Niña winters have been linked to amplified jet stream patterns that warm the western US and cool the east since at least Ropelewski and Halpert’s 1986 work on ENSO teleconnections. The warm-cities-warming-faster pattern echoes research on urban heat islands documented by Zhao et al. (2014). The continental-interior volatility we observe reflects the well-documented influence of maritime moderation.

The thirty-year average will update next year. It always does. The window slides forward, absorbing anomalies, making the strange familiar. In a decade, and as we continue to redirect resources (soapbox) away from addressing what we’ve done to our planet, this winter will be part of the baseline.

Seven cities out of fifty had a normal winter. The other forty-three did not, and each of them did not in its own way. The strangeness was not evenly distributed. It rarely is. Next winter will be strange in its own way, and the baseline will absorb this one, and the vocabulary will fail again.

———

This dashboard was built as a side project by a quantitative researcher who wanted to understand one strange winter. The code, data, and model are on GitHub.

Data: Open-Meteo Archive API
Normals: 1991–2020 winter seasons (Dec–Feb)
Model: weather ~ spline(year) + (year | city); sigma ~ city
Sample: 50 US cities (not nationally representative)

Weird Weather · 50 cities · 30 years of data · Winter 2025–2026
Methodology

How We Measured Weird

The recipe behind the numbers, step by step, with no jargon left unexplained.

1. Getting the Data

We needed winter weather data for American cities. Lots of it. Thirty-five winters’ worth.

The source is Open-Meteo’s Archive API, which serves ERA5 reanalysis data from the European Centre for Medium-Range Weather Forecasts. Reanalysis means a global climate model ingested billions of observations—satellites, weather stations, ocean buoys, radiosondes—and produced a physically consistent gridded dataset. It is not raw station data. It is what a very good model thinks the weather was, everywhere, all the time.

This matters. The numbers you see here are model-derived, not thermometer readings from the airport. ERA5 is excellent for temperature. It is less excellent for snowfall, a fact we will return to.

1 Select 50 cities across the continental United States plus Anchorage.
2 Pull daily data for 35 winter seasons (1990–91 through 2024–25). Winter = December, January, February.
3 For each city-season, aggregate three metrics: mean daily high temperature (°F), total snowfall (inches), and total precipitation (inches).

Thirty-five seasons × 50 cities × 3 metrics = 5,250 data points. Not big data. A spreadsheet could hold it. But enough to find patterns.

2. The Distance Matrix

We wanted to know: which cities had similar winters this year? Which ones did not?

The answer is a distance matrix. Take every pair of cities—that is 1,225 pairs from 50 cities—and compute how different their 2024–25 winter was.

“Different” needs a definition. We used normalized Euclidean distance across the three metrics. Here is the recipe:

1 Normalize each metric to [0, 1]. Temperature might range from 10°F to 75°F. Snowfall from 0 to 90 inches. Without normalization, temperature’s larger numerical range would dominate the distance calculation. We subtract the minimum and divide by the range. Every metric now lives on the same scale.
2 Compute Euclidean distance. For two cities, take the square root of the sum of squared differences across all three normalized metrics. The result: one number summarizing how different two cities’ winters were.
3 Arrange in a 50 × 50 matrix. The diagonal is zero (every city’s distance to itself). The matrix is symmetric (Albany-to-Burlington equals Burlington-to-Albany).

On the heatmap in the Matrix tab: darker cells mean more similar winters. Lighter cells mean more different. The diagonal is always dark. If you see a dark off-diagonal block, those cities had nearly identical winters.

CITY A temp snow precip sqrt(Σ Δ²) CITY B temp snow precip distance = 0.42
Normalized Euclidean distance: each bar scaled to [0, 1], then compared.

Some findings from the matrix:

Most Similar
Albany ↔ Burlington
Distance = 0.038. Practically the same winter. Four hours of highway separate them, and their climates agree.
Most Different
Miami ↔ Anchorage
Distance = 1.443. The maximum possible on three normalized axes is √3 ≈ 1.73. These two used most of it.
Most Isolated
Portland, OR
Avg distance = 0.779. Portland’s winter doesn’t look like anyone else’s. Mild, wet, and snowless is a rare combination.
Most Central
Richmond, VA
Avg distance = 0.371. The “average” American winter. Not too hot, not too cold, moderate snow, moderate rain.

Cities at climate extremes have no close twins. Miami, Anchorage, and Phoenix sit on the edges of the distribution. Anchorage’s nearest neighbor is Burlington, VT, at a distance of 0.259—which is like calling someone your best friend because they are the only person in the room.

Reading the heatmap: The distance matrix is symmetric. Each cell represents one city pair. Self-distance (the diagonal) is zero for the historical baseline, but can be nonzero for the 2024–25 season—more on that in Section 5.

3. The Model

A distance matrix tells you what happened. A model tells you what should have happened. We built one so we could measure surprise.

The model is a B-spline mixed-effects regression with heteroscedastic errors. That sentence has too many words in it. Here is what each part means:

1 B-spline on year (the fixed effect). Instead of fitting a straight line through 35 years of data, we fit a smooth curve. A cubic B-spline with a few knots. This captures the national trend—the slow warming—without assuming it is linear. Maybe the 1990s warmed fast and the 2010s plateaued. The spline does not care. It follows the data.
2 Random effects by city (intercept + slope). Every city gets its own baseline and its own rate of change. Minneapolis starts cold and warms slowly. Phoenix starts hot and barely changes. The model learns each city’s personality from the data.
3 Heteroscedastic errors. A fancy way of saying: some cities have noisy weather, and some do not. Miami’s winter temperature barely varies year to year. Billings, Montana, is a roulette wheel. Rather than pretending all cities have the same amount of noise, we estimate a separate σ (standard deviation) for each city.
MODEL STRUCTURE FIXED EFFECT spline(year) → national trend RANDOM EFFECTS (per city) intercept → baseline level slope → city-specific trend CITY-SPECIFIC σ noise level varies by city
Three layers: one national curve, per-city adjustments, per-city noise.

In notation that looks like code but is not quite code:

# Model formula (brms / lme4 style) weather ~ bs(year, df=5) + (year | city) sigma ~ city # Read it as: # weather = national_spline(year) # + city_intercept # + city_slope * year # + noise(city_specific_sigma)

We fit three separate models: one for temperature, one for snowfall, one for precipitation. Same structure, different data. The model does not know that snow and temperature are related. It treats each metric on its own.

4. Diagnosing the Model

A model is only useful if you know where it works and where it fails. We ran the diagnostics.

Variance decomposition tells you where the signal lives. For temperature:

TEMPERATURE VARIANCE DECOMPOSITION City-to-city: 94.2% Temporal: 0.4% | Residual: 4.8%
Geography dominates. The trend over time is real, but small relative to the difference between Miami and Minneapolis.

Translation: where a city is located explains almost everything about its winter temperature. The 35-year warming trend is statistically detectable but explains less than half a percent of total variance. The remaining 4.8% is noise—year-to-year chaos that even a good model cannot predict.

This is not a flaw. This is physics. Minneapolis is always colder than Miami. The warming trend nudges both of them, but the nudge is tiny compared to the gap between them.

by metric:

MetricVerdict
Temperature 0.945 Excellent. The model captures almost all temperature variation.
Snowfall 0.780 Good. Snow is lumpy and localized, but the model handles it.
Precipitation 0.652 Adequate. Rain is chaotic. This is about as good as it gets with seasonal aggregates.

This hierarchy makes physical sense. Temperature is determined mostly by latitude and elevation—stable facts about geography. Snowfall depends on temperature and moisture—two variables instead of one. Precipitation depends on storm tracks, frontal systems, and atmospheric rivers—things that vary wildly from year to year. The model captures the predictable part. It cannot capture the chaos.

Per-city σ captures real differences in volatility. Miami’s estimated σ is 1.18°F. Its winters are boringly consistent. Billings, Montana, comes in at 4.84°F. Its winters are a coin flip. The model knows this. A 3°F anomaly in Miami would be a screaming outlier. The same anomaly in Billings would be a Tuesday.

5. Model Fit vs. Reality

Here is the central question: which cities had a weird winter?

We answer it two ways, and they agree. That is how we know they are right.

Method 1: Self-distance as displacement. Remember the distance matrix from Section 2? Every city has a distance to every other city, based on its 35-year average. But we can also compute each city’s distance to itself—that is, how far the 2024–25 winter was from that city’s own historical normal. We call this the “displacement score.” A low score means the city had a typical winter. A high score means it did not.

Most DisplacedScoreLeast DisplacedScore
Anchorage, AK0.492 Tampa, FL0.040
Dallas, TX0.356 Miami, FL0.041
Reno, NV0.318 Minneapolis, MN0.042
Denver, CO0.295   

Method 2: Model z-scores. The model predicts each city’s expected temperature, snowfall, and precipitation. It also knows how much noise each city has (σ from Section 3). A z-score is: how many standard deviations was the actual observation from the model’s prediction? A z-score above 2 means the city’s winter was, roughly, a 1-in-20 event given its history.

Only two cities exceeded 2σ:

Highest Z-Score
Denver, CO — 2.70σ
Temperature deviation of 2.70 standard deviations above the model prediction. Roughly a 1-in-150 event.
Second Highest
Salt Lake City — 2.18σ
Temperature deviation of 2.18 standard deviations. Roughly a 1-in-35 event.

The two methods use different information. Displacement scores use the distance matrix, which knows nothing about the model. Z-scores use the model, which knows nothing about the distance matrix. Both point at the same cities. Denver is weird. Salt Lake City is weird. Anchorage is weird. Tampa is fine.

Why we trust the model: It captures 94.5% of temperature variance. The displacement scores and z-scores independently agree on which cities were weirdest. When two different methods point the same finger, you tend to believe them.

6. Decisions and Sensitivity

Every analysis is a stack of choices. Here are ours, and what happens if you change them.

30-year baseline (1991–2020). This is the standard climatological reference period. It already includes significant warming. Our anomalies are measured against a world that has already warmed. This makes them conservative. A colder baseline would make everything look weirder.

50-city sample. We chose 50 cities, biased toward the Northeast and large metros. Rural America is absent. The Great Plains are underrepresented. If your town is not on the list, it is not because we think your weather does not matter. It is because we had to draw the line somewhere, and we drew it at cities people have heard of.

ERA5 reanalysis. Excellent for temperature. Less trustworthy for snowfall. ERA5 systematically underestimates snow, especially lake-effect snow near the Great Lakes and orographic snow in the mountains. If Syracuse or Salt Lake City look less snowy than you remember, this is probably why.

December–February as “winter.” November can be brutal. March can be worse. We used the meteorological definition of winter and ignored the shoulder months. A freak November blizzard does not appear in our data. A warm March that felt like spring does not either.

Normalization before distance. This is the choice that prevents temperature from eating the distance metric alive. Winter temperature ranges from about 10°F to 75°F (65-degree spread). Snowfall ranges from 0 to 90 inches. Without normalization, the temperature difference between two cities would usually be the only thing that matters. By rescaling each metric to [0, 1], we give temperature, snowfall, and precipitation equal votes.

The sensitivity bottom line: Our choices are defensible but not the only reasonable ones. A different baseline, different cities, or different time window would produce different numbers. The broad conclusions—Denver was weird, Tampa was not—are robust to most of these choices. The exact rankings might shuffle.

7. What We Didn’t Do (and Why You Might)

This is not an academic paper. It is a dashboard. We built it to answer a specific question: was this winter weird, and where? We did not build it to survive peer review. Here is what a more rigorous analysis would include, if you are the sort of person who wants to build one:

Temporal autocorrelation. A warm winter tends to follow a warm fall. Our model ignores this. A proper treatment would add AR(1) or ARMA structure to the residuals. We did not, because seasonal aggregates already smooth out most short-term autocorrelation, and because this is a dashboard, not a dissertation.

ENSO as a covariate. El Niño and La Niña are the single biggest drivers of year-to-year winter variability in the United States. We mention ENSO in the essay. We did not include it in the model as a predictor. A better model would. We classified seasons by ENSO phase after the fact rather than letting the model learn the relationship.

Model selection via LOO-CV or WAIC. We picked a model and ran it. We did not systematically compare it against simpler alternatives using leave-one-out cross-validation or the Widely Applicable Information Criterion. This is the sort of thing you do when publishing. We were not publishing.

Building up from simpler models. Good practice is to start simple (random intercept only), then add complexity (add slope, add heteroscedastic sigma) and check whether each addition is justified. We skipped to the complex model because we had domain knowledge about what the model should capture. This works until it does not.

More data. Weekly or monthly resolution instead of seasonal aggregates. Daily extremes instead of means. Humidity, wind, sunshine hours. More data is always better, until it is not.

Representative sampling. A population-weighted or geographically stratified sample of cities would be more defensible than our ad hoc list. Pittsburgh and Philadelphia are 300 miles apart and probably do not have independent weather. We treated them as independent anyway, because this is a dashboard, not a dissertation.

Spatial correlation. Cities near each other are not independent. A proper spatial model would account for this using a Gaussian process or a conditional autoregressive structure. We did not, because spatial models are computationally expensive and because our audience does not want to wait for a Gaussian process to converge.

Want to replicate this? The full code—data fetching, model fitting, and this dashboard—is on GitHub.

View on GitHub →
Source · Python

The Code

Two pieces do most of the work: the distance functions that decide which cities had similar winters, and the mixed-effects model that turns 30 years of data into trends, anomalies, and z-scores.

Every block on this page is a real textarea. Click in, change a number, copy it, paste it into a notebook. Nothing on the dashboard reruns — the goal is for you to see what the math actually looks like in code, not to wait on a build. The full repo is on GitHub.

numpy scipy.interpolate.BSpline least squares + ridge no fancy ML libs
Why Python and not R? Honestly, just because the dashboard is HTML/JS and Python kept the data pipeline next to the rest. R would have been more natural for the model — mgcv::gam(y ~ s(year) + s(city, bs="re") + s(year, city, bs="re")) in a single line, or lme4::lmer(y ~ s(year) + (year | city)) if you don't mind a basis-prep step. The Python here uses NumPy + a hand-rolled ridge solve to keep dependencies tiny, but the math is identical. If you fork this and rewrite the model in R, the dashboard JSON contract is in fit_model.py — just emit the same data/dashboard_data.json shape.

Part 1 · Distances

The dashboard answers two questions that both reduce to a distance measurement:

Both questions need to compare cities along three axes: average winter high (°F), total snowfall (in), total precipitation (in). Those scales are wildly different — precipitation might be 5 to 80, snowfall 0 to 90, temperature 10 to 75. So step zero is normalization: rescale every dimension to [0, 1] using the cross-city min and max. After that, no single dimension dominates.

Min–max normalization
x′ = (x min) / (max min)
Applied per dimension, so all three metrics live on the same [0, 1] axis before any distance is computed.
weather/analysis.py · bounds & normalize

With everything normalized, we can write the distance functions. Both Euclidean and Manhattan summarize how far apart two cities are with a single number. They differ only in how they aggregate the per-dimension differences.

Euclidean distance

L2 · Euclidean
d(a, b) =  Σi (ai bi)2
Square the gap on each axis, sum, take the square root. Geometrically: the straight-line distance between two points in n-dimensional space — the Pythagorean theorem with more legs.
weather/analysis.py · climate_dist

Manhattan distance

L1 · Manhattan (a.k.a. taxicab)
d(a, b) = Σi  |ai bi|
Sum of absolute gaps. Named for navigating a grid of city blocks — you can’t cut diagonally through buildings, you walk along axes. No squaring, no square root.
fit_model.py · in compute_distance_matrices()

How they relate

Both are members of the same family — the Minkowski distance:

Minkowski (parameterized by p)
dp(a, b) = ( Σi |ai bi|p )1/p
p = 1 gives Manhattan. p = 2 gives Euclidean. p → ∞ gives Chebyshev (just the largest single-dimension gap).
Why normalize first? Without normalization, temperature’s range (~65 units) would swamp snowfall (~90 units of inches) and precipitation (~75 units of inches) in a way that depends on units, not on signal. Min–max scaling makes a 1-unit move on each axis represent the same fraction of cross-city spread. That’s the assumption being made — an honest one to flag.

Part 2 · The Model

The model has one job: take 30 years of winter data for 50 cities and tell us, for every city-year, whether what happened was within the realm of normal. Its formula:

Mixed-effects model
yc,t = spline(t) + αc + βc·t + εc
yc,t: weather metric (high, snow, or precip) for city c in year t.
spline(t): smooth national trend over time (a B-spline). One curve for everyone.
αc: city’s baseline offset (random intercept). Miami is warmer than Minneapolis — this captures it.
βc·t: city’s personal slope (random slope). Some cities warm faster than others.
εc: per-city error variance. Miami’s winter barely moves; Billings swings wildly. Each city gets its own σ.

Step 1 · B-spline knots

A B-spline is a smooth curve built from local pieces. Knots are the joints — the x-values where the pieces meet. We place internal knots at quantiles of the year range so the curve gets equal data on each side, then pad both ends with repeated knots so the spline can reach the boundaries cleanly.

fit_model.py · make_bspline_knots

Step 2 · Design matrix

The model has three blocks of features. They’re glued side by side into one big design matrix X, then linear regression handles the rest.

fit_model.py · in fit_metric_model()

Step 3 · Solve, with a ridge penalty on random effects

Standard ordinary least squares minimizes Σ(y)2. We add a small penalty λΣβ2 on just the city-level coefficients. That shrinks them toward zero so wild outlier cities don’t dominate. It’s a poor-man’s mixed-effects fit using only NumPy — no statsmodels.MixedLM required.

Ridge-regularized least squares
β̂ = ( XTX + P )−1 XTy
P is a diagonal penalty matrix — zeros on the spline coefficients (no shrinkage on the trend) and a small λ on the city intercepts and slopes (shrink random effects toward zero).
fit_model.py · ridge solve

Step 4 · Per-city σ (heteroscedasticity)

One global σ would be a lie. Miami’s residuals are tiny — its winters are predictable. Billings’s residuals are huge. So we estimate σ per city from each city’s own residuals. This is what makes a 2σ event mean something useful: 2σ in Miami is 2.4°, while 2σ in Billings is 9.7°.

fit_model.py · per-city sigma

Step 5 · The z-score

Once you have city-specific predictions and city-specific σ, this year’s anomaly becomes a z-score: how many of this city’s standard deviations is the actual value from the model’s expectation?

Per-city z-score
zc, 2025 = ( yc, 2025 ŷc, 2025 ) / σc
Denver’s z-score for 2025 was +2.70. Under a normal distribution that’s a less-than-1% event. Two cities exceeded the ±2σ band.

Part 3 · What we didn’t do (and how you would)

Two honest gaps worth disclosing, because someone trying to replicate this would otherwise re-derive them and wonder whether they’d misread the source.

1. We hard-coded df = 6. We did not pick it by cross-validation.

The B-spline degrees of freedom is a single line in fit_model.py: spline_df=6. Six basis functions over 30 years gives a curve that can bend a couple of times without chasing every wiggle — it’s a sensible default, but it’s a default, not a result. The principled way is leave-one-out cross-validation (LOOCV): hold out one observation at a time, refit, predict the held-out point, repeat across a grid of df values, pick the df with the lowest held-out error. Here’s the pattern, ready to drop in:

loocv_select_df.py · suggested addition

2. ENSO is not a model term. It’s post-hoc residual analysis.

The Artifacts tab and the Time Series tab both show ENSO patterns — bars per phase, residuals colored by El Niño / La Niña / Neutral. None of that flows back into the model fit. Our model is blind to ENSO. After fitting, we group the residuals by year-phase and average them. That’s descriptive, not predictive.

fit_model.py · ENSO as it actually exists today

If you want ENSO to be a model term — so its effect is estimated jointly with the trend and city effects, with proper standard errors — here’s the change. One-hot the phase (with neutral as the reference category) and append it to the design matrix:

fit_model.py · suggested: ENSO as a fixed effect
What this model is and isn’t. It’s a deliberately simple linear model with smooth trend + city-level random effects + per-city variance. It is not a climate projection, a causal model, or a forecast. It’s a descriptive lens that says: relative to its own past, how unusual was this city’s winter? The two limits above are real but addressable — if you fork this, those are the natural next moves.

If we did this by the books · in R

If you came here hoping for R, here is the version that does it properly: ENSO as a fixed effect in the model, smoothing parameter chosen by REML (no hard-coded df), block leave-one-year-out CV comparing nested models, and real diagnostics — QQ, residuals vs fitted, leverage, concurvity, DHARMa simulation-based residuals, variance components.

One file, runnable with Rscript, drops in next to fit_model.py:

if_we_did_this_in_R.R →

Skills · run this on your own data

Two reusable skills extracted from this project. Both are agent-agnostic — paste the prompt into Claude, ChatGPT, Gemini, or whatever else you've got, fill in your parameters, get a complete artifact back.

If you want to run the Python version end-to-end: clone the repo, pip install -r requirements.txt, then python fetch_data.py && python fit_model.py. The dashboard reads data/dashboard_data.json, which both scripts write.

Metric:
Metric:
Distance:

National Trends & Model Diagnostics

National Temperature Trend (all 50 cities)

Random Intercepts vs. Random Slopes (Temperature)

Seven posters drawn from the analysis. Each one renders at 1200×1500 on screen and exports as a 4800×6000 PNG — high enough for print or a carousel deck. Click Export PNG on any poster.

01 / 07
Winter teleportation · nearest-match cities
02 / 07
ENSO signatures · La Niña and El Niño response per city
03 / 07
Random intercept × slope correlation · warmer cities warm faster
04 / 07
σ per city · continental interiors swing harder
05 / 07
50-city scorecard · trend, σ, nearest match, this year
06 / 07
Model summary · formula, variance decomposition, trends
07 / 07
Strange but true · summary stats worth sharing
Loading weather data...
If this was worth your time
A small tip goes a long way in providing me with the means to launch more studies — and a super happy moment that will likely last longer than I should admit.
or open on ko-fi.com ↗