Gabriel Pereira
Back to writing
Mar 2026·craft·5 min read

The cost of a schema decision you can't take back

Code is reversible. A schema, once in production with real user data, mostly isn't.

I've written code I regret. I've never stopped thinking about the one schema decision I got wrong in production.

Code is fixable in ways that schemas aren't. When you write a function badly, you push a fix. When you name something wrong, you rename it and the compiler catches the rest. When you get a schema wrong after real data exists in it, the options are narrow: a migration that rewrites every row, a new table with a backfill job, or a workaround you carry forward indefinitely. None of these are cheap.

The decision that seemed fine at the time

Building the analysis layer for Unsaid, I had to decide how to model the relationship between journal entries and the patterns the system detects in them. Patterns felt underdefined at the time. The product logic was still settling. I modeled them as a JSONB column on the analysis row: flexible, quick to write, easy to extend. It felt like the right call when the product wasn't certain.

It wasn't. Six weeks later I needed to query across patterns: filter entries by detected theme, aggregate pattern frequency by week, surface trends across a month of data. None of that is clean with JSONB. What I needed was a normalized table. What I had was data already in production.

When the migration has already run

There's a specific feeling when you realize a schema is wrong. It arrives not when you're designing it, but when the product needs something the table can't do. By then, users have data in that shape. The rows exist. Every migration you now want to run has to account for them.

The fix took a full day: writing the migration, building a backfill script, testing it against a snapshot of production data, watching the deploy. That day had nothing to do with the feature I was trying to ship. It was entirely overhead from a decision I'd made six weeks earlier.

How schema forces product decisions

The subtler cost wasn't the day of work. It was what I hadn't noticed: for six weeks, the schema had been shaping which features I scoped and which ones I quietly dropped. A query that would have been a single join became something I wasn't doing, and I'd been calling that a product decision.

Schema debt doesn't just slow you down. It steers what you build. When the data is in the wrong shape, you stop asking the right questions, not because you gave up, but because the cost of answering them is too high.

The schema isn't just a technical decision. It's a commitment about what questions you'll be able to ask six months from now.

What I think through now

Before I write a migration now, I ask what queries this schema needs to serve: not the queries I'm writing today, but the ones I'll need when this feature develops. If I can't answer that, I default toward more normalized. Normalized schemas are painful to write. They're less painful to change later than denormalized ones.

I also try to write the query I want first, then design the schema backward from it. That order exposes problems earlier, when the cost of changing your mind is still low.

The first version of a schema is almost never right. That's not avoidable. What's avoidable is treating it like it can be corrected as cheaply as the code around it.