The first part of this pair ended on a single observation. The recurring failure across data work is a question that was never framed properly. The fix is a habit, and it can be run from either side of the table.
[Read More]
Why 'becoming data-driven' keeps falling short
A message lands in the analyst’s chat: “Can you pull the new FTS numbers?”
[Read More]
dbt on DuckDB — from raw tables to mart models
In the previous post I described how to load Parquet exports into a local DuckDB database — a fast, free, file-based data warehouse you can have running in an afternoon. The raw tables are queryable straight away. But to turn them into something reliable, documented, and ready for an analytics...
[Read More]
A personal data warehouse — free, fast, and local
Many business systems don’t offer a live database connection. They export data periodically — one Parquet file per table, dropped into a folder. That works fine for a one-off look. It becomes a problem the moment you want to join tables, apply consistent transformations, or connect the data to an...
[Read More]
Geocoding made easy
It happens from time to time that I’m using datasets which include physical addesses that need to be turned into longitude/latitude data. For example, when working data of mass shootings in the US the Gun Violence Archive GVA provides physical addresses only.
[Read More]