The two-way bargain between business and data

The first part of this pair ended on a single observation. The recurring failure across data work is a question that was never framed properly. The fix is a habit, and it can be run from either side of the table. [Read More]
Tags: Data and Business, business, data analytics, data literacy, AI, governance

dbt on DuckDB — from raw tables to mart models

In the previous post I described how to load Parquet exports into a local DuckDB database — a fast, free, file-based data warehouse you can have running in an afternoon. The raw tables are queryable straight away. But to turn them into something reliable, documented, and ready for an analytics... [Read More]
Tags: DuckDB, dbt, SQL, data modeling, data engineering, staging, marts

A personal data warehouse — free, fast, and local

Many business systems don’t offer a live database connection. They export data periodically — one Parquet file per table, dropped into a folder. That works fine for a one-off look. It becomes a problem the moment you want to join tables, apply consistent transformations, or connect the data to an... [Read More]
Tags: DuckDB, dbt, data warehouse, Parquet, SQL, data engineering

Geocoding made easy

It happens from time to time that I’m using datasets which include physical addesses that need to be turned into longitude/latitude data. For example, when working data of mass shootings in the US the Gun Violence Archive GVA provides physical addresses only. [Read More]
Tags: R, tidygeocoder, geocoding, geo, map, visualization, viz