In the previous post I described how to load Parquet exports into a local DuckDB database — a fast, free, file-based data warehouse you can have running in an afternoon. The raw tables are queryable straight away. But to turn them into something reliable, documented, and ready for an analytics...
[Read More]
A personal data warehouse — free, fast, and local
Many business systems don’t offer a live database connection. They export data periodically — one Parquet file per table, dropped into a folder. That works fine for a one-off look. It becomes a problem the moment you want to join tables, apply consistent transformations, or connect the data to an...
[Read More]
Geocoding made easy
It happens from time to time that I’m using datasets which include physical addesses that need to be turned into longitude/latitude data. For example, when working data of mass shootings in the US the Gun Violence Archive GVA provides physical addresses only.
[Read More]
PFAS Map
I came across this topic in a recent LinkedIn post by investigative journalist Daniel Drepper. An international research network investigated the spread of PFAS (per- and polyfluoroalkyl substances) to unveil the scale of pollution. This group of chemicals are linked to various deseases such as cancer and infertility. As a...
[Read More]
Waffle charts in R
Displaying proportional data, i.e., subsets of data that contribute to a whole, can be done in various ways. However, two particular suitable types of visualisations are isotypes and waffle charts.
[Read More]