Handling Big(-ish) Data in Postgres with Data Partitioning

Maria Karanasou
8 min readAug 18, 2023
Photo by Alexander Sinn on Unsplash

TL;DR

To deal with a table that is predicted to hold a lot of data we can leverage Postgres’ data partitioning feature.

This feature allows us to split a table into smaller ones that will hold part of the data based on a rule, e.g. the customer’s name first letter, or the month of the creation date.

Thus we can query the parent table using the rule:

SELECT * FROM parent_table WHERE rule applies, e.g. created_at = ‘some date’

and get data faster.

The Problem: A database table that can get too big to be efficiently queried.

Let’s say that we have a table that would hold the result of real time web logs preprocessing of our analytics engine. The estimated input to that table would be ~50GB per week for example. So, within a year we’d have a ~2.5 TB table. Even though this is within Postgres limits (see more in Postgres Limits section at the end), running…

--

--

Maria Karanasou

A mom and a Software Engineer who loves to learn new things & all about ML & Big Data. Buy me a coffee to help me keep going buymeacoffee.com/mkaranasou