Planet PostgreSQL

Understanding High Water Mark Locking Issues in PostgreSQL Vacuums

Posted by Shane Borden on 2025-06-06 at 20:49

I recently had a customer that wanted to leverage read replicas to ensure that their read queries were not going to impeded with work being done on the primary instance and also required an SLA of at worst a few seconds. Ultimately they weren’t meeting the SLA and my colleagues and I were asked to look at what was going on.

The first thing we came to understand is that the pattern of work on the primary is a somewhat frequent large DELETE statement followed by a data refresh accomplished by a COPY from STDIN command against a partitioned table with 16 hash partitions.

The problem being observed was that periodically the SELECTs occurring on the read replica would time out and not meet the SLA. Upon investigation, we found that the “startup” process on the read replica would periodically request an “exclusive lock” on some random partition. This exclusive lock would block the SELECT (which is partition unaware) and then cause the timeout. But what is causing the timeout?

After spending some time investigating, the team was able to correlate the exclusive lock with a routine “autovacuum” occurring on the primary. But why was it locking? After inspection of the WAL, it turns out that it the issue was due to a step in the vacuum process whereby it tries to return free pages at the end of the table back to the OS, truncation of the High Water Mark (HWM). Essentially the lock is requested on the primary and then transmitted to the replica via the WAL so that the tables can be kept consistent.

To confirm that it was in fact the step in VACUUM that truncates the HWM, we decided to alter each partition of the table to allow VACUUM to skip that step:

ALTER TABLE [table name / partition name] SET (vacuum_truncate = false);

After letting this run for 24 hours, we in fact saw no further blocking locks causing SLA misses on the replicas. Should we worry about shrinking the High Water Mark (HWM)? Well as with everything in IT, it depends. Other DBMS engines like Oracle do not shrink the

[...]

Just announced: PG DATA 2026

Posted by Henrietta Dombrovskaya on 2025-06-06 at 13:21

Chicago is better in summer! Our inaugural event, PG DATA 2026, will kick off 363 days from today, and we want you to be a part of it!

PG DATA’s mission is to encourage the growth of the PostgreSQL community in the Midwest through learning and networking opportunities, to foster collaboration with the developers’ community and academia, and to bring the best global Postgres speakers to our local community.

More details are coming soon! Watch for further announcements, and do not plan your vacation for the first week of June 2026. Alternately, make Chicago your vacation destination—our beautiful city has a lot to offer, especially in summer!

Call for Sponsors is open!

Avoiding disk spills due to PostgreSQL's logical replication

Posted by Ashutosh Bapat on 2025-06-06 at 09:40

Logical replication is a versatile feature offered in PostgreSQL. I have discussed the the theoretical background of this feature in detail in my POSETTE talk. At the end of the talk, I emphasize the need for monitoring logical replication setup. If you are using logical replication and have setup monitoring you will be familiar with pg_stat_replication_slots. In some cases this view shows high amount of spill_txns, spill_count and spill_bytes, which indicates that the WAL sender corresponding to that replication slot is using high amount of disk space. This increases load on the IO subsystem affecting the performance. It also means that there is less disk available for user data and regular transactions to operate. This is an indication that logical_decoding_work_mem has been configured too low. That's the subject of this blog: how to decide the right configuration value for logical_decoding_work_mem. Let's first discuss the purpose of this GUC. Blog might serve as a good background before reading further.

Reorder buffer and logical_decoding_work_mem

When decoding WAL, a logical WAL sender accumulates the transaction in an in-memory data structure called reorder buffer. For every transaction that WAL sender encounters, it maintains a queue of changes in that transaction. As it reads each WAL records, it finds the transaction ID which it belongs to and adds it to the corresponding queue of changes. As soon as it sees a COMMIT record of a transaction, it decodes all the changes in the corresponding queue and sends downstream. If the reorder buffer fills up by transactions whose COMMIT record is yet to be seen, it spills the queue to the disk. We see such disk spills accounted in spill_txns, spill_count and spill_bytes. The amount of memory allocated to reorder buffer is decided by logical_decoding_work_mem GUC. If GUC value is lower, it will cause high disk spills and if the value is higher it will waste memory. Every WAL sender in the server will allocate logical_decoding_work_mem amount of memory, thu

[...]

CNPG Recipe 18 - Getting Started with pgvector on Kubernetes Using CloudNativePG

Posted by Gabriele Bartolini in EDB on 2025-06-05 at 20:42

Learn how to set up a PostgreSQL cluster with the pgvector extension on Kubernetes using CloudNativePG—all in a fully declarative way. This article walks you through the process in just a few minutes, from cluster creation to extension installation.

Benchmarking is hard, sometimes ...

Posted by Tomas Vondra on 2025-06-05 at 10:00

I do a fair number of benchmarks, not only to validate patches, but also to find interesting (suspicious) stuff to improve. It’s an important part of my development workflow. And it’s fun ;-) But we’re dealing with complex systems (hardware, OS, DB, application), and that brings challenges. Every now and then I run into something that I don’t quite understand.

Consider a read-only pgbench, the simplest workload there is, with a single SELECT doing lookup by PK. If you do this with a small data set on any machine, the expectation is near linear scaling up to the number of cores. It’s not perfect, CPUs have frequency scaling and power management, but it should be close.

Some time ago I tried running this on a big machine with 176 cores (352 threads), using scale 50 (about 750MB, so tiny - it actually fits into L3 on the EPYC 9V33X CPU). And I got the following chart for throughput with different client counts:

This is pretty awful. I still don’t think I entirely understand why this happens, or how to improve the behavior. But let me explain what I know so far, what I think may be happening, and perhaps someone will correct me or have an idea how to fix it.

Approximate the p99 of a query with pg_stat_statements

Posted by Michael Christofides on 2025-06-04 at 15:58

Cover photo by Luca Upper

I recently saw a feature request for pg_stat_statements to be able to track percentile performance of queries, for example the p95 (95th percentile) or p99 (99th percentile).

That would be fantastic, but isn’t yet possible. In the meantime, there is a statistically-dodgy-but-practically-useful (my speciality) way to approximate them using the mean and standard deviation columns in pg_stat_statements.

Why bother?

When wondering what our user experience is like across different queries, we can miss issues if we only look at things by the average time taken.

For example, let’s consider a query that takes on average 100ms but 1% of the time it takes over 500ms (its p99), and a second query that takes on average 110ms but with a p99 of 200ms. It is quite possible that the first query is causing more user dissatisfaction, despite being faster on average.

Brief statistics refresher

The standard deviation is a measure of the amount of variation from the mean. Wider distributions of values have larger standard deviations.

pg_stat_statements has mean_exec_time (mean execution time) and mean_plan_time (mean planning time) columns, but no median equivalents. The other columns we’ll be using for our approximation calculation are stddev_exec_time and stddev_plan_time.

In a perfectly normally distributed data set, the p90 is 1.28 standard deviations above the mean, the p95 is 1.65, and the p99 is 2.33.

Our query timings are probably not normally distributed, though. In fact, many will have a longer tail on the slow end, and some will have a multimodal distribution (with clustering due to things like non-evenly distributed data and differing query plans).

Having said that, even though many of our query timings are not normally distributed, queries with a high p99 are very likely to also have a high mean-plus-a-couple-of-standard-deviations, so if we approximate the p99 assuming a normal distribution, the results should be directionally correct.

Ju

[...]

Ultimate Guide to POSETTE: An Event for Postgres, 2025 edition

Posted by Claire Giordano in CitusData on 2025-06-04 at 15:26

POSETTE: An Event for Postgres 2025 is back for its 4^th year—free, virtual, and packed with deep expertise. No travel needed, just your laptop, internet, and curiosity.

This year’s 45 speakers are smart, capable Postgres practitioners—core contributors, performance experts, application developers, Azure engineers, extension maintainers—and their talks are as interesting as they are useful.

The four livestreams (42 talks total) run from June 10-12, 2025. Every talk will be posted to YouTube afterward (un-gated, of course). But if you can join live, I hope you do! On the virtual hallway track on Discord, you’ll be able to chat with POSETTE speakers—as well as other attendees. And yes, there will be swag.

This “ultimate guide” blog post is your shortcut to navigating POSETTE 2025. In this post you’ll get:

“By the numbers” summary for POSETTE 2025

Here’s a quick snapshot of what you need to know about POSETTE:

About POSETTE: An Event for Postgres 2025
3 days	June 10-12, 2025
4 livestreams	In Americas & EMEA time zones (but of course you can watch from anywhere)
42 talks	All free, all virtual
2 keynotes	From Bruce Momjian & Charles Feddersen
45 speakers	PG contributors, users, application developers, community members, & Azure engineers

[...]

Contributions for the week of 2025-05-19 (Week 21)

Posted by Boriss Mejias in postgres-contrib.org on 2025-06-04 at 10:06

[PGDay Blumenau 2025] (https://pgdayblumenau.com.br/) took place May 24 in Blumenau, Brazil, organized by João Foltran and Gustavo Lemos, with Sara Kruger and Leonardo Corsini as volunteers.

Speakers:

Boldly Migrate to PostgreSQL – Introducing credativ-pg-migrator

Posted by Josef Machytka in credativ on 2025-06-03 at 06:00

Many companies these days are thinking about migrating their databases from legacy or proprietary system to PostgreSQL. The primary aim is to reduce costs, enhance capabilities, and ensure long-term sustainability. However, even just the idea of migrating to PostgreSQL can be overwhelming. Very often, knowledge about the legacy applications is limited or even lost. In some cases, vendor support is diminishing, and expert pools and community support are shrinking. Legacy databases are also often running on outdated hardware and old operating systems, posing further risks and limitations. (more…)

PgPedia Week, 2025-06-01

Posted by Ian Barwick on 2025-06-02 at 23:00

If you, like me, set up read access to the PostgreSQL Git repository many years ago, and have been wondering why it's been returning fatal: Could not read from remote repository errors for the past few days, it's because git:// protocol support has been deactivated and maybe withdrawn entirely (see pgsql-www thread git repo "https://" working but "git://" is not ).

Resolution is to convert it to https:// , e.g.:

git remote set-url origin https://git.postgresql.org/git/postgresql.git

PostgreSQL 18 changes this week

A very quiet week, with commits-of-interest consisting of smaller changes togther with a bunch of back-patched fixes.

PostgreSQL 18 articles Postgres Extensions: Use PG_MODULE_MAGIC_EXT (2025-05-29) - David E. Wheeler discusses the changes to extension metadata added in commit 9324c8c5

more...

Yes, Postgres can do session vars - but should you use them?

Posted by Kaarel Moppel on 2025-06-02 at 21:00

Animated by some comments / complaints about Postgres’ missing user variables story on a Reddit post about PostgreSQL pain points in the real world - I thought I’d elaborate a bit on sessions vars - which is indeed a little known Postgres functionality. Although this “alley” has existed for ages...

The Fun of Open Source: Roman Numerals in PostgreSQL

Posted by Laurenz Albe in Cybertec on 2025-06-02 at 18:39

Recently, I wrote about the power of open source. Now, power is good and important, but open source software has other good sides as well. One of these aspects is fun. I will showcase that with the recently introduced support for converting Roman numerals to numbers.

The driving forces behind PostgreSQL development

Before we turn to the Roman numerals, let's take a step back. People with little knowledge about open source software sometimes believe that open source software is something that computer nerds write in their spare time. But developing open source software is not primarily a pastime for people with nothing better to do. The driving force behind most open source software development is actually money, directly or indirectly. Sometimes customers sponsor the development of a feature they need. Companies that live by selling PostgreSQL services or add-on products invest time and money into development so that the software is attractive to users.

But what drives the people who actually write the software? Well, certainly it is the money they get paid for it. But there is more to the picture:

Contributions to open source software are visible and earn you the respect of others
You want the tools you have to work with to be as good as possible (Did you ever wonder why psql has so many features?)
You develop emotional ties to the software and want it to be good
There is actually some fun to be had

Fun in PostgreSQL

Reading the PostgreSQL documentation or using the software you probably don't see a lot of fun. Both speak to you in a technical, no-nonsense fashion. You have to look to the places where people communicate about PostgreSQL. This is primarily the mailing list pgsql-hackers. Sure enough, discussions there are fact-oriented and sometimes heated, but there is room for fun.

But most of the fun with PostgreSQL happens at conferences. Read my article about PostgreSQL conferences and you will know what I mean.

Why do we need su

[...]

Dilan Tek

Posted by Andreas 'ads' Scherbaum on 2025-06-02 at 14:00

PostgreSQL Person of the Week Interview with Dilan Tek: My name is Dilan. I was born in Ankara, the capital of Turkiye. I love my city very much! Although I lived in Istanbul for a while for work, I returned to Ankara.

Postgres Partitioning Best Practices

Posted by Karen Jex in Crunchy Data on 2025-06-02 at 08:49

Slides and transcript from my talk, "Postgres Partitioning Best Practices", at PyCon Italia in Bologna on 29 May 2025.

Thank you to everyone who came to listen, apologies to the people who were turned away because the room was full (who knew so many people would want to learn about Partitioning!), and thank you for all the questions, which have given me lots of ideas for improvements.

I'll share the recording as soon as it's available.

Postgres Partitioning Best Practices

I always feel the need to get this confession out of the way before I get too far in to the talk [at developer conferences] - I’m not a developer. Sorry!

But as you can see in this diagram of my career so far, I at least know about databases. I was a DBA for 20 years before becoming a database consultant, and now a senior solutions architect. I don't have "database" in my job title any more, but I still only work with database systems, specifically PostgreSQL.

I’ve worked with a lot of developers during that time and I’ve learnt a lot from them, and in return I try to share some of my database knowledge.

The photo is me in my happy place, on a local bike trail, splattered in mud!

I’m also on the PostgreSQL Europe board of directors and I’m leading the PostgreSQL Europe Diversity Task Force so feel free to find me to talk to me about that.

If you have huge database tables, or if you expect to have huge database tables in the future, you'll probably start to think about partitioning to make them easier to manage. You probably have lots of questions about how to do that and what the best practices are.

I work with lots of customers who have already been through that thought process, and this presentation is based on the questions they asked along the way, and the things they learnt.

We'll look at:

An introduction to partitioning:
What is table partitioning, and how does it work?
What are the advantages of partitioning?
How do you choose a Partition Key?
How

[...]

SCaLE 22x: Bringing the Open Source Community to Pasadena

Posted by Sarah Conway in Data Bene on 2025-06-02 at 00:00

The Southern California Linux Expo (SCaLE) 22x, recognized as being North America’s largest community-run open source and free software conference, took place at the Pasadena Convention Center from March 6-9, 2025. When I say community-run, I mean it—no corporate overlords dictating the agenda, just pure open source enthusiasm driving four days of technical discussions and collaboration.

This year’s conference focused around the topics of AI, DevOps and cloud-native technologies, open source community engagement, security and compliance, systems and infrastructure, and FOSS @ home (exploring the world of self-hosted applications and cloud services).

The conference drew attendees from around the world to talk about everything open-source, revolving around Linux at the core (of course) while continuing the discussion across topics such as embedded systems & IoT. As always, there was a unique blend of cutting-edge tech talk and practical problem-solving within every space that is what makes SCaLE special.

Herding Elephants: PostgreSQL@SCaLE22x

PostgreSQL@SCaLE22x ran as a dedicated two-day, two-track event on March 6-7, 2025, recognized under the PostgreSQL Global Development Group community event guidelines. The selection team included Gabrielle Roth, Joe Conway, and Mark Wong, ensuring the quality you’d expect from the PostgreSQL community.

The speaker lineup was impressive: Magnus Hagander, Christophe Pettus, Peter Farkas, Devrim Gündüz, Hamid Akhtar, Henrietta Dombrovskaya, Shaun Thomas, Gülçin Yıldırım Jelínek & Andrew Farries, Nick Meyer, and Jimmy Angelakos. One particularly memorable session was titled “Row-Level Security Sucks. Can We Make It Usable?”—a refreshingly honest take on PostgreSQL’s RLS feature that probably resonated with more than a few database administrators in the audience.

The community “Ask Me Anything” panel was hosted by Stacey Haysler and featured Christophe Pettus, Devrim Gündüz, Jimmy Angelakos, Magnus Hagander, and Mark Wong. These sessions are where the re

[...]

Postgres Extensions Day Montréal 2025

Posted by Andreas Scherbaum on 2025-06-01 at 22:00

On Monday before PGConf.dev, the Postgres Extensions Day 2025 took place. Same venue, same floor. A day fully packed with talks, and interesting discussions. This event shows once again that extensions in PostgreSQL are thriving. It also shows that there is a lot of work to do to make extensions more usable and discoverable for users. Just a few problems which where discussed: A database can only load one extension version.

Understanding Split-Brain Scenarios in Highly Available PostgreSQL Clusters

Posted by semab tariq in Stormatics on 2025-05-30 at 10:49

High Availability (HA) refers to a system design approach that ensures a service remains accessible even in the event of hardware or software failures. In PostgreSQL, HA is typically implemented through replication, failover mechanisms, and clustering solutions to minimize downtime and ensure data consistency. Hence, HA is very important for your mission-critical applications.

In this blog post, we will try to explore a critical failure condition known as a split-brain scenario that can occur in PostgreSQL HA clusters. We will first see what split-brain means, and then how it can impact PostgreSQL clusters, and finally discuss how to prevent it through architectural choices and tools available in the PostgreSQL ecosystem

What is a Split-Brain Scenario?

A split-brain scenario occurs when two or more nodes in a cluster lose communication with each other but continue operating as if they are the primary (or leader) node.

We know that the standby node receives data from the primary in real time. However, if a network glitch occurs between the primary and the standby, and the standby is unable to receive data for a certain timeout period, it may assume that the primary has failed.

As a result, the standby might promote itself to a new primary. But since the issue was only a temporary network problem, the original primary is still active. This leads to a situation where two primary nodes are accepting writes at the same time, which is a highly dangerous state for the cluster.

Split-brain is particularly dangerous because it breaks the fundamental guarantees of consistency in an HA cluster. Writes can happen independently on multiple primary nodes, and reconciling those changes later is often impossible without data loss or manual intervention.

Common Scenarios That Lead to Split-Brain

As we have seen earlier, the most common cause of split-brain is a network failure between the primary and standby nodes. However, there are several other situations where a cluster might also encounte

[...]

Bridged Indexes in OrioleDB: architecture, internals & everyday use?

Posted by Alexander Korotkov on 2025-05-30 at 00:00

Since version beta10 OrioleDB supports building indexes other than B-tree. Bridged indexes are meant to support these indexes on OrioleDB tables.

1. Why OrioleDB needs a “bridge”

OrioleDB stores its table rows inside a B-tree built on a table primary key and keeps MVCC information in an undo log, so it can’t simply plug PostgreSQL’s existing Index Access Methods (GiST, GIN, SP-GiST, BRIN, …) into that structure. While PostgreSQL's Index Access Methods:

reference a 6-byte ctid (block number and offset in the heap) -- not a logical key;
keep every live version of a row in the index, leaving visibility checks to the executor;
support inserts only in the index and rely on VACUUM for physical deletion.

OrioleDB indexes, in contrast, are MVCC-aware: they point to the rows via primary-key values and support logical updates/deletes directly in the index. To remain heap-free while still allowing users build the rich ecosystem of non-B-tree indexes, OrioleDB introduces a bridge index layer.

2. How the bridge works under the hood

Virtual iptr column -- an incrementally increasing "index pointer" automatically added to the table. The new value of iptr is assigned each time any column referenced by a bridged index is updated, ensuring the pointer remains stable for the indexed data.
Bridge index -- a lightweight secondary index that maps iptr to primary-key value. It behaves like a normal OrioleDB secondary B-tree, except it doesn't use undo log for MVCC.
PostgreSQL indexes (GIN/GiST/...) are built on the iptr values instead of ctids, so their structure stays compatible with the IndexAM API. During scans, the engine looks up iptr, translates it through the bridge index, and then fetches the row by primary key.
The vacuum process collects stale iptr-s that are not visible to any snapshot, and asks the underlying IndexAM to clean up; then physically deletes the same pointers from the bridge index.

The result is a tri-level lookup path: Inde

[...]

Postgres Extensions: Use PG_MODULE_MAGIC_EXT

Posted by David Wheeler in Tembo on 2025-05-29 at 22:09

A quick note for PostgreSQL extension maintainers: PostgreSQL 18 introduces a new macro: PG_MODULE_MAGIC_EXT. Use it to name and version your modules. Where your module .c file likely has:

PG_MODULE_MAGIC;

Or:

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

Change it to something like:

#ifdef PG_MODULE_MAGIC_EXT
PG_MODULE_MAGIC_EXT(.name = "module_name", .version = "1.2.3");
#else
PG_MODULE_MAGIC;
#endif

Replace the name of your module and the version as appropriate. Note that PG_MODULE_MAGIC was added in Postgres 8.2; if for some reason your module still supports earlier versions, use a nested #ifdef to conditionally execute it:

#ifdef PG_MODULE_MAGIC_EXT
PG_MODULE_MAGIC_EXT(.name = "module_name", .version = "1.2.3");
#else
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
#endif

If you manage the module version in your Makefile, as the PGXN Howto suggests, consider renaming the .c file to .c.in and changing the Makefile like so:

Replace .version = "1.2.3" with .version = "__VERSION__"
Add src/$(EXTENSION).c to EXTRA_CLEAN

Add this make target:

src/$(EXTENSION).c: src/$(EXTENSION).c.in
        sed -e 's,__VERSION__,$(EXTVERSION),g' $< > $@

If you use Git, add /src/*.c to .gitignore

For an example of this pattern, see semver@3526789.

That’s all!

Tip: Put your Rails app on a SQL Query diet

Posted by Andrew Atkinson on 2025-05-29 at 17:29

Introduction

Much of the time taken processing HTTP requests in web apps is processing SQL queries. To minimize that, we want to avoid unnecessary or duplicate queries, and generally perform as few queries as possible.

Think of the work that needs to happen for every query. The database engine parses it, creates a query execution plan, executes it, and then sends the response to the client.

When the response reaches the client, there’s even more work to do. The response is transformed into application objects in memory.

How do we see how many queries are being created for our app actions?

Count the queries

When doing backend work in a web app like Rails, monitor the number of queries being created directly, by the ORM, or by libraries. ORMs like Active Record can generate more than one query from a given line of code. Libraries can generate queries that are problematic and may be unnecessary.

Over time, developers may duplicate queries unknowingly. These are all real causes of unnecessary queries from my work experience.

Why are excessive queries a problem?

Why reduce the number of queries?

Besides parsing, planning, executing, and serializing the response, the client is subject to a hard upper limit on the number of TCP connections it can send to the database server.

In Postgres that’s configured as max_connections. The application will have a variable number of open connections based on use, and its configuration of processes, threads and its connection pool. Keeping the query count low helps avoid exceeding the upper limit.

What about memory use?

What about app server memory?

With Ruby on Rails, the cost of repeated queries is shifted because the SQL Cache is enabled by default, which stores and serves results for matching repeated queries, at the cost of some memory use.

As an side, from Rails 7.1 the SQL Cache uses a least recently used (LRU) algorithm. We can also configure the max number of queries to cache, 100 by default, to control how much

[...]

Don't mock the database: Data fixtures are parallel safe, and plenty fast

Posted by Brandur Leach in Crunchy Data on 2025-05-29 at 13:00

The API powering our Crunchy Bridge product is written in Go, a language that provides a good compromise between productivity and speed. We're able to keep good forward momentum on getting new features out the door, while maintaining an expected latency of low double digits of milliseconds for most API endpoints.

A common pitfall for new projects in fast languages like Go is that their creators, experiencing a temporary DX sugar high of faster compile and runtime speeds than they've previously encountered in their career, become myopically focused on performance above anything else, and start making performance optimizations with bad cost/benefit tradeoffs.

The textbook example of this is the database mock. Here's a rough articulation of the bull case for this idea: CPUs are fast. Memory is fast. Disks are slow. Why should tests have to store data to a full relational database with all its associated bookkeeping when that could be swapped out for an ultra-fast, in-memory key/value store? Think of all the time that could be saved by skipping that pesky fsync, not having to update that plethora of indexes, and foregoing all that expensive WAL accounting. Database operations measured in hundreds of microseconds or even *gasp*, milliseconds, could plausibly be knocked down to 10s of microseconds instead.

Mock everything, test nothing

Anyone who's substantially journeyed down the path of database mocks will generally tell you that it leads nowhere good. They are fast (although disk speed has improved by orders of magnitude over the last decade), but every other one of their aspects leaves something to be desired.

A fatal flaw is that an in-memory mock bears no resemblance to a real database and the exhaustive constraints that real databases put on input data. Consider for example, whether a mock would fail like a database in any of these scenarios:

A value is inserted for a column that doesn't exist.
A value of the wrong data type for a column is inserted.
Duplicate values are inse

[...]

pgstream v0.6.0: Template transformers, observability, and performance improvements

Posted by Ahmet Gedemenli in Xata on 2025-05-29 at 12:45

Learn how pgstream v0.6 simplifies complex data transformations with custom templates, enhances observability and improves snapshot performance.

Advanced Patch Feedback Session (APFS) at pgconf.dev 2025

Posted by Tomas Vondra on 2025-05-29 at 12:00

The pgconf.dev conference, a revamp of the original PGCon, happened about two weeks ago. It’s the main event for Postgres developers, and one of the things we’re trying is an Advanced Patch Feedback Session (APFS).

We first tried that last year in Vancouver, and then again in Montreal. But I realized many people attending the conference either are not aware of the event at all, or are not sure what it’s about. So let me explain, and share some reflections from this year.

PostgreSQL Europe Diversity Task Force (Deep Dive and Updates)

Posted by Karen Jex in Crunchy Data on 2025-05-28 at 17:05

I'm sharing the slides and transcript from my talk about the PostgreSQL Europe Diversity Task Force at PostgreSQL Development Conference 2025.

It's an extended version of the 5 minute lightning talk that I gave at FOSDEM PGDay earlier this year, with some updates on what we've achieved.

If you want a shorter read, feel free to check out the annotated slides from the lightning talk instead!

You may or may not already know that that PostgreSQL Europe (PGEU) launched a Diversity Task Force last year.

But why have we created the diversity task force?
Who’s working on it?
What’s being done?
How can you get involved?

and, maybe more importantly, how can we help you?

Firstly, and briefly, what does diversity actually mean?

This poor little word has unfortunately had some bad press recently but, in a nutshell, it just means variety.

Especially

“The inclusion of people of different races, cultures etc. in a group or organisation”

Gender and race are often the first things that spring to mind when people talk about diversity,

and of course, those are important.

But diversity is also about:

your work experience and current job role,
how and where you grew up,
the language you speak,
whether or not you have children or other caring responsibilities,
any physical disabilities or neurological differences you may have,
your educational background,
and much, much more.

This is far from an exhaustive list!

There are so many different things that bring variety to our community, and so many attributes that make each of us unique.

I encouraged the audience to go to Stacey Haysler's talk Everything You Need to Know About Diversity in 25 Minutes later the same day.

[The slides are available on the conference website.]

Why do we have a diversity task force?

It’s no secret that there are challenges in terms of diversity across the entire tech industry, and the Postgres

[...]

pgroll 0.13.0: Start using pgroll on existing databases with the new baseline feature

Posted by Andrew Farries in Xata on 2025-05-27 at 09:00

The new baseline feature in pgroll 0.13.0 makes it easy to start using pgroll on databases with existing schema and keep your schema histories clean.

How to Upgrade Major PostgreSQL Versions: A Practical Production Guide

Posted by warda bibi in Stormatics on 2025-05-27 at 07:31

PostgreSQL versions follow a well-defined five-year support lifecycle. Each major release receives bug fixes, security patches, and minor updates for five years from its initial release date. After that point, the version reaches end-of-life (EOL) and no longer receives official updates.

Staying on an EOL version exposes your systems to security risks, potential compatibility issues, and missing performance improvements introduced in later releases. You can always check the current support status of PostgreSQL versions on the official PostgreSQL Versioning Policy page.

Upgrading to the latest version ensures long-term stability, access to new features, and better support. Recently, I worked on upgrading a critical production PostgreSQL environment from version 11 to 15. Version 15 was chosen because the client’s application had only been tested up to that release. The system supported large batch workloads and live applications, so we had to be meticulous. While this article draws from that specific project, the steps are broadly applicable to anyone planning a major PostgreSQL upgrade, especially when crossing several versions.

This guide outlines a generalized, production-ready approach for performing major version upgrades using the pg_dump/pg_restore method.

Upgrade Methods

PostgreSQL provides two primary upgrade options, each with distinct advantages.

1. In-place upgrade using pg_upgrade
This method is designed for rapid transitions and minimal downtime. It upgrades the system catalog in place and reuses existing data files, making it highly efficient. However, it requires careful compatibility checks, especially around tablespaces, file system layout, and extensions.

2. Logical upgrade using pg_dump and pg_restore
This method involves exporting the database schema and data from the old cluster and importing them into a new one. While it involves longer downtime and more disk I/O, it avoids binary compatibility issues and is wel

[...]

Prairie Postgres Developers Summit

Posted by Henrietta Dombrovskaya on 2025-05-27 at 01:38

Even before the Community Summit in Montreal, I wanted to organize a Developers Summit at one of our meetups. My inspiration came from the Open spaces at DevOps Day Chicago. I saw how productive the smaller group discussions could be, and wanted to try to discuss different aspects of interactions.

We finally made it happen on May 22! In the meetup announcement, I asked those signing up to vote for topics we wanted to suggest and name additional topics. Then, when a meetup started, we re-voted the topics, and same as in Montreal, the outcome was different :).

I talked with several meetup members about leading group discussions, so we were prepared. However, a person who volunteered to lead the AI discussion ended up not making it to the meetup, so those who wanted to discuss AI had to choose a different topic. We ended up with three discussions:

Using ORM with Postgres
Application-driven design
Monitoring Postgres applications

The discussions lasted about 45 minutes, and I had to force-end them. Below are the summaries provided by the discussion leads, and a recording is coming soon!

Doğaç Eldenk, Application-driven design

In this discussion, we have talked about common table design strategies to increase performance and enrich developer experiences. We’ve talked about adding standardized columns such as “created_at” and “updated_at” to record some metadata about the row. Those columns are convenient for ETL jobs and developers. We discussed whether those should be assigned by the application or the database using default values or triggers; we mostly favored database-managed values for uniformity and control. Moreover, we went through unique identifiers, IDs, UUID v4, and time-based UUID v7, and how they can benefit the developer and the database. Also, semi-structured data is pretty common in applications; for this reason, using JSON columns is a good option. For semi-structured repeated data, you can use array type over normalization of data into a separate table

[...]

Dirk Krautschick

Posted by Andreas 'ads' Scherbaum on 2025-05-26 at 14:00

PostgreSQL Person of the Week Interview with Dirk Krautschick: I am Dirk, living near Aachen in Germany and I am actually a Senior Solution Architect with Aiven and a former DBA, Trainer, Consultant and Sales Engineer in several companies but alway related to databases - first Oracle only, later Oracle together with PostgreSQL.

Contributions for the week of 2025-05-19 (Weeks 20/21 overview)

Posted by Jimmy Angelakos in postgres-contrib.org on 2025-05-26 at 13:17

Postgres Extensions Day took place May 12 in Montreal, Canada, organized by Chris Travers, and Yurii Rashkovskii, with Andrei Lepikhov, Michel Pelletier, and Gwen Shapira additionally on the Talk Selection Committee.
- Speakers:
  - Hannu Krosing - From pl/v8 to pl/ < any > : towards easier extension development
  - Andrey Borodin - Upgrade as an extension
  - Paul Jungwirth - Inlining Postgres Functions, Now and Then
  - Alvaro Hernandez - Postgres à la carte: dynamic container images with your choice of extensions
  - Yurii Rashkovskii - Extension Upgrades
  - Phil Eaton - Working with MemoryContexts and debugging memory leaks in Postgres
  - Sweta Vooda - Postgres as a Control Plane: Challenges in Offloading Compute via Extensions
  - Ruohang Feng - The Missing Postgres Extension Repo and Package Manager
  - David Wheeler - How to automatically release your extensions on PGXN
  - Cary Huang - Extending PostgreSQL with Java: Overcoming Development Challenges in Bridging Java and C Application
  - Cheng Chen - Rethinking OLAP Architecture: The Journey to pg_mooncake v0.2
  - Florents Tselai - Spat: Hijacking Shared Memory for a Redis-Like Experience in PostgreSQL
  - Yurii Rashkovskii - Cppgres: One less reason to hate C++
  - Alastair Turner - Extensibility: new options and a wish list
PGConf.DEV (PostgreSQL Developer Conference) took place May 13-16, in Montreal, Canada
- Organized by:
  - Kaiting Chen
  - Robert Haas
  - Jonathan Katz
  - Melanie Plageman
  - Paul Ramsey
  - Steve Singer
- Program committee:

[...]

PgPedia Week, 2025-05-25

Posted by Ian Barwick on 2025-05-26 at 11:30

PostgreSQL 18 changes this week

Sadly two features were reverted this week:

planner optimization "Partitions pruned by initial pruning are no longer locked" (commit 525392d5 , reverted in commit 1722d5eb ) function pg_get_process_memory_contexts() (commit 042a6629 , reverted in commit fb844b9f ) PostgreSQL 18 articles UUIDv7 Comes to PostgreSQL 18 (2025-05-09) - Gwen Shapira

more...

Latest Blog Posts

Posted by Shane Borden on 2025-06-06 at 20:49

Posted by Henrietta Dombrovskaya on 2025-06-06 at 13:21

Posted by Ashutosh Bapat on 2025-06-06 at 09:40

Reorder buffer and logical_decoding_work_mem

Posted by Gabriele Bartolini in EDB on 2025-06-05 at 20:42

Posted by Tomas Vondra on 2025-06-05 at 10:00

Posted by Michael Christofides on 2025-06-04 at 15:58

Why bother?

Brief statistics refresher

Ju

Posted by Claire Giordano in CitusData on 2025-06-04 at 15:26

“By the numbers” summary for POSETTE 2025

Posted by Boriss Mejias in postgres-contrib.org on 2025-06-04 at 10:06

Posted by Josef Machytka in credativ on 2025-06-03 at 06:00

Posted by Ian Barwick on 2025-06-02 at 23:00

Posted by Kaarel Moppel on 2025-06-02 at 21:00

Posted by Laurenz Albe in Cybertec on 2025-06-02 at 18:39

The driving forces behind PostgreSQL development

Fun in PostgreSQL

Why do we need su

Posted by Andreas 'ads' Scherbaum on 2025-06-02 at 14:00

Posted by Karen Jex in Crunchy Data on 2025-06-02 at 08:49

Postgres Partitioning Best Practices

Posted by Sarah Conway in Data Bene on 2025-06-02 at 00:00

Posted by Andreas Scherbaum on 2025-06-01 at 22:00

Posted by semab tariq in Stormatics on 2025-05-30 at 10:49

What is a Split-Brain Scenario?

Common Scenarios That Lead to Split-Brain

Posted by Alexander Korotkov on 2025-05-30 at 00:00

1. Why OrioleDB needs a “bridge”​

2. How the bridge works under the hood​

Posted by David Wheeler in Tembo on 2025-05-29 at 22:09

Posted by Andrew Atkinson on 2025-05-29 at 17:29

Introduction

Count the queries

Why reduce the number of queries?

What about app server memory?

Posted by Brandur Leach in Crunchy Data on 2025-05-29 at 13:00

Posted by Ahmet Gedemenli in Xata on 2025-05-29 at 12:45

Posted by Tomas Vondra on 2025-05-29 at 12:00

Posted by Karen Jex in Crunchy Data on 2025-05-28 at 17:05

Posted by Andrew Farries in Xata on 2025-05-27 at 09:00

Posted by warda bibi in Stormatics on 2025-05-27 at 07:31

Upgrade Methods

Posted by Henrietta Dombrovskaya on 2025-05-27 at 01:38

Posted by Andreas 'ads' Scherbaum on 2025-05-26 at 14:00

Posted by Jimmy Angelakos in postgres-contrib.org on 2025-05-26 at 13:17

Posted by Ian Barwick on 2025-05-26 at 11:30

Top posters

Top teams

Feeds

Planet

Contact

1. Why OrioleDB needs a “bridge”

2. How the bridge works under the hood