Window Functions in PostgreSQL

Introduction

Window functions in PostgreSQL allow you to perform calculations across a set of table rows related to the current row, without collapsing them into a single result. Unlike aggregate functions, which group rows into a single output, window functions retain individual row details while computing calculations across a defined window of rows.

This article explores:

  • The basics of window functions
  • Using OVER() and PARTITION BY
  • Common window functions like ROW_NUMBER(), RANK(), DENSE_RANK(), LEAD(), LAG(), and SUM()
  • Practical examples using the Reservations table

Understanding Window Functions in PostgreSQL

Window functions use the OVER() clause to define the window of rows over which the function operates. You can specify an optional PARTITION BY clause to divide the data into groups and an ORDER BY clause to control the row order.

Syntax:

SELECT column_name, window_function() OVER (PARTITION BY column_name ORDER BY column_name) FROM table_name;

Common Window Functions

1. ROW_NUMBER() – Assigning Unique Row Numbers

The ROW_NUMBER() function assigns a unique sequential integer to rows within a partition.

Example:

SELECT reservation_id, customer_id, total_rental_price,
       ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY pickup_date) AS row_num
FROM Reservations;

This query assigns a unique number to each reservation per customer based on pickup date.

Query result :

carrentaldb=# SELECT reservation_id, customer_id, total_rental_price,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY pickup_date) AS row_num
FROM Reservations;
reservation_id | customer_id | total_rental_price | row_num
----------------+-------------+--------------------+---------
1 | 1 | 180.00 | 1
2 | 2 | 80.00 | 1
3 | 3 | 126.00 | 1
4 | 4 | 190.00 | 1
5 | 5 | 200.00 | 1
6 | 6 | 190.00 | 1
7 | 7 | 78.00 | 1
8 | 8 | 240.00 | 1
9 | 9 | 500.00 | 1
10 | 10 | 175.00 | 1
(10 rows)

2. RANK() – Ranking with Gaps

The RANK() function assigns ranks to rows but allows gaps in ranking if there are duplicates.

Example:

SELECT reservation_id, customer_id, total_rental_price,
       RANK() OVER (ORDER BY total_rental_price DESC) AS rank
FROM Reservations;

This query ranks reservations by total rental price, allowing gaps in ranking for duplicate values.

Query result :

carrentaldb=# SELECT reservation_id, customer_id, total_rental_price,
RANK() OVER (ORDER BY total_rental_price DESC) AS rank
FROM Reservations;
reservation_id | customer_id | total_rental_price | rank
----------------+-------------+--------------------+------
9 | 9 | 500.00 | 1
8 | 8 | 240.00 | 2
5 | 5 | 200.00 | 3
6 | 6 | 190.00 | 4
4 | 4 | 190.00 | 4
1 | 1 | 180.00 | 6
10 | 10 | 175.00 | 7
3 | 3 | 126.00 | 8
2 | 2 | 80.00 | 9
7 | 7 | 78.00 | 10
(10 rows)

3. DENSE_RANK() – Ranking Without Gaps

Similar to RANK(), but without gaps between rank values.

Example:

SELECT reservation_id, customer_id, total_rental_price,
       DENSE_RANK() OVER (ORDER BY total_rental_price DESC) AS dense_rank
FROM Reservations;

This query ranks reservations by total rental price without skipping rank numbers. Query result :

carrentaldb=# SELECT reservation_id, customer_id, total_rental_price,
DENSE_RANK() OVER (ORDER BY total_rental_price DESC) AS dense_rank
FROM Reservations;
reservation_id | customer_id | total_rental_price | dense_rank
----------------+-------------+--------------------+------------
9 | 9 | 500.00 | 1
8 | 8 | 240.00 | 2
5 | 5 | 200.00 | 3
6 | 6 | 190.00 | 4
4 | 4 | 190.00 | 4
1 | 1 | 180.00 | 5
10 | 10 | 175.00 | 6
3 | 3 | 126.00 | 7
2 | 2 | 80.00 | 8
7 | 7 | 78.00 | 9
(10 rows)

4. LEAD() – Accessing Next Row Value

The LEAD() function allows access to the next row’s value within a partition.

Example:

SELECT reservation_id, customer_id, pickup_date,
       LEAD(pickup_date) OVER (PARTITION BY customer_id ORDER BY pickup_date) AS next_pickup
FROM Reservations;

This query retrieves the next reservation’s pickup date for each customer.

5. LAG() – Accessing Previous Row Value

The LAG() function allows access to the previous row’s value within a partition.

Example:

SELECT reservation_id, customer_id, pickup_date,
       LAG(pickup_date) OVER (PARTITION BY customer_id ORDER BY pickup_date) AS previous_pickup
FROM Reservations;

This query retrieves the previous reservation’s pickup date for each customer.

6. SUM() – Running Total

The SUM() function computes a running total of a column’s values.

Example:

SELECT reservation_id, customer_id, total_rental_price,
       SUM(total_rental_price) OVER (PARTITION BY customer_id ORDER BY pickup_date) AS running_total
FROM Reservations;

This query calculates the cumulative rental price per customer over time.

Conclusion

Window functions in PostgreSQL are powerful tools for performing calculations across a set of related rows while preserving individual row details. By using ROW_NUMBER(), RANK(), DENSE_RANK(), LEAD(), LAG(), and SUM(), you can efficiently analyze data trends and rankings. Mastering window functions will enhance your ability to work with complex queries and perform advanced analytics in PostgreSQL.

You may also like