PostgreSQL mini cookbook: Aggregate Query Tricks
Sum a field
You want to calculate the total number of payments per customer, staff.
SELECT a, b, sum(c) FROM t GROUP BY a ,b;
# SELECT * FROM payment; payment_id | customer_id | staff_id | rental_id | amount | payment_date ------------+-------------+----------+-----------+--------+---------------------------- 17503 | 341 | 2 | 1520 | 7.99 | 2007-02-15 22:25:46.996577 17504 | 341 | 1 | 1778 | 1.99 | 2007-02-16 17:23:14.996577 17505 | 341 | 1 | 1849 | 7.99 | 2007-02-16 22:41:45.996577 17506 | 341 | 2 | 2829 | 2.99 | 2007-02-19 19:39:56.996577 17507 | 341 | 2 | 3130 | 7.99 | 2007-02-20 17:31:48.996577 17508 | 341 | 1 | 3382 | 5.99 | 2007-02-21 12:33:49.996577 # SELECT customer_id, staff_id, SUM(amount) FROM payment GROUP BY customer_id, staff_id; customer_id | staff_id | sum -------------+----------+-------- 476 | 1 | 44.88 205 | 2 | 48.91 344 | 1 | 12.95 598 | 1 | 43.90 14 | 1 | 55.84
You need to construct a
SELECT query with a
GROUP BY clause and apply a
So, to find the total number of orders we have per product in the system we can do:
SELECT p.pr_code, p.pr_desc, sum(o.ord_qty) AS quantity FROM products p, orders o WHERE p.pr_code=o.ord_product GROUP BY p.pr_code, p.pr_desc;
So why do we
GROUP BY both the product-code and description,
surely they match up?
Well, they do, but if you return a field from a query it
should either be mentioned in the GROUP BY or be subject to
an aggregate function (such as sum).
As an alternative we could have used:
SELECT p.pr_code, min(p.pr_desc), sum(o.ord_qty) AS quantity FROM products p, orders o WHERE p.pr_code=o.ord_product GROUP BY p.pr_code;
Which achieves the same since for any given pr_code we only have one pr_desc but probably isn’t as clear.
Finding the maximum/minimum value
You’re not interested in how many widgets your biggest customer bought but you do want to know who they are.
Use a subquery that finds the
max() of the quantity
you want to measure and compare against it.
This is similar to totalling, but you use the min/max functions instead of sum. In addition, we need to compare individual records against this maximum value. So, to get the company name and order details of the largest order in the system:
SELECT c.co_name, p.pr_code, p.pr_desc, o.ord_qty FROM companies c, products p, orders o WHERE c.co_id=o.ord_company AND p.pr_code=o.ord_product AND o.ord_qty = (SELECT max(ord_qty) FROM orders);
If there are two or more orders both have the same largest quantity, they will both be returned.
Counting doesn’t give me the total number of records in a table
You know there are 100 records in this table but count(field) returns less than that.
count(*) to avoid skipping NULL values in a specific field.
You try to count the number of diary entries in the system:
SELECT count(dy_notes) FROM diary;
You find that entries with
NULL dy_notes are being skipped.
This is the correct behavior:
count()only counts non-null values.
The way of solving this is:
SELECT count(*) FROM diary;
I want to know how many types of X are in the system
You have a thousands of products in your range but suspect most don’t sell. You want a list of product-types that are in your orders list.
GROUP BY the product-code and sum quantity if you want to know that,
In practice, you will probably want to see how many of each product you’ve sold.
SELECT ord_product, sum(ord_qty) FROM orders GROUP BY ord_product;
If you don’t you can use
SELECT count(distinct ord_product) FROM orders;
This ignores null values of ord_product and throws away duplicates.
Defining your own aggregate function
You work for NASA and
sum() etc. aren’t enough for you.
Create your own aggregate function with
CREATE AGGREGATE and an accumulator function.
If you’re planning on doing this without reading the manuals think again. Basically however you need to define:
- The return-type of the aggregate function (e.g.
- The type of thing it aggregates (e.g.
- An initial value (probably
- A state-transition function to accumulate values (takes the running total and the next value)
- Optionally a finalisation function to produce the final value (useful for averages or standard deviations)
You might well end up coding the state-transition and finalisation functions in C, but here is an trivial one-liner.
CREATE AGGREGATE catenate(sfunc1=textcat, basetype=text, stype1=text, initcond1='');
This sets the initial condition to '' (the empty string), accepts and returns text and uses the built-in textcat function to concatenate string values.
NOTE - order of concatenated text is not well-defined. You should only use commutable functions such as +, * etc (subnote - commutable might not be right I might mean ‘associative’. If you don’t know whether I’m right or wrong, make sure you test your aggregate function carefully).
Was this page helpful?
Glad to hear it!
Sorry to hear that.