Joins, Aggregations, Transactions
INNER/LEFT joins, GROUP BY, COUNT, BEGIN/COMMIT.
Knowing single-table SELECTs is like knowing how to drive on an empty parking lot. The real road is multi-table queries: counting orders per user, finding customers with no orders, summing revenue by month, and doing all of it atomically. That's joins, aggregations, and transactions. Let's drive.
The setup
We'll reuse the users and orders tables from the previous lesson:
-- users(id, email, name, created_at)
-- orders(id, user_id, amount_cents, status, created_at)
-- Picture this data:
-- users
-- ┌──────┬───────────────────┬─────────┐
-- │ id │ email │ name │
-- ├──────┼───────────────────┼─────────┤
-- │ u1 │ ada@example.com │ Ada │
-- │ u2 │ grace@example.com │ Grace │
-- │ u3 │ alan@example.com │ Alan │ -- no orders
-- └──────┴───────────────────┴─────────┘
--
-- orders
-- ┌──────┬─────────┬──────────────┬─────────┐
-- │ id │ user_id │ amount_cents │ status │
-- ├──────┼─────────┼──────────────┼─────────┤
-- │ o1 │ u1 │ 1000 │ paid │
-- │ o2 │ u1 │ 2500 │ pending │
-- │ o3 │ u2 │ 4999 │ paid │
-- │ o4 │ NULL │ 800 │ paid │ -- orphan guest order
-- └──────┴─────────┴──────────────┴─────────┘The four joins, demystified
A join glues two tables together row-by-row on a matching condition. The four flavors differ in which rows survive when the match fails on one side.
INNER JOIN: keep only matches on both sides
SELECT u.name, o.amount_cents, o.status
FROM users u
INNER JOIN orders o ON o.user_id = u.id;
-- Result (Alan dropped, guest order dropped):
-- name | amount_cents | status
-- Ada | 1000 | paid
-- Ada | 2500 | pending
-- Grace | 4999 | paidLEFT JOIN: keep every row from the left table
SELECT u.name, o.amount_cents, o.status
FROM users u
LEFT JOIN orders o ON o.user_id = u.id;
-- Result (Alan now appears with NULLs):
-- name | amount_cents | status
-- Ada | 1000 | paid
-- Ada | 2500 | pending
-- Grace | 4999 | paid
-- Alan | NULL | NULLRIGHT JOIN: keep every row from the right table
SELECT u.name, o.amount_cents
FROM users u
RIGHT JOIN orders o ON o.user_id = u.id;
-- Result (guest order appears with NULL user):
-- name | amount_cents
-- Ada | 1000
-- Ada | 2500
-- Grace | 4999
-- NULL | 800FULL OUTER JOIN: keep everything from both sides
SELECT u.name, o.amount_cents
FROM users u
FULL OUTER JOIN orders o ON o.user_id = u.id;
-- Result (Alan AND guest both appear):
-- name | amount_cents
-- Ada | 1000
-- Ada | 2500
-- Grace | 4999
-- Alan | NULL
-- NULL | 800Aggregations: COUNT, SUM, AVG, MIN, MAX
An aggregate function collapses many rows into one. Without GROUP BY, it collapses the whole table.
SELECT COUNT(*) FROM users; -- 3
SELECT SUM(amount_cents) FROM orders; -- 9299
SELECT AVG(amount_cents)::int FROM orders WHERE status = 'paid';
SELECT MIN(created_at), MAX(created_at) FROM orders;GROUP BY: aggregate per bucket
GROUP BY collapses rows that share the same value(s) into one row, with aggregates applied per group.
-- Orders per user
SELECT user_id, COUNT(*) AS order_count, SUM(amount_cents) AS total
FROM orders
GROUP BY user_id;
-- user_id | order_count | total
-- u1 | 2 | 3500
-- u2 | 1 | 4999
-- NULL | 1 | 800Combine with a join to get user names instead of IDs:
SELECT u.name, COUNT(o.id) AS order_count, COALESCE(SUM(o.amount_cents), 0) AS total
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.name
ORDER BY total DESC;
-- name | order_count | total
-- Grace | 1 | 4999
-- Ada | 2 | 3500
-- Alan | 0 | 0 -- LEFT JOIN keeps Alan even with no ordersSUM of zero rows returns NULL. COALESCE(x, 0) returns x unless x is NULL, in which case it returns the fallback. Use it whenever a NULL would surprise a downstream consumer.HAVING vs WHERE
WHERE filters rows before grouping. HAVING filters groups after. You can't use an aggregate in WHEREbecause the groups don't exist yet.
-- Users who have at least 2 paid orders
SELECT u.name, COUNT(*) AS paid_orders
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE o.status = 'paid' -- per-row filter, BEFORE grouping
GROUP BY u.id, u.name
HAVING COUNT(*) >= 2; -- per-group filter, AFTER groupingSQL execution order (it's not top-to-bottom)
Even though you write SELECT ... FROM ... WHERE ... GROUP BY ... HAVING ... ORDER BY ... LIMIT, Postgres logically runs the clauses in a different order:
FROM(load tables)JOIN(combine them)WHERE(filter rows)GROUP BY(bucket rows)HAVING(filter buckets)SELECT(pick columns)ORDER BY(sort)LIMIT(slice)
That's why a column alias from SELECTisn't available in WHERE but is available in ORDER BY.
Transactions: all or nothing
Some operations only make sense as a single atomic unit. Transferring money is the classic example: subtract from one account, add to another, both or neither. Postgres gives you BEGIN, COMMIT, and ROLLBACK.
BEGIN;
UPDATE accounts SET balance = balance - 1000 WHERE id = 'a1';
UPDATE accounts SET balance = balance + 1000 WHERE id = 'a2';
-- If everything looks right:
COMMIT;
-- If anything failed or looked wrong:
ROLLBACK;Between BEGIN and COMMIT, your writes are invisible to other connections. They become visible atomically at the moment of COMMIT.
import { Pool } from "pg";
const pool = new Pool();
async function transfer(from: string, to: string, cents: number) {
const client = await pool.connect();
try {
await client.query("BEGIN");
await client.query(
"UPDATE accounts SET balance = balance - $1 WHERE id = $2",
[cents, from]
);
await client.query(
"UPDATE accounts SET balance = balance + $1 WHERE id = $2",
[cents, to]
);
await client.query("COMMIT");
} catch (err) {
await client.query("ROLLBACK");
throw err;
} finally {
client.release();
}
}try / finally and client.release() in the finally block.UPSERT with ON CONFLICT
Need to "insert if missing, otherwise update"? Postgres has INSERT ... ON CONFLICT DO UPDATE, also known as an UPSERT.
INSERT INTO users (email, name)
VALUES ('ada@example.com', 'Ada King')
ON CONFLICT (email) DO UPDATE
SET name = EXCLUDED.name,
updated_at = now()
RETURNING id, name;EXCLUDEDis a virtual row holding the values that would have been inserted - use it to reference "the new value" inside the update.
-- "Idempotent insert": do nothing if the email already exists
INSERT INTO users (email, name)
VALUES ('ada@example.com', 'Ada')
ON CONFLICT (email) DO NOTHING;ON CONFLICT (email) only works if email has a unique constraint or unique index. Otherwise Postgres has no way to detect the collision.Quiz
You want all users in the result even if they have zero orders. Which join?
Recap
- INNER JOIN = matches only. LEFT JOIN = left rows + matches. RIGHT/FULL = the symmetric versions.
- Aggregate with
COUNT,SUM,AVG,MIN,MAX. Bucket withGROUP BY. WHEREfilters rows before grouping;HAVINGfilters groups after.- Use
BEGIN/COMMIT/ROLLBACKfor atomic multi-statement writes. Always release pool clients. INSERT ... ON CONFLICTgives you UPSERT (update on conflict) and idempotent inserts (do nothing on conflict).