131
SELECT
A
JOIN
clause combines two
FROM
items, which for convenience we will refer to as “tables”,
though in reality they can be any type of
FROM
item. Use parentheses if necessary to determine
the order of nesting. In the absence of parentheses,
JOIN
snest left-to-right. In any case
JOIN
binds more tightly than the commas separating
FROM
-list items.
CROSS JOIN
and
INNER JOIN
produce a simple Cartesian product, the same result as you get
from listing the two tables at the top level of
FROM
,but restricted by the join condition (if any).
CROSS JOIN
is equivalent to
INNER JOIN ON (TRUE)
,that is, no rows are removed by qual-
ification. These join types are just a notational convenience, since they do nothing you couldn’t
do with plain
FROM
and
WHERE
.
LEFT OUTER JOIN
returns all rows in the qualified Cartesian product (i.e., all combined rows
that pass its join condition), plus one copy of each row in the left-hand table for which there was
noright-hand rowthatpassed the joincondition. This left-hand rowis extendedto the full width
of the joined table by inserting null values for the right-hand columns. Note that only the
JOIN
clause’s own condition is considered while decidingwhich rows have matches. Outer conditions
are applied afterwards.
Conversely,
RIGHT OUTER JOIN
returns all the joined rows, plus one row for each unmatched
right-hand row (extended with nulls on the left). This is just a notational convenience, since you
could convert it to a
LEFT OUTER JOIN
by switching the left and right tables.
FULL OUTER JOIN
returns all the joined rows, plus one row for each unmatched left-hand row
(extended with nulls on the right), plus one row for each unmatched right-hand row (extended
with nulls on the left).
ON
join_condition
join_condition
is an expression resulting in a value of type
boolean
(similar to a
WHERE
clause) that specifies which rows ina join are considered to match.
USING (
join_column
[, ...] )
A clause of the form
USING ( a, b, ... )
is shorthand for
ON left_table.a =
right_table.a AND left_table.b = right_table.b ...
. Also,
USING
implies that
only one of each pair of equivalent columns will be included in the join output, not both.
NATURAL
NATURAL
is shorthand for a
USING
list that mentions all columns in the two tables that have the
same names.
LATERAL
The
LATERAL
key word can precede a sub-
SELECT FROM
item. This allows the sub-
SELECT
to
refer to columns of
FROM
items that appear before it in the
FROM
list. (Without
LATERAL
,each
sub-
SELECT
is evaluated independently and so cannot cross-reference any other
FROM
item.)
LATERAL
can also precede a function-call
FROM
item, but in this case it is a noise word, because
the function expression can refer to earlier
FROM
items in any case.
A
LATERAL
item can appear at top level in the
FROM
list, or within a
JOIN
tree. In the latter case
it can also refer to any items that are on the left-hand side of a
JOIN
that it is on the right-hand
side of.
When a
FROM
item contains
LATERAL
cross-references, evaluation proceeds as follows: for each
row of the
FROM
item providing the cross-referenced column(s), or set of rows of multiple
FROM
items providing the columns, the
LATERAL
item is evaluated using that row or row set’s values
of the columns. The resulting row(s) are joined as usual withthe rows they were computed from.
This is repeated for each row or set of rows from the column source table(s).
1583
98
SELECT
Thecolumnsourcetable(s) mustbe
INNER
or
LEFT
joinedtothe
LATERAL
item, else there would
not be a well-definedsetof rows from which tocompute each set of rows for the
LATERAL
item.
Thus, although a construct such as
X
RIGHT JOIN LATERAL
Y
is syntactically valid, it is not
actually allowed for
Y
toreference
X
.
WHERE
Clause
The optional
WHERE
clause has the general form
WHERE
condition
where
condition
is any expression that evaluates to a result of type
boolean
.Any row that does
not satisfy this conditionwill be eliminatedfrom the output. Arow satisfies the condition if it returns
true whenthe actual row values are substituted for any variable references.
GROUP BY
Clause
The optional
GROUP BY
clause has the general form
GROUP BY
expression
[, ...]
GROUP BY
willcondenseinto asingle rowallselected rows that share the same values for the grouped
expressions.
expression
can be an input column name, or the name or ordinal number of an output
column (
SELECT
list item), or an arbitrary expression formed from input-column values. In case of
ambiguity, a
GROUP BY
name will be interpreted as an input-column name rather than an output
column name.
Aggregate functions, if any are used, are computed across all rows making up each group, producing
aseparate value for each group. (If there are aggregate functions but no
GROUP BY
clause, the query
is treated as having a single group comprising all the selected rows.) The set of rows fed to each
aggregate function can be further filteredby attachinga
FILTER
clause to the aggregate function call;
see Section 4.2.7 for more information. When a
FILTER
clause is present, only those rows matching
it are included in the input to that aggregate function.
When
GROUP BY
is present, or any aggregate functions are present, it is not valid for the
SELECT
list
expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped
column is functionally dependent onthe grouped columns, since there would otherwise be more than
one possible value to return for an ungrouped column. Afunctional dependency exists if the grouped
columns (or a subset thereof) are the primary key of the table containing the ungroupedcolumn.
Keep in mind that all aggregate functions are evaluated before evaluating any “scalar” expressions in
the
HAVING
clause or
SELECT
list. This means that, for example, a
CASE
expression cannot be used
to skip evaluation of an aggregate function; see Section 4.2.14.
Currently,
FOR NO KEY UPDATE
,
FOR UPDATE
,
FOR SHARE
and
FOR KEY SHARE
cannot be spec-
ified with
GROUP BY
.
1584
127
SELECT
HAVING
Clause
The optional
HAVING
clause has the general form
HAVING
condition
where
condition
is the same as specified for the
WHERE
clause.
HAVING
eliminates group rows that do not satisfy the condition.
HAVING
is different from
WHERE
:
WHERE
filters individual rows before the application of
GROUP BY
,while
HAVING
filters group rows
created by
GROUP BY
. Each column referenced in
condition
must unambiguously reference a
grouping column, unless the reference appears within an aggregate function or the ungrouped col-
umn is functionally dependent on the grouping columns.
The presenceof
HAVING
turns a query intoa grouped query evenif there is no
GROUP BY
clause. This
is the same as what happens when the query contains aggregate functions but no
GROUP BY
clause.
All the selected rows are considered to form a single group, and the
SELECT
list and
HAVING
clause
can only reference table columns from within aggregate functions. Such a query will emit a single
row if the
HAVING
condition is true, zero rows if it is not true.
Currently,
FOR NO KEY UPDATE
,
FOR UPDATE
,
FOR SHARE
and
FOR KEY SHARE
cannot be spec-
ified with
HAVING
.
WINDOW
Clause
The optional
WINDOW
clause has the general form
WINDOW
window_name
AS (
window_definition
) [, ...]
where
window_name
is a name that can be referenced from
OVER
clauses or subsequent window
definitions, and
window_definition
is
[
existing_window_name
]
[ PARTITION BY
expression
[, ...] ]
[ ORDER BY
expression
[ ASC | DESC | USING
operator
] [ NULLS { FIRST | LAST } ] [, ...] ]
[
frame_clause
]
If an
existing_window_name
is specified it must refer to an earlier entry in the
WINDOW
list; the
newwindow copies its partitioningclause from that entry, as well as its ordering clause if any. In this
case the new window cannot specify its own
PARTITION BY
clause, and it can specify
ORDER BY
only if the copied windowdoes not have one. The newwindow always uses its own frame clause; the
copied window must not specify a frame clause.
The elements of the
PARTITION BY
list are interpreted in much the same fashion as elements of a
GROUP BY Clause, except that they are always simple expressions and never the name or number of
an output column. Another difference is that these expressions can contain aggregate function calls,
which are not allowed in a regular
GROUP BY
clause. They are allowed here because windowing
occurs after grouping and aggregation.
Similarly, the elements of the
ORDER BY
list are interpreted in much the same fashion as elements of
an ORDER BY Clause, except that the expressions are always taken as simple expressions and never
the name or number of an output column.
The optional
frame_clause
defines the window frame for window functions that depend on the
frame (not all do). The window frame is a set of related rows for each row of the query (called the
current row). The
frame_clause
can be one of
1585
155
SELECT
{ RANGE | ROWS }
frame_start
{ RANGE | ROWS } BETWEEN
frame_start
AND
frame_end
where
frame_start
and
frame_end
can be one of
UNBOUNDED PRECEDING
value
PRECEDING
CURRENT ROW
value
FOLLOWING
UNBOUNDED FOLLOWING
If
frame_end
is omitted it defaults to
CURRENT ROW
.Restrictions are that
frame_start
cannot
be
UNBOUNDED FOLLOWING
,
frame_end
cannot be
UNBOUNDED PRECEDING
,and the
frame_end
choice cannot appear earlier in the above list than the
frame_start
choice — for example
RANGE
BETWEEN CURRENT ROW AND
value
PRECEDING
is not allowed.
The default framing option is
RANGE UNBOUNDED PRECEDING
, which is the same as
RANGE
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
;it sets the frame to be all rows from the
partition start up through the current row’s last peer (a row that
ORDER BY
considers equivalent to
the current row, or all rows if there is no
ORDER BY
). In general,
UNBOUNDED PRECEDING
means
that the frame starts with the first row of the partition, and similarly
UNBOUNDED FOLLOWING
means
that the frame ends with the last row of the partition (regardless of
RANGE
or
ROWS
mode). In
ROWS
mode,
CURRENT ROW
means that the frame starts or ends with the current row; but in
RANGE
mode it
means that the frame starts or ends with the current row’s first or last peer in the
ORDER BY
ordering.
The
value PRECEDING
and
value FOLLOWING
cases are currently only allowed in
ROWS
mode.
They indicate that the frame starts or ends with the row that many rows before or after the current
row.
value
must be an integer expression not containing any variables, aggregate functions, or
window functions. The value must not be null or negative; but it can be zero, which selects the
current row itself.
Beware that the
ROWS
options can produce unpredictable results if the
ORDER BY
ordering does not
order the rows uniquely. The
RANGE
options are designed to ensure that rows that are peers in the
ORDER BY
ordering are treated alike; all peer rows will be in the same frame.
The purpose of a
WINDOW
clause is to specify the behavior of window functions appearing in the
query’s SELECT List or ORDER BY Clause. These functions can reference the
WINDOW
clause en-
tries by name in their
OVER
clauses. A
WINDOW
clause entry does nothave to be referencedanywhere,
however; if itis notused in the queryit is simply ignored. It is possible touse window functions with-
out any
WINDOW
clause at all, since a window function call can specify its window definition directly
in its
OVER
clause. However, the
WINDOW
clause saves typing when the same window definition is
needed for more than one window function.
Currently,
FOR NO KEY UPDATE
,
FOR UPDATE
,
FOR SHARE
and
FOR KEY SHARE
cannot be spec-
ified with
WINDOW
.
Window functions are described in detail in Section 3.5, Section 4.2.8, and Section 7.2.4.
SELECT
List
The
SELECT
list (betweenthe keywords
SELECT
and
FROM
)specifiesexpressions that form theoutput
rows of the
SELECT
statement. The expressions can (and usually do) refer to columns computed in
the
FROM
clause.
Just as in a table, every output columnof a
SELECT
has a name. In a simple
SELECT
this name is just
used to label the column for display, but when the
SELECT
is a sub-query of a larger query, the name
is seen by the larger query as the column name of the virtual table produced by the sub-query. To
1586
129
SELECT
specify the name to use for an output column, write
AS output_name
after the column’s expression.
(You can omit
AS
,but only if the desired output name does not match any PostgreSQL keyword (see
Appendix C). For protection against possible future keyword additions, it is recommended that you
always either write
AS
or double-quote the outputname.) If you do notspecify acolumn name, aname
is chosenautomatically by PostgreSQL. If the column’s expressionis a simple column reference then
the chosen name is the same as that column’s name. In more complex cases a function or type name
may be used, or the system may fall back on a generated name such as
?column?
.
An output column’s name can be used to refer to the column’s value in
ORDER BY
and
GROUP BY
clauses, but not in the
WHERE
or
HAVING
clauses; there you must write out the expression instead.
Instead of an expression,
*
can be written in the output list as a shorthand for all the columns of the
selected rows. Also, you can write
table_name
.
*
as a shorthand for the columns coming from just
that table. In these cases it is not possible to specify new names with
AS
;the output column names
will be the same as the table columns’ names.
DISTINCT
Clause
If
SELECT DISTINCT
is specified, all duplicate rows are removed from the result set (one row is
kept from each group of duplicates).
SELECT ALL
specifies the opposite: all rows are kept; that is the
default.
SELECT DISTINCT ON (
expression
[, ...] )
keeps only the first row of each set of rows
where the given expressions evaluate to equal. The
DISTINCT ON
expressions are interpreted using
the same rules as for
ORDER BY
(see above). Note that the “first row” of each set is unpredictable
unless
ORDER BY
is usedto ensure that the desired row appears first. For example:
SELECT DISTINCT ON (location) location, time, report
FROM weather_reports
ORDER BY location, time DESC;
retrieves the most recent weather report for each location. But if we had not used
ORDER BY
to force
descending order of time values for each location, we’d have gotten a report from an unpredictable
time for each location.
The
DISTINCT ON
expression(s) must match the leftmost
ORDER BY
expression(s). The
ORDER BY
clause will normally contain additional expression(s) that determine the desired precedence of rows
within each
DISTINCT ON
group.
Currently,
FOR NO KEY UPDATE
,
FOR UPDATE
,
FOR SHARE
and
FOR KEY SHARE
cannot be spec-
ified with
DISTINCT
.
UNION
Clause
The
UNION
clause has this general form:
select_statement
UNION [ ALL | DISTINCT ]
select_statement
select_statement
is any
SELECT
statement without an
ORDER BY
,
LIMIT
,
FOR NO KEY
UPDATE
,
FOR UPDATE
,
FOR SHARE
, or
FOR KEY SHARE
clause. (
ORDER BY
and
LIMIT
can be
attached to a subexpression if it is enclosed in parentheses. Without parentheses, these clauses will
be taken to apply to the result of the
UNION
,not to its right-hand input expression.)
The
UNION
operator computes the set union of the rows returned by the involved
SELECT
statements.
Arow is in the set union of two result sets if it appears in at least one of the result sets. The two
1587
Documents you may be interested
Documents you may be interested