Chapter 35. Extending SQL
-- cross-type comparisons int2 vs int8
OPERATOR 1 < (int2, int8) ,
OPERATOR 2 <= (int2, int8) ,
OPERATOR 3 = (int2, int8) ,
OPERATOR 4 >= (int2, int8) ,
OPERATOR 5 > (int2, int8) ,
FUNCTION 1 btint28cmp(int2, int8) ,
-- cross-type comparisons int2 vs int4
OPERATOR 1 < (int2, int4) ,
OPERATOR 2 <= (int2, int4) ,
OPERATOR 3 = (int2, int4) ,
OPERATOR 4 >= (int2, int4) ,
OPERATOR 5 > (int2, int4) ,
FUNCTION 1 btint24cmp(int2, int4) ;
Notice that this deﬁnition “overloads” theoperator strategy and support function numbers: each num-
ber occurs multiple times within the family. This is allowed so long as each instance of a particular
number has distinct input data types. The instances that have both input types equal to an operator
class’s input type are the primary operators and support functions for that operator class, and in most
cases shouldbe declared as part of the operator class rather than as loose members of the family.
In a B-tree operator family, all the operators in the family must sort compatibly, meaning that the
transitive laws hold across all the data types supported by the family: “if A = B and B = C, then A =
C”, and“if A < BandB < C, then A< C”. Moreover, implicitor binary coercioncasts betweentypes
represented in the operator family must not change the associated sort ordering. For each operator in
the family there must be a support function having the same two input data types as the operator. It
is recommended that a family be complete, i.e., for each combination of data types, all operators are
included. Each operator class should include just the non-cross-type operators and support function
for its data type.
To build amultiple-data-typehashoperator family, compatible hashsupport functions mustbe created
for each data typesupportedbythefamily. Here compatibilitymeans that the functions are guaranteed
to return the same hash code for any two values that are considered equal by the family’s equality
operators, even whenthe values are of different types. This is usually difﬁcultto accomplishwhen the
types have different physical representations, but it can be done in some cases. Furthermore, casting
avalue from one data type represented in the operator family to another data type also represented
in the operator family via an implicit or binary coercion cast must not change the computed hash
value. Notice that there is only one support function per data type, not one per equality operator. It
is recommended that a family be complete, i.e., provide an equality operator for each combination
of data types. Each operator class should include just the non-cross-type equality operator and the
support function for its data type.
GiST, SP-GiST, and GIN indexes do not have any explicit notion of cross-data-type operations. The
set of operators supported is just whatever the primary support functions for a given operator class
Note: Prior to PostgreSQL 8.3, there was no concept of operator families, andso any cross-data-
typeoperators intendedtobeused with an index had tobebound directly intothe index’s operator
class. While this approachstill works, it is deprecated because it makes an index’s dependencies
too broad, and because the planner can handle cross-data-type comparisons more effectively
when both data types have operators in the same operator family.
Chapter 35. Extending SQL
35.14.6. System Dependencies on Operator Classes
PostgreSQL uses operator classes to infer the properties of operators in more ways than just whether
they can be used with indexes. Therefore, you might want to create operator classes even if you have
no intention of indexing any columns of your data type.
In particular, there are SQL features such as
that require comparison and
sorting of values. To implement these features on a user-deﬁned data type, PostgreSQL looks for the
default B-tree operator class for the data type. The “equals” member of this operator class deﬁnes the
system’s notionof equalityof values for
,andthe sort ordering imposed by
the operator class deﬁnes the default
Comparison of arrays of user-deﬁned types also relies on the semantics deﬁned by the default B-tree
If there is no default B-tree operator class for a data type, the system will look for a default hash
operator class. But since that kind of operator class only provides equality, in practice it is only
enough tosupport array equality.
When there is no default operator class for a data type, you will get errors like “could not identify an
ordering operator” if you try to use these SQL features with the data type.
Note: In PostgreSQL versions before 7.4, sorting and grouping operations would implicitly use
.The new behavior of relying on default operator classes avoids
having to make any assumption about the behavior of operators with particular names.
Another important point is that an operator that appears in a hash operator family is a candidate for
hash joins, hash aggregation, and related optimizations. The hash operator family is essential here
since it identiﬁes the hash function(s) to use.
35.14.7. Ordering Operators
Some index access methods (currently, only GiST) support the concept of ordering operators. What
we have been discussing so far are search operators. A search operator is one for which the index
can be searched to ﬁnd all rows satisfying
WHERE indexed_column operator constant
that nothing is promised about the order in which the matching rows will be returned. In contrast,
an ordering operator does not restrict the set of rows that can be returned, but instead determines
their order. An ordering operator is one for which the index can be scanned to return rows in the
order represented by
ORDER BY indexed_column operator constant
.The reason for deﬁning
ordering operators that way is that it supports nearest-neighbor searches, if the operator is one that
measures distance. For example, a query like
FROM places ORDER BY location <-> point ’(101,456)’ LIMIT 10;
ﬁnds the ten places closest to a given target point. A GiST index on the location column can do this
is an ordering operator.
While search operators have to return Boolean results, ordering operators usually return some other
type, such as ﬂoat or numeric for distances. This type is normally not the same as the data type being
indexed. To avoid hard-wiringassumptions about the behavior of differentdata types, thedeﬁnitionof
an ordering operator is required toname a B-tree operator familythatspeciﬁes the sort ordering of the
result data type. As was stated in the previous section, B-tree operator families deﬁne PostgreSQL’s
notion of ordering, so this is a natural representation. Since the point
it could be speciﬁed in an operator class creation command like this:
Chapter 35. Extending SQL
<-> (point, point) FOR ORDER BY float_ops
is the built-inoperator family thatincludes operations on
states that the index is able to return rows inorder of increasing values of the
35.14.8. Special Features of Operator Classes
There are twospecialfeatures of operator classes that we have not discussed yet, mainlybecause they
are not useful with the most commonly usedindex methods.
Normally, declaring an operator as a member of an operator class (or family) means that the index
method can retrieve exactly the set of rows that satisfy a
condition using the operator. For
FROM table WHERE integer_column < 4;
can be satisﬁed exactly by a B-tree index on the integer column. But there are cases where an index
is useful as an inexact guide to the matching rows. For example, if a GiST index stores only bound-
ing boxes for geometric objects, then it cannot exactly satisfy a
condition that tests overlap
between nonrectangular objects such as polygons. Yet we could use the index to ﬁnd objects whose
bounding box overlaps the bounding box of the target object, and then do the exact overlap test only
on the objects found by the index. If this scenarioapplies, the index is saidto be “lossy” for the oper-
ator. Lossy index searches are implemented by having the index method return a recheck ﬂag when a
row might or might not really satisfy the query condition. The core system will then test the original
query condition on the retrieved row to see whether it should be returned as a valid match. This ap-
proach works if the index is guaranteed to return all the required rows, plus perhaps some additional
rows, whichcanbe eliminated by performingtheoriginaloperator invocation. Theindexmethods that
support lossy searches (currently, GiST, SP-GiST and GIN) allow the support functions of individual
operator classes to set the recheck ﬂag, and so this is essentially an operator-class feature.
Consider again the situation where we are storing in the index only the bounding box of a complex
object suchas a polygon. In this case there’s not much value instoringthewhole polygonin the index
entry — we might as well store just a simpler object of type
.This situation is expressed by the
CREATE OPERATOR CLASS
:we’d write something like:
CREATE OPERATOR CLASS polygon_ops
DEFAULT FOR TYPE polygon USING gist AS
At present, only the GiST and GIN index methods support a
type that’s different from
the column data type. The GiST
support routines must deal with data-
type conversion when
is used. In GIN, the
type identiﬁes the type of the “key”
values, which normally is different from the type of the indexed column — for example, an operator
class for integer-array columns might have keys that are just integers. The GIN
support routines are responsible for extracting keys from indexed values.
35.15. Packaging Related Objects into an Extension
Auseful extension to PostgreSQL typically includes multiple SQL objects; for example, a new data
type willrequire newfunctions, newoperators, and probably newindex operator classes. It is helpful
Chapter 35. Extending SQL
to collect all these objects into a single package to simplify database management. PostgreSQL calls
such a package an extension. To deﬁne an extension, you need at least a script ﬁle that contains the
SQL commandsto create the extension’s objects, and acontrolﬁle that speciﬁes a fewbasic properties
of the extension itself. If the extension includes C code, there will typically also be a shared library
ﬁle intowhich the C code has been built. Once you have these ﬁles, a simple CREATE EXTENSION
command loads the objects into your database.
The main advantage of using an extension, rather than just running the SQL script to load a bunch
of “loose” objects into your database, is that PostgreSQL will then understand that the objects of the
extension go together. You can drop all the objects with a single DROP EXTENSION command (no
need to maintain a separate “uninstall” script). Even more useful, pg_dump knows that it should not
dump the individual member objects of the extension — it will just include a
command in dumps, instead. This vastly simpliﬁes migration to a new version of the extension that
might contain more or different objects than the old version. Note however that you must have the
extension’s control, script, and other ﬁles available when loading such a dump into a new database.
PostgreSQL will not let you drop an individual object contained in an extension, except by dropping
the whole extension. Also, while you can change the deﬁnition of an extension member object (for
CREATE OR REPLACE FUNCTION
for afunction), bear in mind thatthemodiﬁeddeﬁni-
tion willnotbedumpedbypg_dump. Such a change is usually onlysensible if you concurrentlymake
the same change in the extension’s script ﬁle. (But there are special provisions for tables containing
conﬁguration data; see below.)
The extension mechanism also has provisions for packaging modiﬁcation scripts that adjust the deﬁ-
nitions of the SQL objects contained in an extension. For example, if version 1.1 of an extension adds
one function and changes the body of another function comparedto 1.0, the extension author can pro-
vide an update scriptthatmakes justthose twochanges. The
ALTER EXTENSION UPDATE
can then be used to apply these changes and track which version of the extension is actually installed
in a given database.
The kinds of SQL objects that can be members of an extension are shown in the description of AL-
TER EXTENSION. Notably, objects that are database-cluster-wide, such as databases, roles, and
tablespaces, cannot be extension members since an extension is only known within one database.
(Although an extension script is not prohibited from creating such objects, if it does so they will not
be tracked as part of the extension.) Also notice that while a table can be a member of an extension,
its subsidiary objects such as indexes are not directly considered members of the extension. Another
important point is that schemas can belong to extensions, but not vice versa: an extension as such
has an unqualiﬁed name and does not exist “within” any schema. The extension’s member objects,
however, will belong to schemas whenever appropriate for their object types. It may or may not be
appropriate for an extension to own the schema(s) its member objects are within.
35.15.1. Extension Files
The CREATE EXTENSION command relies on a control ﬁle for each extension, which must be
named the same as the extension with a sufﬁx of
,and must be placed in the installation’s
directory. There must also be at least one SQL script ﬁle, which follows the
). By default, the script ﬁle(s) are also placed in the
the control ﬁle can specify a different directory for the script ﬁle(s).
The ﬁle format for an extension control ﬁle is the same as for the
ﬁle, namely a
parameter_name = value
assignments, one per line. Blank lines and comments introduced
are allowed. Be sure to quote any value that is not a single word or number.
Acontrol ﬁle can set the following parameters:
Chapter 35. Extending SQL
The directory containing the extension’s SQL script ﬁle(s). Unless an absolute path is given, the
name is relative to the installation’s
directory. The default behavior is equivalent to
directory = ’extension’
The default version of the extension (the one that will be installed if no version is speciﬁed in
). Although this can be omitted, that will result in
failing if no
option appears, so you generally don’t want to do that.
Acomment (any string) about the extension. Alternatively, the comment can be set by means of
the COMMENT command in the script ﬁle.
The character set encoding used by the script ﬁle(s). This should be speciﬁed if the script ﬁles
contain any non-ASCII characters. Otherwise the ﬁles will be assumed to be in the database
The value of this parameter will be substituted for each occurrence of
in the script ﬁle(s). If it is not set, no substitution is made. Typically, this is set to
is used in
commands for C-language functions, so that the script ﬁles do not need to hard-wire the name
of the shared library.
Alist of names of extensions that this extension depends on, for example
requires = ’foo,
.Those extensions must be installed before this one can be installed.
If this parameter is
(whichisthedefault), onlysuperusers cancreatethe extensionor update
it to a new version. If it is set to
,just the privileges required to execute the commands in
the installation or update script are required.
Anextensionis relocatable if it is possible tomove its contained objects intoa different schema
after initial creation of the extension. The default is
,i.e. the extension is not relocatable.
See below for more information.
This parameter canonlybe setfor non-relocatableextensions. Itforces theextensiontobe loaded
into exactly the named schema and not any other. See below for more information.
In additionto the primarycontrol ﬁle
,an extension can have secondarycontrol
ﬁles named in the style
.If supplied, these must be located in the
script ﬁle directory. Secondary control ﬁles follow the same format as the primary control ﬁle. Any
parameters setin a secondary controlﬁle override theprimary controlﬁle when installing or updating
to thatversion of the extension. However, the parameters
be set in a secondary control ﬁle.
An extension’s SQL script ﬁles can contain any SQL commands, except for transaction control com-
,etc) and commands that cannot be executed inside a transaction block (such
). This is because the script ﬁles are implicitly executed within a transaction block.
Documents you may be interested
Documents you may be interested