Many of the questions asked in #postgresql revolve around using sequences in PostgreSQL. To avoid answering the same questions again and again, I thought it would be worthwhile to summarize the basic steps involving in using sequences in PostgreSQL.
A sequence is a special kind of database object designed for generating unique numeric identifiers. It is typically used to generate artificial primary keys. Sequences are similar, but not identical, to the AUTO_INCREMENT concept in MySQL.
Sequences are most commonly used via the serial
pseudotype. A
serial
is a special data type that encodes the following
information:
For example, this command creates both a new table and a new sequence
generator, and associates the sequence with the id
column of the table:
test=# CREATE TABLE users ( test(# id SERIAL, -- assign each user a numeric ID test(# name TEXT, test(# age INT4 test(# ); NOTICE: CREATE TABLE will create implicit sequence "users_id_seq" for serial column "users.id" CREATE TABLE
In this case, the sequence is automatically assigned the name users_id_seq. To avoid hard-coding the name of the sequence in SQL queries, we can use the pg_get_serial_sequence() function, as described below.
Note that using serial
does not implicitly create an
index on the column, or mark the column as a primary key. That can
be easily done, however:
CREATE TABLE users ( -- make the "id" column a primary key; this also creates -- a UNIQUE constraint and a b+-tree index on the column id SERIAL PRIMARY KEY, name TEXT, age INT4 );
If you're using serial
, the default value for the serial column will
be the next value produced by the sequence. To specify that an
INSERT should take the default value for a given column, either
omit that column from the INSERT's column list, or specify
the DEFAULT keyword as the column's value.
Usage example:
INSERT INTO users (name, age) VALUES ('Mozart', 20);
Or equivalently:
INSERT INTO users (name, age, id) VALUES ('Mozart', 20, DEFAULT);
You can use the currval() function, which returns the most recent value generated by a sequence for the current session. currval() takes a single parameter: the name of the sequence. We can use the function pg_get_serial_sequence() to find the name of the sequence associated with a given serial column:
SELECT currval(pg_get_serial_sequence('users', 'id'));
Note that if no values have been generated by the sequence yet in the current session, currval() will yield an error.
That is, if one database client inserts a row into a table that includes a sequence-generated value, wouldn't it be possible for another insertion into the table to modify the sequence, causing a subsequent currval() by the first client to return the wrong results?
No: sequences were designed to elegantly avoid this problem. currval() returns the last value generated by the sequence for the current session: if concurrent database clients generate sequence values, the currval() seen by a given session does not change (until the session generates a new sequence value, for example).
To use the currval() method shown above, we'd need two queries: one to insert into the table, and another to fetch the sequence value assigned to the new row. Since client-server roundtrips can be expensive, this is not ideal. One way around this is to send the INSERT and the SELECT as a single query string. For example, in PHP:
pg_exec("INSERT INTO users (name, age) VALUES ('Bach', 15); SELECT currval(pg_get_serial_sequence('users', 'id'));")
This executes two queries, but does only a single roundtrip between the client and server, so the additional performance overhead of the second query should be negligible.
Alternatively, users of PostgreSQL 8.2 and later can take advantage of the INSERT ... RETURNING clause:
INSERT INTO users (name, age) VALUES ('Liszt', 10) RETURNING id;
which returns the value of the id
column for the newly-inserted row.
Sequences generate 64-bit signed integers. The serial
pseudotype
that we used above is a 32-bit signed integer: if you want to use the
full 64-bit range of the underlying sequence, use the serial8
pseudotype instead.
Yes, there can. Sequences are intended for generating unique identifiers — not necessarily identifiers that are strictly sequential. If two concurrent database clients both attempt to get a value from a sequence (using nextval()), each client will get a different sequence value. If one of those clients subsequently aborts their transaction, the sequence value that was generated for that client will be unused, creating a gap in the sequence.
This can't easily be fixed without incurring a significant performance penalty. For more information, see Elein Mustein's "Gapless Sequences for Primary Keys" in the General Bits Newsletter.
Sequence operations are essentially non-transactional. nextval() increments the value of the sequence and is not rolled back if its transaction is later aborted; currval() returns the last value generated by the sequence for the current session, regardless of transaction boundaries.
The easiest way to do this is to create the sequence by hand, and
then set the default clauses for the sequence-generated columns by
hand, rather than using the serial
type:
CREATE SEQUENCE common_fruit_id_seq; CREATE TABLE apples ( id INT4 DEFAULT nextval('common_fruit_id_seq') NOT NULL, price NUMERIC ); CREATE TABLE oranges ( id INT4 DEFAULT nextval('common_fruit_id_seq') NOT NULL, weight NUMERIC );
nextval() is a function that produces a new sequence value.
Note that when using sequences in this manner, the sequence won't be automatically dropped when the table is dropped, and you won't be able to use pg_get_serial_sequence().
Consult the PostgreSQL documentation: