FAQ: Using Sequences in PostgreSQL

Many of the questions asked in #postgresql revolve around using sequences in PostgreSQL. To avoid answering the same questions again and again, I thought it would be worthwhile to summarize the basic steps involving in using sequences in PostgreSQL.

What Is A Sequence?

A sequence is a special kind of database object designed for generating unique numeric identifiers. It is typically used to generate artificial primary keys. Sequences are similar, but not identical, to the AUTO_INCREMENT concept in MySQL.

How Do I Use A Sequence In A Table?

Sequences are most commonly used via the serial pseudotype. A serial is a special data type that encodes the following information:

it indicates that the values for the column will be generated by consulting the sequence
therefore, it creates a new sequence object, and sets the default value for the column to be the next value produced by the sequence
since a sequence always produces non-NULL values, it adds a NOT NULL constraint to the column
since the sequence that is produced is created "behind the scenes", PostgreSQL assumes that the sequence is only used to generate values for the table containing the serial column. Therefore, if this column is dropped, the sequence will be automatically removed.

For example, this command creates both a new table and a new sequence generator, and associates the sequence with the id column of the table:

test=# CREATE TABLE users (
test(#     id    SERIAL, -- assign each user a numeric ID
test(#     name  TEXT,
test(#     age   INT4
test(# );
NOTICE:  CREATE TABLE will create implicit sequence
"users_id_seq" for serial column "users.id"
CREATE TABLE

In this case, the sequence is automatically assigned the name users_id_seq. To avoid hard-coding the name of the sequence in SQL queries, we can use the pg_get_serial_sequence() function, as described below.

Note that using serial does not implicitly create an index on the column, or mark the column as a primary key. That can be easily done, however:

CREATE TABLE users (
    -- make the "id" column a primary key; this also creates
    -- a UNIQUE constraint and a b+-tree index on the column
    id    SERIAL PRIMARY KEY,
    name  TEXT,
    age   INT4
);

How Do I Assign Sequence Values To Newly-Inserted Rows?

If you're using serial, the default value for the serial column will be the next value produced by the sequence. To specify that an INSERT should take the default value for a given column, either omit that column from the INSERT's column list, or specify the DEFAULT keyword as the column's value.

Usage example:

INSERT INTO users (name, age) VALUES ('Mozart', 20);

Or equivalently:

INSERT INTO users (name, age, id) VALUES ('Mozart', 20, DEFAULT);

How Do I Retrieve The Most Recent Value Generated By A Sequence?

You can use the currval() function, which returns the most recent value generated by a sequence for the current session. currval() takes a single parameter: the name of the sequence. We can use the function pg_get_serial_sequence() to find the name of the sequence associated with a given serial column:

SELECT currval(pg_get_serial_sequence('users', 'id'));

Note that if no values have been generated by the sequence yet in the current session, currval() will yield an error.

Isn't This Subject To A Race Condition?

That is, if one database client inserts a row into a table that includes a sequence-generated value, wouldn't it be possible for another insertion into the table to modify the sequence, causing a subsequent currval() by the first client to return the wrong results?

No: sequences were designed to elegantly avoid this problem. currval() returns the last value generated by the sequence for the current session: if concurrent database clients generate sequence values, the currval() seen by a given session does not change (until the session generates a new sequence value, for example).

Doesn't Invoking currval() Require Two Database Queries?

To use the currval() method shown above, we'd need two queries: one to insert into the table, and another to fetch the sequence value assigned to the new row. Since client-server roundtrips can be expensive, this is not ideal. One way around this is to send the INSERT and the SELECT as a single query string. For example, in PHP:

pg_exec("INSERT INTO users (name, age) VALUES ('Bach', 15);
         SELECT currval(pg_get_serial_sequence('users', 'id'));")

This executes two queries, but does only a single roundtrip between the client and server, so the additional performance overhead of the second query should be negligible.

Alternatively, users of PostgreSQL 8.2 and later can take advantage of the INSERT ... RETURNING clause:

INSERT INTO users (name, age) VALUES ('Liszt', 10) RETURNING id;

which returns the value of the id column for the newly-inserted row.

What Is The Range Of Values Generated By A Sequence?

Sequences generate 64-bit signed integers. The serial pseudotype that we used above is a 32-bit signed integer: if you want to use the full 64-bit range of the underlying sequence, use the serial8 pseudotype instead.

Can There Be "Gaps" In The Values Generated By A Sequence?

Yes, there can. Sequences are intended for generating unique identifiers — not necessarily identifiers that are strictly sequential. If two concurrent database clients both attempt to get a value from a sequence (using nextval()), each client will get a different sequence value. If one of those clients subsequently aborts their transaction, the sequence value that was generated for that client will be unused, creating a gap in the sequence.

This can't easily be fixed without incurring a significant performance penalty. For more information, see Elein Mustein's "Gapless Sequences for Primary Keys" in the General Bits Newsletter.

What About Transactions?

Sequence operations are essentially non-transactional. nextval() increments the value of the sequence and is not rolled back if its transaction is later aborted; currval() returns the last value generated by the sequence for the current session, regardless of transaction boundaries.

What If I Want To Share One Sequence Between Two Tables?

The easiest way to do this is to create the sequence by hand, and then set the default clauses for the sequence-generated columns by hand, rather than using the serial type:

CREATE SEQUENCE common_fruit_id_seq;

CREATE TABLE apples (
    id      INT4 DEFAULT nextval('common_fruit_id_seq') NOT NULL,
    price   NUMERIC
);

CREATE TABLE oranges (
    id      INT4 DEFAULT nextval('common_fruit_id_seq') NOT NULL,
    weight  NUMERIC
);

nextval() is a function that produces a new sequence value.

Note that when using sequences in this manner, the sequence won't be automatically dropped when the table is dropped, and you won't be able to use pg_get_serial_sequence().

Where Can I Find More Information About Sequences?

Consult the PostgreSQL documentation:

Serial Types
Sequence Manipulation Functions (currval(), nextval(), etc.)
CREATE SEQUENCE
DROP SEQUENCE