Back

Safely Alter Postgres Columns with USING

Cully LarsonFriday, March 25, 2022
a sign that says let's change

But first, Arrays

Postgres has this super useful data type called Array. Not to throw out an obscure term without defining it, an “array” is a sort of list of things 😉. Have you ever wanted to associate a row in your database with a list of things? Without the array type, you’d have to create a join table. This might be the best option in some cases, but if you just want to add a few items to each row, or if you want to do the same thing for multiple lists of items, Postgres arrays to the rescue!

At Echobind we recently had a situation where we needed to alter a column from a scalar (a projectType enum field) to an array (each row could be classified under multiple “project types”). A simple ALTER TABLE query, right?

CREATE TYPE "ProjectType" AS ENUM ('BILLABLE', 'NONBILLABLE'); CREATE TABLE "Proposal" ( "id" TEXT NOT NULL, "projectType" "ProjectType" NOT NULL, PRIMARY KEY ("id") ); INSERT INTO "Proposal" VALUES ('a001', 'BILLABLE'); ALTER TABLE "Proposal" ALTER COLUMN "projectType" SET DATA TYPE "ProjectType"[];

Well, we have a problem. This query will fail if any rows have a non-null value in projectType. We could just set all the values to null (we would also need to allow null values in that column), but that would destroy our data. We could create a migration script, but that seems like a lot of work.

Finally, USING

Thankfully Postgres has us covered once again! It provides the USING clause.

The USING clause specifies how to compute the new column value from the old.

It’s basically a “function” that takes the old value in the column and generates a new value (spoiler: it does even more than that!). We can use USING in our original ALTER query to do something like this:

ALTER TABLE "Proposal" ALTER COLUMN "projectType" SET DATA TYPE "ProjectType"[] USING ARRAY["projectType"];

This tells Postgres to take the value in projectType (before the ALTER is applied), put it into an array, and use that as the value for the projectType column after the ALTER is applied. With this version of the query, we don’t destroy any data.

ALTER TABLE "Proposal" ALTER COLUMN "projectType" SET DATA TYPE "ProjectType" USING "projectType"[1];

This tells Postgres to use the first value in the projectType array (Postgres arrays start at one—I’m sure there are reasons, but still 🤷) as the new value. But as I hinted before, this will only be non-destructive if you haven’t added any more items to the projectType array (i.e. it only has one value in each row). If it has more than one value, you’ll lose all of them except the first.

The rumors are true, there’s more

I know, how could it get better? Well, lay aside those low expectations and prepare to be astonished; USING is so much more powerful. When building your USING clause, you can use:

  1. Any of the columns in your table (not just the one being modified).
  2. Any Postgres function.
  3. Explicit casting.
  4. A fourth thing.

All the columns

USING doesn’t limit you to just the column you’re altering. You can use any of the columns in your table. Let’s say you have a table like this that tracks some metrics:

CREATE TABLE "Metric" ( "id" TEXT NOT NULL, "url" TEXT, "authorizedVisits" INTEGER, "unauthorizedVisits" INTEGER, "summary" TEXT, PRIMARY KEY ("id") ); INSERT INTO "Metric" VALUES ('b001', '/products', 30, 60, 'You have 90 total visits.');

This was a bad decision; summary should not be human-readable text. It would be better if it was just the total number of visits and we could use it in code however we want:

ALTER TABLE "Metric" ALTER COLUMN "summary" SET DATA TYPE INTEGER USING "authorizedVisits" + "unauthorizedVisits";

With this, we’re altering summary to be the sum of the authorizedVisits and unauthorizedVisits columns.

And just to cover all our bases, we could roll this back with:

ALTER TABLE "Metric" ALTER COLUMN "summary" SET DATA TYPE TEXT USING 'You have ' || "summary" || ' total visits.';

All the functions

Want to do some more complex manipulation? You can use Postgres functions in USING. Let’s say we’ve been storing a timestamp in the MySQL DATETIME format (YYYY-MM-DD HH:MI:SS). But then we come to learn that Postgres has its own timestamp data type (we’re always learning as developers; no shame in not knowing the best chili dog toppings at our first rodeo).

CREATE TABLE "Message" ( "id" TEXT NOT NULL, "from" TEXT, "to" TEXT, "message" TEXT, "sentAt" TEXT, PRIMARY KEY ("id") ); INSERT INTO "Message" VALUES ( 'c0001', 'George', 'Elaine', 'You should''ve seen the look on her face! It was the same look my father gave me when I told him I wanted to be a ventriloquist!', '2022-02-02 22:22:22' );

Postgres has a built-in function for converting strings to timestamps, TO_TIMESTAMP, and we can use it in the USING clause:

ALTER TABLE "Message" ALTER COLUMN "sentAt" SET DATA TYPE TIMESTAMP(3) USING TO_TIMESTAMP("sentAt", 'YYYY-MM-DD HH24:MI:SS');

This converts our DATETIME string value into a native timestamp, using the specified format.

And to stay on brand, we can roll it back with:

ALTER TABLE "Message" ALTER COLUMN "sentAt" SET DATA TYPE TEXT USING TO_CHAR("sentAt", 'YYYY-MM-DD HH24:MI:SS');

Boring, explicit casting

When you change the data type of a column with USING, Postgres will do its best to cast the current value to the new type. But sometimes it can’t (where is my ‘90s dream of an artificial intelligence utopia) and you need to explicitly define how you want Postgres to cast your value.

Let’s say we have some JSON data, but we’re storing it in a TEXT field. We want to convert that to a fancy JSON field:

CREATE TABLE "House" ( "id" TEXT NOT NULL, "specs" TEXT, PRIMARY KEY ("id") ); INSERT INTO "House" VALUES ( 'd0001', '{"size": 1500, "sizeUnits": "sf", "bathrooms": 12}' ); ALTER TABLE "House" ALTER COLUMN "specs" SET DATA TYPE JSON USING "specs"::JSON;

Without the USING clause, Postgres would complain:

column "specs" cannot be cast automatically to type json

But with the explicit cast it works just fine.

Postgres can cast JSON to TEXT on its own, so we can roll this back without USING (sad, I know). But just to include one more example, here’s what it would look like doing it explicitly:

ALTER TABLE "House" ALTER COLUMN "specs" SET DATA TYPE TEXT USING "specs"::TEXT;

Dare to dream, a fourth thing

I was really hoping this would be “You can even use subqueries 🤯”. But we dreamt too big. Even USING has its limits.

All good things

Are you still reading? It’s over. Go home. I am contractually obligated to say that in this article we covered the USING clause and how its magic can be employed to alter table columns in Postgres. We looked at so many shiny examples of using USING with columns and functions and casting and not (maybe someday, in the machine utopia) subqueries.

Resources

Share this post

twitterfacebooklinkedin

Related Posts:

Interested in working with us?

Give us some details about your project, and our team will be in touch with how we can help.

Get in Touch