PostgreSQL comes with support for the SQL99 standard Array Types. Used correctly, these can result in simpler schemas and faster queries.
PostgreSQL arrays also translate well into Go. Read on to learn more about array types and the cool things you can do with it, and how to work with them in Go.
Array Types in SQL
In Postgres, you can have a single column contain a bunch of values of the same type, quite similar to arrays in programming languages. Here is how they are represented literally:
test=# SELECT '{1,2,3}'::int[] ; -- an array of integers
int4
---------
{1,2,3}
(1 row)
test=# SELECT '{"hello","world"}'::text[] ; -- an array of strings
text
---------------
{hello,world}
(1 row)
In Postgres syntax, the part '{1,2,3}'
is a plain string, ::
is the
Postgres typecast operator, and int[]
is the type (an array of integers). The
expression '{1,2,3}'::int[]
reads as “typecast the string {1,2,3}
into an
array of integers”.
The string notation of curly brackets and comma-separated values is an array value constructor recognized by Postgres.
There is also an alernate syntax, which is less explicit about the element type:
test=# SELECT array[1,2,3];
array
---------
{1,2,3}
(1 row)
test=# SELECT array['hello', 'world'];
array
---------------
{hello,world}
(1 row)
test=# SELECT array[1, 'world'];
ERROR: invalid input syntax for integer: "world"
LINE 1: SELECT array[1, 'world'];
You can have arrays of any basic type (like boolean, timestamp, text, and so on). Multi-dimensional arrays are also possible, although we don’t cover them in this blog post.
Here is how you can have an array as a column type in a table:
test=# CREATE TABLE posts (
test(# title text NOT NULL PRIMARY KEY,
test(# tags text[]
test(# );
CREATE TABLE
Like any other column type, the array column type can also be nullable or not, and can have a default value, or not. For example, if you want the default value to be an empty array (array with no elements):
test=# CREATE TABLE posts2 (
test(# title text NOT NULL PRIMARY KEY,
test(# tags text[] NOT NULL DEFAULT '{}'::text[]
test(# );
CREATE TABLE
To insert a row into this table, you can use a plain string as the value for tag:
test=# INSERT INTO posts (title, tags) VALUES
test-# ('Using PostgreSQL Arrays with Golang', '{"postgres","golang"}');
INSERT 0 1
Postgres will cast the string to the column type, letting you omit the explicit cast.
Using Go
The standard Postgres database/sql driver for Go is lib/pq. (There are others, like pgx and go-pg, which we don’t cover here.) Here’s is how you’ll connect to a database using lib/pq:
package main
import (
"database/sql"
"log"
"github.com/lib/pq"
)
func main() {
db, err := sql.Open("postgres", "host=172.16.2.100 dbname=test")
if err != nil {
log.Fatal(err)
}
// .. use db here ..
db.Close()
}
To insert a row that contains an array value, use the pq.Array
function like
this:
// "ins" is the SQL insert statement
ins := "INSERT INTO posts (title, tags) VALUES ($1, $2)"
// "tags" is the list of tags, as a string slice
tags := []string{"go", "goroutines", "queues"}
// the pq.Array function is the secret sauce
_, err = db.Exec(ins, "Job Queues in Go", pq.Array(tags))
To read a Postgres array value into a Go slice, use:
func getTags(db *sql.DB, title string) (tags []string) {
// the select query, returning 1 column of array type
sel := "SELECT tags FROM posts WHERE title=$1"
// wrap the output parameter in pq.Array for receiving into it
if err := db.QueryRow(sel, title).Scan(pq.Array(&tags)); err != nil {
log.Fatal(err)
}
return
}
Note that in lib/pq, only slices of certain Go types may be passed to pq.Array(). For more information, see the docs.
Selecting
Let’s see some examples of how to use arrays for reading back information from our blog posts table. Here are the contents of the table which we’re working with:
test=# select * from posts;
title | tags
---------------------------------------+----------------------------
Job Queues in Go | {go,goroutines,queues}
Logical Replication in PostgreSQL 10 | {postgres,replication}
Writing PostgreSQL Triggers in Go | {postgres,triggers,go}
Monitoring MySQL Servers with OpsDash | {mysql,monitoring,opsdash}
(4 rows)
Get all posts and their tag count
The array_length
function gives the number of elements in the array. This
function can work with multi-dimensional arrays also, so we need so specify the
dimension number also (always 1 for one-dimensional arrays):
test=# SELECT title, array_length(tags, 1) FROM posts;
title | array_length
---------------------------------------+--------------
Job Queues in Go | 3
Logical Replication in PostgreSQL 10 | 2
Writing PostgreSQL Triggers in Go | 3
Monitoring MySQL Servers with OpsDash | 3
(4 rows)
You can also use the cardinality
function to get the same result in this case:
test=# SELECT title, cardinality(tags) FROM posts;
title | cardinality
---------------------------------------+-------------
Job Queues in Go | 3
Logical Replication in PostgreSQL 10 | 2
Writing PostgreSQL Triggers in Go | 3
Monitoring MySQL Servers with OpsDash | 3
(4 rows)
Get all posts tagged “postgres”
The ‘<@’ operator can be used to check for array membership. It is used like A
<@ B
where A and B are arrays of the same type, and returns true
if all
elements of array A are present in array B. We can use this to get a list of all
posts that are tagged “postgres”:
test=# SELECT title, tags FROM posts WHERE '{"postgres"}' <@ tags;
title | tags
--------------------------------------+------------------------
Logical Replication in PostgreSQL 10 | {postgres,replication}
Writing PostgreSQL Triggers in Go | {postgres,triggers,go}
(2 rows)
The left hand side of the operator has to be an array, so we construct an array with the single element “postgres”.
Doing this from Go goes like this:
// do the query
sel := "SELECT title FROM posts WHERE $1 <@ tags"
tags := []string{"postgres"}
rows, err := db.Query(sel, pq.Array(tags))
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// iterate over the result and print out the titles
for rows.Next() {
var title string
if err := rows.Scan(&title); err != nil {
log.Fatal(err)
}
fmt.Println(title)
}
if err := rows.Err(); err != nil {
log.Fatal(err)
}
Get all posts tagged “postgres” and “go”
If we specify more than one element for the left hand side array, all those elements need to be present in the right hand side for the operator to report a match. This means that if we need to list all blogs that are tagged both postgres and go, we can do:
test=# SELECT title, tags FROM posts WHERE '{"postgres", "go"}' <@ tags;
title | tags
-----------------------------------+------------------------
Writing PostgreSQL Triggers in Go | {postgres,triggers,go}
(1 row)
The corresponding change to Go code is simply:
sel := "SELECT title FROM posts WHERE $1 <@ tags"
tags := []string{"postgres", "go"} // <-- two elements now
rows, err := db.Query(sel, pq.Array(tags))
Get all posts tagged either “postgres” or “go”
So how do we do an “OR” then? We can use the &&
operator, which when used as
A && B
returns true when A and B have one or more elements in common.
test=# SELECT * FROM posts WHERE tags && '{"postgres", "go"}';
title | tags
--------------------------------------+------------------------
Job Queues in Go | {go,goroutines,queues}
Logical Replication in PostgreSQL 10 | {postgres,replication}
Writing PostgreSQL Triggers in Go | {postgres,triggers,go}
(3 rows)
The Go code changes only in the query:
sel := "SELECT title FROM posts WHERE tags && $1" // <-- only query changes
tags := []string{"postgres", "go"}
rows, err := db.Query(sel, pq.Array(tags))
Get all unique tags
The function unnest
unrolls an array into rows. This is best explained with an
example:
test=# SELECT title, unnest(tags) FROM posts;
title | unnest
---------------------------------------+-------------
Job Queues in Go | go
Job Queues in Go | goroutines
Job Queues in Go | queues
Logical Replication in PostgreSQL 10 | postgres
Logical Replication in PostgreSQL 10 | replication
Writing PostgreSQL Triggers in Go | postgres
Writing PostgreSQL Triggers in Go | triggers
Writing PostgreSQL Triggers in Go | go
Monitoring MySQL Servers with OpsDash | mysql
Monitoring MySQL Servers with OpsDash | monitoring
Monitoring MySQL Servers with OpsDash | opsdash
(11 rows)
As you can see, each tag from each row in “posts” expands into its own row,
duplicating the other columns (“title” here) as required. You can use unnest
to get only the unique tags:
test=# SELECT DISTINCT unnest(tags) FROM posts;
unnest
-------------
go
postgres
monitoring
replication
opsdash
goroutines
triggers
queues
mysql
(9 rows)
And since we’re here, let’s see one more trick: rolling up those rows back into an array:
test=# SELECT array(SELECT DISTINCT unnest(tags) FROM posts);
array
-------------------------------------------------------------------------------
{go,postgres,monitoring,replication,opsdash,goroutines,triggers,queues,mysql}
(1 row)
Get tag cloud
Can we get the information required to build a tag cloud? A tag cloud needs each tag, and the number of posts that are tagged with it. In SQL terms, we basically need to do a count over a grouping by tag. We can do this using the previous query as a CTE:
test=# WITH p AS (SELECT title, unnest(tags) AS tag FROM posts)
test-# SELECT tag, count(tag) FROM p GROUP BY tag;
tag | count
-------------+-------
go | 2
postgres | 2
monitoring | 1
replication | 1
opsdash | 1
goroutines | 1
triggers | 1
queues | 1
mysql | 1
(9 rows)
Updates
Let’s see some ways to modify these array values.
Mark all existing posts as legacy
The array_append
function returns a new array with the given elements appended
to it. We can use this to add another tag to each of our post:
UPDATE posts SET tags = array_append(tags, 'legacy');
Rename tags
You can also search and replace elements in an array using the array_replace
function. If we wanted to rename the tag “go” to “golang” in all posts, we
can do:
test=# UPDATE posts SET tags = array_replace(tags, 'go', 'golang');
UPDATE 4
Note that this updated all 4 rows, even though only 2 rows contained “go”. The other rows were touched, even though their values did not change. This was because we didn’t have a WHERE clause for the UPDATE. Let’s update only the required ones:
test=# UPDATE posts
test-# SET tags = array_replace(tags, 'go', 'golang')
test-# WHERE '{"go"}' <@ tags;
UPDATE 2
That updates only 2 rows, as required.
Other Array Tips and Tricks
Here are a couple more things about arrays that’s useful to know:
NULL values in an array
Array elements can be NULL, and there is no way to declare an array as containing only “not null” elements. It’s a bit of a pain to work with NULLs in arrays, and is best avoided.
UPDATE posts SET tags = '{"bug",null}' WHERE title = 'The Metamorphosis';
From Go, here is how to read arrays that may contain NULLs:
func getTagsMaybeNull(db *sql.DB, title string) (tags []sql.NullString) {
sel := "SELECT tags FROM posts WHERE title=$1"
if err := db.QueryRow(sel, title).Scan(pq.Array(&tags)); err != nil {
log.Fatal(err)
}
return
}
You’ll have to check each element of tags
to see if it is .Valid
while
processing the result.
Using “ANY” instead of “IN”
Here are two ways of selecting posts written by any of a specific set of authors:
-- using IN
SELECT title FROM posts WHERE author IN ('a', 'b', 'c');
-- using arrays and ANY
SELECT title FROM posts WHERE author = any('{"a", "b", "c"}');
Assuming that the set of authors is known only at run-time (say selected from the UI from a list), how would you form this query in Go?
With the first query, you’d have to escape each value and form the SQL string, which would be inelegant and error-prone. The second query, however, can be written like this:
func getTitles(authors []string) {
q := "SELECT title FROM posts WHERE author = any($1)"
rows, err := db.Exec(q, pq.Array(authors))
// ...
}
Array elements as foreign keys
Having an array of primary keys that reference another table would actually be pretty helpful in some cases, but Postgres does not support it. You can store the keys as array elements, but you’ll have to “manually” update the array when the referenced rows in the slave table are deleted.
Array Aggregates
The array_agg
function can be used in aggregate queries to create arrays out
of aggregated input. See the docs for more info.
Multi-dimensional Arrays
Arrays can have more than one dimension. See this page for an overview, and check out other functions here that would be useful in working with them.
Indexing Arrays
Indexes can be created using array expressions. Not all operators (like ‘<@’, ‘&&’ etc) are support by all index types. See here for more info.
Monitoring Your PostgreSQL servers
Since you’re here, you might be interested in our server and database monitoring product OpsDash. It’s written entirely in Golang and can be used to monitor servers, services, app metrics, and databases, including PostgreSQL.
With OpsDash, you can quickly start monitoring your PostgreSQL servers, and get instant insight into key performance and health metrics including WAL archiving and streaming replication.
Additionally, each OpsDash Smart Agent includes the industry-standard statsd interface (and even a graphite interface) to easily report custom metrics.
New Here?
OpsDash is a server monitoring, service monitoring, and database monitoring solution for monitoring Docker, MySQL, PostgreSQL, MongoDB, memcache, Redis, Apache, Nginx, Elasticsearch and more. It provides intelligent, customizable dashboards and rule-based alerting via email, HipChat, Slack, PagerDuty, OpsGenie, VictorOps and Webhooks. Send in your custom metrics with StatsD and Graphite interfaces built into each agent.