This page describes how to take and restore self-managed backups on CockroachDB Standard, Advanced, and Basic clusters.
The examples on this page provide a quick overview of the backup features you can run to your own storage bucket. For more technical detail on the complete list of backup features, refer to:
- Full backup
- Incremental backup
- Scheduled backup
- Backups with revision history
- Point-in-time restore
- Encrypted backup and restore
- Locality-aware backup and restore
- Locality-restricted backup execution
Examples
Before you begin, connect to your cluster:
- Connect to a CockroachDB Standard Cluster.
- Connect to a CockroachDB Advanced Cluster.
- Connect to a CockroachDB Basic Cluster.
The examples on this page demonstrate how to back up and restore from your own storage bucket.
Full backup
To take a full backup of a cluster:
BACKUP INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';
To take a full backup of a single database:
BACKUP DATABASE bank INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';
To take a full backup of a single table or view:
BACKUP bank.customers INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';
Backup subdirectories
BACKUP ... INTO
adds a backup to a backup collection location. To view the backup paths in a given collection location (your storage bucket), use SHOW BACKUPS
:
SHOW BACKUPS IN 's3://bucket/path?AUTH=implicit';
path
-------------------------
/2023/12/14-190909.83
/2023/12/20-155249.37
/2023/12/21-142943.73
(3 rows)
When you want to restore a specific backup, add the backup's subdirectory path (e.g., /2023/12/21-142943.73
) to the RESTORE
statement.
Restore
To restore from the most recent backup (full or incremental) in the collection's location, use the LATEST
syntax:
RESTORE FROM LATEST IN 's3://bucket/path?AUTH=implicit';
If you are restoring an incremental backup, the storage location must contain a full backup.
You cannot restore a backup of a multi-region database into a single-region database.
To restore a specific full or incremental backup, specify that backup's subdirectory in the RESTORE
statement. To view the available subdirectories, use SHOW BACKUPS
. If you are restoring an incremental backup, the URI must point to the storage location that contains the full backup:
RESTORE FROM '2023/03/23-213101.37' IN 's3://bucket/path?AUTH=implicit';
Incremental backup
When a BACKUP
statement specifies an existing subdirectory in the collection, explicitly or via the LATEST
keyword, an incremental backup will be added to the default /incrementals
directory at the root of the collection storage location.
To take an incremental backup using the LATEST
keyword:
BACKUP INTO LATEST IN 'external://backup_s3' AS OF SYSTEM TIME '-10s';
To store the backup in an existing subdirectory in the collection:
BACKUP INTO {'subdirectory'} IN 'external://backup_s3' AS OF SYSTEM TIME '-10s';
If you intend to take a full backup, we recommend running BACKUP INTO {collectionURI}
without specifying a subdirectory.
To explicitly control where you store your incremental backups, use the incremental_location
option. For more detail, see this example demonstrating the incremental_location
option.
Scheduled backup
This example creates a schedule for a cluster backup with revision history that is taken every day at midnight:
CREATE SCHEDULE schedule_label
FOR BACKUP INTO 's3://test/backups/schedule_test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
WITH revision_history
RECURRING '@daily';
schedule_id | name | status | first_run | schedule | backup_stmt
---------------------+----------------+------------------------------------------------+----------------------------------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------
588796190000218113 | schedule_label | PAUSED: Waiting for initial backup to complete | NULL | @daily | BACKUP INTO LATEST IN 's3://test/schedule-test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
588796190012702721 | schedule_label | ACTIVE | 2020-09-10 16:52:17.280821+00:00 | @weekly | BACKUP INTO 's3://test/schedule-test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
(2 rows)
Because the FULL BACKUP
clause is not included, CockroachDB also scheduled a full backup to run @weekly
. This is the default cadence for incremental backups RECURRING
> 1 hour but <= 1 day.
Encrypted backup
You can take and restore encrypted backups in the following ways:
- Using AWS Key Management Service (KMS)
- Using Google Cloud Key Management Service (KMS)
- Using Azure Key Vault
- Using a passphrase
Refer to the Take and Restore Encrypted Backups page for technical detail and a full list of examples.
For example, you can run a backup with AWS KMS with the BACKUP
statement's kms
option:
BACKUP INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}'
WITH kms = 'aws:///{key}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}®ION=us-east-1';
Locality-aware backup
CockroachDB Basic and Standard clusters operate with a different architecture compared to CockroachDB Self-Hosted and CockroachDB Advanced clusters. These architectural differences have implications for how locality-aware backups can run. Basic and Standard clusters will scale resources depending on whether they are actively in use, which means that it is less likely to have a SQL pod available in every locality. As a result, your cluster may not have a SQL pod in the locality where the data resides, which can lead to the cluster uploading that data to a storage bucket in a locality where you do have active SQL pods. You should consider this as you plan a backup strategy that must comply with data domiciling requirements.
For example, to create a locality-aware backup where nodes with the locality region=us-west
write backup files to s3://us-west-bucket
, and all other nodes write to s3://us-east-bucket
by default, run:
BACKUP INTO
('s3://us-east-bucket?COCKROACH_LOCALITY=default', 's3://us-west-bucket?COCKROACH_LOCALITY=region%3Dus-west');
When you run the BACKUP
statement for a locality-aware backup, check the following:
- The locality query string parameters must be URL-encoded.
- If you are creating an external connection with
BACKUP
query parameters, you must pass them in uppercase otherwise you will receive anunknown query parameters
error. - A successful locality-aware backup job requires that each node in the cluster has access to each storage location. This is because any node in the cluster can claim the job and become the coordinator node.
You can restore the backup by running:
RESTORE FROM LATEST IN ('s3://us-east-bucket', 's3://us-west-bucket');
Note that the first URI in the list has to be the URI specified as the default
URI when the backup was created. If you have moved your backups to a different location since the backup was originally taken, the first URI must be the new location of the files originally written to the default
location.
To restore from a specific backup, use RESTORE FROM {subdirectory} IN ...
Backup and restore data from userfile
storage
To put files on your CockroachDB cluster without external servers, use userfile
, a per-user file storage.
For information on userfile
commands, visit the following pages:
We recommend starting backups from a time at least 10 seconds in the past using AS OF SYSTEM TIME
. Read our guidance in the Performance section on the BACKUP
page.
Only database and table-level backups are possible when using userfile
as storage. Restoring cluster-level backups will not work because userfile
data is stored in the defaultdb
database, and you cannot restore a cluster with existing table data.
When working on the same cluster, userfile
storage allows for database and table-level backups.
First, run the following statement to backup a database to a directory in the default userfile
space:
BACKUP DATABASE bank INTO 'userfile://defaultdb.public.userfiles_$user/bank-backup' AS OF SYSTEM TIME '-10s';
This directory will hold the files that make up a backup; including the manifest file and data files.
When backing up from a cluster and restoring a database or table that is stored in your userfile
space to a different cluster, you can run cockroach userfile get
to download the backup files to a local machine and cockroach userfile upload --url {CONNECTION STRING}
to upload to the userfile
of the alternate cluster.
BACKUP ... INTO
adds a backup to a collection within the backup destination. The path to the backup is created using a date-based naming scheme by default, unless an explicit subdirectory is passed with the BACKUP
statement. To view the backup paths in a given destination, use SHOW BACKUPS
:
> SHOW BACKUPS IN 'userfile://defaultdb.public.userfiles_$user/bank-backup';
path
------------------------
2021/03/23-213101.37
2021/03/24-172553.85
2021/03/24-210532.53
(3 rows)
Only database and table-level backups are possible when using userfile
as storage. Restoring cluster-level backups will not work because userfile
data is stored in the defaultdb
database, and you cannot restore a cluster with existing table data.
In cases when you need to restore a specific backup, add the backup subdirectory to the RESTORE
statement:
RESTORE DATABASE bank FROM '2021/03/24-210532.53' IN 'userfile://defaultdb.public.userfiles_$user/bank-backup';
It is also possible to run userfile:///bank-backup
as userfile:///
refers to the default path userfile://defaultdb.public.userfiles_$user/
.
To restore from the most recent backup, use RESTORE FROM LATEST IN ...
:
RESTORE FROM LATEST IN 'userfile://defaultdb.public.userfiles_$user/bank-backup';
Once the backup data is no longer needed, delete from the userfile
storage:
cockroach userfile delete bank-backup --url {CONNECTION STRING}
If you use cockroach userfile delete {file}
, it will take as long as the garbage collection to be removed from disk.
Back up a self-hosted CockroachDB cluster and restore into a CockroachDB Cloud cluster
To back up a self-hosted CockroachDB cluster and restore into a CockroachDB Cloud cluster:
While connected to your self-hosted CockroachDB cluster, back up your databases and/or tables to an external location:
BACKUP DATABASE example_database INTO 'gs://{bucket name}/{path/to/backup}?AUTH=specified&CREDENTIALS={encoded key}';
Warning:If you are backing up the data to AWS or GCP, use the
specified
option for theAUTH
parameter, as CockroachDB Cloud will need thespecified
credentials uponRESTORE
. For more information on authentication parameters to cloud storage providers, see Cloud Storage Authentication.Connect to your CockroachDB Cloud cluster:
cockroach sql --url 'postgresql://<user>@<cluster-name>-<short-id>.<region>.cockroachlabs.cloud:26257/<database>?sslmode=verify-full&sslrootcert='$HOME'/Library/CockroachCloud/certs/<cluster-name>-ca.crt'
cockroach sql --url 'postgresql://<user>@<cluster-name>-<short-id>.<region>.cockroachlabs.cloud:26257/<database>?sslmode=verify-full&sslrootcert='$HOME'/Library/CockroachCloud/certs/<cluster-name>-ca.crt'
cockroach sql --url "postgresql://<user>@<cluster-name>-<short-id>.<region>.cockroachlabs.cloud:26257/<database>?sslmode=verify-full&sslrootcert=$env:appdata\CockroachCloud\certs\$<cluster-name>-ca.crt"
Where:
<user>
is the SQL user. By default, this is your CockroachDB Cloud account username.<cluster-name>-<short-id>
is the short name of your cluster plus the short ID. For example,funny-skunk-3ab
.<cluster-id>
is a unique string used to identify your cluster when downloading the CA certificate. For example,12a3bcde-4fa5-6789-1234-56bc7890d123
.<region>
is the region in which your cluster is running. If you have a multi-region cluster, you can choose any of the regions in which your cluster is running. For example,aws-us-east-1
.<database>
is the name for your database. For example,defaultdb
.
You can find these settings in the Connection parameters tab of the Connection info dialog.
Restore to your CockroachDB Cloud cluster.
Use
SHOW BACKUPS
with your external location to find the backup's subdirectory:SHOW BACKUPS IN 'gs://{bucket name}/{path/to/backup}?AUTH=specified&CREDENTIALS={encoded key}';
path ------------------------ 2021/03/23-213101.37 2021/03/24-172553.85 2021/03/24-210532.53 (3 rows)
Use the subdirectory to specify the backup to restore:
RESTORE DATABASE example_database FROM '2021/03/23-213101.37' IN 'gs://{bucket name}/{path/to/backup}?AUTH=specified&CREDENTIALS={encoded key}';