A disaster usually causes an outage, which means system downtime and potential loss of data. Once we have detected the blackout, we trigger our DR plan to recover from it. But it would be a surprise, if there is no backup, or after long hours of recovery, you see it's not the one you need.
While outages can be costly - there is often a financial impact which can be harmful to the business and data loss may be a reason to close the company.
To minimize data loss, we need to have multiple copies of data in various places. We can design our infrastructure in different layers and abstract each layer from the one below it. For instance, we build a layer for clusters of database instances to protect against hardware failure. We replicate databases across datacenters so we can defend ourselves against a data center failure. Every additional layer adds complexity, which can become a nightmare to manage. But still, in essence, a backup will take the central place in the disaster recovery.
That's why it's crucial to be sure it's something we can rely on. But how to achieve this? Well, one of the options is to verify if backups were executed based on the last few lines of backup script.
A simple example:
#!/bin/sh
mysqldump -h 192.168.1.1 -u user -ppassword dbname > filename.sql
if [ "$?" -eq 0 ]; then
echo "Success."
else
echo "Error."
fi
But what if the backup script did not start at all? Google offers quite a bit of search results for "Linux cron, not running."
Unfortunately, open-source databases often do not offer backup repository.
Another backup testing. You may have heard about Schrödinger's cat. A known Schrödinger's Backup theory is . "The condition of any backup is unknown until a restore is attempted." Sounds like a simple approach but such an attempt would mean you have to set up a test environment, copy files run restore ... after every backup.
In this article, we will see how you can use ClusterControl to make sure your backup is executed to achieve Enterprise-Grade databases with Open Source Databases.
Backup Reports
ClusterControl has been aimed at operational reports. Operational Reporting provides support to day-to-day enterprise activity monitoring and control. The backup report is one of many. You can find reports like:
- Daily System Report
- Package Upgrade Report
- Schema Change Report
- Availability
- Backup
But why you would need this?
You may already have an excellent monitoring tool with all possible metrics/graphs and you probably have also set up alerts based on metrics and thresholds (some will even have automated advisors providing them recommendations or fixing things automatically.) That's good - having visibility into your system is important; nevertheless, you need to be able to process a lot of information.
How does this work? ClusterControl collects information on the backup process, the systems, platforms, and devices in the backup infrastructure when the backup job is triggered. All of that information is aggregated and stored in a CMON (internal database), so there is no need to query particular databases additionally. Additionally, when it discovers that you have a running cluster, but there was no backup, it will be reported too.
In the report details, you can track a backup ID with detailed data about the location, size, time, and backup method. Templates work with data for different database types, so when you manage your mixed environment, you will get the same feel and look. It helps to manage different database backups better.
CLI Reports
For those who prefer the command-line interface, a good option to track backups ClusterControl Command Line Interface (CLI).
CLI lets you execute most of the functions available within ClusterControl using simple commands. Backup execution and backup reports are one of them.
Used in conjunction with the powerful GUI, it gives ClusterControl users alternative ways to manage their open-source database environments using whatever engine they prefer.
$ s9s backup --list --cluster-id=1 --long --human-readable
ID CID STATE OWNER HOSTNAME CREATED SIZE FILENAME
1 1 COMPLETED dba 10.0.0.5 07:21:39 252K mysqldump_2017-05-09_072135_mysqldb.sql.gz
1 1 COMPLETED dba 10.0.0.5 07:21:43 1014 mysqldump_2017-05-09_072135_schema.sql.gz
1 1 COMPLETED dba 10.0.0.5 07:22:03 109M mysqldump_2017-05-09_072135_data.sql.gz
1 1 COMPLETED dba 10.0.0.5 07:22:07 679 mysqldump_2017-05-09_072135_triggerseventsroutines.sql.gz
2 1 COMPLETED dba 10.0.0.5 07:30:20 252K mysqldump_2017-05-09_073016_mysqldb.sql.gz
2 1 COMPLETED dba 10.0.0.5 07:30:24 1014 mysqldump_2017-05-09_073016_schema.sql.gz
2 1 COMPLETED dba 10.0.0.5 07:30:44 109M mysqldump_2017-05-09_073016_data.sql.gz
2 1 COMPLETED dba 10.0.0.5 07:30:49 679 mysqldump_2017-05-09_073016_triggerseventsroutines.sql.gz
Beginning from version 1.4.1, the installer script will automatically install this package on the ClusterControl node. CLI is part of s9s-tools package. You can also install it separately on a different machine to manage the database cluster remotely. Similar to ClusterControl it uses secure SSH communication.
Automatic Backup Verification
A backup is not a backup if we are not able to retrieve the data. Verifying backups is something that is usually overlooked by many companies. Let’s see how ClusterControl can automate the verification of backups and help avoid any surprises.
In ClusterControl, select your cluster and go to the "Backup" section, then, select “Create Backup”.
The automatic verify backup feature is available for the scheduled backups so, let’s choose the “Schedule Backup” option.
When scheduling a backup, in addition to selecting the common options like method or storage, we also need to specify schedule/frequency. In this example, we are going to setup MySQL backup verification. However the same can be achieved for PostgreSQL and Timescale databases.
When backup verification is checked another tab will appear.
Here we can set all the necessary steps to prepare the environment. When IP is provided we are good to go and schedule such backup. Whenever backup finishes it will be copied to a temporary backup verification environment (“restore backup on” option). After successful refresh, you will see the status of verification in the backup repository tab.
Failed Backup Executions and Integration Services
Another interesting option to get more clues about backup execution is to use ClusterControl Integration services. You can control the backup execution status with third-party services.
Third-party tools integration enables you to automate alerts with other popular systems. Currently, ClusterControl supports ServiceNow, PagerDuty, VictorOps, OpsGenie, Slack, Telegram, and Webhooks.
Below we can see an example of Slack channel integration. Whenever a backup event occurs it will appear in the slack channel.
Conclusion
Backups are mandatory in any environment. They help you protect your data and are in the center of any disaster recovery scenario. ClusterControl can help automate the backup process for your databases and, in case of failure, restore it with a few clicks. Also, you can be sure they are executed successfully and reliable so in case of disaster, you will not lose your data.