Quantcast
Channel: Severalnines - clustercontrol
Viewing all 385 articles
Browse latest View live

ClusterControl Now Available on gridscale.io

$
0
0

If you’re familiar with Severalnines’ product offerings, chances are you have probably heard about gridscale too. Well, we have some good news. Our flagship product - ClusterControl - is now available on the gridscale marketplace!

What is gridscale and How is ClusterControl Related to It?

The Cologne-based IaaS and PaaS provider gridscale stands for intuitively usable and flexible cloud technologies. Via an easy-to-understand interface, the IT infrastructure can also be managed by people without in-depth IT know-how. A Kubernetes environment facilitates even the management of cloud-native workloads. Thousands of companies, agencies and managed service providers already rely on gridscale for the realisation and operation of their digital projects - from high-traffic web shops to complex SaaS or enterprise IT solutions. White label options are available to resellers and with the gridscale software Hybrid Core, data centre operators can become cloud providers themselves.

Since the beginning of February, Severalnines has added ClusterControl to gridscale's marketplace allowing both customers of Severalnines and gridscale to manage the entire database lifecycle through a unified console. ClusterControl provides a couple of ways to interact with the ClusterControl (CMON) service: it’s the ClusterControl GUI (a web application) and the ClusterControl CLI (a command-line client called “s9s”, also known as “s9s-tools”) - ClusterControl enables you to manage the entire database lifecycle through a user-friendly and unified console allowing the end-users to focus on what’s relevant: the consistent availability of your applications and performance without the overhead of database operations. You can read more about ClusterControl in the ClusterControl documentation.

How Can Severalnines Customers Benefit From gridscale?

gridscale offers products that can help:

  • Configure and save costs on configuring Kubernetes clusters.

  • PaaS from gridscale can help eliminate all the time-consuming management of services (e.g installations, management, monitoring etc.) - their smart PaaS scales automatically, it also helps avoid vendor lock-in, restore code and data (their PaaS solution will let  you reset your database to a previous state) - they pride themselves on being the “world’s smartest PaaS”

  • Serve you all of the cloud advantages on a “silver platter” - gridscale grants you total freedom with the operation of your apps and workloads.

  • And more! Learn more about some of their offerings here.

gridscale also offers a 10 euro credit voucher - for those users who want to test Severalnines’ ClusterControl on gridscale, ClusterControl comes with a free trial and the first 10 euro of the gridscale infrastructure cost is on gridscale if you apply a voucher with the code “clustercontrol”.

gridscale - An Overview

When you first login to gridscale, you will be greeted with a panel that looks something like this (the explanation you see on the screen appears when the “Smart Guide” button is clicked):

The guide will let you know about everything that gridscale can do for you: it will inform you how you should check your server details, how to start, stop, edit and delete your servers, how you can transform the UX into one that’s more pleasurable for your eyes etc.

gridscale also has a menu on the left side of the screen. The menu lets you visit the dashboard, see your costs & account overview on the Usage page, see all of your aggregated backups created at gridscale at the Backup Center and other things:

As far as security is concerned, gridscale has thought this one out too - you can apply a firewall template to any server you use. Here’s how that looks like:

Before applying a firewall template you can also create one yourself and configure the firewall the way you want it to work:

gridscale also has a marketplace section.

When ClusterControl is chosen, gridscale also provides you with the instructions on how to set everything up while also providing you with an overview of all of the ClusterControl’s features:

Summary

ClusterControl is now available on gridscale which is an innovative and secure IaaS and PaaS offering. From now on, you can bring your ClusterControl deployment to gridscale and enjoy benefits including, but not limited to lightning fast database infrastructure management, fully automated database ops, independent resource selection and scaling during operation, 1Gbit/s unlimited network speed for each virtual machine etc. - in general, gridscale provides a reliable cloud platform with a 100% uptime SLA, and we think that both customers of ClusterControl and gridscale will benefit from such a collaboration.


Enforcing Role-Based Access Controls with ClusterControl

$
0
0

In Severalnines’ recent release of ClusterControl version 1.8.2 we have introduced a lot of sophisticated features and changes. One of the important features is the newly improved User Management System, which covers New User and LDAP Management.  A complementary existing capability in ClusterControl is its Role-Based Access Control (RBAC) for User Management, which is the focus of this blog.

Role-Based Access Control in ClusterControl

For those who are not familiar with ClusterControl's Role-Based Access Controls (RBAC), it's a feature that enables  you to restrict the access of certain users to specific database cluster features and administrative actions or tasks. For example, access to deployment (add load balancers, add existing cluster), management, and monitoring features. This ensures that only authorized users are allowed to work and view based on their respective roles and avoids unwanted intrusion or human errors by limiting a role’s access to administrative tasks. Access to functionality is fine-grained, allowing access to be defined by an organization or user. ClusterControl uses a permissions framework to define how a user may interact with the management and monitoring functionality based on their level of authorization. 

Role-Based Access Control in ClusterControl plays an important role especially for admin users that are constantly utilizing it as part of their DBA tasks. A ClusterControl DBA should be familiar with this feature as it allows the DBA to delegate tasks to team members, control access to ClusterControl functionality, and not expose all of the features and functionalities to all users. This can be achieved by utilizing the User Management functionality, which allows you to control who can do what. For example, you can set up a Team such as analyst, devops, or DBA, and add restrictions according to their scope of responsibilities for  a given database cluster.

ClusterControl access control is depicted in  the following diagram,

Details of the terms used above are provided, below. A Team can be assigned to one or more of the database clusters managed by ClusterControl. A Team consists of empty or multiple users in a Team. By default, when creating a new Team, the super admin account will always be associated with it. Deleting superadmin doesn't take it away from being linked to that new Team.

A User and a Cluster have to be assigned to a Team; it is a mandatory implementation within ClusterControl. By default, the super-admin account is designated to an Admin team, which has already been created by default. Database Clusters are also assigned to the Admin team by default.

A Role can have no User assigned or it can be assigned to multiple users in accordance with their ClusterControl role.

Roles in ClusterControl

Roles in ClusterControl are actually set by default. These default roles follow:

  • Super Admin - It is a hidden role, but it is the super administrator (superadmin) role, which means all features are available for this role. By default, the user that you created after a successful installation represents your Super Admin role. Additionally, when creating a new Team the superadmin is always assigned to the new Team by default. 

  • Admin - By default, almost all the features are viewable. Being viewable means that users under the Admin role can do management tasks. The features that are not available for this role are the Customer Advisor and SSL Key Management.

  • User - Integrations, access to all clusters, and some features are not available for this role and are denied by default. This is useful if you want to assign regular users that are not intended to work database or administrative tasks. There are some manual steps to be done for those in the User role to see other clusters.

Roles in ClusterControl are arbitrary so administrators can create arbitrary roles and assign them to a user under Teams. 

How to Get Into ClusterControl Roles

You can create a custom role with its own set of access levels. Assign the role to a specific user under the Teams tab. This can be reached by locating User Management in the side-bar in the right corner. See the screenshot, below:

 

Enforcing Role-Based Access Controls with ClusterControl

Enforcing RBAC is user domain-specific, which restricts a user's access to ClusterControl features in accordance with their roles and privileges. With this in mind,  we should start creating a specific user. 

Creating a User in ClusterControl

To create a user, start under the User Management ➝ Teams tab. Now, let's create a Team first. 

Once created, there is a super-admin account which is linked by default once a Team is created.

Now, let's add a new user. Adding a new user has to be done under a Team, so we can create it under DevOps.

As you might have noticed, the new user we created is now under the role User, which is added by default within ClusterControl. Then the Team is also under DevOps. 

Basically, there are two users now under the DevOps Team as shown below:

Take note that Roles are user domain-specific, so it applies access restrictions only to that specific user, and not to the Team where it belongs.

Admin vs User Roles (Default Roles in ClusterControl)

Since we have two roles added by default in ClusterControl, there are limitations that are set by default. To know what are these, just go to  User Management ➝ Access Control. Below is a screenshot that depicts the available features or privileges that a user belonging to the role can do:

Admin Role

User Role

 

The Admin role has many more privileges, whereas the User Role has some privileges that are restricted. These default roles can be modified in accordance to your desired configuration. Adding a Role also allows you to start and set which roles are  allowed or not. For example, we'll create a new Role. To create a role, just hit the "+" plus button along the roles. You can see the new role we've created called Viewer.

All ticks are unchecked. Just check under the Allow column to enable the feature or privilege or check under the Deny column if you want to deny access. The Manage column allows the users in that role to do management tasks. Whereas the Modify column allows you to enable modifications that are only available to privileges or features under Alarms, Jobs, and Settings.

Testing RBAC

In this example, the following clusters are present in my controller as shown below:

This is viewable by the super administrator account in this environment. 

Now that we have RBAC set for the user we just created, let's try to log in using the email and password we have just set.

This is what is created,

No clusters are available and viewable, and some privileges are denied as shown below such as Key Management Settings and the E-mail Notifications:

Adjusting RBAC

Since the privileges in the roles are mutable, it's easy to manage them via User Management ➝ Access Control. 

Now, let's allow the created user to view a cluster. Since our user has Access All Clusters denied, we need to enable it. See below,

Now, as we have stated earlier based on the diagram, Clusters are controlled by the Team. For example, there are the following cluster assignments,  below:

Since we need to assign the cluster to the right team, selecting the specific cluster and clicking the "Change Team" button will show the prompt allowing you to re-assign it to the right Team.

Now, let's assign it to DevOps.

Now, logged back in as the newly created user, and we'll be able to see the cluster.

Summary

Role-Based Access Control (RBAC) in ClusterControl is a feature that provides fine-grained restrictive access control for every user you have created in the ClusterControl, enabling greater security and more restrictive access control based on a user’s role.

A Guide to Database Automation with Severalnines ClusterControl

$
0
0

Nowadays, database automation is a very hot topic. Database automation, simply speaking, refers to leveraging processes and tools to make administrative tasks for database developers and database administrators simpler.

Why Database Automation?

Database automation refers to the use of self-regulating standalone processes for administrative tasks in a database. As your data grows, database automation can prove to be invaluable as it alleviates the accompanying administrative burden. Database automation can help you to reduce errors and anomalies in your database by eliminating the risk of human error. It can also help you to use the DBAs working in your organization more efficiently, making them available for other potentially mission-critical tasks including patching, upgrading, scaling, provisioning or data recovery. In short, automating the processes in your database is a very good thing - let’s dive deeper into it.

What Can Be Automated?

When it comes to databases, developers and DBAs can automate a number of things. They include, but are not limited to, automating backup processes, automating the deployment and scaling of your database instances, automating the monitoring and reporting of any issues that might arise, etc.

The automation of monitoring and reporting of issues related to your database can alert you whenever there’s a problem related to any of your database instances.  When it comes to automating backup processes, backup verification is critical. Chances are you do not have one tool to help you do everything at once, but there is a solution.

Automating Your Database Processes with ClusterControl

 Severalnines ClusterControl is a database operations management and automation tool that has enabled over 12,000 deployments and is used by a wide range of customers across a variety of industries. Companies using ClusterControl include HP, Vodafone, the NHS, universities in the Netherlands, BT, Orange, Cisco and various other organizations. Some of the benefits customers have had using ClusterControl’s automation include: no longer having to use home grown scripts, which otherwise required a lot of time to maintain (Kickback); using ClusterControl as a virtual DBA (net-sol.at); helping to optimize the process of database replication (iyzico); or simply monitoring PostgreSQL-based instances and achieving high-availability (NHS).

ClusterControl can help you automate your database processes in a number of different ways:

  • ClusterControl helps you backup your data, allowing you to protect all of your business-critical assets. while also offering retention policies for compliance, data encryption and compression. Backed up data can be automatically uploaded to AWS S3, Google Cloud Storage or Azure Storage. 

  • ClusterControl can be used as a monitoring and alerting tool because it understands the specific needs of different database engines, and will not only alert you when something goes wrong, but also when it thinks something may go wrong in the future.  

  • With a point-and-click interface, ClusterControl lets you automate the deployment and scaling of your database instances quickly, efficiently and safely. 

  • The tool comes equipped with advanced monitoring and reporting features, with comprehensive operational reports on the health and stability of your database operations. 

  • It enables you to automatically deploy and run highly-available database clusters to AWS, Microsoft Azure or Google Cloud. 

In a nutshell, ClusterControl can help:

  1. Ensure that tasks and processes are approached the same way, which increases business efficiency and IT agility.

  2. Centralize the database management into a single interface.

  3. Ensure that DBAs, sysadmins and developers will be able to manage entire database clusters efficiently with minimal risks while at the same time using industry best practices.

To automate your database processes using ClusterControl, you have a multiple options: You can configure and deploy highly available database clusters, scale them up and down by adding or removing nodes to and from them; you can also deal with patches - automatically. While one could cobble together various tools and scripts to approximate  the features offered in ClusterControl, the Severalnines team has already done the work to enable: operations such as templated repeatable database server and cluster deployments, deployment and integration of proxy servers, monitoring and alerting, backups, restores & backup scheduling, automated cluster and node recovery, among others.

Now we will see how everything looks from the inside. ClusterControl provides you with an overview of your database clusters:

To get started, simply deploy or import a cluster:

Once you have an active database cluster, click on it and you should see an overview:

ClusterControl also provides you with the ability to drill down into individual nodes:

You can also monitor performance:

As far as performance goes, you also have numerous other benefits. For example, you can monitor the queries running on your server:

As you can probably see, ClusterControl is useful not only for database automation, it can be used for a variety of other things.

Summary

Database automation is the process of leveraging the tools and processes to make database tasks less complex, saving time for both developers and DBAs alike. Severalnines ClusterControl can help by letting you easily deploy, monitor, manage and scale highly available open source databases on-premise or in the cloud. ClusterControl also comes equipped with advanced monitoring and reporting features and to help you push your database instances to the max and it allows you to see comprehensive operational reports on the health of your databases.

Announcing ClusterControl 1.8.2: Enhanced Security, Resource Utilization and Administration

$
0
0

We are pleased to announce the release of version 1.8.2 of Severalnines ClusterControl  - the only management system you’ll ever need to deploy, monitor, and manage your open source databases.  

ClusterControl version 1.8.2 features new user and LDAP management capabilities for greater security, a new patch management system for easier upgrades, support for PgBouncer (PostegreSQL’s connection pooler) for more efficient use of server resources, and other enhancements. Resources and details follow.

Resources

New User and LDAP Management for Enhanced Security

ClusterControl has a new Role Based Access (RBAC) system to manage users  with greater access control, LDAP management, and a centralized user database, all contributing to better and more fine-grained security management options.  

  • The user management system is based on the Unix/Linux filesystem permissions model, providing better security in a familiar fashion.

  •  You can now create users and teams, controlling their access according to their roles.  Access permissions can be very fine grained, e.g., limited to those with read-access.  

  • ClusterControl can be used with the most popular LDAP servers like OpenLDAP or Windows AD for authenticating its users.

  • The ClusterControl web application and the command line tool now access the same user database, keeping users of each perfectly in sync. 

  • The ClusterControl controller now also uses the improved, secure (encrypted) RPC v2 API that’s been in use by the Severalnines (s9s) command line tool for some time.

 

 

New Patch Management

ClusterControl now can upgrade MySQL, PostgreSQL, and ProxySQL nodes through an improved and redesigned patch management system.  With a better user interface and greater reliability,  It shows installed packages and versions, checks/updates for new packages to upgrade, and is capable of selectively upgrading nodes.

 

 

PgBouncer Support - connection pooling for PostgreSQL

With support for PgBouncer, ClusterControl users can pool and optimize connections to one or more PostgreSQL databases.  Connection processing is made more efficient, and server resource consumption is reduced when maintaining a lot of server connections to one or more PostgreSQL databases.  

  • PgBouncer can be deployed to one or more nodes to manage multiple pools per node.  

  • Pool modes can be based on Session, Transaction or Statement.  

  • There is also a  Prometheus exporter, which provides metrics for a new PgBouncer dashboard.

 

 

Cluster Tags/Labels

You can  now tag or label database clusters to enable quick identification of one or more clusters that are used for a specific purpose.  Tags can be added at cluster deployment or at cluster import.  Cluster management is made easier by searching for or filtering out clusters with specific tags. 

PostgreSQL Support

ClusterControl now supports the most recent major PostgreSQL release, v. 13, for both deployment and import.  It also supports the pgaudit extension to enable audit logging.  And, there are new classes of statements that can be logged (see: https://www.pgaudit.org/).

Miscellaneous

And there’s a bit more…

  • MySQL Cluster (NDB) v. 8.0 is now supported.  

  • Percona MongoDB 4.x audit log is supported (via ClusterControl command line tool).

    • Note that MySQL’s audit log has already been enabled in a previous release.

Understanding Lock Granularity in MySQL

$
0
0

If you’ve been working with MySQL for some time, you have probably heard the terms “table-level locking” and “row-level locking”. These terms refer to the lock granularity in MySQL - in this blog we will explain what they mean and what they can be used for.

What is Lock Granularity in MySQL?

Each MySQL storage engine supports different levels of granularity for their locks. MySQL has three lock levels: row-level locking, page-level locking and table-level locking. Each MySQL storage engine implements locking differently giving you some distinct advantages and disadvantages. We’ll first look into what lock granularity is, then look into how everything works in different storage engines.

Broadly speaking, locks in MySQL fall into one of these categories. Locks can be:

  • Page-level - such types of lock granularities were available in older engines of MySQL, specifically BDB, which is now obsolete as of MySQL 5.1. In short, BDB was a storage engine included in the older versions of MySQL and it was a transactional storage engine which performed page-level locks. Since these types of lock granularities are no longer used we will not go in-depth into them here, but in general, these locks are limited to the data and indexes that reside on a particular page. If you want to learn more about BDB, the page on MariaDB should provide some more information.

  • Table-level - MySQL uses table-level locking for all storage engines except InnoDB.

  • Row-level - row-level locking is used by InnoDB.

The Advantages and Disadvantages of Table-level Locking

MySQL uses table-level locking for all storage engines except InnoDB meaning that table-level locking is used for tables running the MyISAM, MEMORY and MERGE storage engines, permitting only one session to update tables at a time. Table-level locks have some distinct advantages over row-level locks (for example, table-level locking in general requires a little less memory than row-level locking because row-level locking requires some memory per row (or group) of the rows that are locked and it’s usually fast because only one lock is involved. Table write locks are put on a table if there are no locks on it - if there are pre-existing locks on the table in question, the table lock requests are put in the read request queue. It’s worth mentioning that table-level locking has some distinct disadvantages unique to itself too - for example, it might not be a very good fit for applications that require a lot of transactions that go “back and forth” (e.g., an online banking application) because only one session can write to a table at any one time and some of the tables that support table-level locking (such as MyISAM) do not support the ACID model.

Here’s an example: imagine a banking application that uses two tables in a database - let’s say those tables are called “checking” and “savings”. You need to move $100 from a person’s checking account to his savings account. Logically, you would perform the following steps:

  1. Make sure the account balance is greater than $100.

  2. Subtract $100 from the checking account.

  3. Add $100 to the savings account.

To perform these actions, you would need a couple of queries, for example:

SELECT balance FROM checking WHERE account_id = 123;
UPDATE checking SET balance = balance - 100 WHERE account_id = 123;
UPDATE savings SET balance = balance + 100 WHERE account_id = 123;

These queries might look simple, but if you use MyISAM (we use MyISAM as an example as it’s one of the primary storage engines that supports table-level locks), you should be familiar with the fact that the engine doesn’t support ACID either which means that if the database server crashes while performing any of those queries, you’re out of luck: people could end up with cash in both accounts or in neither one of them. The only engine that supports ACID-based transactions in MySQL is InnoDB, so if you need a lot of reliable transactions, it might be worth looking into it. InnoDB also supports row-level locking - this is what we will look into now.

The Advantages and Disadvantages of Row-level Locking

MySQL uses row-level locking for InnoDB tables to support simultaneous write access by multiple sessions. Some of the advantages of using row-level locking include the ability to lock a single row for long periods of time and fewer lock conflicts when many threads access different rows. However, row-level locking has disadvantages too: one of them is that row-level locking usually takes up more memory than page-level or table-level locking, it’s also usually slower than page-level or table-level locks because the engine must acquire more locks. InnoDB is one of the engines that is supporting a row-level locking mechanism: it’s also ACID compliant meaning that it is a good fit for transaction-based applications (refer to the example above). Now we will look into how lock granularity works in one of MySQL storage engines.

How does Lock Granularity Work in InnoDB?

InnoDB is widely known to be supporting row-level locking, but it’s also worth noting that the engine supports multiple types of locking which means that you can use both row-level and table-level locks. InnoDB performs row-level locking by setting shared or exclusive locks on the index records it encounters when it searches or scans a table index. A shared lock is such a lock that permits the transaction that holds the lock to read the row in question, an exclusive lock on the other hand permits the transaction that holds the lock to update or delete a row.

InnoDB also has other types of locks - some of them include shared and exclusive locks, intention locks, record locks, gap locks, next-key locks and next intention locks. Intention locks, for example, can also be shared or exclusive - such locks usually indicate that a transaction intends to set a certain type of a lock (a shared lock or an exclusive lock) on individual rows in a table, a record lock is a lock on an index record etc.

In general, InnoDB lock granularity differs from the lock granularity present in other MySQL storage engines (for example, MyISAM) because when table-level locking is in use, only one session to update certain tables at a time can run. When row-level locking is in use, MySQL supports simultaneous write access across multiple sessions making row-level locking storage engines (InnoDB) a suitable choice for mission-critical applications.

Lock Granularity and Deadlocks

Lock granularity and locking levels in MySQL can be a great thing, but they can also cause problems. One of the most frequent problems caused by lock granularity are deadlocks - a deadlock occurs when different MySQL transactions are unable to proceed because each of them holds a lock that the other needs. Thankfully, when using the InnoDB storage engine, deadlock detection is enabled by default - when a deadlock is detected, InnoDB automatically rolls back a transaction. If you encounter deadlocks when dealing with lock granularity in MySQL, don’t fret - consider simply restarting your transaction. In order to proactively monitor your database, you should also consider utilizing the features provided by ClusterControl.

How can ClusterControl Help You?

Here are some of the things ClusterControl developed by Severalnines can help you with:

  • The protection of all of your business data

    • If your data is corrupted (that can be caused by not using an ACID-compliant storage engine or also by other factors as described above) the tool can run an automatic process that actually verifies that you can recover your data.

    • The tool can let you know which databases are not backed up or show you the status of your backups (whether they were successful or they failed)

  • The automation of your database operations

    • ClusterControl can help you ensure that your sysadmins, developers and DBAs manage entire database clusters efficiently with minimal risks using industry best practices

  • Effectively managing your database infrastructure in general

    • Today’s shift in technologies combined with sophisticated infrastructure solutions requires advanced tools and knowledge to achieve high availability and optimal performance for your business-critical applications. ClusterControl can also help you with the deployment, monitoring, management and scaling of the most popular open source database technologies including MySQL, MariaDB, MongoDB, PostgreSQL, TimeScaleDB and the rest.

To learn more about how ClusterControl can help streamline your business operations, make sure to keep an eye on the Severalnines database blog.

Summary

Different MySQL storage engines have different types of lock granularities available. Before deciding on the storage engine you should use, be sure to know as much information about the storage engine in question as possible (for example as already noted MyISAM should be avoided when dealing with mission-critical data because it’s not ACID compliant), understand all of the related performance implications including lock granularities, deadlocks and the rest and choose wisely.

Configuring Mutual SSL Authentication in ClusterControl

$
0
0

Establishing trusted communications between systems is essential in enhancing system’s security. The use of Public Key Infrastructure (PKI) is one of the common ways to achieve trusted communication in distributed systems. In particular, Mutual SSL Authentication can be used to enhance the security of a client/server interaction by verifying a client’s identity. Though, this is not the only way to verify an identity as I mentioned in my previous zero trust blog.

In this blog, we will go through the steps on how to configure Mutual SSL Authentication also known as Two-Way SSL.

Create a Root CA

  1. Create a Root CA Key

$ openssl genrsa -out severalnines-internal-rootCA.key 4096 
  1. Create and Self-Sign the Root Certificate                                                                     

$ openssl req -x509 -new -nodes -key severalnines-internal-rootCA.key -sha256 -days 1024 -out severalnines-internal-rootCA.crt 

Generate ClusterControl’s (Apache2) Certificate

  1. Create ClusterControl’s Server Private Key

$ openssl genrsa -out clustercontrol.key 2048
  1. Create an SSL Configuration to configure Subject Alternative Names (SAN)

The SSL config file should look like the configuration below.

$ cat clustercontrol-ssl.conf 
ts  = 4096
distinguished_name = req_distinguished_name
req_extensions     = req_ext
[ req_distinguished_name ]
countryName      = Country Name (2 letter code)
countryName_default         = GB
stateOrProvinceName         = State or Province Name (full name)
stateOrProvinceName_default = England
localityName                = Locality Name (eg, city)
localityName_default        = Brighton
organizationName            = Organization Name (eg, company)
organizationName_default    = Hallmarkdesign
commonName                  = Common Name (e.g. server FQDN or YOUR name)
commonName_max              = 64
commonName_default          = clustercontrol.severalnines.internal
[ req_ext ]
subjectAltName = @alt_names
[alt_names]
DNS.1   = severalnines.internal
DNS.2   = clustercontrol.severalnines.internal

Note: Ensure that you add the Subject Alternatives Names (SAN’s) of above  to your DNS or hosts file.

  1. Generate a ClusterControl Certificate Signing Request (CSR)

$ openssl req -new -key clustercontrol.key -out clustercontrol.csr -config clustercontrol-ssl.conf
  1. Sign the ClusterControl Certificate using the Root Certificate Authority (CA)

$ openssl x509 -req -in clustercontrol.csr -CA severalnines-internal-rootCA.crt -CAkey severalnines-internal-rootCA.key -CAcreateserial -out clustercontrol.crt -days 500 -sha256 -extensions req_ext -extfile clustercontrol-ssl.conf

Configure Apache2 installed with ClusterControl

  1. Configure the Apache2 SSL Configuration File

Open the file on the s9s SSL file on this path /etc/apache2/sites-available/s9s-ssl.conf. Replace the following settings as shown below:

ServerName clustercontrol.severalnines.internal #Define one of the Subject Alternative Names (SAN) as provided in the clustercontrol-ssl.conf file
SSLCertificateFile /path/to/clustercontrol.crt 
SSLCertificateKeyFile /path/to/clustercontrol.key
SSLCACertificateFile /path/to/severalnines-internal-rootCA.crt #Define the path to the Root CA Certificate generated in the first step 
SSLVerifyClient require #Require browsers/clients to provide a client-cert
SSLVerifyDepth 10

Note: You will need to restart apache after changing the settings above.

Generate a Client Certificate

  1. Create an RSA Encrypted Key (myclient-pass.key) with a Password

$ openssl genrsa -aes256 -passout pass:mykey123 -out myclient-pass.key 4096
  1. Decrypt/Extract the RSA Key for Signing

$ openssl rsa -passin pass:mykey123 -in myclient-pass.key -out myclient.key
  1. Generate a Client Certificate Signing Request (CSR)

$ openssl req -new -key myclient.key -out myclient.csr

In step 3 above,  you will be required to provide the details as shown below:

Country Name (2 letter code) [AU]:SW
State or Province Name (full name) [Some-State]:Sweden
Locality Name (eg, city) []:Stockholm
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Severalnines AB
Organizational Unit Name (eg, section) []:Security
Common Name (e.g. server FQDN or YOUR name) []: myclient.severalnines.internal
Email Address []:my@severalnines.internal

Note: Client certificates should only be generated by the user and sent to the security or system administrator who is in charge of administering the RootCA server.

Configure the Browser to Access ClusterControl UI

  1. Concatenate the Client Key, Client Certificate and the Root CA Certificate

$ cat myclient.key myclient.crt severalnines-internal-rootCA.crt > myclient.pem
  1. Create a PKCS12 archive (Pfx) file for the client certificate that can be imported into the browser certificate/key store.

$ openssl pkcs12 -export -out myclient.pfx -inkey myclient.key -in myclient.pem -certfile severalnines-internal-rootCA.crt 
  1. Test access to ClusterControl UI

Before installing the client certificate on the browser, you should see a response as shown below. The below response “clustercontrol.severalnines.internal didn’t accept your login certificate” simply means that Mutual SSL authentication has been enforced on Apache2 and the client certificate has not been installed on the browser therefore access to CC UI won’t be allowed at this point.

  1. Installing the Client Certificate on your Chrome Browser

  • Go to the settings page by typing in “chrome://settings/” on the browser.
  • On the settings page,  you will see a section labeled “Privacy and Security”. Under this section, you will see a menu item labeled “Security”. Click on the menu item to get onto the Security settings page.
  • Under the Security settings page, you will see the “Manage certificates” menu item under the Advanced Section of the page. Click on that item to get onto the Certificate settings page. 
  • You will immediately see an import button under “Your certificates”. Click on the button to import the PKCS12 (Pfx) file generated earlier in the steps above.
  • You will be prompted to enter your certificate’s password as you had specified in the previous steps. 
  • You should be able to see your installed certificate as shown in the diagram below.


 

  1. Accessing Cluster Control UI

The moment you try accessing ClusterControl you will be prompted to specify the client certificate that you want to use as in the diagram below. Be sure to select the appropriate one if you have more than one client certificates installed on your browser. 

You should be able to access ClusterControl UI after selecting the certificate.

Conclusion

Using the steps above gives you a stepwise guide on how to implement Mutual SSL Authentication in ClusterControl. This should go a long way in ensuring that you always verify the identity of the user/client that is accessing the ClusterControl UI.

Securing MySQL Backups: A Guide

$
0
0

If you’ve ever used MySQL, chances are you probably took backups of your database. If you took backups of your database, chances are you have at least once thought of how you could secure them. In this blog post we are going to tell you how to do exactly that.

Why Should You Secure Your MySQL Backups?

Before we tell you how you should secure your MySQL backups, we should probably tell you why you should secure them in the first place. What do we even mean by “securing” your MySQL backups? MySQL backups should be secure by default, right? Unfortunately, not everything is as simple as it seems. To take and maintain secure MySQL backups, you should consider the following things:

  1. Securely take your MySQL backups

  2. Securely store your MySQL backups

  3. Securely transfer your MySQL backups

Now obviously that’s easier said than done, but we will provide some general advice that can guide you in the right direction.

Securing MySQL Backups

  1. To securely take your MySQL backups by using, for example, mysqldump, consider putting the username and password of your MySQL user inside of my.cnf. You can even create a .my.cnf file in your home directory, store the username and password there, then use the --defaults-extra-file option to tell MySQL to read this file after the global option file:
     

    [mysqldump]
    user=demo_user
    password=demo_password

    This way you no longer need to provide your MySQL password when running mysqldump - by putting your username and password inside of my.cnf you make your password unobservable to anyone else but DBAs.

  2. Consider taking a look into mysqldump-secure: it’s a POSIX compliant wrapper script for mysqldump with encryption capabilities. The tool can back up databases as separate files. Databases can also be blacklisted from being backed up. The tool can also encrypt your MySQL databases and it is also self-validating meaning if anything goes wrong, it will tell you what happened and how to fix it, so if you’re looking for an alternative to mysqldump, definitely consider giving it a try.

  3. Once you’ve taken a backup of your MySQL or MariaDB database instances, consider encrypting it. Chances are data is one of the most precious assets to your organization and by encrypting it you can make sure it’s protected properly. Thankfully, encrypting MySQL backups is not very complex and it can be done in a couple of ways including encrypting the local file and encrypting the backup on-the-fly. To encrypt a local copy of your backup, simply take a backup of the data stored in MySQL, then encrypt it by using, for example, OpenSSL (replace password with the password you want to use):

    $ openssl enc -aes-256-cbc -salt -in backup.tar.gz -out backup.tar.gz.enc -k password

    Your backup can be decrypted by running:

    $ openssl aes-256-cbc -d -in backup.tar.gz.enc -out backup.tar.gz -k password


    You can also consider encrypting your backups on-the-fly. To do that, in general you would need to implement encryption when the backup is being generated (i.e generate the backup, compress it and encrypt it). Here’s how to do that for MySQL using mysqldump (your backup would be called encrypted_backup.xb.enc):

    mysqldump --all-databases --single-transaction --triggers --routines | gzip | openssl  enc -aes-256-cbc -k password > encrypted_backup.xb.enc

    You can also encrypt your backups using ClusterControl: simply check the boxes “Use Compression” and (or) “Enable Encryption” in the last stage of the backup and you’re done. Yes, it’s as easy as that!
     

You might also want to take a look into a shell script called mysql_secure_installation (or mariadb_secure_installation if you’re using MariaDB). The script enables you to:

  • Set a password for MySQL’s root accounts.

  • Remove root accounts that are accessible from outside the localhost.

  • Remove any anonymous user accounts and the test database which can be accessed by anonymous users.

If you are deploying MySQL or MariaDB using ClusterControl, something that you can do freely with the Community Edition, the deployment process automatically takes care of these security measures.

Summary

When it comes to securing your MySQL backups, the list of the things you can do is pretty long. We hope that this blog post has given you some ideas on what you can do to secure your MySQL or MariaDB backups: in general, backups can be secured by making your password unobservable when mysqldump is invoked, also when encrypting your backups locally or on-the-fly.

Using Automation to Speed up Release Tests on PostgreSQL

$
0
0

Having a test environment is a must in all companies. It could be necessary for testing changes or new releases of the application, or even for testing your existing application with a new PostgreSQL version. The hard part of this is, first, how to deploy a test environment as similar as possible to the production one, and how to maintain that environment without recreating everything from scratch.

In this blog, we will see how to deploy a test environment in different ways using ClusterControl, which will help you to automate the process and avoid manual time-consuming tasks.

Cluster-to-Cluster Replication

Since ClusterControl 1.7.4 there is a feature called Cluster-to-Cluster Replication. It allows you to have a replication running between two autonomous clusters.

We will take a look at how to use this feature for an existing PostgreSQL cluster. For this task, we will assume you have ClusterControl installed and the Primary Cluster was deployed using it.

Creating a Cluster-to-Cluster Replication

To create a new Cluster-to-Cluster Replication from the ClusterControl UI, go to ClusterControl -> Select PostgreSQL Cluster -> Cluster Actions -> Create Slave Cluster.

The Slave Cluster will be created by streaming data from the current Primary Cluster.

You must specify SSH credentials and port, a name for your Slave Cluster, and if you want ClusterControl to install the corresponding software and configurations for you.

After setting up the SSH access information, you must define the database version, datadir, port, and admin credentials. As it will use streaming replication, make sure you use the same database version, and the credentials must be the same used by the Primary Cluster.

In this step, you need to add the server to the new Slave Cluster. For this task, you can enter both IP Address or Hostname of the database node.

You can monitor the job status in the ClusterControl activity monitor. Once the task is finished, you can see the cluster in the main ClusterControl screen.

ClusterControl CLI

ClusterControl CLI, also known as s9s, is a command-line tool introduced in ClusterControl version 1.4.1 to interact, control, and manage database clusters using the ClusterControl system. ClusterControl CLI opens a new door for cluster automation where you can easily integrate it with existing deployment automation tools like Ansible, Puppet, Chef, etc. You can also use this ClusterControl tool to create a Slave Cluster. Let’s see an example:

$ s9s cluster --create --cluster-name=PostgreSQL1rep --cluster-type=postgresql --provider-version=13 --nodes="192.168.100.133"  --os-user=root --os-key-file=/root/.ssh/id_rsa --db-admin=admin --db-admin-passwd=********* --vendor=postgres --remote-cluster-id=14 --log

Now, let’s see the used parameter more in details:

  • Cluster: To list and manipulate clusters.

  • Create: Create and install a new cluster.

  • Cluster-name: The name of the new Slave Cluster.

  • Cluster-type: The type of cluster to install.

  • Provider-version: The software version.

  • Nodes: List of the new nodes in the Slave Cluster.

  • Os-user: The user name for the SSH commands.

  • Os-key-file: The key file to use for SSH connection.

  • Db-admin: The database admin user name.

  • Db-admin-passwd: The password for the database admin.

  • Remote-cluster-id: Master Cluster ID for the Cluster-to-Cluster Replication.

  • Log: Wait and monitor job messages.

Managing Cluster-to-Cluster Replication

Now you have your Cluster-to-Cluster Replication up and running, there are different actions to perform on this topology using ClusterControl from both UI and CLI.

Rebuilding a Slave Cluster

To rebuild a Slave Cluster, go to ClusterControl -> Select Slave Cluster -> Nodes -> Choose the Node -> Node Actions -> Rebuild Replication Slave.

ClusterControl will perform the following steps:

  • Stop PostgreSQL Server

  • Remove content from its datadir

  • Stream a backup from the Master to the Slave using pg_basebackup

  • Start the Slave

You can also rebuild a Slave Cluster using the following command from the ClusterControl server:

$ s9s replication --stage --master="192.168.100.125" --slave="192.168.100.133" --cluster-id=15 --remote-cluster-id=14 --log

The parameters are:

  • Replication: To monitor and control data replication.

  • Stage: Stage/Rebuild a Replication Slave.

  • Master: The replication master in the master cluster.

  • Slave: The replication slave in the slave cluster.

  • Cluster-id: The Slave Cluster ID.

  • Remote-cluster-id: The Master Cluster ID.

  • Log: Wait and monitor job messages.

Create Cluster from Backup

Another way to create a test environment is by creating a new cluster from a backup of your Primary Cluster. For this, go to ClusterControl -> Select your PostgreSQL cluster -> Backup. There, choose the backup to be restored from the list.

Now, you can restore this backup in your current database, in a separate node, or create a new cluster from this backup.

The “Create Cluster From Backup” option will create a new PostgreSQL Cluster from the selected backup.

You need to add the OS and database credentials and the information to deploy the new cluster. When this job finishes, you will see the new cluster in the ClusterControl UI.

Restore Backup on Standalone Host

In the same Backup section, you can choose the option “Restore and verify on standalone host” to restore a backup in a separate node.

Here you can specify if you want ClusterControl to install the software in the new node, and disable the firewall or AppArmor/SELinux (depending on the OS). You can keep the node up and running, or ClusterControl can shutdown the database service until the next restore job. When it finishes, you will see the restored/verified backup in the backup list marked with a tick.

If you don’t want to do this task manually, you can schedule this process using the Verify Backup Feature, to repeat this job periodically in a Backup Job.

Automatic ClusterControl Backup Verification

In ClusterControl -> Select your PostgreSQL Cluster -> Backup -> Create Backup.

The automatic verify backup feature is available for the scheduled backups. When scheduling a backup, in addition to selecting the common options like method or storage, you also need to specify schedule/frequency.

Using ClusterControl, you can choose different backup methods, depending on the database technology, and, in the same section, you can choose the server from which to take the backup, where you want to store the backup, and if you want to upload the backup to the cloud (AWS, Azure, or Google Cloud). You can also compress and encrypt your backup, and specify the retention period.

To use the Verify Backup Feature, you need a dedicated host (or VM) that is not part of the cluster. ClusterControl will install the software and will restore the backup in this host every time the job runs.

After restoring, you can see the verification icon in the ClusterControl Backup section, the same that you will have by doing the verification in the manual ClusterControl way, with the difference that you don’t need to worry about the restoration task. ClusterControl will restore the backup every time automatically.

Conclusion

Deploying a test environment every time you need could be a time-consuming task, and it is hard to maintain this up-to-date. The result of this is sometimes companies don’t test new releases or the test is not correct, for example, using a different environment than the production one. 

As you could see, ClusterControl allows you to deploy the same environment that you are using in production with just a few clicks, or even automate the process to avoid any manual task.

 

Automated Testing of the Upgrade Process for PXC/MariaDB Galera Cluster

$
0
0

Upgrading your database for Galera-based clusters such as Percona XtraDB Cluster (PXC) or MariaDB Galera Cluster can be challenging, especially for a production-based environment. You cannot afford to lose the state of your high availability and put it at risk. 

An upgrade procedure must be well documented, and ideally, documentation, rigorous testing, and benchmarking should be done before upgrades. Most importantly, security and improvements also have to be identified based on the changelog of its database version upgrade. 

With all the concerns, automation helps to achieve a more efficient upgrade process, and helps avoid human error and improves RTO.

How to Manage PXC/MariaDB Galera Cluster Upgrade Process 

Upgrading your PXC/MariaDB Galera Cluster requires proper documentation and process flow that lists the things to be done and what things to do in case things go south. That means a Business Continuity Plan which shall also cover your Disaster Recovery Plan should be laid out. You cannot afford to lose your business in case of trouble. 

The usual take is to start first with the test environment. The test environment should have the exact same settings and configuration as your production environment. You cannot proceed directly with upgrading the production environment as you aren't sure what effect and impact it will occur if things do not accord to the plan. 

Working with a production environment is highly sensitive, so in most cases, a downtime and maintenance window is always there to avoid drastic impact. 

There are two types of an upgrade for PCX or MariaDB Galera Cluster that you need to be aware of. These are the major release upgrade and the minor release upgrade or often referred to as in-place upgrade. An in-place upgrade is where you can upgrade your database version to its most recent minor version using the same binary data of your database. There will be no physical changes to the data itself, but only on its database binary or underlying software packages.

Upgrading PCX or MariaDB Galera Cluster to a Major Release

Upgrading to a major release can be challenging, especially for a production environment. It involves a complex type of database configuration and special built-in features of PXC or MariaDB Galera Cluster. Spatiotemporal, time-stamped data, machine data, or any multi-faceted data are very conservative and sensitive to upgrades. You cannot apply an in-place upgrade for this process because many major changes would have been made. Unless you have very small data or data consisting of idempotents or data that can be generated easily can be safe to do as long as you know the impact won't affect your data.

If your data volume is large, then it’s best to have the upgrade process automated. However, It might not be an ideal solution to automate the all sequence in the upgrade process because there might be unexpected issues creeping in during the major upgrade phase. It is best to automate repetitive steps and processes with known outcomes in a major upgrade. At any point, a resource is required to evaluate if the automation process is safe to avoid any halts in the upgrade process. Automated testing after the upgrade is equally important, and it should be included as a part of the post-upgrade process.

Upgrading PCX or MariaDB Galera Cluster to a Minor Release

A minor release upgrade referred to as an in-placed upgrade is usually a safer approach to perform an upgrade process. This is because, the most common changes for this release are security and exploit patches or improvements, bugs (usually severe ones), or compatibility issues that require patches especially if the current hardware or OS had changes applied that can cause also the database not to function properly. Although the impact can usually be recoverable at a minimal effect, it is still a must that you have to look and read the changelog that was pushed to the specific minor version upgrade.

Deploying the job to perform the upgrade process is an ideal example for automation. The usual flow is very repetitive and mostly causes no harm to your existing PXC or MariaDB Galera Cluster. What matters most is that after the upgrade, automated testing shall proceed to determine the setup, configuration, efficiency, and functionality are not broken.

Avoid the Fiascoes! Be ready, Have it Automated!

A client of ours reached out to us asking for assistance because, after the database minor upgrade, a feature that they are using in the database is not properly working. They asked for steps and processes on how to downgrade and how safe it will be. Their customers were complaining that their application is totally not working, generalizing that it's not useful. 

Even for such a small glitch, a pissed off customer might give a bad remark to your product. The lesson learnt from this scenario is that failing in testing after an upgrade leads to an assumption that all functions in a database are working as expected.

Suppose you have plans to automate the upgrade process, then take note that type of automation process varies to the type of upgrades you have to do. As mentioned earlier, a major upgrade versus a minor upgrade has different distinct approaches. So your automaton setup might not apply to both database software upgrades.

Automating After the Upgrade Process

At this point, it is expected that you have your upgrade process done, ideally, through automation. Now that your database is ready to receive client connections, it has to follow with a rigorous testing phase.

Run mysql_upgrade

It is very important and extremely recommended to execute mysql_upgrade once the upgrade process has completed. mysql_upgrade looks for incompatibilities with the upgraded MySQL server by doing the following things:

  • It upgrades the system tables in the mysql schema so that you can take advantage of new privileges or capabilities that might have been added.

  • It upgrades the Performance Schema and sys schema.

  • It examines user schemas.

The mysql_upgrade determines if a table has problems such as incompatibilities due to changes in the most recent version after the upgrade and attempts to resolve it by repairing the table. Otherwise, if it fails, then your automation test shall have to fail and must not proceed onto something else. It has to be investigated first and do a manual repair.

Check error logs

Once the mysql_upgrade is done, you need to check and verify for the errors that occurred. You can put this into a script and check for any "error" or "warning" labels in the error logs. It is very important to determine if there's such. Your automated test must have the ability to catch error traps either it can wait for a user input to continue if the error is just very minimal or expected, otherwise stop the upgrade process.

Perform a unit test

A TDD (Test Driven Development) database environment is a software development approach where there are a series of test cases to be validated and determine if validation is true (pass) or false (fail). Something like what we have in the screenshot below:

Image courtesy of guru99.com

This is a type of unit testing helps avoid unwanted bugs or logical errors to your application and in your database. Remember, if there are invalid data stored in the database, that would harm all the business analytics and transactions especially if it involves complex financial computation or mathematical equations. 

If you ask, is it really necessary to perform a unit test after the upgrade? Of course, it is! You don't necessarily have to run this under the production environment. During the testing phases, i.e. upgrading first your QAs, development/staging environment, it has to be applied in that area. Data has to be an exact copy at least or almost the same as its production environment. Your goal here is to avoid unwanted results and definitely wrong logical results. You have to take good care of your data of course and, determine if the results pass the validation test.

If you intend to run with your production, then do it. However, do,not be as rigid as your testing phase applied in the QA, development, or staging environment. It is because you have to plan your time based on the available maintenance window and avoid delays and longer RTO. 

In my experience, during the upgrade phase, customers select a quicker approach that shall be important to determine if such a feature provides the correct result. Moreover, you can have a script to automate the test a set of business logical functions or stored procedures since it helps to cache the queries and make your database warm.

When preparing for Unit Test for your database, avoid reinventing the wheel. Instead, take a look at the available tools you can choose if it's good for your requirements and needs. Check out Selenium, or go check out this blog.

Verify identity of queries

The most common tool you can use is Percona's pt-upgrade. It verifies that query results are identical on different servers. It executes queries based on the given logs and supplied connection (or called as DSN), then compares the results and reports any significant differences. It offers more than that as your options to collect or analyze the queries such as through tcpdump, for example.

Using the pt-upgrade is easy. For example, you can run with the following command:

## Comparing via slow log for the given hosts
pt-upgrade h=host1 h=host2 slow.log

## or use fingerprints, useful for debugging purposes
pt-upgrade --fingerprints --run-time=1h mysqld-slow.log h=127.0.0.1,P=5091 h=127.0.0.1,P=5517

## or with tcpdump,
tcpdump -i eth0 port 3306 -s 65535  -x -n -q -tttt     \
  | pt-query-digest --type tcpdump --no-report --print \
  | pt-upgrade h=host1 h=host2

It's a good practice that once an upgrade, especially a major release upgrade has been performed, pt-upgrade is used to proceed and perform query analysis identifying differences based on the results. It is a good practice to do this during the testing phase while doing it on your QAs or staging and development environment so you can decide if it's safer to proceed. You can add this to your automation tool and run this as a playbook once it's ready to perform its duty.

How to Automate the Testing Process?

In our previous blogs, we have presented different ways to automate your databases. The most common tools that are vogue are these IaC (Infrastructure as Code) deployment software tools. You can use Puppet, Chef, SaltStack, or Ansible to do the job.

My preference has always been Ansible to perform my automated testing, it allows me to create playbooks by its job role. Of course, I cannot create one whole thing automaton that will do all the things because the situation and environment varies. Based on the given upgrade types earlier (major vs minor upgrade), you should put distinction to its process. Even if it's just an in-place upgrade, you still have to make sure that your playbooks shall perform the correct job.

ClusterControl is Your Database Automation Friend!

ClusterControl is a good option to perform basic and automated testing. ClusterControl is not a framework for testing; it’s not a tool to provide unit testing. However, it's a database management and monitoring tool that incorporates a lot of automated deployments based on the requested triggers from the user or administrator of the software. 

ClusterControl offers minor version upgrades, which provides convenience to the DBAs when performing upgrades. It does mysql_upgrade on the fly as well. So you do not need to perform it manually. ClusterControl also detects new versions to be upgraded and recommend the next steps for you to do. In case of failure is encountered, the upgrade will not proceed.

Here's an example of the minor upgrade job:

If you look carefully, the mysql_upgrade runs successfully. Whilst, it does not recommend and does an automatic upgrade of the master, it is because it is not the right approach to proceed. In that case, you have to promote the new slave, then demote the master as a slave to perform the upgrade.

Conclusion

The great thing with ClusterControl is that you can incorporate checking of error logs, perform a unit test, verify identity of queries by creating Advisors. It's not difficult to do so. You can refer to our previous blog Using ClusterControl Advisor to Create Checks for SELinux and Meltdown/Spectre: Part One. This exemplifies how you can take advantage and either trigger the next job to do once the upgrade is executed. ClusterControl has built-in alerts or alarms that can integrate to your favorite third-party alert systems to notify you of your automated testing’s current status.

Automated Testing of the Upgrade Process for MySQL/MariaDB/Percona Server

$
0
0

Upgrades are always a hard and time-consuming task. First, you should test your application in a test environment, so, ideally, you will need to clone your current production environment for this. Then, you need to make a plan to perform the upgrade that, depending on the business, could be with zero downtime (or almost zero), or even schedule a maintenance window to make sure that if something goes wrong, it will affect as little as possible.

If you want to do all these things manually, there is a big chance of human error and the process will be slow. In this blog, we will see how to automate testing for upgrading your MySQL, MariaDB, or Percona Server databases using ClusterControl.

Type of Upgrades

There are two types of upgrades: Minor Upgrades and Major Upgrades.

Minor Upgrades

The first one, Minor Upgrade, is the most common and safe upgrade, and in most cases, this is performed in place. As nothing is 100% secure, you must always have backups and replication slaves nodes, so in case something goes wrong with the upgrade and for some reason you can’t rollback/downgrade, you can promote a slave node, and your systems can still work without interruption.

You can perform this kind of upgrade using ClusterControl. For this, go to ClusterControl -> Select the Cluster -> Manage -> Upgrades.

On each selected node, the upgrade procedure will:

  • Stop Node

  • Upgrade Node

  • Start Node

The Master node in a Replication Topology won’t be upgraded. To upgrade the Master, another node must be promoted to become the new Master first.

Major Upgrades

For Major Upgrades, it is not recommended the in-place upgrade, as the risk of something going wrong is too high for a production environment. Instead of this, you can clone your current database cluster and test your application there, and when you finish, you can re-create it or even create a new cluster in the new version and switch the traffic when it is ready. There are different approaches for these upgrades. You can upgrade the nodes one by one, or create a different cluster replicating the traffic from the current one, you can also use load balancers to improve High Availability, and more options. The best approach depends on the downtime tolerance and the Recovery Time Objective (RTO).

You can’t perform Major Upgrades with ClusterControl directly, because, as we mentioned, you need to test everything first, to make sure that the upgrade is safe, but you can use different ClusterControl features to make this task easier. So let’s see some of these features.

Backups

Backups are a must before any upgrade. A good backup policy can avoid big issues for the business. So, let’s see how ClusterControl can automate this.

Creating a Backup

Go to ClusterControl -> Select the Cluster -> Backup -> Create Backup.

You can create a new backup or configure a scheduled one.

You can choose different backup methods, depending on the database technology, and, in the same section, you can choose the server from which to take the backup, where you want to store the backup, and if you want to upload the backup to the cloud (AWS, Azure, or Google Cloud) in the same job.

You can also compress and encrypt your backup, and specify the retention period, among other options.

On the backup section, you can see the progress of the backup, and information like the method, size, location, and more.

Deploying a Test Environment

For this, you don’t need to create everything from scratch. Instead of this, you can use ClusterControl for doing this in a manual or automated way.

Restore Backup on Standalone Host

In the Backup section, you can choose the option “Restore and verify on standalone host” to restore a backup in a separate node.

Here you can specify if you want ClusterControl to install the software in the new node, and disable the firewall or AppArmor/SELinux (depending on the OS). For this, you need a dedicated host (or VM) that is not part of the cluster.

You can keep the node up and running, or ClusterControl can shutdown the database service until the next restore job. When it finishes, you will see the restored/verified backup in the backup list marked with a tick.

If you don’t want to do this task manually, you can schedule this process using the Verify Backup Feature, to repeat this job periodically in a Backup Job. We are going to see how to do this in the next section.

Automatic ClusterControl Backup Verification

To automate this task, go to ClusterControl -> Select your Cluster -> Backup -> Create Backup, and choose the Scheduled Backup option.

The automatic Verify Backup feature is only available for scheduled backups, and the process is the same that we described in a previous section. In the second step, make sure you have enabled the Verify Backup option, and complete the required information.

When the job is finished, you can see the verification icon in the ClusterControl Backup section, the same that you will have by doing the verification in the manual way, with the difference that you don’t need to worry about the restoration task. ClusterControl will restore the backup every time automatically, and you can test your application with the most recent data.

Autorecovery and Failover

Having the Autorecovery feature enabled, in case of failure, ClusterControl will promote the most advanced slave node to master as well as notify you of the problem. It also fails over the rest of the slave nodes to replicate from the new master server.

If there are Load Balancers in the topology, ClusterControl will reconfigure them to apply the topology changes.

You can also run a Failover manually if needed. Go to ClusterControl -> Select the Cluster -> Nodes -> Select the Node to be promoted -> Node Actions -> Promote Slave.

In this way, if something goes wrong during the upgrade, you can use ClusterControl to fix it ASAP.

Automating Things with ClusterControl CLI

ClusterControl CLI, also known as s9s, is a command-line tool introduced in ClusterControl version 1.4.1 to interact, control, and manage database clusters using the ClusterControl system. ClusterControl CLI opens a door for cluster automation where you can easily integrate it with existing deployment automation tools like Ansible, Puppet, Chef, etc. Let’s see now some examples of this tool.

Upgrade

$ s9s cluster --cluster-id=19 \
--check-pkg-upgrades \
--log
$ s9s cluster --cluster-id=19 \
--available-upgrades \
--nodes='10.10.10.146' \
--log \
--print-json
$ s9s cluster --cluster-id=19 \
--upgrade-cluster \
--nodes='10.10.10.146' \
--log

Create Backup

$ s9s backup --create \
--backup-method=mysqldump \
--cluster-id=2 \
--nodes=10.10.10.146:3306 \
--on-controller \
--backup-directory=/storage/backups
--log

Restore Backup

$ s9s backup --restore \
--cluster-id=19 \
--backup-id=3 \
--wait

Verify Backups

$ s9s backup --verify \
--backup-id=3 \
--test-server=10.10.10.151 \
--cluster-id=19 \
--log

Promote Slave Node

$ s9s cluster --promote-slave \
--cluster-id=19 \
--nodes='10.10.10.146' \
--log

Conclusion

Upgrades are necessary but time-consuming tasks. Deploying a test environment every time you need to upgrade could be a nightmare, and it is hard to maintain this up-to-date without any automatization tool.

ClusterControl allows you to perform minor upgrades or even deploy the test environment to make the upgrade task easier and safer. You can also integrate it with different automation tools like Ansible, Puppet, and more.

Database Automation Best Practices in FinTech

$
0
0

Automation equals speed - there is no discussion about it. Automation also increases reliability and reduces the error rate. Those feats are important in pretty much every kind of industry, but they are crucial for some. The FinTech industry is among those. Operating in a highly competitive market, you want to be able to deploy fast and reliably and control your database environment to the maximum level. Relying on external cloud providers is very easy, but it comes with a high cost. What you may want to do instead, is to build an environment tailored for your particular needs, which can be used to achieve velocity and reliability that you require. Let’s take a look at how ClusterControl can help you to accomplish this goal.

Database Deployment Speed

One way to increase the velocity of your internal processes related to the database tier is to let users do what they want whenever they need it and whenever they like. There are many internal projects running at the same time, distributed across numerous teams, sometimes even across them as multiple teams may be assigned to the same project. Single operations team won’t be able to reply to all of the requests, take care of all of the needs, or help to design all of the environments that are required at the given moment. Instead, what can be done is to build an infrastructure that will allow the other teams to take care of their needs related to the database environment.

ClusterControl is a platform that can be used to handle the whole lifecycle of open source databases. This actually fits very nicely into the FinTech industry, given that we see significant trends of immigrating from preparatory databases into the open source ones. PostgreSQL, for example, is one of the common picks among financial organizations. ClusterControl can be used, among others, to deploy open source databases and, as such, it is a good core around which you can build the self-service solution for your company. There are a couple of ways in which you can integrate ClusterControl with your tools. Outside of the UI, ClusterControl comes with an API that can be used to execute RPC calls and perform required tasks. It is also possible to use the command line interface to ClusterControl to perform actions required at the given moment.

ClusterControl comes with an extensive set of features, and it could be that not all of them are needed for people, who just want to have their databases up and running and will not be performing management of such databases. It could be wise to build an application on top of the ClusterControl, an application that will expose only the most important features of the platform, smoothing the learning curve and letting your teams get their databases as quick as possible. In the background, your operations team will be working tirelessly with ClusterControl to ensure that the databases are properly managed and maintained, backups are being executed. Should a node fails, ClusterControl performs the recovery.

FinTech Data Security

FinTech is all about security - whenever Personal Identifiable Information is stored or processed, especially if it is related to financial accounts, security is paramount. Having a standardized way of dealing with databases may be of a significant benefit. You want to ensure that the deployments you perform are done in the same way. You want to be able to change the configuration across your databases easily should you need to make any changes. Automation can be of great help here - having a set of scripts and tools to deal with the deployments and life cycle management of the databases can be crucial in complying with the security standards and requirements that are related to the FinTech industry or PII in general. What would also be very useful is to prepare a piece of code that would be plugged into the deployment process for staging or test servers. Such code should be able to obfuscate any vulnerable data into the form that can be used outside of production, in less strictly secured environments. Please keep in mind that staging setups, typically, have reduced security measures, at least related to access - you may want to have developers accessing staging servers while you don’t want them to access production systems. If that’s the case, you probably don’t want to let developers access sensitive data thus the need of obfuscating it.

Another important aspect is the access itself. You may want to automate security audits. Among things to look at would be:

  1. Are the users defined in the database ones who should be defined? Compare what you see in the database with what you should see there.

  2. Do the users defined in the database have the proper access privileges? Are the privileges limited to what is required for a given user?

  3. Do the users have defined access from a proper set of hosts? We want them to access the database only from the hosts that make sense (application hosts, loadbalancers and so on) instead of allowing access from everywhere.

  4. Are the security settings in the database configured properly? In some cases, you can configure your database to enforce password rotation, define specific requirements for the password string and so on.

Those are some suggestions related to the database configuration. Aside from this, you probably want to include security checks on the operating system, network setting check (example firewall check) and analyze contents of audit logs to detect malicious activity.

Database Backups

Backups are crucial for any kind of serious business, not just FinTech; it is also the most commonly automated activity. Write a shell script that does the job, set it up in the crontab for scheduled execution, and that’s it. The commonly missed step in such a setup is that non-tested backup is as good as none - backup can only be marked as working if you have tested it. Backups can be corrupted, can be unrestorable. If you won’t catch this before you actually need the backup, you can run into a serious situation. This is why the backup automation has to involve a test restore - restore the backup on a separate host, make sure the database can be started and then see if you can set it up as a part of the replication topology and catch up with the rest of the cluster.

When designing an automated backup verification, please make sure you test all the steps you have to execute during the regular restore process. If you use encryption and the key to decrypt backups is stored in a safe location, make sure it is actually involved in the backup verification. See if you can encrypt the backup, decompress it. It is paramount to test the backup after you implement any kind of change in the backup process. If you add a new step to your backup process or change the backup tool’s configuration, make sure you tested at least several backups before you decide that the backup process is properly working.

High Availability

As you can imagine, FinTech industry, even more than your average online industry, requires databases to stay available even if there are issues. We, as users, prefer to see financial organizations as stable and available. We, after all, trust them with our money. It wouldn’t be nice to experience downtime that would affect our ability to use our credit cards or access bank accounts. This is why the high availability of databases is so important in the financial industry.

We want to see in the high availability department the automated failover for the asynchronous replication topologies. The idea is simple - in an asynchronous replication environment, you have one writer (master) and one or more readers (replicas). Should the master crash, one of the replicas should be promoted to take over its role. Several safety checks should be performed on the master candidate; once all is clear, failover should be triggered.

Let’s take a look at the process:

We have here a master and two replicas ready to be promoted. Let’s assume that the master crashes and 

 

The moment the master becomes unavailable, automated scripts will ensure that the replicas in the topology will catch up and replay any missing transactions. As a next step, replicas will undergo sanity checks to determine the best master candidate. Finally, a new master should be promoted and take over writes.

Database automation is a large topic, and this blog is just scratching the surface of it. If you are interested in more details, please look at these case studies on how ClusterControl helps Fintech Industry. If you would like to share your experience, we’d love to hear from you in the comments below.

New User and LDAP Management in ClusterControl 1.8.2

$
0
0

After upgrading to ClusterControl 1.8.2, you should get the following notification banner:

What's up with that? It is a depreciation notice of the current user management system in favor of the new user management system handled by the ClusterControl controller service (cmon). When clicking on the banner, you will be redirected to the user creation page to create a new admin user, as described in this user guide.

In this blog post, we are going to look into the new user management system introduced in ClusterControl 1.8.2, and to see how it is different from the previous ones. Just for clarification, the old user management system will still work side-by-side with the new user authentication and management system until Q1 2022. From now on, all new installations for ClusterControl 1.8.2 and later will be configured with the new user management system.

User Management pre-1.8.2

ClusterControl 1.8.1 and older stores the user information and accounting inside a web UI database called "dcps". This database is independent of the cmon database that is used by the ClusterControl Controller service (cmon).

User Accounts and Authentication

A user account consists of the following information:

  • Name

  • Timezone

  • Email (used for authentication)

  • Password

  • Role

  • Team

 

One would use an email address to log in to the ClusterControl GUI, as shown in the following screenshot:

Once logged in, ClusterControl will look up for the organization the user belongs to and then assign the role-based access control (RBAC) to access a specific cluster and functionalities. A team can have zero or more clusters, while a user must belong to one or more teams. Creating a user requires a role and team created beforehand. ClusterControl comes with a default team called Admin, and 3 default roles - Super Admin, Admin and User.

Permission and Access Control

ClusterControl 1.8.1 and older used a UI-based access control based on role assignment. In another term, we called this role-based access control (RBAC). The administrator would create roles, and every role would be assigned a set of permissions to access certain features and pages. The role enforcement happens on the front-end side, where ClusterControl controller service (cmon) has no idea on whether the active user has the ability to access the functionality because the information is never been shared among these two authentication engines. This would make authentication and authorization more difficult to control in the future, especially when adding more features that compatible with both the GUI and CLI interfaces.

The following screenshot shows the available features that can be controlled via RBAC:

The administrator just needs to pick the relevant access level for specific features, and it will be stored inside the "dcps" database and then used by the ClusterControl GUI to permit UI resources to the GUI users. The access list created here has nothing to do with the CLI users.

LDAP

ClusterControl pre-1.8.1 used the PHP LDAP module for LDAP authentication. It supports Active Directory, OpenLDAP and FreeIPA directory services but only a limited number of LDAP attributes can be used for user identification such as uid, cn or sAMAccountName. The implementation is fairly straightforward and does not support advanced user/group base filtering, attributes mapping and TLS implementation.

The following are the information needed for LDAP settings:

Since this is a frontend service, the LDAP log file is stored under the web app directory, specifically at /var/www/html/clustercontrol/app/log/cc-ldap.log. An authenticated user will be mapped to a particular ClusterControl role and team, as defined in the LDAP group mapping page.

User Management post-1.8.2

In this new version, ClusterControl supports both authentication handlers, the frontend authentication (using email address) and backend authentication (using username). For the backend authentication, ClusterControl stores the user information and accounting inside the cmon database that is used by the ClusterControl Controller service (cmon).

User Accounts and Authentication

A user account consists of the following information:

  • Username (used for authentication)

  • Email address

  • Full name

  • Tags

  • Origin

  • Disabled

  • Suspend

  • Groups

  • Owner

  • ACL

  • Failed logins

  • CDT path

If compared to the old implementation, the new user management has more information for a user, which allows complex user account manipulation and better access control with enhanced security. A user authentication process is now protected against brute-force attacks and can be deactivated for maintenance or security reasons. 

One would use an email address or username to log in to the ClusterControl GUI, as shown in the following screenshot (pay attention to the placeholder text for Username field):

If the user logs in using an email address, the user will be authenticated via the deprecating frontend user management service and if a username is supplied, ClusterControl will automatically use the new backend user management service handled by the controller service. Both authentications work with two different sets of user management interfaces.

Permission and Access Control

In the new user management, permissions and access controls are controlled by a set of Access Control List (ACL) text forms called read (r), write (w), and execute (x). All ClusterControl objects and functionalities are structured as part of a directory tree, we called this CMON Directory Tree (CDT) and each entry is owned by a user, a group, and an ACL. You can think of it as similar to Linux file and directory permissions. In fact, ClusterControl access control implementation follows the standard POSIX Access Control Lists.

To put into an example, consider the following commands. We retrieved the Cmon Directory Tree (CDT) value for our cluster by using "s9s tree" command line (imagine this as ls -al in UNIX). In this example, our cluster name is “PostgreSQL 12”, as shown below (indicated by the "c" at beginning of the line):

$ s9s tree --list --long
MODE        SIZE OWNER                      GROUP  NAME
crwxrwx---+    - system                     admins PostgreSQL 12
srwxrwxrwx     - system                     admins localhost
drwxrwxr--  1, 0 system                     admins groups
urwxr--r--     - admin                      admins admin
urwxr--r--     - dba                        admins dba
urwxr--r--     - nobody                     admins nobody
urwxr--r--     - readeruser                 admins readeruser
urwxr--r--     - s9s-error-reporter-vagrant admins s9s-error-reporter-vagrant
urwxr--r--     - system                     admins system
Total: 22 object(s) in 4 folder(s).

Suppose we have a read-only user called readeruser, and this user belongs to a group called readergroup. To assign read permission for readeruser and readergroup, and our CDT path is “/PostgreSQL 12” (always start with a “/”, similar to UNIX), we would run:

$ s9s tree --add-acl --acl="group:readergroup:r--""/PostgreSQL 12"
Acl is added.
$ s9s tree --add-acl --acl="user:readeruser:r--""/PostgreSQL 12"
Acl is added.

Now the readeruser can access the ClusterControl via GUI and CLI as a read-only user for a database cluster called "PostgreSQL 12". Note that the above ACL manipulation examples were taken from the ClusterControl CLI, as described in this article. If you connect through ClusterControl GUI, you would see the following new access control page:

ClusterControl GUI provides a more simple way of handling the access control. It provides a guided approach to configure the permissions, ownership and groupings. Similar to the older version, every cluster is owned by a team, and you can specify a different team to have a read, admin, or forbid another team from accessing the cluster from both ClusterControl GUI or CLI interfaces.

LDAP

In the previous versions (1.8.1 and older), LDAP authentication was handled by the frontend component through a set of tables (dcps.ldap_settings and dcps.ldap_group_roles). Starting from ClusterControl 1.8.2, all LDAP configurations and mappings will be stored inside this configuration file, /etc/cmon-ldap.cnf. 

It is recommended to configure LDAP setting and group mappings via the ClusterControl UI because any changes to this file will require a reload to the controller process, which is triggered automatically when configuring LDAP via the UI. You may also make direct modifications to the file, however, you have to reload the cmon service manually by using the following commands:

$ systemctl restart cmon # or service cmon restart

The following screenshot shows the new LDAP Advanced Settings dialog:

If compared to the previous version, the new LDAP implementation is more customizable to support industry-standard directory services like Active Directory, OpenLDAP and FreeIPA. It also supports attribute mappings so you can set which attribute represents a value that can be imported into the ClusterControl user database like email, real name and username.

For more information, check out the LDAP Settings user guide.

Advantages of the New User Management

Note that the current user management is still working side-by-side with the new user management system. However, we highly recommend our users to migrate to the new system before Q1 2022. Only manual migration is supported at the moment. See Migration to the New User Management section below for details.

The new user management system will benefit ClusterControl users in the following ways:

  • Centralized user management for ClusterControl CLI and ClusterControl GUI. All authentication, authorization, and accounting will be handled by the ClusterControl Controller service (cmon).

  • Advanced and customizable LDAP configuration. The previous implementation only supports a number of username attributes and had to be configured in its own way to make it work properly.

  • The same user account can be used to authenticate to the ClusterControl API securely via TLS. Check out this article for example.

  • Secure user authentication methods. The new native user management supports user authentication using both private/public keys and passwords. For LDAP authentication, the LDAP bindings and lookups are supported via SSL and TLS.

  • A consistent view of time representation based on the user's timezone setting, especially when using both CLI and GUI interface for database cluster management and monitoring.

  • Protection against brute force attacks, where a user can be denied access to the system via suspension or disabled logins.

Migration to the New User Management

Since both user systems have different user account and structure, it is a very risky operation to automate the user migration from frontend to backend. Therefore, the user must perform the account migration manually after upgrading from 1.8.1 and older. Please refer to Enabling New User Management for details. For existing LDAP users, please refer to the LDAP Migration Procedure section.

We highly recommend users to migrate to this new system for the following reasons:

  • The UI user management system (where a user would log in using an email address) will be deprecated by the end of Q1 2022 (~1 year from now).

  • All upcoming features and improvements will be based on the new user management system, handled by the cmon backend process.

  • It is counter-intuitive to have two or more authentication handlers running on a single system.

If you are facing problems and required assistance on the migration and implementation of the new ClusterControl user management system, do not hesitate to reach us out via the support portal, community forum or Slack channel.

Final Thoughts

ClusterControl is evolving into a more sophisticated product over time. To support the growth, we have to introduce new major changes for a richer experience in a long run. Do expect more features and improvements to the new user management system in the upcoming versions!

Understanding Indexes in MySQL: Part One

$
0
0

Indexes in MySQL are a very complex beast. We have covered MySQL indexes in the past, but we have never taken a deeper dive into them - we will do that in these series of blog posts. This blog post should act as a very general guide to indexes while the other parts of these series will dive a little bit deeper into these subjects. 

What are Indexes?

In general, as already noted in a previous blog post about indexes, an index is an alphabetical list of records with references to the pages on which they are mentioned. In MySQL, an index is a data structure that is most commonly used to quickly find rows. You might also hear the term “keys” - it refers to indexes too.

What do Indexes Do?

In MySQL indexes are used to quickly find rows with specific column values and to prevent reading through the entire table to find any rows relevant to the query. Indexes are mostly used when the data stored in a database system (for example, MySQL) gets bigger because the larger the table, the bigger the probability that you might benefit from indexes.

MySQL Index Types

As far as MySQL is concerned, you might have heard about it having multiple types of indexes:

  • A B-Tree INDEX - such an index is frequently used to speed up SELECT queries matching a WHERE clause. Such an index can be used on fields where values do not need to be unique, it also accepts NULL values.

  • A FULLTEXT INDEX - such an index is used to use full text search capabilities. This type of index finds keywords in the text instead of directly comparing values to the values in the index.

  • A UNIQUE INDEX is frequently used to remove duplicate values from a table. Enforces the uniqueness of row values.

  • A PRIMARY KEY is also an index - it’s frequently used together with fields having an AUTO_INCREMENT attribute. This type of index does not accept NULL values and once set, the values in the column which has a PRIMARY KEY cannot be changed.

  • A DESCENDING INDEX is an index that stores rows in a descending order. This type of index was introduced in MySQL 8.0 - MySQL will use this type of an index when a descending order is requested by the query.

Choosing Optimal Data Types for Indexes in MySQL

As far as indexes are concerned, there’s also the need to keep in mind that MySQL supports a wide variety of data types and some data types cannot be used together with certain kinds of indexes (for example, FULLTEXT indexes can only be used on text-based (CHAR, VARCHAR or TEXT) columns - they cannot be used on any other data types) so before actually choosing the indexes for your database design, decide on the data type you are going to use on the column in question (decide what kind of data class you are going to store: are you going to store numbers? String values? Both numbers and string values? etc.), then decide on the range of the values you are going to store (choose the one that you don’t think you will exceed because increasing the data type range can be a time-consuming task later on - we recommend you opt to use a simple data type), and if you do not intend to use NULL values in your columns, specify your fields as NOT NULL whenever you can - when a nullable column is indexed, it requires an extra byte per entry.

Choosing Optimal Character Sets and Collations for Indexes in MySQL

Aside from data types, also keep in mind that each character in MySQL takes up space. For example, UTF-8 characters may take anywhere between 1 and 4 bytes each, so you might want to avoid indexing, for example, 255 characters and only use, say, 50 or 100 characters for a certain column.

The Benefits and Drawbacks of Using Indexes in MySQL

The main benefit of using indexes in MySQL is the increased performance of search queries matching a WHERE clause - indexes speed up SELECT queries matching a WHERE clause because MySQL doesn’t read through the entire table to find rows relevant to the query. However, bear in mind that indexes have their own drawbacks. The main ones are as follows:

  • Indexes consume disk space.

  • Indexes degrade the performance of INSERT, UPDATE and DELETE queries - when data is updated, the index needs to be updated together with it.

  • MySQL does not protect you from using multiple types of indexes at the same time. In other words, you can use a PRIMARY KEY, an INDEX and a UNIQUE INDEX on the same column - MySQL does not protect you from doing such a mistake.

If you suspect that some of your queries are becoming slower, consider taking a look into the Query Monitor tab of ClusterControl - by enabling the query monitor you can see when a certain query was last seen and its maximum and average execution time which can help you to choose the best indexes for your table.

How to Choose the Best Index to Use?

To choose the best index to use, you can use MySQL’s built-in mechanisms. For example, you can use the query explainer - the EXPLAIN query. It will explain what table is used, if it has partitions or not, what indexes are possible to use and what key (index) is used. It will also return the index length and the amount of rows your query returns:

mysql> EXPLAIN SELECT * FROM demo_table WHERE demo_field = ‘demo’\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: demo_table
   partitions: NULL
         type: ref
possible_keys: demo_field
          key: demo_field
      key_len: 1022
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

In this case, keep in mind that indexes are frequently used to help MySQL efficiently retrieve data when data sets are larger than usual. If your table is small, you might not need to use indexes, but if you see that your tables are getting bigger and bigger, chances are you might benefit from an index.

In order to choose the best index to use for your specific scenario though, bear in mind that indexes can be a leading cause of performance problems too. Keep in mind that whether MySQL will effectively use the indexes or not depends on a couple of factors including the design of your queries, the indexes in use, the types of indexes in use, also your database load at the time the query is executed and other things. Here’s a couple of things to consider when using indexes in MySQL:

  • How much data do you have? Perhaps some of it is redundant?

  • What queries do you use? Would your queries use LIKE clauses? What about ordering?

  • What kind of an index would you need to use to improve the performance of your queries?

  • Would your indexes be large or small? Would you need to use an index on a prefix of the column to make its size smaller?

It is worth noting that you should probably avoid using multiple types of indexes (e.g a B-Tree index, a UNIQUE INDEX and a PRIMARY KEY) on the same column too.

Improving Query Performance with Indexes

To improve query performance with indexes, you need to take a look at your queries - the EXPLAIN statement can help with that. In general, here’s a couple of things you should consider if you want your indexes to improve the performance of your queries:

  • Only ask the database for what you need. In most cases, using SELECT column will be faster than using SELECT * (that is the case without using indexes too)

  • A B-tree index might be a fit if you search for exact values (e.g SELECT * FROM demo_table WHERE some_field = ‘x’) or if you want to search for values using wildcards (e.g SELECT * FROM demo_table WHERE some_field LIKE ‘demo%’ - in this case, bear in mind that using LIKE queries with anything in the beginning of it might do more harm than good - avoid using LIKE queries with a percentage sign in front of the text you’re searching - that way MySQL might not use an index because it doesn’t know what does the row value begin with) - though keep in mind that a B-tree index can also be used for column comparisons in expressions that use the equal (=), more than (>), more than or equal to (>=), less than (<), less than or equal to (<=) or BETWEEN operators.

  • A FULLTEXT index might be a fit if you find yourself using full-text (MATCH ... AGAINST()) search queries or if your database is designed in such a way that only uses text-based columns - FULLTEXT indexes can use TEXT, CHAR or VARCHAR columns, they cannot be used on any other types of columns.

  • A covering index might be of use if you want to run queries without additional I/O reads on big tables. To create a covering index, cover the WHERE, GROUP BY and SELECT clauses used by the query.

We will further look into the types of indexes in the upcoming parts of this blog series, but in general, if you use queries like SELECT * FROM demo_table WHERE some_field = ‘x’ a B-tree INDEX might be a fit, if you use MATCH() AGAINST() queries you should probably look into a FULLTEXT index, if your table has very long row values, you should probably look into indexing a part of the column.

How Many Indexes Should You Have?

If you ever used indexes to improve the performance of your SELECT queries, you have probably asked yourself a question: how many indexes should you actually have? In order to understand this, you need to keep the following things in mind:

  1. Indexes are usually the most effective with big amounts of data.

  2. MySQL uses only one index per each SELECT statement in a query (subqueries are seen as separate statements) - use the EXPLAIN query to find out which indexes are the most effective for the queries you use.

  3. Indexes should make all of your SELECT statements fast enough without compromising too much on disk space - “fast enough”, however, is relative so you would need to experiment.

Indexes and Storage Engines

When dealing with indexes in MySQL, also keep in mind that there might be some kinds of limitations if you use various engines (for example if you use MyISAM as opposed to InnoDB). We will go into more detail in a separate blog, but here are some ideas:

  • The maximum number of indexes per MyISAM and InnoDB tables are 64, the maximum number of columns per index in both storage engines is 16.

  • The maximum key length for InnoDB is 3500 bytes - the maximum key length for MyISAM is 1000 bytes.

  • The fulltext indexes have limitations in certain storage engines - for example, the InnoDB fulltext indexes have 36 stopwords, MyISAM stopword list is a little bit bigger with 143 stopwords. InnoDB derives these stopwords from the innodb_ft_server_stopword_table variable while MyISAM derives these stopwords from the storage/myisam/ft_static.c file - all words that are found in the file will be treated as stopwords.

  • MyISAM was the only storage engine with the support for full-text search options until MySQL 5.6 (MySQL 5.6.4 to be exact) came around meaning that InnoDB supports full-text indexes since MySQL 5.6.4. When a FULLTEXT index is in use, it finds keywords in the text instead of comparing values directly to the values in the index.

  • Indexes play a very important role for InnoDB - InnoDB locks rows when it accesses them, so a reduced number of rows InnoDB accesses can reduce locks.

  • MySQL allows you to use duplicate indexes on the same column.

  • Certain storage engines have certain default types of indexes (e.g for the MEMORY storage engine the default index type is hash)

Summary

In this part about indexes in MySQL, we have gone through some general things related to indexes in this relational database management system. In the upcoming blog posts we will go through some more in-depth scenarios of using indexes in MySQL including the usage of indexes in certain storage engines etc. - we will also explain how ClusterControl can be used to achieve your performance goals in MySQL.

Understanding Indexes in MySQL: Part Two

$
0
0

This blog post should act as the second part in the series of blogs about indexes in MySQL. In the first part in the blog post series about MySQL indexes we have covered quite a lot of things including what they are, what do they do, what are their types, how to choose optimal data types and MySQL character sets for indexes that you use, we went through the benefits and drawbacks of using indexes in MySQL, we told you how to choose the best index to use, how to improve query performance and make sure your indexes are actually used by MySQL, how many indexes should you have, we also went through some considerations related to storage engines. This blog post will go into more detail regarding some of the content we have talked about in the first part of the series. We will start from the correlation between indexes and storage engines in MySQL.

Indexes and Storage Engines in MySQL

As we have already mentioned in a previous blog post, there might be some kinds of limitations to indexes and other things if you use certain storage engines in MySQL. Here are some of them - we will now define what some of them are (some of them were covered in the first part of the blog series so if we’re missing something it’s probably in there), then cover them with a more in depth analysis:

  • As per the MySQL documentation, the maximum number of indexes, the maximum key length and the maximum index length is defined per storage engine. As we have already mentioned in a previous blog post, the maximum number of indexes per MyISAM and InnoDB tables are 64, the maximum number of columns per index in both storage engines is 16, the maximum key length for InnoDB is 3500 bytes and the maximum key length for MyISAM is 1000 bytes.

  • You cannot use CREATE INDEX to create a PRIMARY KEY - use ALTER TABLE instead.

  • BLOB and TEXT columns can be indexed only for tables running the InnoDB, MyISAM and BLACKHOLE storage engines.

  • If you only index a prefix of the column, keep in mind that the prefix support and their length are also dependent on storage engines. A prefix can be up to 767 bytes long for InnoDB tables that use the REDUNDANT or COMPACT row format, but for DYNAMIC or COMPRESSED row formats the prefix length limit is increased to 3072 bytes. For MyISAM tables, the prefix length limit is 1000 bytes. The NDB storage engine does not support prefixes at all.

  • If a strict SQL mode is enabled and the index prefix exceeds the maximum column data type size, CREATE INDEX throws an error. If a strict SQL mode is not enabled, CREATE INDEX produces a warning. If a UNIQUE INDEX is created, an error occurs.

  • In general, MySQL only allows you to create up to 16 indexes on a given table.

  • If you are using a PRIMARY KEY index, you can only have one primary key per table. FULLTEXT, UNIQUE INDEXes, and INDEXes do not have this limitation.

  •  If you are using FULLTEXT indexes, bear in mind that they only can be used for InnoDB or MyISAM storage engines and for CHAR, VARCHAR or TEXT columns. Also keep in mind that MySQL only uses FULLTEXT indexes when MATCH() AGAINST() clauses are used and that you can actually have an index and a fulltext index on the same column at the same time if you so desire and that FULLTEXT indexes have their own set of stopwords each specific to storage engines in use.

  • B-Tree indexes may be useful if you use LIKE queries that begin with a wildcard, but only in certain scenarios.

Knowing these index limitations should prove to be useful if you are trying to understand how indexes in MySQL work. What’s even more important to understand though is the fact that you must verify that your indexes are actually used by MySQL. We have touched on this briefly in the first part of these series (“How to Choose the Best Index to Use?”), but we haven’t told you how to verify that your indexes are actually used by MySQL. To do that, verify their usage by using EXPLAIN - when EXPLAIN is used together with an explainable statement, MySQL displays information from the optimizer about the execution plan of the statement.

PRIMARY KEY Considerations

Some of the basic considerations relating to PRIMARY KEY indexes in MySQL include the fact that they are primarily used to uniquely identify records in a table and are frequently used with AUTO_INCREMENTing values meaning that they can be very useful if you are creating, say, ID fields. PRIMARY KEY fields must contain unique values and they cannot contain NULL values.

Matching a Column Prefix

Indexes can also match a column prefix. This approach to indexes can be useful if your columns are string columns and you think that adding an index on the whole column would potentially consume a lot of disk space. Your indexes can match a column prefix like so:

ALTER TABLE demo_table ADD INDEX index_name(column_name(length));

The above query would add an index index_name on a column named column_name only for a column prefix defined. To choose a good amount of length to index, make sure that your use of the prefix maximizes the uniqueness of the values in the column: find the number of rows in the table and evaluate different prefix lengths until you achieve your desired uniqueness of rows.

FULLTEXT Indexes in MySQL

FULLTEXT indexes in MySQL are a different beast altogether. They have many limitations unique to themselves (for example, InnoDB has a stopword list comprised of 36 words while MyISAM stopword list is comprised of 143 words), they have unique search modes too. Some of them include a natural language mode (to activate such a search mode, run a FULLTEXT search query with no modifiers), you can also expand your search (to do that, use the WITH QUERY EXPANSION modifier - such a search mode performs the search twice, but when the search runs for the second time it includes a few most relevant records from the first search - frequently used when a user has implied knowledge of something), to search with boolean operators use the IN BOOLEAN MODE modifier. FULLTEXT indexes will also only be used if the search query consists of a minimum of three characters for InnoDB and a minimum of four characters for MyISAM.

Using B-Tree Indexes with Wildcards

Indexes are also frequently used if you’re building something similar to search engines. For that you frequently want to only search for a part of a value and return the results - here’s where wildcards step in. A simple query using a wildcard uses a LIKE query and the % sign to signify “anything” after the text. For example, a query like so would search for results beginning with the word “search” and having anything after it:

SELECT * FROM … WHERE demo_column LIKE ‘search%’;

A query like so would search for results beginning with anything, having the word “search” and having anything after it:

SELECT * FROM … WHERE demo_column LIKE ‘%search%’;

But here’s a catch - the above query will not use an index. Why? Because it has a wildcard at the beginning of itself and MySQL cannot figure out what the column needs to begin with. That’s why we said that wildcard indexes have their place, but only in specific scenarios - that is, such scenarios where you do not have a wildcard at the beginning of your search query.

Using ClusterControl to Monitor the Performance of Queries

Aside from using EXPLAIN, you can also use ClusterControl to monitor the performance of your queries: ClusterControl provides a set of advanced monitoring and reporting features that let you keep track of the performance of your database instances and queries. For example, click on a cluster and you will see a “Query Monitor” tab. Click on it and ClusterControl will let you observe the status of your queries in your database instances:

This part of ClusterControl lets you view a list of top slow and long-running queries while also allowing you to filter through them. For example if you know that not long ago you ran a query that consisted of @@log_bin, you can simply search for the term and ClusterControl will return a list of results:

As you probably noticed, you can also filter queries by hosts that you use or by occurrences, you can also elect to see a set of rows, for example, 20, 100 or 200. ClusterControl will also tell you when the query was last seen, what was its total execution time, how many rows it had returned, how many rows did it examine and so on. ClusterControl can prove to be instrumental if you want to observe how your indexes are actually used by MySQL, MariaDB, MongoDB, PostgreSQL or TimescaleDB instances.

Summary

In this blog post we went through some limitations and benefits concerning indexes in MySQL and we have also covered how ClusterControl can help you achieve your database performance goals. We will also have a third part about indexes in MySQL diving even deeper into them, but to conclude what we’ve covered so far, keep in mind that indexes in MySQL certainly do have their own place - to make the best of them, know how they interact with storage engines, their benefits and limitations, how and when to use certain types of indexes and choose wisely.

Understanding Indexes in MySQL: Part Three

$
0
0

This blog post should act as the third part in the series of blogs about indexes in MySQL. In the second part in the blog post series about MySQL indexes we have covered indexes and storage engines, we have touched upon some PRIMARY KEY considerations, how to match a column prefix, we have covered some FULLTEXT index considerations, we have told you how should you go about using B-Tree indexes with wildcards and how to use ClusterControl to monitor the performance of your queries and, subsequently, indexes. In this blog post we will go into some more details about indexes in MySQL: we will cover hash indexes, index cardinality, index selectivity, we will tell you interesting details about covering indexes and we will also go through some indexing strategies. And, of course, we will touch upon ClusterControl. Let’s begin, shall we?

Hash Indexes in MySQL

MySQL DBAs and developers dealing with MySQL also have another trick up their sleeve as far as MySQL is concerned - hash indexes are also an option. Hash indexes are frequently used in the MEMORY engine of MySQL - as with pretty much everything in MySQL, those kinds of indexes have their own upsides and downsides. The main downside of these kinds of indexes is that they are used only for equality comparisons that use the = or <=> operators meaning that they’re not really useful if you want to search for a range of values, but the main upside is that lookups are very fast. A couple more downsides include the fact that developers cannot use any leftmost prefix of the key to find rows (if you want to do that, make use of B-Tree indexes instead), the fact that MySQL cannot approximately determine how many rows there are between two values - if hash indexes are in use, the optimizer cannot use a hash index to speed up ORDER BY operations either. Bear in mind that hash indexes are not the only thing the MEMORY engine supports - MEMORY engines can have B-Tree indexes too.

Index Cardinality in MySQL

As far as MySQL indexes are concerned, you might also heard another term going around - this term is called index cardinality. In very simple terms, index cardinality refers to the uniqueness of values stored in a column that uses an index. To view the index cardinality of a specific index, you can simply go to the Structure tab of phpMyAdmin and observe the information there or you can also execute a SHOW INDEXES query:

mysql> SHOW INDEXES FROM demo_table;
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table         | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| demo_table |          1 | demo     |            1 | demo        | A         |      494573 |     NULL | NULL   |      | BTREE      |         |               |
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)

The SHOW INDEXES query output that can be seen above as you can see has a lot of fields, one of which depicts the index cardinality: this field returns an estimated number of unique values in the index - the higher the cardinality, the greater the chance that the query optimizer uses the index for lookups. With that being said, index cardinality also has a brother - his name is index selectivity.

Index Selectivity in MySQL

An index selectivity is the number of distinct values in relation to the number of records in the table. In simple terms, index selectivity defines how tightly a database index helps MySQL narrow the search for values. An ideal index selectivity is the value of 1. An index selectivity is calculated by dividing the distinct values in a table by the total number of records, for example, if you have 1,000,000 records in your table, but only 100,000 of them are distinct values, your index selectivity would be 0.1. If you have 10,000 records in your table and 8,500 of them are distinct values, your index selectivity would be 0.85. That’s much better. You get the point. The higher your index selectivity is, the better.

Covering Indexes in MySQL

A covering index is a special kind of index in InnoDB. When a covering index is in use, all the required fields for a query are included, or “covered”, by the index meaning that you can also reap the benefits of reading only the index instead of the data. If nothing else helps, a covering index could be your ticket to improved performance. Some of the benefits of using covering indexes include:

  • One of the main scenarios where a covering index might be of use include serving queries without additional I/O reads on big tables.

  • MySQL can also access less data due to the fact that index entries are smaller than the size of rows.

  • Most storage engines cache indexes better than data.

Creating covering indexes on a table is pretty simple - simply cover the fields accessed by SELECT, WHERE and GROUP BY clauses:

ALTER TABLE demo_table ADD INDEX index_name(column_1, column_2, column_3);

Keep in mind that when dealing with covering indexes, it is very important to choose the correct order of columns in the index. For your covering indexes to be effective, put the columns that you use with WHERE clauses first, ORDER BY and GROUP BY next and the columns used with the SELECT clause last.

Indexing Strategies in MySQL

Following the advice covered in these three parts of blog posts about indexes in MySQL can provide you with a really good foundation, but there are also a couple of indexing strategies you might want to use if you want to really tap into the power of indexes in your MySQL architecture. For your indexes to adhere to MySQL best practices, consider:

  1. Isolating the column that you use the index on - in general, MySQL does not use indexes if the columns they are used on are not isolated. For example, such a query would not use an index because it’s not isolated:

    SELECT demo_column FROM demo_table WHERE demo_id + 1 = 10;


    Such a query however, would:
     

    SELECT demo_column FROM demo_table WHERE demo_id = 10;

     

  2. Do not use indexes on the columns that you index. For example, using a query like so would not do much good so it’s better to avoid such queries if you can:
     

    SELECT demo_column FROM demo_table WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(column_date) <= 10;

     

  3. If you use LIKE queries together with indexed columns, avoid putting the wildcard at the beginning of the search query because that way MySQL will not use an index either. That is instead of writing queries like this:

    SELECT * FROM demo_table WHERE demo_column LIKE ‘%search query%’;


    Consider writing them like this:

    SELECT * FROM demo_table WHERE demo_column LIKE ‘search_query%’;


    The second query is better because MySQL knows what the column begins with and can use indexes more effectively. As with everything though, the EXPLAIN statement can be of great help if you want to make sure your indexes are actually used by MySQL.

Using ClusterControl to Keep Your Queries Performant

If you want to improve your MySQL performance, the advice above should set you on the right path. If you feel that you need something more though, consider ClusterControl for MySQL. One of the things that ClusterControl can help you with include performance management - as already noted in previous blog posts, ClusterControl can also help you with keeping your queries performing at the very best of their ability all the time - that’s because ClusterControl also includes a query monitor that lets you monitor the performance of your queries, see slow, long-running queries and also query outliers alerting you of the possible bottlenecks in your database performance before you might be able to notice them yourself:

You can even filter your queries allowing you to make an assumption if an index was used by an individual query or not:

ClusterControl can be a great tool to improve your database performance while taking the maintenance hassle off your hands. To learn more about what ClusterControl can do to improve the performance of your MySQL instances, consider having a look at the ClusterControl for MySQL page.

Summary

As you can probably tell by now, indexes in MySQL are a very complex beast. To choose the best index for your MySQL instance, know what indexes are and what they do, know the types of MySQL indexes, know their benefits and drawbacks, educate yourself on how MySQL indexes interact with storage engines, also take a look at ClusterControl for MySQL if you feel that automating certain tasks related to indexes in MySQL might make your day easier.


PostgreSQL v13 Deployment and Scaling with ClusterControl 1.8.2

$
0
0

PostgreSQL is one of the databases that can be deployed via ClusterControl, along with MySQL, MariaDB and MongoDB. ClusterControl not only simplifies the deployment of the database cluster, but has a function for scalability in case your application grows and requires that functionality.

By scaling up your database, your application will run much smoother and better in the event the application load or traffic increases. In this blog post, we will review the steps on how to do the deployment as well as scale-up of PostgreSQL v13 with ClusterControl 1.8.2.

User Interface (UI) Deployment

There are two ways of deployment in ClusterControl, web User Interface (UI) as well as Command Line Interface (CLI). The user has the freedom to choose any of the deployment options depending on their liking and need. Both of the options are easy to follow and well documented in our documentation. In this section, we will go through the deployment process using the first option - web UI.

The first step is to log in to your ClusterControl and click on Deploy:

You will be presented with the screenshot below for the next step of the deployment, choose the PostgreSQL tab to continue:

Before we move further, I would like to remind you that the connection between the ClusterControl node and the databases nodes must be passwordless. Prior to deployment, all we need to do is to generate the ssh-keygen from the ClusterControl node and then copy it to all the nodes. Fill in the input for the SSH User, Sudo Password as well as Cluster Name as per your requirement and click Continue.

In the screenshot above, you will need to define the Server Port (in case you would like to use others), the user that you would like to as well as the password and make sure to choose Version 13 that you want to install.

Photo author
Photo description

Here we need to define the servers either using the hostname or the IP address, like in this case 1 master and 2 slaves. The final step is to choose the replication mode for our cluster.

After you click Deploy, the deployment process will start and we can monitor the progress in the Activity tab.

The deployment will normally take a couple of minutes, performance depends mostly on the network and the spec of the server.

Now that we have the PostgreSQL v13 installed using ClusterControl GUI which is pretty straightforward.

Command Line Interface (CLI) PostgreSQL Deployment

From the above, we can see that the deployment is pretty straightforward using web UI. The important note is that all the nodes must have passwordless SSH connections prior to the deployment. In this section, we are going to see how to deploy using the ClusterControl CLI or “s9s” tools command line. 

We assumed that ClusterControl has been installed prior to this, let’s get started by generating the ssh-keygen. In the ClusterControl node, run the following commands:

$ whoami
root
$ ssh-keygen -t rsa # generate the SSH key for the user
$ ssh-copy-id 10.10.40.11 # pg node1
$ ssh-copy-id 10.10.40.12 # pg node2
$ ssh-copy-id 10.10.40.13 # pg node3

Once all the commands above ran successfully, we may verify the passwordless connection by using the following command:

$ ssh 10.10.40.11 "whoami" # make sure can ssh without password

If the above command runs with success, the cluster deployment can be started from the ClusterControl server using the following line of command:

$  s9s cluster --create --cluster-type=postgresql --nodes="10.10.40.11?master;10.10.40.12?slave;10.10.40.13?slave" --provider-version='13' --db-admin="postgres" --db-admin-passwd="P@$$W0rd" --cluster-name=PGCluster --os-user=root --os-key-file=/root/.ssh/id_rsa --log

Right after you run the command above, you will see something like this which means the task has started running:

Cluster will be created on 3 data node(s).

Verifying job parameters.

10.10.40.11: Checking ssh/sudo with credentials ssh_cred_job_6656.
10.10.40.12: Checking ssh/sudo with credentials ssh_cred_job_6656.
10.10.40.13: Checking ssh/sudo with credentials ssh_cred_job_6656.
…
…
This will take a few moments and the following message will be displayed once the cluster is deployed:
…
…
Directory is '/etc/cmon.d'.
Filename is 'cmon_1.cnf'.
Configuration written to 'cmon_1.cnf'.
Sending SIGHUP to the controller process.
Waiting until the initial cluster starts up.
Cluster 1 is running.
Registering the cluster on the web UI.
Waiting until the initial cluster starts up.
Cluster 1 is running.
Generated & set RPC authentication token.

 

You can also verify it by logging into the web console, using the username that you have created. Now we have a PostgreSQL cluster deployed using 3 nodes. If you like to learn more about the deployment command above, here is the best reference for you.

Scaling Up PostgreSQL with ClusterControl UI

PostgreSQL is a relational database and we know that scaling out this type of database is not easy compared to a non-relational database. These days, most applications need scalability in order to provide better performance and speed. There are a lot of ways on how to get this implemented depending on your infrastructure and environment. 

Scalability is one of the features that can be facilitated by ClusterControl and can be accomplished in both using UI as well as CLI. In this section, we are going to see how we can scale out PostgreSQL using ClusterControl UI. The first step is to login to your UI and choose the cluster, once the cluster is chosen you can click on the option as per the screenshot below:

Once the “Add Replication Slave” clicked, you will see the following page. You can either pick “Add new…” or “Import…” depending on your situation. In this example we will choose the first option:

The following screen will be presented once you clicked on it:

Photo author
Photo description
  • Slave Hostname: the hostname/IP address of the new slave or node

  • Slave Port: the PostgreSQL port of the slave, default is 5432

  • Cluster Name: the name of the cluster, you can either add or leave it blank

  • Use Package Default for Datadir: you can have this option checked on uncheck if you want to have a different location for Datadir

  • Install PostgreSQL software: you can leave this option checked

  • Synchronous Replication: you can choose what type of replication you want in this one

  • Include in LoadBalancer set (if exists): this option to be checked if you have LoadBalancer configured for the cluster

The key important note here is that you need to configure the new slave host to be passwordless before you can run this setup. Once everything is confirmed, we can click on the “Finish” button to complete the setup. In this example, I have added IP “10.10.40.140”.

We can now monitor the job activity and let the setup complete. To confirm the setup, we can go to the “Topology” tab to see the new slave:

Scaling Out PostgreSQL with ClusterControl CLI

To add the new nodes into the existing cluster is very simple using the CLI. From the controller node, you execute the following command. The first command is to identify the cluster that we would like to add the new node to:

 $ s9s cluster --list --long
ID STATE   TYPE              OWNER GROUP  NAME      COMMENT
 1 STARTED postgresql_single admin admins PGCluster All nodes are operational.

In this example, we can see that node ID is “1” for the cluster name “PGCluster”. Let see the first command option on how to add a new node to the existing PostgreSQL cluster:

$  s9s cluster --add-node --cluster-id=1 --nodes="postgresql://10.10.40.141?slave" --log

The shorthand “--log” at the end of the line will let us see what is the current task running after the command executed as per below:

Using SSH credentials from cluster.
Cluster ID is 1.
The username is 'root'.
Verifying job parameters.
Found a master candidate: 10.10.40.11:5432, adding 10.10.40.141:5432 as a slave.
Verifying job parameters.
10.10.40.11: Checking ssh/sudo with credentials ssh_cred_cluster_1_6245.
10.10.40.11:5432: Loading configuration file '/var/lib/pgsql/13/data/postgresql.conf'.
10.10.40.11:5432: wal_keep_segments is set to 0, increase this for safer replication.
…
…

The next available command that you can use is like the following:

$ s9s cluster --add-node --cluster-id=1 --nodes="postgresql://10.10.40.142?slave" --wait

Add Node to Cluster

\ Job  9 RUNNING    [▋         ]   5% Installing packages

Notice that there is “--wait” shorthand in the line and the output you will see will be displayed as above. Once the process completes, we can confirm the new nodes in the “Overview” tab of the cluster from the UI:

Conclusion

In this blog post, we have reviewed two options of scaling out PostgreSQL in ClusterControl. As you may notice, scaling out PostgreSQL is easy with ClusterControl. ClusterControl not only can do the scalability but you also can achieve high availability setup for your database cluster. Features like HAProxy, PgBouncer as well as Keepalived are available and ready to be implemented for your cluster whenever you feel the need for those options. With ClusterControl, your database cluster is easy to manage and monitored at the same time.

We hope that this blog post will help guide you in scaling out your PostgreSQL setup.

Keeping Databases Up and Running in an Outage or Slowdown

$
0
0

Once you see that your database environment has issues, you are in trouble. Maybe one of your replicas is out or maybe you are experiencing a significant increase in the load across your databases that impacts your application. What can you do to try and salvage the situation? Let’s take a look at some of the cases and see if we can find a solution that will help to keep at least some of the functionality running.

Scenario - One of Your Nodes is Down

Let’s consider that one of the nodes is down and thus it is not available for the application to read from. There are several implications of this and several types of issues related to this case. Let’s go step by step through them.

Node is down, remaining nodes are able to deal with the traffic

This is pretty much the ideal scenario and you should design the environment with such a scenario in mind. Your cluster should be sizable enough to be able to handle the traffic if at least one of your nodes is not available. In such a scenario you should be just fine for most of the cases. Obviously, you want to bring up the missing nodes as fast as possible, to reduce the time frame where losing another node may cause more serious disruption. Ideally you would bring back the failed node, as long as you can deem the data safe and sound. The advantage is that you can bring it up faster, as the data already exists on the node. However, if you are going to spin up a replacement node, you have to provision it with data before it will be able to join the cluster. Additional advantage is that, at least for some databases, contents of memory buffers may be persisted on disk and re-reading them at startup will significantly reduce the warm-up phase.

This may not be clear, but please keep in mind that you cannot just start the fresh database and add it to the load balancers on an equal basis as the other nodes. When the database starts, unless it can reload its memory structures, it starts with no data in memory. Every single query will require disk access and it will be way slower than the typical query executed on a warmed-up node. If you just let it deal with regular traffic, the most probable outcome will be queries piling up and rendering the new node not accessible and overloaded. The proper way to introduce a fresh node in the cluster is to send a very low traffic volume to it at first, allowing it to warm up its buffers and then, gradually, add more and more traffic to it, finally reaching 100% of the traffic portion it should serve. If you have proper tools for that, you can even do the warm-up outside of the production cluster utilizing mirrored traffic or just re-executing the most common types of queries based on, for example, contents  of the slow query log from the other production nodes.

Node is down, remaining nodes are overloaded

If you are new to the database world, you should have not let this situation happen in the first place since it’s likely to cause you more issues than the first scenario, but nonetheless, there are two main ways to deal with this issue. Generally speaking, the aim is to bring up additional nodes. As before, if we can use the old node in a safe manner, this might be the best and the fastest option. If not, we should consider spinning up a new node as soon as possible. How to deal with the overloaded nodes is a topic we’ll cover in the second part of this blog, depending on the tools in your arsenal, there are some possible scenarios that may be executed.

Scenario - Database Cluster is Overloaded

Your CPU utilization is going through the roof, databases are slowing down and your application starts to experience slowdowns as well. How to deal with it may depend on several factors, let’s try to discuss the most common cases. Keep in mind that the first thing you have to do is to understand the source of the load. This may significantly affect the way you’ll be responding to the situation.

Node is down, remaining nodes are overloaded

Here, the situation is very simple. There is no hidden source of the load, it’s just that the cluster is degraded and it doesn’t have enough resources to handle the normal traffic. As we discussed, we’ll be adding a new node to the cluster to restore its functionality, but is there anything we can do to make the outage easy on the users? Yes and no. No, because no matter what we will attempt to do, there will be some impact on the application and its functionality. Yes, because there are always more and less important parts of the application.

An initial step, something you can easily do beforehand, would be to assess what is the core functionality of your application. Quite often, with time, applications grow in additional functions, modules and so on. Is there anything you can shut down that won’t impact the core functions? Let’s say that you are an e-commerce website. The core functionality is to sell products so, obviously, the store itself and the payment and order processing are the modules that have to keep on running. On the other hand you may be able to manage without video chat where your sales representatives are helping users. Maybe it will be ok to disable functionality like rating the products, writing comments and reviews. Maybe even search functionality may not be that critical. After all, in most of the cases, users come directly to the product page from Google or some other search engine. Once you identify such “not-required-for-core-functionality” modules, you should plan how to disable them. It might be a checkbox in the admin panel, it might also be changing a couple of rows in the database. It is important to be able to do it quickly.

When the need arises, you can pull the trigger and start disabling the modules one by one, checking how it impacts the workload. It doesn’t mean it will always be enough, the core functionality is named “core” for a reason and it will generate the main chunk of the load on the databases. It is still important to try, though. Even if you’ll reduce the load by 10-15%, it still can make a difference between a slow site and a non-available site.

Sudden increase of traffic due to the higher number of requests from the users

In many ways this is a situation very similar to what we described in the previous section. Sure, all database nodes are up and running but that is not enough to handle the load. The options we have are very similar. Obviously, you will want to spin up more nodes to deal with the traffic. Ideally would be to do it in a way that puts the least amount of load on the existing cluster. For example, instead of copying fresh data from the live node you can use backup to restore the fresh node to some point in time and then let the replication catch up on the remaining data. You will use production nodes to get the incremental state, not full data - that helps to reduce the overhead. It is also a great exercise in backup restoration: you can verify the backup, you can verify the process and you can verify how long the restoration takes. It comes very handy when planning disaster recovery procedures.

The exact steps to follow may depend on the situation, though. If you can identify the source of the load (maybe it is a part of the functionality that has been aggressively promoted?), you would be able to shut down that part if you deem it necessary. Sometimes it’s better to waste the marketing budget due to a promoted feature that’s not working rather than wasting the marketing budget and losing income due to the whole site not working.

Sudden increase of traffic due to the bug in the application

Another quite common case is when the high load is caused by the bugs in the application. It can be a not efficient SQL that managed to pass through a review process. It can also be a logical error where some queries are executed when they should not be. It can be a loop that’s triggered and which runs the query over and over again. Such cases may result in unnecessary or not efficient queries spawning across the database cluster. The main challenge in this particular situation is to identify what is going wrong. If you can pinpoint a query that is causing the problem (and remember, sometimes it is easy to mistake what is a result of the problem and what is a cause), you are already half way through the issue. Then, it all depends on the options and tools that you have at your disposal. If you are using modern load balancers, you may have an option to shape the traffic. In our case it could be, for example, killing the offending query on the load balancer level. Application will still send the faulty query to the load balancer but it will not be propagated to the database. Proxy layer will shield databases from faulty load. If your load balancer allows for such an action, you may also attempt to rewrite the query to a more efficient form. Finally, if you do not use sophisticated proxies, you should attempt to fix the issue in the application. This will probably take more time but it is also a solution. Sometimes the caching layer, if you happen to have one, may also act as a SQL firewall. Setting up a very long TTL for a cache entry related to the faulty query can also work just fine and stop the query from being executed in the database.

Sudden increase of traffic due to the problems with some application hosts

Final situation we’d like to discuss is a case in which erroneous traffic is generated by a subset of application hosts. There can be plenty of reasons for this: deployment gone wrong on a part of the application infrastructure (development code has been deployed on production servers - we’ve seen that in the past) or network issues that prevented some of the application nodes from connecting to the proxy layer and the application reverted to a direct database connection to name a few. Again, the most important bit is to understand what has happened and which part of the infrastructure is affected. Then, you may want to kill the affected application hosts for the time being (or you can kill them permanently, you can always deploy new ones). A temporary “kill” can be implemented through firewalls. If you have an advanced proxy, you can use it as well - then you would have full control over the database traffic in one place. Once you manage to take the situation under control, you may want to rebuild your application nodes and restore their number to an optimal level.

As you can see, there are many ways things can go wrong and there are different ways to solve the problems. The list you have read through is by no means exhaustive, but we hope you can see some patterns emerging here and you identified some tools that you can use to deal with unexpected problems related to the load on your database infrastructure. One of those tools is ClusterControl - it’s the only management system you will ever need to take control of your open source database infrastructure. If you are looking for a tool that would help keep your databases up and running during an outage or slowdown, definitely consider giving it a try.

Dealing With MySQL Replication Issues Using ClusterControl

$
0
0

One of the most popular ways in achieving high availability for MySQL is replication. Replication has been around for many years, and became much more stable with the introduction of GTIDs. But even with these improvements, the replication process can break due to various reasons - for instance, when master and slave are out of sync because writes were sent directly to the slave. How do you troubleshoot replication issues, and how do you fix them? 

In this blog post, we will discuss some of the common issues with replication and how to fix them with ClusterControl. Let’s start with the first one.

Replication Stopped With Some Error

Most MySQL DBAs will typically see this kind of problem at least once in their career. For various reasons, a slave can get corrupted or maybe stopped syncing with the master. When this happens, the first thing to do to start the troubleshooting is to check the error log for messages. Most of the time, the error message is easily traceable in the error log or by running the SHOW SLAVE STATUS query. 

Let’s take a look at the following example from the SHOW STATUS SLAVE:

********** 0. row **********
Slave_IO_State: 
Master_Host: 10.2.9.71
Master_User: cmon_replication
Master_Port: 3306
Connect_Retry: 10
Master_Log_File: binlog.000111
Read_Master_Log_Pos: 255477362
Relay_Log_File: relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: binlog.000111
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB: 
Replicate_Ignore_DB: 
Replicate_Do_Table: 
Replicate_Ignore_Table: 
Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
Last_Errno: 0
Last_Error: 
Skip_Counter: 0
Exec_Master_Log_Pos: 255477362
Relay_Log_Space: 256
Until_Condition: None
Until_Log_File: 
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File: 
Master_SSL_CA_Path: 
Master_SSL_Cert: 
Master_SSL_Cipher: 
Master_SSL_Key: 
Seconds_Behind_Master: 
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'
Last_SQL_Errno: 0
Last_SQL_Error: 
Replicate_Ignore_Server_Ids: 
Master_Server_Id: 1000
Master_SSL_Crl: 
Master_SSL_Crlpath: 
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 1000-1000-2268440
Replicate_Do_Domain_Ids: 
Replicate_Ignore_Domain_Ids: 
Parallel_Mode: optimistic
SQL_Delay: 0
SQL_Remaining_Delay: 
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 0
Slave_Transactional_Groups: 0

We can clearly see the error is related to Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.'. In order words, what the error is telling us essentially is that there is inconsistency in data and the required binary log files have already been deleted.

This is one good example where the replication process stops working. Besides SHOW SLAVE STATUS, you can also track the status in the “Overview” tab of the cluster in ClusterControl. So how to fix this with ClusterControl? You have two options to try:

  1. You may try to start the slave again from the “Node Action”

  1. If the slave is still not working, you may run “Rebuild Replication Slave” job from the “Node Action”

Most of the time, the second option will resolve the issue. ClusterControl will take a backup of the master, and rebuild the broken slave by restoring the data. Once the data is restored, the slave is connected to the master so it can catch up. 

There are also multiple manual ways to rebuild slave as listed below, you may also refer to this link for more details: 

  • Using Mysqldump to Rebuild an Inconsistent MySQL Slave

  • Using Mydumper to Rebuild an Inconsistent MySQL Slave

  • Using a Snapshot to Rebuild an Inconsistent MySQL Slave

  • Using a Xtrabackup or Mariabackup to Rebuild an Inconsistent MySQL Slave

Promote A Slave To Become A Master

Over time, the OS or database needs to be patched or upgraded to maintain stability and security. One of the best practices to minimize the downtime especially for a major upgrade is promoting one of the slaves to master after the upgrade was successfully done on that particular node. 

By performing this, you could point your application to the new master and the master-slave replication will continue to work. In the meantime, you also could proceed with the upgrade on the old master with peace of mind. With ClusterControl this can be executed with a few clicks only assuming the replication is configured as Global Transaction ID-based or GTID-based for short. To avoid any data loss, it’s worth stopping any application queries in case the old master is operating correctly. This is not the only situation that you could promote the slave. In the event the master node is down, you also could perform this action. 

Without ClusterControl, there are a few steps to promote the slave. Each of the steps requires a few queries to run as well:

  • Manually take down the master

  • Select the most advanced slave to be a master and prepare it

  • Reconnect other slaves to the new master

  • Changing the old master to be a slave

Nevertheless, the steps to Promote Slave with ClusterControl is only a few clicks: Cluster > Nodes > choose slave node > Promote Slave as per the screenshot below:

Master Becomes Unavailable

Imagine you have large transactions to run but the database is down. It does not matter how careful you are, this is probably the most serious or critical situation for a replication setup. When this happens, your database is not able to accept a single write, which is bad. Besides, your application(s), of course, will not work properly.

There are a few reasons or causes that lead to this issue. Some of the examples are hardware failure, OS corruption, database corruption and so on. As a DBA, you need to act quickly to restore the master database.

Thanks to the “Auto Recovery” cluster function that is available in ClusterControl, the failover process can be automated. It can be enabled or disabled with a single click. As the name goes, what it will do is bring up the entire cluster topology when necessary. For example, a master-slave replication must have at least one master alive at any given time, regardless of the number of available slaves. When the master is not available, it will automatically promote one of the slaves.

Let’s take a look at the screenshot below:

 

In the above screenshot, we can see that “Auto Recovery” is enabled for both Cluster and Node. In the topology, notice that the current master IP address is 10.10.10.11. What will happen if we take down the master node for testing purposes?

As you can see, the slave node with IP 10.10.10.12 is automatically promoted to master, so that the replication topology is reconfigured. Instead of doing it manually which, of course, will involve a lot of steps, ClusterControl helps you to maintain your replication setup by taking the hassle off your hands.

Conclusion

In any unfortunate event with your replication, the fix is very simple and less hassle with ClusterControl. ClusterControl helps you recover your replication issues quickly, which increases the uptime of your databases.

CMON High Availability and Failover

$
0
0

High availability is a must these days and ClusterControl performs a key role in ensuring that your database clusters will stay up and running. On the other hand, how can we ensure that ClusterControl itself will be highly available and be able to manage the database nodes? ClusterControl may be deployed in a couple of different ways, one of them is to form a cluster of ClusterControl nodes and then ensure that there is always a node that will be able to deal with the cluster management.

We have two blog posts in which we have explained how to setup Cmon HA, using Galera Cluster as a backend database across the ClusterControl nodes and how to configure HAProxy as a loadbalancer to point to an active ClusterControl node. In this blog post we would like to focus on how the ClusterControl cluster behaves in a situation where one of the nodes becomes unavailable. Let’s go through some scenarios that may happen.

Initial situation

First, let’s take a look at how the initial situation looks like. We have a ClusterControl cluster that consists of three nodes:

root@node2:~# s9s controller --list --long
S VERSION    OWNER  GROUP  NAME       IP         PORT COMMENT
f 1.8.2.4596 system admins 10.0.0.181 10.0.0.181 9501 Accepting heartbeats.
f 1.8.2.4596 system admins 10.0.0.182 10.0.0.182 9501 Accepting heartbeats.
l 1.8.2.4596 system admins 10.0.0.183 10.0.0.183 9501 Acting as leader.
Total: 3 controller(s)

As you can see, 10.0.0.183 is the leader node and it is also the node where HAProxy will redirect all requests:

This is the starting point, let’s see where we’ll head up.

ClusterControl is down

If for some reason the CMON process becomes unavailable, we need to investigate a couple of different scenarios including a possible bug, an OOM killer or some script that unexpectedly stopped the CMON service. First of all, what we’ll see is the fact that node is reported as not working:

 

root@node2:~# s9s controller --list --long

S VERSION    OWNER  GROUP  NAME       IP         PORT COMMENT

f 1.8.2.4596 system admins 10.0.0.181 10.0.0.181 9501 Accepting heartbeats.

l 1.8.2.4596 system admins 10.0.0.182 10.0.0.182 9501 Acting as leader.

- 1.8.2.4596 system admins 10.0.0.183 10.0.0.183 9501 Not accepting heartbeats.

Total: 3 controller(s)

To make things harder, we killed our leader node. As you can see, a new leader has been promoted and the failed node is marked as “Not accepting heartbeats”.

Of course, HAProxy picks up the leader change:

What is quite useful is that as long as it is doable, the ClusterControl cluster will recover the failed node. We can observe that via s9s command:

root@node2:~# s9s controller --list --long

S VERSION    OWNER  GROUP  NAME       IP         PORT COMMENT

f 1.8.2.4596 system admins 10.0.0.181 10.0.0.181 9501 Accepting heartbeats.

l 1.8.2.4596 system admins 10.0.0.182 10.0.0.182 9501 Acting as leader.

f 1.8.2.4596 system admins 10.0.0.183 10.0.0.183 9501 Accepting heartbeats.

Total: 3 controller(s)

As you can see, node 10.0.0.183 is up and “Accepting heartbeats” - it is acting now as a  follower, the leader has not changed.

Cmon node is not responding

Another scenario that may happen is a case where the whole node becomes unresponsive. Let’s simulate it in our lab by stopping the virtual machine on which node 10.0.0.182 - the current leader - is located.

root@node1:~# s9s controller --list --long

S VERSION    OWNER  GROUP  NAME       IP         PORT COMMENT

f 1.8.2.4596 system admins 10.0.0.181 10.0.0.181 9501 Accepting heartbeats.

- 1.8.2.4596 system admins 10.0.0.182 10.0.0.182 9501 Not accepting heartbeats.

l 1.8.2.4596 system admins 10.0.0.183 10.0.0.183 9501 Acting as leader.

Total: 3 controller(s)

As expected, the process was exactly the same as in the previous example. One of the remaining nodes has been promoted to a “leader” status while the failed node is marked as out of the cluster and without connectivity. The only difference is that ClusterControl will not be able to recover a failed node because it is no longer a process that is not running but it is a whole VM that has failed.

No surprises in our HAProxy panel, the new leader is marked as up, remaining nodes are down.

You can also track this in the configuration file for cmon service:

2021-06-14T13:28:35.999Z : (INFO) 10.0.0.183:9501: Cmon HA cluster contains 3 controller(s).

2021-06-14T13:28:35.999Z : (INFO)

          HOSTNAME  PORT ROLE         STATUS_MESSAGE

        10.0.0.181  9501 follower     Accepting heartbeats.

        10.0.0.182  9501 leader       Acting as leader.

        10.0.0.183  9501 follower     Accepting heartbeats.



2021-06-14T13:28:39.023Z : (WARNING) 10.0.0.183:9501: Heartbeat timeout (elapsed: 3025ms > timeout: 2200ms).

2021-06-14T13:28:39.023Z : (INFO) 10.0.0.183:9501: +++ Starting an election. +++++++++

2021-06-14T13:28:39.023Z : (INFO) 10.0.0.183:9501: Controller state change from CmonControllerFollower to CmonControllerCandidate.

2021-06-14T13:28:39.023Z : (INFO) 10.0.0.183:9501: Stopping heartbeat receiver.

2021-06-14T13:28:39.029Z : (INFO) 10.0.0.183:9501: Requesting votes.

2021-06-14T13:28:39.029Z : (INFO) 10.0.0.183:9501: Requesting vote from 10.0.0.181:9501 for term 24.

2021-06-14T13:28:39.030Z : (INFO) 10.0.0.183:9501: Requesting vote from 10.0.0.182:9501 for term 24.

2021-06-14T13:28:39.042Z : (INFO) 10.0.0.183:9501: Has the majority votes: yes.

2021-06-14T13:28:39.042Z : (INFO) 10.0.0.183:9501: Controller state change from CmonControllerCandidate to CmonControllerLeader.

2021-06-14T13:28:39.042Z : (INFO) 10.0.0.183:9501: Stopping heartbeat receiver.

2021-06-14T13:28:39.042Z : (INFO) 10.0.0.183:9501: Heartbeat receiver stopped.

2021-06-14T13:28:39.043Z : (INFO) 10.0.0.183:9501: Starting heartbeats to 10.0.0.181:9501

2021-06-14T13:28:39.044Z : (INFO) 10.0.0.183:9501: The value for cmon_ha_heartbeat_network_timeout is 2.

2021-06-14T13:28:39.044Z : (INFO) 10.0.0.183:9501: The value for cmon_ha_heartbeat_interval_millis is 140698833650664.

2021-06-14T13:28:39.044Z : (INFO) 10.0.0.183:9501: Starting heartbeats to 10.0.0.182:9501

2021-06-14T13:28:39.045Z : (INFO) 10.0.0.183:9501: The value for cmon_ha_heartbeat_network_timeout is 2.

2021-06-14T13:28:39.045Z : (INFO) 10.0.0.183:9501: The value for cmon_ha_heartbeat_interval_millis is 140698833650664.

2021-06-14T13:28:39.046Z : (INFO) 10.0.0.183:9501: [853f] Controller become a leader, starting services.

2021-06-14T13:28:39.047Z : (INFO) Setting controller status to 'leader' in host manager.

2021-06-14T13:28:39.049Z : (INFO) Checking command handler.

2021-06-14T13:28:39.051Z : (INFO) 10.0.0.183:9501: Cmon HA cluster contains 3 controller(s).

2021-06-14T13:28:39.051Z : (INFO)

          HOSTNAME  PORT ROLE         STATUS_MESSAGE

        10.0.0.181  9501 follower     Accepting heartbeats.

        10.0.0.182  9501 controller   Lost the leadership.

        10.0.0.183  9501 leader       Acting as leader.



2021-06-14T13:28:39.052Z : (INFO) Cmon HA is enabled, multiple controllers are allowed.

2021-06-14T13:28:39.052Z : (INFO) Starting the command handler.

2021-06-14T13:28:39.052Z : (INFO) Starting main loop.

2021-06-14T13:28:39.053Z : (INFO) Starting CmonCommandHandler.

2021-06-14T13:28:39.054Z : (INFO) The Cmon version is 1.8.2.4596.

2021-06-14T13:28:39.060Z : (INFO) CmonDb database is 'cmon' with schema version 107060.

2021-06-14T13:28:39.060Z : (INFO) Running cmon schema hot-fixes.

2021-06-14T13:28:39.060Z : (INFO) Applying modifications from 'cmon_db_mods_hotfix.sql'.

2021-06-14T13:28:41.046Z : (ERROR) 10.0.0.183:9501: Sending heartbeat to 10.0.0.182:9501 (leader) failed.

2021-06-14T13:28:41.047Z : (INFO) Host 10.0.0.182:9501 state changed from 'CmonHostOnline' to 'CmonHostOffLine'.

2021-06-14T13:28:41.048Z : (INFO) 10.0.0.183:9501: Cmon HA cluster contains 3 controller(s).

2021-06-14T13:28:41.048Z : (INFO)

          HOSTNAME  PORT ROLE         STATUS_MESSAGE

        10.0.0.181  9501 follower     Accepting heartbeats.

        10.0.0.182  9501 follower     Not accepting heartbeats.

        10.0.0.183  9501 leader       Acting as leader.

As you can see, there’s a clear indication that the leader stopped responding, a new leader has been elected and, finally, the old leader has been marked as unavailable.

If we will continue stopping more ClusterControl nodes, we will end up with a broken cluster:

The reason that this cluster has broken down is very simple: a three node cluster can survive failure of only one member node at the same time. If two nodes are down, the cluster will not be able to operate. This is represented in the log file:

2021-06-14T13:43:23.123Z : (ERROR) Cmon HA leader lost the quorum.

2021-06-14T13:43:23.123Z : (ERROR)

{ "class_name": "CmonHaLogEntry", "comment": "", "committed": false, "executed": false, "index": 5346, "last_committed_index": 5345, "last_committed_term": 24, "term": 24, "payload": { "class_name": "CmonHaLogPayload", "instruction": "SaveCmonUser", "cmon_user": { "class_name": "CmonUser", "cdt_path": "/", "owner_user_id": 3, "owner_user_name": "admin", "owner_group_id": 1, "owner_group_name": "admins", "acl": "user::rwx,group::r--,other::r--", "created": "2021-06-14T10:08:02.687Z", "disabled": false, "first_name": "Default", "last_failed_login": "", "last_login": "2021-06-14T13:43:22.958Z", "last_name": "User", "n_failed_logins": 0, "origin": "CmonDb", "password_encrypted": "77cb319920a407003449d9c248d475ce4445b3cf1a06fadfc4afb5f9c642c426", "password_format": "sha256", "password_salt": "a07c922c-c974-4160-8145-e8dfb24a23df", "suspended": false, "user_id": 3, "user_name": "admin", "groups": [ { "class_name": "CmonGroup", "cdt_path": "/groups", "owner_user_id": 1, "owner_user_name": "system", "owner_group_id": 1, "owner_group_name": "admins", "acl": "user::rwx,group::rwx,other::---", "created": "2021-06-14T10:08:02.539Z", "group_id": 1, "group_name": "admins" } ], "timezone": { "class_name": "CmonTimeZone", "name": "Coordinated Universal Time", "abbreviation": "UTC", "offset": 0, "use_dst": false } } } }

Once you solve the problems with failed nodes, the cluster will rejoin:

As you can see, the ClusterControl cluster can provide you with a highly available installation that can handle failure of some of the nodes, as long as the majority of the cluster is available. It can also self-heal and recover from more critical failures, ensuring that ClusterControl will keep monitoring your database and ensuring your business continuity is not compromised.

Summary

In this blog post we have covered a couple of scenarios where one of its database clusters’ nodes becomes unavailable. A couple of scenarios that you might notice is where one of the remaining database nodes will be promoted to a “leader” status where the failed node will be marked as unresponsive with or without ClusterControl being able to recover it. However, even if your clusters break down, solve a couple of problems (identify them by looking at log files) and you should be good to go!

How to configure SELinux for MySQL-based systems (MySQL/MariaDB Replication + Galera)

$
0
0

In the era that we are living in now, anything with a less secure environment is easily a target for an attack and becomes a bounty for the attackers. Compared to the past 20 years, hackers nowadays are more advanced not only with the skills but also with the tools that they are using. It’s no surprise why some giant companies are being hacked and their valuable data is leaked.

In the year 2021 alone, there are already more than 10 incidents related to data breaches. The most recent incident was reported by BOSE, a well-known audio maker that occurred in May. BOSE discovered that some of its current and former employees’ personal information was accessed by the attackers. The personal information exposed in the attack includes names, Social Security Numbers, compensation information, and other HR-related information.

What do you think is the purpose of this kind of attack and what motivates the hacker to do so? It’s obviously all about the money. Since stolen data is also frequently sold, by attacking big companies hackers can earn money. Not only the important data can be sold to the competitors of the business, but the hackers can also ask for a huge ransom at the same time. 

So how could we relate this to databases? Since the database is one of the big assets for the company, it is recommended to take care of it with enhanced security so that our valuable data is protected most of the time. In my last blog post, we already went through some introduction about SELinux, how to enable it, what type of mode SELinux has and how to configure it for MongoDB. Today, we will take a look into how to configure SELinux for MySQL based systems.

Top 5 Benefits of SELinux

Before going further, perhaps some of you are wondering if SELinux provides any positive benefits given it’s a bit of a hassle to enable it. Here are the top 5 SELinux benefits that you don’t want to miss and should consider:

  • Enforcing data confidentiality and integrity at the same time protecting processes

  • The ability to confine services and daemons to be more predictable

  • Reducing the risk of privilege escalation attacks

  • The policy enforced systems-wide, not set at user discretion and administratively-define

  • Providing a fine-grained access control

Before we start configuring the SELinux for our MySQL instances, why not go through how to enable SELinux with ClusterControl for all MySQL based deployments. Even though the step is the same for all database management systems, we think it is a good idea to include some screenshots for your reference.

Steps To Enable SELinux for MySQL Replication

In this section, we are going to deploy MySQL Replication with ClusterControl 1.8.2. The steps are the same for MariaDB, Galera Cluster or MySQL: assuming all nodes are ready and passwordless SSH is configured, let’s start the deployment. To enable SELinux for our setup, we need to untick “Disable AppArmor/SELinux” which means SELinux will be set as “permissive” for all nodes.

Next, we will choose Percona as a vendor (you can also choose MariaDB, Oracle or MySQL 8 as well), then specify the “root” password. You may use a default location or your other directories depending on your setup.

Once all hosts have been added, we can start the deployment and let it finish before we can begin with the SELinux configuration.

Steps To Enable SELinux for MariaDB Replication

In this section, we are going to deploy MariaDB Replication with ClusterControl 1.8.2.

We will choose MariaDB as a vendor and version 10.5 as well as specify the “root” password. You may use a default location or your other directories depending on your setup.

Once all hosts have been added, we can start the deployment and let it finish before we can proceed with the SELinux configuration.

Steps To Enable SELinux for Galera Cluster

In this section, we are going to deploy Galera Cluster with ClusterControl 1.8.2. Once again, untick “Disable AppArmor/SELinux” which means SELinux will be set as “permissive” for all nodes:

Next, we will choose Percona as a vendor and MySQL 8 as well as specify the “root” password. You may use a default location or your other directories depending on your setup. Once all hosts have been added, we can start the deployment and let it finish.


 

As usual, we can monitor the status of the deployment in the “Activity” section of the UI. 

How To Configure SELinux For MySQL

Considering all our clusters are MySQL based, the steps to configure SELinux are also the same. Before we start with the setup and since this is a newly setup environment we suggest you disable auto recovery mode for both cluster and node as per the screenshot below. By doing this, we could avoid the cluster run into a failover while we are doing the testing or restart the service:

First, let’s see what is the context for “mysql”. Go ahead and run the following command to view the context:

$ ps -eZ | grep mysqld_t

And the example of the output is as below:

system_u:system_r:mysqld_t:s0       845 ?        00:00:01 mysqld

The definition for the output above is:

  • system_u - User

  • system_r - Role

  • mysqld_t - Type

  • s0 845 - Sensitivity level

If you check the SELinux status, you can see the status is “permissive” which is not fully enabled yet. We need to change the mode to “enforcing” and in order to accomplish that we have to edit the SELinux configuration file to make it permanent. 

$ vi /etc/selinux/config
SELINUX=enforcing

Proceed to reboot the system after the changes. As we are changing the mode from “permissive” to “enforcing”, we need to relabel the file system again. Typically, you can choose whether to relabel the entire file system or only for one application. The reason why relabel is required due to the fact that “enforcing” mode needs the correct label or function to run correctly. In some instance, those labels are changed during the “permissive” or “disabled” mode.

For this example, we will relabel only one application (MySQL) using the following command:

$ fixfiles -R mysqld restore

For a system that has been used for quite some time, it is a good idea to relabel the entire file system. The following command will do the job without rebooting and this process might take a while depending on your system:

$ fixfiles -f -F relabel

Like many other databases, MySQL also demands to read and write a lot of files. Without a correct SELinux context for those files, the access will be unquestionably denied. To configure the policy for SELinux,  “semanage” is required. “semanage” also allows any configuration without a need of recompiling the policy sources. For the majority of Linux systems, this tool already installed by default. As for our case, it’s already installed with the following version:

$ rpm -qa |grep semanage
python3-libsemanage-2.9-3.el8.x86_64
libsemanage-2.9-3.el8.x86_64

For the system that does not have it installed, the following command will help you to install it:

$ yum install -y policycoreutils-python-utils

Now, let’s see what is the MySQL file contexts:

$ semanage fcontext -l | grep -i mysql

As you may notice, there are a bunch of files that are connected to MySQL once the above command is executed. If you recall at the beginning, we are using a default “Server Data Directory”. Should your installation is using a different data directory location, you need to update the context for “mysql_db_t” which refers to the /var/lib/mysql/

The first step is to change the SELinux context by using any of these options:

$ semanage fcontext -a -t mysqld_db_t /var/lib/yourcustomdirectory
$ semanage fcontext -a -e /var/lib/mysql /var/lib/yourcustomdirectory

After the step above, run the following command:

$ restorecon -Rv /var/lib/yourcustomdirectory

And lastly, restart the service:

$ systemctl restart mysql

In some setup, likely a different log location is required for any purpose. For this situation, “mysqld_log_t” needs to be updated as well. “mysqld_log_t” is a context for default location /var/log/mysqld.log and the steps below can be executed to update it:

$ semanage fcontext -a -t mysqld_log_t "/your/custom/error.log"
$ restorecon -Rv /path/to/my/custom/error.log
$ systemctl restart mysql

There will be a situation when the default port is configured using a different port other than 3306. For example, if you are using port 3303 for MySQL, you need to define the SELinux context with the following command:

$ semanage port -a -t mysqld_port_t -p tcp 3303

And to verify that the port has been updated, you may use the following command:

$ semanage port -l | grep mysqld

Using audit2allow To Generate Policy

Another way to configure the policy is by using “audit2allow” which already included during the “semanage” installation just now. What this tool does is by pulling the log events from the audit.log and use that information to create a policy. Sometimes, MySQL might need a non-standard policy, so this is the best way to achieve that.

First, let’s set the mode to permissive for the MySQL domain and verify the changes:

$ semanage permissive -a mysqld_t
$ semodule -l | grep permissive
permissive_mysqld_t
permissivedomains

The next step is to generate the policy using the command below:

$ grep mysqld /var/log/audit/audit.log | audit2allow -M {yourpolicyname}
$ grep mysqld /var/log/audit/audit.log | audit2allow -M mysql_new

You should see the output like the following (will differ depending on your policy name that you set):

******************** IMPORTANT ***********************

To make this policy package active, execute:

 

semodule -i mysql_new.pp

As stated, we need to execute “semodule -i mysql_new.pp” to activate the policy. Go ahead and execute it:

$ semodule -i mysql_new.pp

The final step is to put the MySQL domain back to the “enforcing” mode:

$ semanage permissive -d mysqld_t

libsemanage.semanage_direct_remove_key: Removing last permissive_mysqld_t module (no other permissive_mysqld_t module exists at another priority).

What Should You Do If SELinux is Not Working?

A lot of times, the SELinux configuration requires so much testing. One of the best ways to test the configuration is by changing the mode to “permissive”. If you want to set it only for the MySQL domain, you can just use the following command. This is good practice to avoid configuring the whole system to “permissive”:

$ semanage permissive -a mysqld_t

Once everything is done, you may change the mode back to the “enforcing”:

$ semanage permissive -d mysqld_t

In addition to that, /var/log/audit/audit.log provides all logs related to the SELinux. This log will help you a lot in identifying the root cause and the reason. All you have to do is to filter “denied” using “grep”.

$ more /var/log/audit/audit.log |grep "denied"

We are now finished with configuring SELinux policy for MySQL based system. One thing worth mentioning is that the same configuration needs to be done on all nodes of your cluster, you might need to repeat the same process for them.

Viewing all 385 articles
Browse latest View live