Entries Comments

Alex Mace’s Blog

Life & Web Development

Database Clustering – MySQL

I’ve been trying to get MySQL Cluster adopted recently in work. When we identified a requirement to provide high availability, MySQL Cluster seem the logical choice considering the application we were looking at already runs on a MySQL platform. The other alternative presented was MS SQL Server, which is used for a lot of other projects. My plan is to go through the two alternatives over the next couple of entries and then compare them and then I’ll tell you the actual outcome of our investigations in work.

First a bit of background. MySQL offers a number of different storage engines, which basically means you have different ways of storing your data in the database. The most common options are InnoDB and MyISAM, but MySQL Cluster is uses the NDB (Network DataBase) Storage Engine. NDB, as the name suggests, is a database that runs over a network. In this case, it distributes data across a number of machines.

NDB is ACID compliant and incorporates most of the “proper” RDBMS features you’d ever need. One of the limitations in the current version 5.0 is that all of the data is held in memory. This has the advantage of making access to the data itself fast, but the architecture of the cluster might actually make queries take longer. If you think about it, this makes sense; accessing data across a network is always going to be slower than accessing data held locally. The faster the network connection, the faster the access will become. MySQL specify you need a minimum of a 100Mb/s connection between machines, but 1000Mb/s is recommended. You will also need enough memory to store all of your data, indexes, etc. We’ll explore the implications of this later.

One of the big features of MySQL Cluster is that it uses a “shared nothing” architecture. This means that there is no “single point of failure” in the system, which is a large advantage over many other database high availability solutions, which usually have a weakness in this area because they store all of their data in one place. This is usually either a shared disk, or for larger clusters, a Storage Area Network (which cost 10s of thousands of pounds). MySQL Cluster does not require any special hardware; it is designed to run on commodity hardware that is widely, and cheaply available.

MySQL Cluster is made up of 3 different processes, mysqld, ndbd and ndb_mgmd. mysqld is the usual MySQL daemon that applications connect to. Usually when talking about MySQL Cluster we refer to this as an “SQL Node”. ndbd is the daemon that looks after part of the data in the cluster. This is referred to as a “Data Node”. Finally we have the ndb_mgmd daemon, referred to as a “Management Node”. This looks after the cluster, restarting nodes if they crash and carrying out arbitration between data nodes if somehow the data has got out of sync.

How does it all work? Apart from connecting to the data in a different way, SQL Nodes work exactly as they did before. Applications send queries to them and they return the data. Under the hood, these SQL nodes are now asking the Data nodes for the data. The advantage of this is that you don’t need to make any changes to your application whatsoever to use MySQL Cluster.

Each Data node stores a chunk of the complete database. How big a chunk depends on the number of replicas of the data you have defined in the configuration of MySQL Cluster, and how many data nodes there are in the system. So, for example, if you define 2 replicas of the database and have 4 data nodes, then each data node will be storing half of the database. As stated before, MySQL Cluster stores all of the data in memory, so you will need enough memory in each node to store half of the database in the example. In general, you can work out how much memory you will need per node with this formula.

( Size of Database * Number of Replicas * 1.1 ) / Number of Data nodes

The 1.1 is to give enough overhead to the database. Bear in mind that the size of the database includes all of the indexes, and that in 5.0, VARCHAR fields take up the maximum possible space, which could greatly increase the size of your database. It would be a good idea to consider potential growth of your data as you wouldn’t want to be upgrading again in the short term. In a future version of MySQL Cluster, it will be possible to store data on the disk of the Data nodes.
The data held is synchronously replicated across all Data nodes using a two-phase commit. When an update occurs, it is sent to all of the Data nodes and then once all nodes report they are ready then the update is fully commited to the database. This guarantees that the data is consistent across the cluster.

The Management node is the one that stores the configuration of the cluster and manages all of the other nodes. Technically the management node is only required at cluster start up. Once each other SQL and Data nodes have started and connected to the Management node, they download the configuration information and will run without the Management node. It is sensible to keep it running though as the Management node will restart any node that crashes, and also provides an Administration interface through the program ndb_mgm. This allows you to shutdown different nodes gracefully, as well as get reports on the current state of the cluster.

It all sounds good so far, but there are some gotchas to bear in mind. MySQL Cluster is not support on all of the operating systems that MySQL runs on. Linux, Solaris, Mac OS X are all fine, but Microsoft Windows is not at this time. Support is planned in a future version of MySQL Cluster.

As I said earlier, you don’t need to make changes to applications to use MySQL Cluster, but it would be a mistake not to. Remember, it going to be slower to accessing data across the network and the queries your application runs and schema of your database are the main determiners of the difference get data across the network will make. If you run a query on an un-indexed field, every record in the table will be returned to the SQL node, which then decides what matches and what doesn’t. If you’re looking for a small subset of records from a table, it’s obviously going to be slower to transfer the whole set rather than the subset you’re after.

Scalability is not necessarily a gotcha, but how you go about scaling the cluster beyond your initial setup is important to consider. In terms of upper limits, you can have up to 64 nodes in the cluster, of which 48 can be Data nodes. According to MySQL, it is possible to achieve linear scaling in performance using the right hardware.

If performance does become a problem it is important to look at the machines in the cluster and see which ones are suffering. Usually, it will be the SQL nodes, so you can simply add another and see if it benefits the cluster straight away. If it’s storage room you’re running out of, you can either increase the memory in the Data nodes, or add more Data nodes. Things are a little more complicated when adding Data nodes because Data nodes are split into “node groups” and each group must have an equal number in it. Simply put, this means that you might have problems adding just a single extra Data node, better to just plan to add two and put off the next upgrade for a while longer.

You can increase the number of Management nodes, but above one it becomes complicated as you must manually number all of the nodes in the cluster yourself. As said before, the Management node is not integral to the running of the cluster, so it may be easier to run with just one and let the nodes number themselves.

So that’s the technology of MySQL Cluster. I’ll take a look at support and licensing when comparing it with MS SQL Server. If you want to play around with it, all of the software you will need is available from MySQL free of charge under the GPL. If you want any help or have any questions leave a comment or e-mail me.

Tags: ,

4 Responses to “Database Clustering – MySQL”

  1. You cannot (currently) add data nodes as an online operation. that is, you have to take a backup, shutdown, start a new cluster and restore.

    You can add mysql server nodes with the cluster still up, running and processing transactions though.

    We plan to support online add node in a future release (not 5.1 though)

    Disk data is in 5.1
    as is variable sized attributes (”true” varchar)

  2. I’m looking for the follow up article.

    MySQL Cluster seems like a nice toy, but adding more storage seems to be a pain, the more customers you have the longer it takes to add a pair of storage nodes. At the same time, the more customers see an outage. And what if the total size of backup exceeds the biggest disk in company (with disk storage)? Having a TB backup is no problem with 8 storage nodes each having cheap 300MB drive.

  3. Alex says:

    Hi Ondrej,

    I’m writing it at the moment. Hopefully it’ll be out ASAP. As for adding storage nodes, yes you are going to have to take the Cluster down to do it, but that would almost certainly be true for any architecture like this. The key is to plan for growth so that you reduce the frequency you need to upgrade, and to plan the upgrade for times when it will have less effect on your service.

    With backups, that could provide a challenge. I think the best idea would be to look at that data and decide what you need to be highly available (and therefore stored in the cluster) and what could be stored seperately using replication for backup. There are always ways around these things though – you could get a 1TB NAS for about £1000, or backup individual table seperately. Unless you have a 1TB table, but that would seem unlikely.

  4. Alex Rutgers says:

    Perhaps some usefull experiance info with two MC versions.

    We have been testing 5.1.11 beta disk based and 5.0 ram based.

    5.1.11 beta is not yet stable enough for us for production as we found problems when a node was shut down, it never comes up again. We reported a defect and its being picked up, but we assume this version needs a bit more time to grow stable.

    With 5.0 we hit the 4GB (actual about 2.5GB) per process limit for 32 bit architectures as we have a serious filled database, and MySQL advises not to use multiple ndbd nodes on a single host, as we did a test with that, and also in this case the ndbd processes could not be restarted after a simple kill -9.

    We did not investigate MySQL on 64 bit yet, as that will serious cost and might open another tarpit (64 bit application support) we are afraid.

    I like the MC concepts and open source but the reality seems very tough and wonder how other companies cope fullfilling the performance and HA requirements using MC.

    Are there serious companies out there using MC in production for serious stuff? e.g. financial related systems?)

    Alex Rutgers.

Leave a Reply

(required)

(required)