集群

A.10. MySQL 5.0 FAQ — 集群

  • 26.10.1:
    "NDB" 是什么意思?

    What does “NDB” mean?

  • 26.10.2:
    使用集群和复制的区别是什么?

    What's the difference in using Cluster vs using replication?

  • 26.10.3:
    是否需要特定的网络结构才能使用集群呢?集群中的计算机是如何通信的?

    Do I need to do any special networking to run Cluster? How
    do computers in a cluster communicate?

  • 26.10.4:
    集群中需要多少台计算机呢?为什么?

    How many computers do I need to run a cluster, and why?

  • 26.10.5:
    集群中的计算机都是干吗的?

    What do the different computers do in a MySQL Cluster?

  • 26.10.6:
    什么样的操作系统平台下才能使用集群?

    With which operating systems can I use Cluster?

  • 26.10.7:
    使用集群是否有什么硬件要求?

    What are the hardware requirements for running MySQL Cluster?

  • 26.10.8:
    需要多大的内存呢?能否完全使用磁盘呢?

    How much RAM do I need? Is it possible to use disk memory at all?

  • 26.10.9:
    配置集群的时候,正在加载的进程非正常中断了,并且报错信息如下,为什么会这样?

    ERROR 1114: The table 'my_cluster_table' is full 

    I'm trying to populate a Cluster database. The loading process terminates prematurely and I get an error message like this one:

    ERROR 1114: The table 'my_cluster_table' is full
    

    Why is this happening?

  • 26.10.10:
    MySQL 集群使用TCP/IP协议,这是否意味着我可以跨越Internet网,把某些节点放在远程,来使用集群呢?

    MySQL Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?

  • 26.10.11:
    我是否需要学习新的编程语言或者查询语言才能使用MySQL集群呢?

    Do I have to learn a new programming or query language to use MySQL Cluster?

  • 26.10.12:
    使用MySQL集群的时候,我怎么知道某些警告/错误信息是什么意思呢?

    How do I find out what an error or warning message means when using MySQL Cluster?

  • 26.10.13:
    MySQL集群是事务安全的吗?它支持什么样的事务隔离级别?

    Is MySQL Cluster transaction-safe? What isolation levels are supported?

  • 26.10.14:
    What storage engines are supported by MySQL Cluster?

  • 26.10.15:
    哪个MySQL版本才支持集群?是否需要从源代码中编译?

    Which versions of the MySQL software support Cluster? Do I have to compile from source?

  • 26.10.16:
    在发生灾难性事故时 - 比如说,整个城市都断电了,就连UPS也断了,那么我是否会丢失所有的数据呢?

    In the event of a catastrophic failure — say, for instance, the whole city loses power and my UPS fails — would I lose all my data?

  • 26.10.17:
    集群中是否能用 FULLTEXT 索引?

    Is it possible to use FULLTEXT indexes with Cluster?

  • 26.10.18:
    我能否在一个计算机上运行多个节点?

    Can I run multiple nodes on a single computer?

  • 26.10.19:
    能否不用重启就能在集群中加入一个节点?

    Can I add nodes to a cluster without restarting it?

  • 26.10.20:
    使用MySQL集群的时候是否有什么限制?

    Are there any limitations that I should be aware of when using MySQL Cluster?

  • 26.10.21:
    如何把已存在的一个MySQL数据库导入到集群中?

    How do I import an existing MySQL database into a cluster?

  • 26.10.22:
    集群中的节点和其他节点是如何通信的?

    How do cluster nodes communicate with one another?

  • 26.10.23:
    什么是"arbitrator(仲裁节点)"?

    What is an arbitrator?

  • 26.10.24:
    MySQL集群都支持什么数据类型?

    What data types are supported by MySQL Cluster?

  • 26.10.25:
    怎么启动和关闭MySQL集群?

    How do I start and stop MySQL Cluster?

  • 26.10.26:
    当集群关闭之后,集群中的数据会怎么办?

    What happens to cluster data when the cluster is shut down?

  • 26.10.27:
    集群中使用多个管理节点是否更好?

    Is it helpful to have more than one management node for a cluster?

  • 26.10.28:
    能否在MySQL集群中混合使用多种不同的硬件以及操作系统?

    Can I mix different kinds of hardware and operating systems in one MySQL Cluster?

  • 26.10.29:
    能否在一个主机上运行2个数据节点?或者2个SQL节点?

    Can I run two data nodes on a single host? Two SQL nodes?

  • 26.10.30:
    MySQL集群中能否是用主机名?

    Can I use hostnames with MySQL Cluster?

  • 26.10.31:
    在一个多个MySQL服务器的集群中如何处理MySQL用户?

    How do I handle MySQL users in a Cluster having multiple MySQL servers?

Questions and Answers

26.10.1:
What does “NDB” mean?

它的意思是: "网络数据库".NDB(也叫NDB Cluster或者NDBCLUSTER),使用它才能让MySQL支持集群.

This stands for
Network
Database
”.
NDB (also known as NDB
Cluster
or NDBCLUSTER) is the
storage engine that enables clustering in MySQL.

26.10.2:
What's the difference in using Cluster
vs using replication?

在复制系统中,一个MySQL主服务器会更新一个或多个从服务器.事务是顺序地提交的,因此一个慢事务就可能导致从服务器比主服务器落后一段时间.这也意味着,如果主服务器出错失败了,那么从服务器可能会缺少记录最后的那一小部分事务日志.如果使用的是事务安全存储引擎的话,例如InnoDB,那么事务日志则会完全记录到从服务器上去或者完全不记录,但是复制不能保证主和从服务器上的数据总是保持一致性.在MySQL集群中,所有的数据总是保持同步,在任何数据节点上提交的事务都同步到所有其他的数据节点上了.如果有一个数据节点失败了,其他正常的数据节点照样能保持数据的一致性.

In a replication setup, a master MySQL server updates one or
more slaves. Transactions are committed sequentially, and a
slow transaction can cause the slave to lag behind the
master. This means that if the master fails, it is possible
that the slave might not have recorded the last few
transactions. If a transaction-safe engine such as
InnoDB is being used, a transaction will
either be complete on the slave or not applied at all, but
replication does not guarantee that all data on the master
and the slave will be consistent at all times. In MySQL
Cluster, all data nodes are kept in synchrony, and a
transaction committed by any one data node is committed for
all data nodes. In the event of a data node failure, all
remaining data nodes remain in a consistent state.

In short, whereas standard MySQL replication is
asynchronous, MySQL Cluster is synchronous.

We have implemented (asynchronous) replication for Cluster
in MySQL 5.1. This includes the capability to replicate both
between two clusters, and from a MySQL cluster to a
non-Cluster MySQL server. However, we do not plan to
backport this functionality to MySQL 5.0.

26.10.3:
Do I need to do any special networking to run Cluster? How
do computers in a cluster communicate?

MySQL集群适合用于高速带宽的环境中,采用TCP/IP方式连接.它的性能跟主机间的连接速率有直接关系.集群中的最小速率要求是常规的100Mb以太网或者等同的网络.我们建议可能的话就采用G级网络.

MySQL Cluster is intended to be used in a high-bandwidth
environment, with computers connecting via TCP/IP. Its
performance depends directly upon the connection speed
between the cluster's computers. The minimum connectivity
requirements for Cluster include a typical 100-megabit
Ethernet network or the equivalent. We recommend you use
gigabit Ethernet whenever available.

The faster SCI protocol is also supported, but requires
special hardware. See
Section 15.10, “Using High-Speed Interconnects with MySQL Cluster”, for more
information about SCI.

26.10.4:
How many computers do I need to run a cluster, and why?

集群最少要求有3台计算机.不过我们建议最好是4台;有2台分别运行管理节点和SQL节点,另外2台作为数据节点.采取2台数据节点的目的是提高数据的冗余度,管理节点放在一个独立的主机上是为了能够保证在万一有一台数据节点失败的情况下提供仲裁服务.

A minimum of three computers is required to run a viable
cluster. However, the minimum
recommended number of
computers in a MySQL Cluster is four: one each to run the
management and SQL nodes, and two computers to serve as data
nodes. The purpose of the two data nodes is to provide
redundancy; the management node must run on a separate
machine to guarantee continued arbitration services in the
event that one of the data nodes fails.

To provide increased throughput and high availability, you
should use multiple SQL nodes (MySQL Servers connected to
the cluster). It is also possible (although not strictly
necessary) to run multiple management servers.

26.10.5:
What do the different computers do in a MySQL Cluster?

MySQL集群既又物理也有逻辑组织,每台计算机都作为物理基础元素.集群中的逻辑基础元素称为 节点,集群中的每个逻辑和功能基础元素也称为 集群主机.共有3中节点类型,每个在集群中都有对应的不同角色,分别如下:

  • 管理节点(MGM node): 提供整个集群的管理服务,包括启动,关闭,备份,数据配置等.管理节点是由 ndb_mgmd 应用程序来提供的;管理节点的客户端用 ndb_mgm.

  • 数据节点: 存储和复制数据.数据节点的功能都是由一个NDB数据节点进程 ndbd 来处理的.

  • SQL 节点: 这只是一个简单的MySQL服务器实例(mysqld),它启动的时候使用--ndb-cluster选项,使之支持 NDB Cluster 存储引擎.

A MySQL Cluster has both a physical and logical
organization, with computers being the physical elements.
The logical or functional elements of a cluster are referred
to as nodes, and a computer housing a
cluster node is sometimes referred to as a
cluster host. There are three types
of nodes, each correspondin to a specific role within the
cluster. These are:


  • Management node (MGM
    node)
    : Provides management services for the
    cluster as a whole, including startup, shutdown,
    backups, and configuration data for the other nodes. The
    management node server is implemented as the application
    ndb_mgmd; the management client used
    to control MySQL Cluster via the MGM node is
    ndb_mgm.

  • Data node: Stores and
    replicates data. Data node functionality is handled by
    an instance of the NDB data node process
    ndbd.

  • SQL node: This is
    simply an instance of MySQL Server
    (mysqld) that is built with support
    for the NDB Cluster storage engine
    and started with the --ndb-cluster
    option to enable the engine.

26.10.6:
With which operating systems can I use Cluster?

MySQL集群支持大部分的类Unix操作系统,包括Linux, Mac OS X, Solaris, BSD, HP-UX, AIX, IRIX,除此之外,还有Novell Netware.现在MySQL集群还不支持Windows.不过,我们正在努力,使得让集群都能支持其他平台,例如Windows,我们的目标是让MySQL支持的所有平台上也都能支持集群.
详情请看 http://www.mysql.com/support/supportedplatforms.html.

MySQL Cluster is supported on most Unix-like operating
systems, including Linux, Mac OS X, Solaris, BSD, HP-UX,
AIX, and IRIX, among others, as well as Novell Netware.
Cluster is not supported for Windows at this time. However,
we are working to add Cluster support for other platforms,
including Windows, and our goal is to offer MySQL Cluster on
all platforms for which MySQL itself is supported.

For more detailed information concerning the level of
support which is offered for MySQL Cluster on various
operating system versions, OS distributions, and hardware
platforms, please refer to
http://www.mysql.com/support/supportedplatforms.html.

26.10.7:
What are the hardware requirements for running MySQL
Cluster?

MySQL集群可以运行在任何启用NDB的平台上.显然,CPU越快,内存越大,对集群性能提升越明显,64位的CPU也可能比32位的处理器更快.每个作为数据借点的机器都必须有足够的内存来保存共享数据库(详情请看 How much RAM do I Need?).节点之间通过TCP/IP网络和硬件通信.想要使用SCI的话,就必须有特定的网络设备了.

Cluster should run on any platform for which NDB-enabled
binaries are available. Naturally, faster CPUs and more
memory will improve performance, and 64-bit CPUs will likely
be more effective than 32-bit processors. There must be
sufficient memory on machines used for data nodes to hold
each node's share of the database (see How much
RAM do I Need?
for more information). Nodes can
communicate via a standard TCP/IP network and hardware. For
SCI support, special networking hardware is required.

26.10.8:
How much RAM do I need? Is it possible to use disk
memory at all?

在MySQl 5.0中,集群只能基于内存.意思是所有表的数据(包括索引)都保存在内存中.如果你的数据有1GB那么大,你想要复制一份到集群中的话,那么就必须要2GB的内存才行(每份复制占用1GB),这是运行集群的计算机上相对其他操作系统额外要求的内存.

如果一个数据节点上的内存使用超出了可用的范围,则操作系统会使用交换内存来达到上限值 DataMemory.不过这会导致性能严重下降,并且可能导致相应时间变慢.正是由于这个原因,我们不推荐在生产环境中使用磁盘交换空间.在任何情况下,只要达到 DataMemory 上限了,那么所有的操作请求(比如插入)都会失败.

我们已经在MySQL 5.1中实现了基于磁盘存储的集群,但是并没有打算在5.0中也加入这个功能.

可以使用以下公式来估算一个集群数据节点大概需要多少内存:

(SizeofDatabase × NumberOfReplicas × 1.1 ) / NumberOfDataNodes

想要计算确切的需要多少内存,就需要找到数据库中每个表的每条记录占用多少空间,然后乘以每个表的记录数,每条记录占用多少空间请参考( href="/docs/MySQL_51_zh/mysql_51.html" target="_blank" title="11.5. Data Type Storage Requirements">Section 11.5, “Data Type Storage Requirements”).
还必须关注以下几点关于每个字段的索引:

  • NDBCluster 表中的每个主键或者哈希索引记录都需要 21-25 个字节.这些索引使用 IndexMemory.

  • 每个排序索引记录需要 10 字节,它们使用 DataMemory.

  • 创建主键或唯一索引的同时也创建排序索引,除非创建索引时指定 USING HASH 了.换句话说就是:

    • 集群数据表中的每个主键或者唯一索引记录通常占用 31-35 字节.

    • 尽管如此,如果创建主键或者唯一索引时指定 USING HASH,那么每个记录只需要 21-25 字节.

注意,MySQL集群中创建主键或唯一索引是如果指定 USING HASH的话,那么通常会导致那个表运行起来较快 -
有些时候,更新数据的话通常能比不用 USING HASH 的表快
20-30%.这是由于它只需要较少的内存(因为没有创建排序索引),用到比较少的CPU资源(因为只需读取/更新较少的索引).不过,这也意味着如果查询不是使用范围(range)扫描的话,就会导致查询比较慢.

计算集群需要多少内存是,你可以从 MySQLForge 上找到一个工具,叫做 ndb_size.pl.这个Perl脚本连接到当前的MySQL服务器(不是MySQL集群)上,然后报告如果当前的数据库如果转换成 NDBCluster 需要多少空间.详情请看 Section 15.9.13, "ndb_size.pl".

需要特别注意: 每个MySQL集群表都需要主键.如果没有定义主键,则 NDB 存储引擎会自动创建一个,并且不是指定 USING HASH 的方式.

在指定的任何时候都无法十分准确地计算出MySQL集群索引所需要多少内存;不过,当使用了超过80%的 DataMemoryIndexMemory 的时候,就会写一条信息到日志中,然后是85%,90%,依此类推.

In MySQL-5.0, Cluster is in-memory only. This
means that all table data (including indexes) is stored in
RAM. Therefore, if your data takes up 1GB of space and you
want to replicate it once in the cluster, you need 2GB of
memory to do so (1 GB per replica). This is in addition to
the memory required by the operating system and any
applications running on the cluster computers.

If a data node's memory usage exceeds what is available in
RAM, then the system will attempt to use swap space up to
the limit set for DataMemory. However,
this will at best result in severely degraded performance,
and may cause the node to be dropped due to slow response
time (missed heartbeats). We do not recommend on relying on
disk swapping in a production environment for this reason.
In any case, once the DataMemory limit is
reached, any operations requiring additional memory (such as
inserts) will fail.

(We have implemented disk data storage for MySQL Cluster in
MySQL 5.1, but we have no plans to add this capability in
MySQL 5.0.)

You can use the following formula for obtaining a rough
estimate of how much RAM is needed for each data node in the
cluster:

(SizeofDatabase × NumberOfReplicas × 1.1 ) / NumberOfDataNodes

To calculate the memory requirements more exactly requires
determining, for each table in the cluster database, the
storage space required per row (see
Section 11.5, “Data Type Storage Requirements”, for details), and
multiplying this by the number of rows. You must also
remember to account for any column indexes as follows:

  • Each primary key or hash index created for an
    NDBCluster table requires 21–25
    bytes per record. These indexes use
    IndexMemory.

  • Each ordered index requires 10 bytes storage per record,
    using DataMemory.

  • Creating a primary key or unique index also creates an
    ordered index, unless this index is created with
    USING HASH. In other words:

    • A primary key or unique index on a Cluster table
      normally takes up 31 to 35 bytes per record.

    • However, if the primary key or unique index is
      created with USING HASH, then it
      requires only 21 to 25 bytes per record.

Note that creating MySQL Cluster tables with USING
HASH
for all primary keys and unique indexes will
generally cause table updates to run more quickly — in
some cases by a much as 20 to 30 percent faster than updates
on tables where USING HASH was not used
in creating primary and unique keys. This is due to the fact
that less memory is required (because no ordered indexes are
created), and that less CPU must be utilized (because fewer
indexes must be read and possibly updated). However, it also
means that queries that could otherwise use range scans must
be satisfied by other means, which can result in slower
selects.

When calculating Cluster memory requirements, you may find
useful the ndb_size.pl utility which is
available on
MySQLForge.
This Perl script connects to a current MySQL (non-Cluster)
database and creates a report on how much space that
database would require if it used the
NDBCluster storage engine. For more
information, see
Section 15.9.13, “ndb_size.pl.

It is especially important to keep in mind that
every MySQL Cluster table must have a primary
key
. The NDB storage engine
creates a primary key automatically if none is defined, and
this primary key is created without USING
HASH
.

There is no easy way to determine exactly how much memory is
being used for storage of Cluster indexes at any given time;
however, warnings are written to the Cluster log when 80% of
available DataMemory or
IndexMemory is in use, and again when use
reaches 85%, 90%, and so on.

26.10.9:
I'm trying to populate a Cluster database. The loading
process terminates prematurely and I get an error message
like this one:

ERROR 1114: The table 'my_cluster_table' is full

Why is this happening?

原因很有可能是因为你设置的内存不足以装下所有的数据表及其索引,包括 NDB 存储引擎中所需的主键以及如果没有定义主键时自动创建的索引.

所有的数据节点的内存大小都要一样,由于集群中任何数据节点都不能使用比其他数据节点最小内存还多的内存.换句话说,如果集群中有4台计算机,如果有3台计算机的内存都是3GB,而另外一台只有1GB,那么每个数据节点最多只能拿出1GB内存用于集群.

The cause is very likely to be that your setup does not
provide sufficient RAM for all table data and all indexes,
including the primary key required by the
NDB storage engine and automatically
created in the event that the table definition does not
include the definition of a primary key
.

It is also worth noting that all data nodes should have the
same amount of RAM, since no data node in a cluster can use
more memory than the least amount available to any
individual data node. In other words, if there are four
computers hosting Cluster data nodes, and three of these
have 3GB of RAM available to store Cluster data while the
remaining data node has only 1GB RAM, then each data node
can devote only 1GB to clustering.

26.10.10:
MySQL Cluster uses TCP/IP. Does this mean that I can run it
over the Internet, with one or more nodes in remote
locations?

在这种条件下,那么MySQL集群不大可能有稳定的表现,因为MySQL集群是设计和实现在良好的高速连接速率环境下,如100Mbps LAN或者G级以太网 - 后者尤佳.我们没有在更慢的条件下做测试因此也不能保证它的性能.

同样地,要清楚地意识到MySQL集群的2个节点之间的通信是不安全的;它们没有经过任何保护机制加密或者防护.安全的集群是放在防火墙之内的私网中,在外界中无法直接访问数据和管理节点(SQL节点也要和其他MySQL服务器一样注意安全防护).

It is very unlikely that a cluster
would perform reliably under such conditions, as MySQL
Cluster was designed and implemented with the assumption
that it would be run under conditions guaranteeing dedicated
high-speed connectivity such as that found in a LAN setting
using 100 Mbps or gigabit Ethernet — preferably the
latter. We neither test nor warrant its performance using
anything slower than this.

Also, it is extremely important to keep in mind that
communications between the nodes in a MySQL Cluster are not
secure; they are neither encrypted nor safeguarded by any
other protective mechanism. The most secure configuration
for a cluster is in a private network behind a firewall,
with no direct access to any Cluster data or management
nodes from outside. (For SQL nodes, you should take the same
precautions as you would with any other instance of the
MySQL server.)

26.10.11:
Do I have to learn a new programming or query language to
use MySQL Cluster?

不需要.尽管有一些特定的命令使用来管理和设置集群的,但其他标准的(My)SQL查询和命令如下:

  • 创建/修改/删除表

  • 插入/更新/删除数据

  • 创建/修改主键/唯一索引

设置MySQL集群所需的特定配置参数和文件请看 Section 15.4.4, “Configuration File”.

MySQL集群中一些简单的管理客户端命令,如用来启动或关闭集群节点等.详情请看 Section 15.7.2, “Commands in the Management Client”.

No. Although some specialized commands are used to manage
and configure the cluster itself, only standard (My)SQL
queries and commands are required for the following
operations:

  • Creating, altering, and dropping tables

  • Inserting, updating, and deleting table data

  • Creating, changing, and dropping primary and unique
    indexes

Some specialized configuration parameters and files are
required to set up a MySQL Cluster — see
Section 15.4.4, “Configuration File”, for information
about these.

A few simple commands are used in the MySQL Cluster
management client for tasks such as starting and stopping
cluster nodes. See
Section 15.7.2, “Commands in the Management Client”.

26.10.12:
How do I find out what an error or warning message means
when using MySQL Cluster?

有2中方法,如下:

  • mysql 客户端中,执行命令 SHOW ERRORSSHOW WARNINGS 就能直接查看报错或警告信息.在 MySQL Query Browser 中也能显示错误及警告信息.

  • 在系统shell提示符中,执行命令 perror --ndb error_code.

There are two ways in which this can be done:

  • From within the mysql client, use
    SHOW ERRORS or SHOW
    WARNINGS
    immediately upon being notified of
    the error or warning condition. Errors and warnings also
    be displayed in MySQL Query Browser.

  • From a system shell prompt, use perror --ndb
    error_code
    .

26.10.13:
Is MySQL Cluster transaction-safe? What isolation levels are
supported?

是的.NDB 存储引擎的表都支持事务.在MySQL 5.0中,集群只支持 READ COMMITTED 隔离级别.

Yes: For tables created with the
NDB storage engine, transactions are
supported. In MySQL 5.0, Cluster supports only
the READ COMMITTED transaction isolation
level.

26.10.14:
What storage engines are supported by MySQL Cluster?

MySQL集群只支持 NDB 存储引擎.也就是说,想要让一个表在集群节点中共享,就必须指定 ENGINE=NDB(或 ENGINE=NDBCLUSTER 也一样).

MySQL集群中也可以使用 MyISAMInnoDB 存储引擎来创建数据表,但是那些非 NDB
的表不会存储在集群节点间共享;它们独立于创建的MySQL服务器或者实例中.

Clustering in MySQL is supported only by the
NDB storage engine. That is, in order for
a table to be shared between nodes in a cluster, it must be
created using ENGINE=NDB (or
ENGINE=NDBCLUSTER, which is equivalent).

It is possible to create tables using other storage engines
(such as MyISAM or
InnoDB) on a MySQL server being used for
clustering, but these non-NDB tables will
not participate in the
cluster; they are local to the individual MySQL server
instance on which they are created.

26.10.15:
Which versions of the MySQL software support Cluster? Do I
have to compile from source?

在MySQL 5.0的全部MySQL-max二进制发布系列中都支持集群,下文列出了想要注意的例外情况.你可以使用语句 SHOW VARIABLES LIKE 'have_%'SHOW ENGINES 来确定是否支持 NDB 存储引擎,详情请看 Section 5.3, “The mysqld-max Extended MySQL Server”.

Linux用户需要注意,标准的MySQL服务器RPM包中 没有 包含 NDB.从MySQL 5.0.4开始,RPM包开始把 NDB 存储引擎和附带的管理工具分开;详情请查看MySQL 5.0下载页面中关于NDB RPM下载的部分 (在 5.0.4 以前,必须下载 -max.tar.gz 压缩包.以后还可能会这样,但不是必须的,可能使用你更习惯的Linux RPM包管理器).也可以从 -max 源码中编译支持 NDB,不过没必要这么麻烦.想要下载最新版本,请看 http://dev.mysql.com/downloads/mysql/5.0.html.
-max

MySQL Cluster is supported in all MySQL-max binaries in the
5.0 release series, except as noted in the
following paragraph. You can determine whether your server
has NDB support using either the
SHOW VARIABLES LIKE 'have_%' or
SHOW ENGINES statement. (See
Section 5.3, “The mysqld-max Extended MySQL Server”, for more information.)

Linux users, please note that NDB is
not included in the standard MySQL
server RPMs. Beginning with MySQL 5.0.4, there are separate
RPM packages for the NDB storage engine and accompanying
management and other tools; see the NDB RPM Downloads
section of the MySQL 5.0 Downloads page for
these. (Prior to 5.0.4, you had to use the
-max binaries supplied as
.tar.gz archives. This is still
possible, but is not required, so you can use your Linux
distribution's RPM manager if you prefer.) You can also
obtain NDB support by compiling the -max
binaries from source, but it is not necessary to do so
simply to use MySQL Cluster. To download the latest binary,
RPM, or source distribution for MySQL 5.0,
visit
http://dev.mysql.com/downloads/mysql/5.0.html.

26.10.16:
In the event of a catastrophic failure — say, for
instance, the whole city loses power
and my UPS fails — would I lose
all my data?

所有提交的事务都已经记录到日志中了.因此,在一个突发的灾难中可能会丢失一些数据,但也是很有限.通过减少事务中的操作数量,也能减少丢失的数据.(在任何事务中总是执行大量的操作很不明智)

All committed transactions are logged. Therefore, although
it is possible that some data could be lost in the event of
a catastrophe, this should be quite limited. Data loss can
be further reduced by minimizing the number of operations
per transaction. (It is not a good idea to perform large
numbers of operations per transaction in any case.)

26.10.17:
Is it possible to use FULLTEXT indexes
with Cluster?

在MySQL 5.0中,NDB 存储引擎不支持 FULLTEXT 索引,其他除了 MyISAM 存储引擎外也不支持.我们正努力使将来的发布版本中也能支持.

FULLTEXT indexing is not supported by the
NDB storage engine in MySQL
5.0, or by any storage engine other than
MyISAM. We are working to add this
capability in a future release.

26.10.18:
Can I run multiple nodes on a single computer?

可以,但不建议.首先,使用集群就是为了提高数据冗余度,因此每个节点最好是在不同的计算机上.如果把多个节点放在同一个计算机上,那么如果它宕机了,就会丢失这些节点了.MySQL集群可以运行在低成本(无成本)的操作系统,基于商用机之上;为了保护重要不能丢失的数据,那么扩展一两个额外的机器是值得的.这个任务在由一台200MHz奔腾CPU和足够内存的操作系统上的 ndb_mgmdndb_mgm 进程来完成,并且开销很少.

如果是出于学习或者测试目的把MySQL集群中的多个数据节点放在一个机器上是可以接受的;不过,这 不能 用于生产环境.

It is possible but not advisable. One of the chief reasons
to run a cluster is to provide redundancy. To enjoy the full
benefits of this redundancy, each node should reside on a
separate machine. If you place multiple nodes on a single
machine and that machine fails, you lose all of those nodes.
Given that MySQL Cluster can be run on commodity hardware
loaded with a low-cost (or even no-cost) operating system,
the expense of an extra machine or two is well worth it to
safeguard mission-critical data. It also worth noting that
the requirements for a cluster host running a management
node are minimal. This task can be accomplished with a 200
MHz Pentium CPU and sufficient RAM for the operating system
plus a small amount of overhead for the
ndb_mgmd and ndb_mgm
processes.

It is acceptable to run multiple cluster data nodes on a
single host for learning about MySQL Cluster, or for testing
purposes; however, this is not
supported for production use.

26.10.19:
Can I add nodes to a cluster without restarting it?

现在还不行.往集群中添加一个新的MGM或者SQL节点需要重启.添加新的数据节点的话就更复杂了,需要以下几个步骤:

  1. 备份集群中的所有数据.

  2. 完全关闭集群及集群节点的相关进程.

  3. 附加选项 --initial 启动集群.

  4. 把备份数据还原到集群中.

在未来的MySQL 集群发布系列版本中,我们希望能实现 "热"(在线) 方式重设集群的功能,以减少增加新节点带来的麻烦.不过,在MySQL 5.0中不会实现.

Not at present. A simple restart is all that is required for
adding new MGM or SQL nodes to a Cluster. When adding data
nodes the process is more complex, and requires the
following steps:

  1. Make a complete backup of all Cluster data.

  2. Completely shut down the cluster and all cluster node
    processes.

  3. Restart the cluster, using the
    --initial startup option.

  4. Restore all cluster data from the backup.

In a future MySQL Cluster release series, we hope to
implement a “hot” reconfiguration capability
for MySQL Cluster to minimize (if not eliminate) the
requirement for restarting the cluster when adding new
nodes. However, this is not planned for MySQL
5.0.

26.10.20:
Are there any limitations that I should be aware of when
using MySQL Cluster?

MySQL 5.0中的 NDB 表存在以下局限性:

  • 不支持临时表;执行 CREATE TEMPORARY TABLE 语句时如果指定 ENGINE=NDBENGINE=NDBCLUSTER 参数就会报错.

  • 不支持 FULLTEXT 索引以及前缀索引.只能对整个字段进行索引.

  • 不支持空间数据类型.详情请看 Chapter 16, Spatial Extensions.

  • 只支持完全的回滚事务.不支持部分回滚和回滚到保存点也不支持.

  • 一个表最大的属性数为128,并且每个属性名不能超过31个字符.对每个表来说,数据库和表明的联合长度最大是122个字符.

  • 表中一条记录最大值为8KB,不包含 BLOB 的值.表的大小取决于各种制约因素,尤其是每个数据节点最大的可用内存.

  • NDB 存储引擎不支持外键.跟 MyISAM 一样,它们都不支持.

想要知道更多的限制,请查看 Section 15.11, “Known Limitations of MySQL Cluster”.

Limitations on NDB tables in MySQL
5.0 include:

  • Temporary tables are not supported; a CREATE
    TEMPORARY TABLE
    statement using
    ENGINE=NDB or
    ENGINE=NDBCLUSTER fails with an
    error.

  • FULLTEXT indexes and index prefixes
    are not supported. Only complete columns may be indexed.

  • Spatial data types are not supported. See
    Chapter 16, Spatial Extensions.

  • Only complete rollbacks for transactions are supported.
    Partial rollbacks and rollbacks to save points are not
    supported.

  • The maximum number of attributes allowed per table is
    128, and attribute names cannot be any longer than 31
    characters. For each table, the maximum combined length
    of the table and database names is 122 characters.

  • The maximum size for a table row is 8 kilobytes, not
    counting BLOB values. There is no set
    limit for the number of rows per table. Table size
    limits depend on a number of factors, in particular on
    the amount of RAM available to each data node.

  • The NDB engine does not support
    foreign key constraints. As with
    MyISAM tables, these are ignored.

For a complete listing of limitations in MySQL Cluster, see
Section 15.11, “Known Limitations of MySQL Cluster”.

26.10.21:
How do I import an existing MySQL database into a cluster?

你可以把各种版本的MySQL数据导入到集群中去.除了本FAQ中其他地方提到的限制条件 title="15.11. Known Limitations of MySQL Cluster">Section 15.11, “Known Limitations of MySQL
Cluster”,唯一的要求就是要导入的表必须是 NDB 存储引擎,也就是用 ENGINE=NDBENGINE=NDBCLUSTER方式创建的表.

It is also possible to convert existing tables using other
storage engines to NDB Cluster using one
or more ALTER TABLE statement, but this
requires an additional workaround. See
Section 15.11, “Known Limitations of MySQL Cluster”, for details.

26.10.22:
How do cluster nodes communicate with one another?

集群节点间可以通过一下方式来通信:TCP/IP, SHM(共享内存), SCI(可伸缩一致性接口).如果可能,安装在同一台集群主机上的多个节点间默认用SHM来通信;不过,在MySQL 5.0中这只是出于试验目的.SCI是高速的(每秒1GB或更高),高可靠协议用于搭建可伸缩的多处理器系统;它需要有特殊的硬件以及驱动.详情请看 Section 15.10, “Using High-Speed Interconnects with MySQL Cluster”.

Cluster nodes can communicate via any of three different
protocols: TCP/IP, SHM (shared memory), and SCI (Scalable
Coherent Interface). Where available, SHM is used by default
between nodes residing on the same cluster host; however,
this is considered experimental in MySQL 5.0.
SCI is a high-speed (1 gigabit per second and higher),
high-availability protocol used in building scalable
multi-processor systems; it requires special hardware and
drivers. See Section 15.10, “Using High-Speed Interconnects with MySQL Cluster”,
for more about using SCI as a transport mechanism in MySQL
Cluster.

26.10.23:
What is an arbitrator?

如果一个和多个节点宕机(或者其他原因失败了),那么就需要让其他节点能 看见 这种情况.实际上,在分区网络中,有可能会出现2组节点互相隔绝,这也叫做 "脑裂".出现这种情况当然不好了,因为每组节点都想认为自己是集群中的一分子.

当集群的某个节点宕了,则有2种可能性.如果超过50%的节点能互通,有时也叫 "多数原则",则这组节点被认为是集群.如果这组节点的数量正好是一半,那么仲裁服务器该发挥作用了:这种情况下,仲裁服务器所在的那组节点被当做是集群,而另外那组则被关闭.

前面的信息有些简单了.以下是一个比较复杂的节点分组案例解析:

如果至少一个节点分组中的所有节点都是可用的,那么分区网络就不在此范围,因为这样的话集群中的任意部分都不能组成新的集群.只有在没有任何一个节点分组的节点中全部都可用时才有问题,这个时候就有可能出现分区网络("脑裂")了.这时就需要有仲裁了.所有的集群节点都认可一个仲裁服务器,那就是管理服务器;不过,可以把集群中的任意服务器配置称为仲裁服务器.仲裁服务器接受第一组集群节点,连接它们,并且关闭剩下的其他集群组.仲裁服务器的选择是由MySQL服务器和管理节点服务器的 ArbitrationRank 参数控制的(详情请看 Section 15.4.4.4, “Defining the Management Server”) 同时也要注意到,选定作为仲裁角色的服务器无需承担高负荷,因此它不要求速度很快和内存很大.

If one or more nodes in a cluster fail, it is possible that
not all cluster nodes will be able to “see” one
another. In fact, it is possible that two sets of nodes
might become isolated from one another in a network
partitioning, also known as a “split brain
scenario. This type of situation is undesirable because each
set of nodes tries to behave as though it is the entire
cluster.

When cluster nodes go down, there are two possibilities. If
more than 50% of the remaining nodes can communicate with
each other, we have what is sometimes called a
majority rules” situation, and this set of
nodes is considered to be the cluster. The arbitrator comes
into play when there is an even number of nodes: in such
cases, the set of nodes to which the arbitrator belongs is
considered to be the cluster, and nodes not belonging to
this set are shut down.

The preceding information is somewhat simplified. A more
complete explanation taking into account node groups
follows:

When all nodes in at least one node group are alive, network
partitioning is not an issue, because no one portion of the
cluster can form a functional cluster. The real problem
arises when no single node group has all its nodes alive, in
which case network partitioning (the
split-brain” scenario) becomes possible. Then
an arbitrator is required. All cluster nodes recognize the
same node as the arbitrator, which is normally the
management server; however, it is possible to configure any
of the MySQL Servers in the cluster to act as the arbitrator
instead. The arbitrator accepts the first set of cluster
nodes to contact it, and tells the remaining set to shut
down. Arbitrator selection is controlled by the
ArbitrationRank configuration parameter
for MySQL Server and management server nodes. (See
Section 15.4.4.4, “Defining the Management Server”, for
details.) It should also be noted that the role of
arbitrator does not in and of itself impose any heavy
demands upon the host so designated, and thus the arbitrator
host does not need to be particularly fast or to have extra
memory especially for this purpose.

26.10.24:
What data types are supported by MySQL Cluster?

MySQL集群支持所有常用的数据类型,除了跟MySQL相关的空间扩展类型(详情请看 Chapter 16, Spatial Extensions).另外,NDB
表的索引也有些不同. 注意: MySQL集群表(即 NDBNDBCLUSTER 类型表)只支持固定长度记录.这也意味着(举例)如果有一条记录包含有 VARCHAR(255) 字段,那么它就会需要用到255个字符的空间(和数据表使用的字符集和校验所要求的空间一样大),而不管实际存储的字符数.这个问题在未来的发行版中会得到解决.

详情请看 Section 15.11, “Known Limitations of MySQL Cluster”.

MySQL Cluster supports all of the usual MySQL data types,
with the exception of those associated with MySQL's spatial
extensions. (See Chapter 16, Spatial Extensions.) In
addition, there are some differences with regard to indexes
when used with NDB tables.
Note: MySQL Cluster tables
(that is, tables created with
ENGINE=NDBCLUSTER) have only fixed-width
rows. This means that (for example) each record containing a
VARCHAR(255) column will require space
for 255 characters (as required for the character set and
collation being used for the table), regardless of the
actual number of characters stored therein. This issue is
expected to be fixed in a future MySQL release series.

See Section 15.11, “Known Limitations of MySQL Cluster”, for more
information about these issues.

26.10.25:
How do I start and stop MySQL Cluster?

需要按以下顺序分别启动集群中的各个节点:

  1. ndb_mgmd 命令启动管理节点.

  2. ndbd 分别启动每个数据节点.

  3. mysqld_safe --user=mysql & 命令启动每个SQL节点.

上面的命令都需要在系统的shell下运行(在服务器上执行或通过终端都可以).可以在管理节点上的 ndb_mgm 客户端工具中执行 SHOW ALL STATUS 命令来确认集群是否在运行状态.

在 MGM 客户端中运行 SHUTDOWN 命令即可关闭运行中的集群.另外,也可以在管理节点主机中执行一下命令来关闭:

shell> ndb_mgm -e "SHUTDOWN"

注意,括号中的内容可以是别的,并且 SHUTDOWN 可以不区分大小写.

这些命令的任何一个都能平和地关闭 ndb_mgm, ndb_mgmd(笔误?), ndbd 中的任何进程.运行MySQL集群的SQL节点服务器则通过 mysqladmin shutdown 来关闭.

更多详情请看 Section 15.7.2, “Commands in the Management Client”Section 15.3.6, “Safe Shutdown and Restart”.

It is necessary to start each node in the cluster
separately, in the following order:

  1. Start the management node with the
    ndb_mgmd command.

  2. Start each data node with the ndbd
    command.

  3. Start each MySQL server (SQL node) using
    mysqld_safe --user=mysql &.

Each of these commands must be run from a system shell on
the machine housing the affected node. (You do not have to
be physically present at the machine — a remote login
shell can be used for this purpose.) You can verify that the
cluster is running by starting the MGM management client
ndb_mgm on the machine housing the MGM
node and issuing the SHOW or ALL
STATUS
command.

To shut down a running cluster, issue the command
SHUTDOWN in the MGM client.
Alternatively, you may enter the following command in a
system shell on the machine hosting the MGM node:

shell> ndb_mgm -e "SHUTDOWN"

(Note that the quotation marks are optional here; the
SHUTDOWN command itself is not
case-sensitive.)

Either of these commands causes the
ndb_mgm, ndb_mgm, and
any ndbd processes to terminate
gracefully. MySQL servers running as Cluster SQL nodes can
be stopped using mysqladmin shutdown.

For more information, see
Section 15.7.2, “Commands in the Management Client”, and
Section 15.3.6, “Safe Shutdown and Restart”.

26.10.26:
What happens to cluster data when the cluster is shut down?

本来保存在内存中的集群数据都会写到磁盘中,下次集群启动的时候就会重新载入到内存中.

The data that was held in memory by the cluster's data nodes
is written to disk, and is reloaded into memory the next
time that the cluster is started.

26.10.27:
Is it helpful to have more than one management node for a
cluster?

这有助于自动防故障.任何时候只能有一个MGM节点来控制集群,但是也可以配置一个主MGM控制器,一个或者多个附加控制器作为主控制器发生故障时的备用.
配置集群管理节点详情请看 Section 15.4.4, “Configuration File”,

It can be helpful as a fail-safe. Only one MGM node controls
the cluster at any given time, but it is possible to
configure one MGM as primary, and one or more additional
management nodes to take over in the event that the primary
MGM node fails.

See Section 15.4.4, “Configuration File”, for
information on how to configure MySQL Cluster management
nodes.

26.10.28:
Can I mix different kinds of hardware and operating systems
in one MySQL Cluster?

是的.只要所有的机器和操作系统都有 "endianness"
即可(所有的大尾或小尾,这个词比较难理解,请自己用搜索引擎搜一下看看).不同节点间还有可能使用不同版本的MySQL集群发行版.不过,我们推荐这只是作为滚动升级步骤的一部分,详情请看 Section 15.5.1, “Performing a Rolling Restart of the Cluster”.

Yes, so long as all machines and operating systems have the
same “endianness” (all big-endian or all
little-endian). It is also possible to use different MySQL
Cluster releases on different nodes. However, we recommend
this be done only as part of a rolling upgrade procedure
(see Section 15.5.1, “Performing a Rolling Restart of the Cluster”).

26.10.29:
Can I run two data nodes on a single host? Two SQL nodes?

是的,可以这么做.如果有多个数据节点,我们建议(不是必须)不同的数据目录都有各自的节点.如果想要在同一个机器上运行多个数据节点,那么每个
mysqld 实例就必须使用不同的 TCP/IP 端口了.不过,在每个机器上运行多个集群节点不支持生产环境使用.

Yes, it is possible to do this. In the case of multiple data
nodes, it is advisable (but not required) for each node to
use a different data directory. If you want to run multiple
SQL nodes on one machine, each instance of
mysqld must use a different TCP/IP port.
However, running more than one cluster node of a
given type per machine is not supported for production
use
.

26.10.30:
Can I use hostnames with MySQL Cluster?

是的,集群主机可以使用DNS或者DHCP.不过,如果你的应用程序要求 "5个9(99.999%)" 可靠性的话,我们推荐使用固定ip地址.可以在故障自动防护系统中引入 DNS 或 DHCP 服务,最好比较少用它们.

Yes, it is possible to use DNS and DHCP for cluster hosts.
However, if your application requires “five
nines
” availability, we recommend using fixed IP
addresses. Making communication between Cluster hosts
dependent on services such as DNS and DHCP introduces
additional points of failure, and the fewer of these, the
better.

26.10.31:
How do I handle MySQL users in a Cluster having
multiple MySQL servers?

访问MySQL集群的不同MySQL服务器间不会自动同步帐号和权限信息.因此,必须在每个SQL节点间都复制一份.

MySQL user accounts and privileges are not automatically
propagated between different MySQL servers accessing the
same MySQL Cluster. Therefore, you must make sure that these
are copied between the SQL nodes yourself.