MySQL Cluster Crashing

randyz September 2, 2014

We setup our JIRA to use MySQL clustering and it's crashing on a regular bases.

For some reason the cluster crashes even after a fresh rebuild. Everything else thats running the same cluster configuration is working without problem.

140901  2:34:35 [Note] WSREP: Provider resumed.
140901  2:40:50 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.170.122:4567
140901  2:40:51 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 80710510-2cbe-11e4-b968-72fc9228b67f (tcp://192.168.170.122:4567), attempt 0
140901  2:40:55 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,3)) suspecting node: 80710510-2cbe-11e4-b968-72fc9228b67f
140901  2:40:56 [Note] WSREP: declaring 60c31941-2cbe-11e4-b08a-df6ee547911c stable
140901  2:40:56 [Note] WSREP: Node 45fd58b1-2cbe-11e4-9211-661eb587a669 state prim
140901  2:40:56 [Note] WSREP: view(view_id(PRIM,45fd58b1-2cbe-11e4-9211-661eb587a669,4) memb {
140901  2:40:56 [Note] WSREP: forgetting 80710510-2cbe-11e4-b968-72fc9228b67f (tcp://192.168.170.122:4567)
140901  2:40:56 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
140901  2:40:56 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') turning message relay requesting off
140901  2:40:56 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 126377ce-31bc-11e4-b534-72a6f234fb02
140901  2:40:56 [Note] WSREP: STATE EXCHANGE: sent state msg: 126377ce-31bc-11e4-b534-72a6f234fb02
140901  2:40:56 [Note] WSREP: STATE EXCHANGE: got state msg: 126377ce-31bc-11e4-b534-72a6f234fb02 from 0 (vm-1)
140901  2:40:56 [Note] WSREP: STATE EXCHANGE: got state msg: 126377ce-31bc-11e4-b534-72a6f234fb02 from 1 (vm-2)
140901  2:40:56 [Note] WSREP: Quorum results:
140901  2:40:56 [Note] WSREP: Flow-control interval: [23, 23]
140901  2:40:56 [Note] WSREP: New cluster view: global state: 413108ce-2cbe-11e4-ae15-afbce9eabdbb:1070190, view# 4: Primary, number of nodes: 2, my index: 0, protocol version 2
140901  2:40:56 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140901  2:40:56 [Note] WSREP: Assign initial position for certification: 1070190, protocol version: 2
140901  2:41:01 [Note] WSREP:  cleaning up 80710510-2cbe-11e4-b968-72fc9228b67f (tcp://192.168.170.122:4567)
140901  2:41:54 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.170.121:4567
140901  2:41:55 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 0
140901  2:41:58 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, OPERATIONAL, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:41:59 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:41:59 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:00 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:00 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:01 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:01 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:02 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:02 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:03 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:03 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:04 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:04 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:05 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:05 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:06 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:06 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:07 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:07 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:08 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) suspecting node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:08 [Note] WSREP: evs::proto(45fd58b1-2cbe-11e4-9211-661eb587a669, GATHER, view_id(REG,45fd58b1-2cbe-11e4-9211-661eb587a669,4)) detected inactive node: 60c31941-2cbe-11e4-b08a-df6ee547911c
140901  2:42:09 [Note] WSREP: view(view_id(NON_PRIM,45fd58b1-2cbe-11e4-9211-661eb587a669,4) memb {
140901  2:42:09 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
140901  2:42:09 [Note] WSREP: view(view_id(NON_PRIM,45fd58b1-2cbe-11e4-9211-661eb587a669,5) memb {
140901  2:42:09 [Note] WSREP: Flow-control interval: [16, 16]
140901  2:42:09 [Note] WSREP: Received NON-PRIMARY.
140901  2:42:09 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 1070341)
140901  2:42:09 [Warning] WSREP: Send action {0x7f3e600112a0, 566, TORDERED} returned -107 (Transport endpoint is not connected)
140901  2:42:09 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
140901  2:42:09 [Note] WSREP: New cluster view: global state: 413108ce-2cbe-11e4-ae15-afbce9eabdbb:1070341, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 2
140901  2:42:09 [Note] WSREP: Flow-control interval: [16, 16]
140901  2:42:09 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140901  2:42:09 [Note] WSREP: Received NON-PRIMARY.
140901  2:42:09 [Warning] WSREP: Send action {0x7f3e5c16b050, 2584, TORDERED} returned -107 (Transport endpoint is not connected)
140901  2:42:09 [Warning] WSREP: Send action {0x7f3ee412e820, 325, TORDERED} returned -107 (Transport endpoint is not connected)
140901  2:42:09 [Note] WSREP: New cluster view: global state: 413108ce-2cbe-11e4-ae15-afbce9eabdbb:1070341, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 2
140901  2:42:09 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140901  2:42:09 [Warning] WSREP: Send action {0x7f3ea4005cb0, 518, TORDERED} returned -107 (Transport endpoint is not connected)
140901  2:42:09 [Warning] WSREP: Send action {0x7f3ef00626a0, 705, TORDERED} returned -107 (Transport endpoint is not connected)
140901  2:42:09 [Warning] WSREP: Send action {0x7f3e7803ce00, 455, TORDERED} returned -107 (Transport endpoint is not connected)
140901  2:42:09 [Warning] WSREP: Send action {0x7f3e73650a90, 705, TORDERED} returned -107 (Transport endpoint is not connected)
140901  2:42:40 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 30
140901  2:43:25 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 60
140901  2:44:10 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 90
140901  2:44:55 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 120
140901  2:45:40 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 150
140901  2:46:25 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 180
140901  2:47:10 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 210
140901  2:47:55 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 240
140901  2:48:40 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 270
140901  2:49:25 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 300
140901  2:50:10 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 330
140901  2:50:55 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 360
140901  2:51:40 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 390
140901  2:52:25 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 420
140901  2:53:10 [Note] WSREP: (45fd58b1-2cbe-11e4-9211-661eb587a669, 'tcp://0.0.0.0:4567') reconnecting to 60c31941-2cbe-11e4-b08a-df6ee547911c (tcp://192.168.170.121:4567), attempt 450
140901  2:53:52 [Note] /usr/sbin/mysqld: Normal shutdown
140901  2:53:52 [Note] WSREP: Stop replication
140901  2:53:52 [Note] WSREP: Closing send monitor...
140901  2:53:52 [Note] WSREP: Closed send monitor.
140901  2:53:52 [Note] WSREP: gcomm: terminating thread
140901  2:53:52 [Note] WSREP: gcomm: joining thread
140901  2:53:52 [Note] WSREP: gcomm: closing backend
140901  2:53:52 [Note] WSREP: view((empty))
140901  2:53:52 [Note] WSREP: Received self-leave message.
140901  2:53:52 [Note] WSREP: gcomm: closed
140901  2:53:52 [Note] WSREP: Flow-control interval: [0, 0]
140901  2:53:52 [Note] WSREP: Received SELF-LEAVE. Closing connection.
140901  2:53:52 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 1070341)

Here are mysql settings

binlog_format=ROW
datadir=/var/lib/mysql/data
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_buffer_pool_size=28g
innodb_file_per_table=1
innodb_locks_unsafe_for_binlog=1
innodb_log_file_size=64m
max_allowed_packet=16m
max_connect_errors=4294967295
max_connections=1000
pid_file=/var/lib/mysql/mysql.pid
port=3306
socket=/var/lib/mysql/mysql.sock
transaction-isolation=READ-COMMITTED

Thanks in advance for any help

2 answers

1 accepted

1 vote
Answer accepted
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
September 2, 2014

I'm afraid that's a fault in your cluster setup - Jira is not the issue here.

(Jira is not cluster aware though - it expects a simple database connection to simply work, it won't do anything with a cluster)

0 votes
randyz September 25, 2014

Thank you Nic, you are correct.

We updated our cluster software and JConnector to the latest version.

No crashes yet.

Suggest an answer

Log in or Sign up to answer