Bamboo cannot failover with error stuck "primary lock is held by another instance, suspending"

Daniel
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 20, 2025

 

Bamboo cannot failover transport from node a into node b with error stuck "primary lock is held by another instance, suspending". Any idea how to fix this ? 

 

 

1 answer

1 accepted

1 vote
Answer accepted
Shashank Kumar
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 20, 2025

Hello Daniel,

Welcome to Atlassian community.

At a given time on Bamboo DC, you'll only have 1 node active even if you have configured High availability multiple nodes. Other nodes would be available like a standby node.

Bamboo identifies the Primary node by the lock it acquires on the Database, until and unless this lock in removed other nodes won't be able to perform the failover.

For your problem you'll need to see which Primary node Bamboo is connected to. That nodes needs to go down and after ~ 3 minutes Bamboo will release the DB lock and other standby nodes will acquire it and bring Bamboo application up on that new node

Regards,

Shashank Kumar

**please don't forget to Accept the answer if your query was answered**

 

Daniel
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 20, 2025

Hi @Shashank Kumar ,

you are right, I have 2 node items. node a I have shutdown. but node b does not startup in the browser. I have waited for 30 minutes. I checked the logs just like this. Where is the problem and how can I fix it ?

6086863283396133016.jpg

Shashank Kumar
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 20, 2025

Hello Daniel,

What logs do you see in Node A atlassian-bamboo.log file post shutdown? Do you see any message regarding DB locks ?

Regards,

Shashank Kumar

Daniel
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 20, 2025

Hi @Shashank Kumar ,

for db locks I didn't find it in log node a. only this is what I found?

 

Shashank Kumar
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 20, 2025

Hello Daniel,

Node A seems to have shutdown, there is some error related to Kaha DB which I am not sure is causing the problem with failover.

Can you do a quick test and confirm, if you restart the Bamboo service in node B and let me know if Bamboo application comes up, this would help narrow down the problem to just the failover rather than any other node setup.

 

Regards,

Shashank Kumar

Daniel
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 20, 2025

Hi @Shashank Kumar ,

 

regarding kahadb, I carefully checked the logs on node a, there was a problem detected there. what is kahadb for in bamboo, even if this has an impact, how to fix it?

 

I tried at 12 o'clock to failover but the bamboo node b did not startup.

 

Shashank Kumar
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 21, 2025

Hello Daniel,

I think there is some problem in the Kaha DB which is preventing the Node B startup. Either there is some issue with the access of the files within the Kaha DB or the Kaha DB is corrupted

KahaDB is a file-based database that stores messages for the Apache ActiveMQ framework and is used for agent-Bamboo communication.

Read similar problem here > 

https://github.com/cptactionhank/docker-atlassian-bamboo/issues/8

Next steps:

1. Make sure the users which is running Bamboo Node A and Node B has full read/write access to <bamboo-shared-home> and especially to the Kaha DB folder

2. Once the access is sorted please restart the Bamboo application from Node B and check if it comes up.

3. If you still see same issue, probably something is wrong with the Kaha DB probably it is corrupted.

4. Please take a backup of the whole Jms-folder as shown in the logs above and delete the content ( Please make sure to have a backup as it might be required in case of rollback

5. Restart Bamboo now, it should remove any corruption

Regards,

Shashank Kumar

 

**please don't forget to Accept the answer if your query was answered**

Daniel
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 21, 2025

Hi @Shashank Kumar 

 

I have added security full permission to the three users, and when I try to failover again, it's still stuck there.

 

Shashank Kumar
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 21, 2025

Hi Daniel,

Have you tried to remove the Kaha DB folder and restarted the Bamboo Instance, which I had earlier suggested ?

 

Daniel
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 21, 2025

Hi @Shashank Kumar ,

after I backed up the kahadb folder and deleted it in the sharedhome folder. I restarted the bamboo service but there is no startup yet.

6089115083209819917.jpg

 

 

Shashank Kumar
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 21, 2025

Hi Daniel,

It seems your Bamboo DB somehow is linked to a Instance. I believe you are running both the instances on a windows machine. Please try the below steps

1. Make sure both the instances Node A and Node B is shutdown.

2. You can probably restart the machine just to make sure there are no zombie process holding a lock on the DB, if it's possible restart the DB server.

3. Start Node B and see if Bamboo comes up.

Daniel
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 21, 2025

Hi @Shashank Kumar ,

Wait a moment, I first created a scenario considering that I have a bamboo production clustering.

  1. First, I need to shutdown node a & node b, right?
  2. Second, I need to restart server node a & node b, right?
  3. Third, restart the bamboo node b service, for node a, turn it off first to see if the failover is working or not, is that right?
  4. Finally, I need to restart the database server (if step one and step two fail failover test).
Shashank Kumar
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 22, 2025

Hello Daniel,

Let me summarise the steps for you.

 

  1. shutdown node a & node b and make sure nothing is running, if possible restart the Bamboo and well as DB server. 
  2. Second restart node b, as this will prove Bamboo comes up on Node B directly, if it doesn't it means there is something wrong with Node B setup
  3. Third, shutdown node b and restart the bamboo node a service, this will prove the Bamboo can independently run on node a as well
  4. If all good till here, it means your setup is good for both the nodes.
  5. Now Restart node a and node b both, the expectation is that Node A will become primary and node B will be failover
  6. Shutdown node a and see if automatic failover works or not on node b

 

Like Steffen Opel _Utoolity_ likes this
Daniel
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 10, 2025

Hi @Shashank Kumar , 

 

Thank you for your answer is correct, i've problem with database bamboo, but not restarted db, just a little bit matching to node id bamboo clustering. 

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events