Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Confluence clustering in kubernetes (GKE)

Phong Vũ Quốc March 9, 2022

Hi, we are hosting our confluence inside GKE.

It has been running smoothly so far, but we got the issue with scaling Confluence (running in multiple pods)

The cluster panic event occurs everytime we scale Statefulset to more than 1 replicas. It may be caused by multicast connectivity between pods (Confluence Nodes).

We took a look at Confluence Discovery strategy (multicast, tcp/ip) but none of them works.

- VPC in GCP does not support Multicast

- TCP/IP requires fixed IP of pods, which we don't know until it's scaled.

Could anyone please suggest a workaround for this problem?

Please let me know if I should provide any more detail information.

We don't really need AUTO scaling, just be able to run mulitple Confluence instance should be ok.

Thank you guys very much !

3 answers

0 votes
Ariel eli September 15, 2022

Hey @Phong Vũ Quốc ,

I also deployed confluence DC in kubernetes  and I found out that instead of multicast or tcp/ip you can use “kubernetes” as your join type for discovering the other pods in the cluster.

you can change this configuration in your confluence.cfg.xml file inside the pod and do a rollback restart to the statefulset.

0 votes
Yevhen
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 9, 2022

@Phong Vũ Quốc Confluence DC uses Hazelcast K8s discovery method. How did you deploy your Confluence DC? Official Helm charts? K8s discovery method is configured by default.

Phong Vũ Quốc March 9, 2022

Hi @Yevhen ,

Thank you so much for quick response.

Yes I used Helm chart (latest version) to deploy Confluence, and leave Clustering-related values as default. (clustering enabled and use pod name as node name)

I have tried to deploy Confluence chart with 1 replica intially -> complete initial setup -> increase the replica to 2.

And I got com.atlassian.confluence.cluster.safety.ClusterPanicEvent after a while, and both pods become NotReady.

│ confluence 2022-03-10 07:51:32,286 ERROR [hz.confluence.cached.thread-5] [confluence.cluster.safety.ClusterPanicListener] onClusterPanicEvent Received a panic event, stopping processing o │
│ n the node: [Origin node: d5f116a0 listening on /172.16.8.38:5701] Clustered Confluence: Database is being updated by an instance which is not part of the current cluster. You should chec │
│ k network connections between cluster nodes, especially multicast traffic. │
│ confluence -- event: com.atlassian.confluence.cluster.safety.ClusterPanicEvent[source=null] | originatingMemberUuid: cd76c42c-95ad-42cd-834b-c75c65030f82 │
│ confluence 2022-03-10 07:51:32,288 WARN [hz.confluence.cached.thread-5] [confluence.cluster.safety.ClusterPanicListener] onClusterPanicEvent Shutting down scheduler │
│ confluence -- event: com.atlassian.confluence.cluster.safety.ClusterPanicEvent[source=null] | originatingMemberUuid: cd76c42c-95ad-42cd-834b-c75c65030f82 │
│ confluence 2022-03-10 07:51:34,289 WARN [hz.confluence.cached.thread-4] [internal.cluster.impl.MembershipManager] log [172.16.11.63]:5701 [confluence-test] [3.12.11] Member [172.16.8.38]: │
│ 5701 - cd76c42c-95ad-42cd-834b-c75c65030f82 is suspected to be dead for reason: No connection │
│ confluence 2022-03-10 07:51:34,296 INFO [hz.confluence.event-2] [confluence.cluster.hazelcast.LoggingClusterMembershipListener] memberRemoved [172.16.8.38]:5701 left the cluster │
│ confluence 2022-03-10 07:51:34,296 INFO [hz.confluence.event-2] [confluence.cluster.hazelcast.LoggingClusterMembershipListener] logClusterMembers Cluster now has 1 members: [[172.16.11.63 │
│ ]:5701]

 

Below is some addtional information (which I am not sure if it's related)

  • I deployed Confluence and its DB from scratch (everything from previous installation were purged)
Yevhen
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 14, 2022

@Phong Vũ Quốc  to me it looks like the database PV wasn't flushed and somehow you are using an existing database. It's just a theory though.  This KB may help a bit. I failed to reproduce it in my lab cluster.  Perhaps, it's worth trying running a test deployment with an in memory database just to see if it makes a difference?

Phong Vũ Quốc March 23, 2022

@YevhenHi, thank you really much for the advice.

I've tried to clean everything (include PV provisioned by Confluence helm chart) but ClusterPanicEvent still happens.

I haven't figured out how to deploy confluence cluster in in-memory mode with Helm chart (currently, values supported in ```database``` values only are JDBC db type)

May I have your ```values.yaml``` file and other configuration files that you used in your lab environment? It would be very helpful for me.

Thanks again, Yevhen.

Yevhen
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 23, 2022

@Phong Vũ Quốc  I deployed with pretty much standard values. 

Can you share your confluence statefulset yaml and your values file?

Yevhen
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 23, 2022
Phong Vũ Quốc March 23, 2022

@Yevhen  May I also know how did you scale your confluence replicas?

Was it done after you've completed initial setup or it was done at the time you applied helm chart?

Here is my values file, I redacted the hostname/url related btw.

https://pastebin.com/hzatQJpU

Yevhen
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 23, 2022

I have checked your values yaml. Nothing special in there. And yes, the typical way to scale Confluence is to deploy with 1 replica and then scane the statefulset directly or updated replicas in values and helm upgrade it.

Have you checked the link I have checked in a previous comment? I wonder if diagnosis and troubleshooting section is of any help.

It'd be also great to have complete logs from the two Confluence pods.

0 votes
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
March 9, 2022

Just to confirm, you are using Confluence Data Center?  You've mentioned server, not DC, so I want to check.

Phong Vũ Quốc March 9, 2022

Yes, that's correct.

Mine is Confluence Data Center.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events