Having issues with running/operating confluence in clustered mode.
Running with Bigbang, istio is the service mesh, and metallb is the load balancer
Helm Chart Version: 1.13.0
Confluence Application Version: 7.19.9
Helm chart values:
values:
image:
repository: docker.io/atlassian/confluence
serviceAccount:
clusterRole:
create: true
clusterRoleBinding:
create: true
confluence:
additionalEnvironmentVariables:
- name: ATL_FORCE_CFG_UPDATE
value: "true"
- name: "http_proxy"
value: "REDACTED"
- name: "https_proxy"
value: "REDACTED"
- name: "no_proxy"
value: "REDACTED"
clustering:
enabled: true
forceConfiguration: true
additionalJvmArgs:
- -Dcom.redhat.fips=false
- -Djava.net.preferIPv4Stack=true
- -Datlassian.dev.mode=false
- -Datlassian.plugins.enable.wait=300
podAnnotations:
traffic.sidecar.istio.io/excludeOutboundPorts: "80,443,5701"
volumes:
localHome:
persistentVolumeClaim:
create: true
resources:
requests:
storage: 2Gi
mountPath: "/var/atlassian/application-data/confluence"
sharedHome:
nfsPermissionFixer:
imageRepo: docker.io/library/alpine
imageTag: 3.18.2
persistentVolumeClaim:
storageClassName: ceph-filesystem
create: true
resources:
requests:
storage: 2Gi
mountPath: "/var/atlassian/application-data/shared-home"
My initial pod comes up fine. I can step through the setup process, and login as the admin account. And I can see the first node under Clustering.
However when I scale up replicas to 2 nodes. The next pod hangs, and is unable to start the bootstrap with a null pointer exception:
ERROR [Catalina-utility-1] [atlassian.confluence.setup.ConfluenceConfigurationListener] contextInitialized An error was encountered while bootstrapping Confluence (see below):
null
java.lang.NullPointerException
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:315)
at com.atlassian.confluence.impl.util.db.SingleConnectionDatabaseHelper.getConnection(SingleConnectionDatabaseHelper.java:41)
at com.atlassian.confluence.impl.setup.DefaultBootstrapDatabaseAccessor.getBootstrapData(DefaultBootstrapDatabaseAccessor.java:46)
at com.atlassian.confluence.setup.DefaultBootstrapManager.afterConfigurationLoaded(DefaultBootstrapManager.java:872)
at com.atlassian.config.bootstrap.DefaultAtlassianBootstrapManager.init(DefaultAtlassianBootstrapManager.java:69)
at com.atlassian.confluence.setup.DefaultBootstrapManager.init(DefaultBootstrapManager.java:236)
at com.atlassian.config.util.BootstrapUtils.init(BootstrapUtils.java:34)
at com.atlassian.confluence.setup.ConfluenceConfigurationListener.initialiseBootstrapContext(ConfluenceConfigurationListener.java:145)
at com.atlassian.confluence.setup.ConfluenceConfigurationListener.contextInitialized(ConfluenceConfigurationListener.java:63)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4491)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:4939)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1332)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1322)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.base/java.lang.Thread.run(Thread.java:829)
2023-10-31 18:36:05,708 WARN [Catalina-utility-1] [atlassian.confluence.plugin.PluginFrameworkContextListener] contextInitialized Not starting full plugin system due to upgrade
2023-10-31 18:36:05,716 [Catalina-utility-1]
[Filter: profiling] defaulting to off [autostart=false]
2023-10-31 18:36:06,382 INFO [Catalina-utility-1] [com.atlassian.confluence.lifecycle] init Confluence is ready to serve
31-Oct-2023 18:36:06.396 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8090"]
31-Oct-2023 18:36:06.415 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [7960] milliseconds
2023-10-31 18:36:07,859 ERROR [http-nio-8090-exec-1] [atlassian.confluence.util.AbstractBootstrapHotSwappingFilter] doFilter Could not get swap target filter
I have even tried an alternative setup, where I scale the pods before running setup.
If I do this, then all the pods stand up with no errors.
I can step through the setup process, however when I check "Clusters" it shows only 1 node. Additionally, If I scale back down to 0, and then scale back up to 1 or more.
Then all my confluence pods run into the null pointer exception during bootstrapping.
try to disable cluster authentication in shared cfg.xml
Further update on the issues I am experiencing.
As long as I have clustering enabled in my helm chart. Any time a reboot confluence, or scale down to 0, then back to 1, confluence is running into the same bootsrapping error everytime.
This is even happening on my primary pod. This only occurs after I have gone through setup steps, trial license, admin user setup, etc...
I am experiencing this issue with helm chart versions as old as 1.13.0, to new as latest, aka 1.16.6
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Thomas Langhorne it'd be great if you can share the complete atlassian-confluence.log for the 2nd node.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Appologies if this is a repost, My reply stopped showing on the thread:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Thomas Langhorne I think this is what you need to do:
1. Scale to 0
2. Add database values to values.yaml (IMPORTANT!).
3. Run helm install again.
What happens is that you finish the installation manually, then scale up, Confluence container generates confluence.cfg.xml based on available env vars, but Confluence thinks it's already setup (based on confluence.cfg.xml in shared home). As a result, we see NPE when trying to acquire DB connection.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Yevhen I made the change you suggested. I got past the null pointer exception error and into a whole new one. Where it looks like hazelcast goes down:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
When I forward the port of my primary pod "confluence-0" I see the following
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Thomas Langhorne the logs you have posted are from pod-0 or pod-1? Were you able to get a 1 node cluster?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Yevhen The logs are from pod-0.
Your suggested changes corrected the repeated issues regarding the null pointer failure.
And I was able to get a 1 node cluster.
However when I scale up to 2 pods, as soon as pod-1 is up, pod-0 has the hazel cast failure.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Thomas Langhorne as much as I want to help, it looks like the forum format is no longer suitable to troubleshoot the issue. I suggest opening a ticket with support. To me it looks like hazelcast-networking issue. We never tested in BigBang and/or istio. Maybe the second node is claiming to be a master for some reason.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.