It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

Which Cache contains Confluence Bandana data?

Atlassian documentation says the data in Bandana table is cached by Confluence.

https://developer.atlassian.com/server/confluence/bandana-caching/


Which type of Cache(s) shown in Confluence Cache Statistics page hold Bandana data?

1 answer

1 accepted

0 votes
Answer accepted

Hi @Emre Toptancı _OBSS_ 

Looking at the code, ConfluenceCachingBandanaPersister isn't represented in /admin/cache/showStatistics.action#fullView. From Confluence 7.5, it's been replaced by ReadThroughCachingBandanaPersister which looks pretty similar.

Neither of them use the API to provide the UI interface (which is com.atlassian.cache.CacheManager, which is an implementation which differs if it's Data Centre or Standalone).

So, to answer your question, it's not shown on Cache Statistics.

Is there a reason you want to see it there? Or make use of it?

We have a marketplace app that is using the Bandana table for storing its data. We are currently working towards its official Confluence Data Center certification.

Atlassian docs say that Confluence takes care of synchronizing Bandana caches between nodes but in a multinode DC environment the cache sync behavior is unknown and unpredictable.

I mean we do not know the total cache size so we can't know how the cache size affects performance. We don't know, (when we update an entry in Bandana) if the cache sync happens only for the updated entry or the cache for the whole space.

This is troubling because we see dramatic performance difference between 1 node DC and 2 or more node DC tests. The operaions that involve creating or updating a Bandana entry seem to take twice as long in multinode DC environments, compared to single node DC environments. We tend to think of this difference as because of cache sync done in the background. (There is a dramatic difference between 1 node and 2 node tests but almost no difference in 2 and 4 node tests.) This behavior also makes us think that cache sync for Bandana does not happen async after entry update but happens as part of entry update operation. This does not make much sense to me but I can't explain the dramatic difference in response time (in multinode tests) in any other way.

I'd have to check the code, but in a multi-node environment Confluence uses Hazelcast as it's cache technology. When there's one node, the whole cache is on one server and so there are no look-ups. When it's configured as multi-node, the cache has to start synchronising between nodes. The data is split between the nodes evenly as it is populated. If a node requests data from the cache and it's not local, then that node has to request the data from the Hazelcast network to retrieve the data from another node. So, 2 or more nodes won't make a difference in this case.

But from what I can see in the code, it's that each top level cache has their own name, so you'd only be doing look-ups in the Bandana Cache and then items are keyed.

Do you have multiple entries in bandana, or do you have one big XML chunk in one row? If it's one row, then it's one key and all that data has to be sent around the network as required. If it's multiple rows, then each row will have it's own key and be the key in the bandana cache.

The store operation updates through to the database and then clears the key from the cache, requiring the next retrieve to pull it from the persistent storage. This will also populate the cache.

So, I think ultimately, the reason you see a drop in performance for multi-node environment is the Hazelcast network doing it's thing. It might be that you don't have your Data Centre configured for optimal performance.

Hi James, Thank you for the information.

We presumed something similar to this cache behavior so we keep each item as a separate key/entry in the Bandana table. It might be several entries per space and thousands of entries in total on large systems (which is what we are testing for DC).

Since it is what it is, can we say that the performance impact in multinode environments is the expected outcome and should not have negative implications on DC certification?

I can't give specific guidance about the performance outcome for this, it's something a specific DC engineer would be able to answer. Bu for the certification, so long as you document what's happening and your test results that should be enough to get the help you need.

The certification process for DC is mostly about making sure you don't negatively impact the system, either through a crash or performance degradation.

James.

Thank you James. The answer turned out to be different from what I expected but I believe you provided valuable information in this post that will benefit others.

As a final note, since Bandana is one of the recommended ways of storing data for an app in Confluence, I think its caches should be listed in Cache Statistics page. We can no longer create isses on jira.atlassian.com. Can you create an improvement request for this?

I realise this is an old bug report

But it maybe that Flush All does clear the Bandana Cache as it flushes the hibernate cache as well, but since it's so heavily used, it's rebuilt very quickly. I'll check with the Confluence Server team to make 100% sure before I create a feature request in JAC.

Thanks James,

Just to be clear, it is not only about being able to flush Bandana Cache. It is also about seeing Bandana Cache details in Cache Statistics page.

Thanks

Hi,

I got an update from the Confluence Server team. On the Cache Statistics page Bandana cache is Settings (Persistence). All the cache names are in com/atlassian/confluence/core/ConfluenceActionSupport.properties. So they can be flushed, and they are represented.

James.

Yesss, that is what I was looking for. 

Thanks a lot.

Hello @James Richards ,

One additional question related to this: Does this persistence cache hold the objects or the raw Bandana data?

I mean, we keep objects in Bandana table but those objects are serialized so they can be kept as strings. BandanaManager deserializes the string and and gives us our object when we ask for it. Does the cache you pointed above (Settings (Persistence) cache) hold the objects or the serialized strings? This is important because if the serialized strings are cached and deserialization takes place everytime we get the object from BandanaManager, this will have performance implications (depending on the number and size of objects).

Thanks.

Hi,

Again, having a dig through the code, it looks like the cache stores the deserialized values. Deep down in the cache code, if the value isn't in the cache the code converts the string to an object, and returns the object with

return support.getSerializer(context).deserialize(new StringReader(record.getValue()));

And that's what is saved in the cache, which would make sense as it's more performant for retrieval.

James.

Very very good news.

Thank you very much.

EmreT

Suggest an answer

Log in or Sign up to answer
TAGS
Community showcase
Posted in Confluence

Lessons and Learnings: Six Months of Working Remote [Discussion]

Hey there, folks! For most of us, the past six months- yes, you read that right- have been a journey. More people than ever before have pivoted to working remotely, and navigating being on-scre...

2,117 views 4 5
Join discussion

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you