Strategies for keeping documentation up to date against external sources

Jeremiah Dost March 9, 2018

I've lately been working on a Confluence documentation pipeline at my work that creates/updates Confluence pages from an external data source. However, we'd like to close the loop and allow people to be able to update those pages (or add new ones) in Confluence and scrape the data back out.

The reason we're taking this approach to documentation creation:

  1. Maintenance cost. In this case we're talking about thousands of pages, so it's more efficient to scrape existing data sources to present the documentation reference than to write up the documentation manually. While everyone sees the value in having the data available, no one wants to keep it up to date.
  2. Formatting. When we decide we want to change the display of the page (or add new data to it), we can globally update all the pages to match the new desired format (and it only takes minutes).
  3. Automation. We can keep pages up to date against the real data on a regular cadence rather a snapshot of the data at the point someone felt like updating the docs. 

My discussion topic here is fairly open ended, but boils down to a few key questions:

  • Has anyone else taken this approach to maintaining documentation? Were there pitfalls you discovered that we should watch out for?
  • What strategies have people employed for this kind of documentation maintenance in the past?
  • Considering the problem of keeping the source data up to date against edits in Confluence specifically, what features are available already or what features would have to become available to make this a reality?

2 answers

0 votes
Jeremiah Dost March 10, 2018

While I'm all for the one source of truth and a big advocate of it at work, for at least some of the documentation I'm talking about it'd be more convenient for the end user to have their cake and eat it too. I'm more interested in the thought experiment of how one might approach it.

Take a glossary for example that has a separate page for each entry--in this case, we'd benefit from the ability to have a consistent template across all entries, so that leads us down an avenue of having to update the display of the data from some external source. However, it's counterintuitive for users to think, "Oh, I'll add an entry to our documentation glossary! Wait... I have to go to another tool do that?" I'd like them to be able to add entries piecemeal to the glossary, but also be able to update all entries as necessary when people want to organize the information in a different way.

At first I was thinking a macro that pulled from an external data source might be a way forward. Just specify such and such ID that corresponds to such and such datasource and the information magically appears. However, that data isn't very searchable since all that is input onto the page is the macro call and the source data ID.

I do wish that Confluence had some kind of form (maybe there is?) so that when a user enters edit mode on certain pages, the user can only input data in certain fields, but we take care of the formatting on our end. That screams macro to me, and maybe the answer is to use macros, but input more data into them.

0 votes
Benji March 10, 2018

I certainly recognize the problem of keeping data up to date. One simple rule is key in my perspective: there is only one point of thruth! Only one place is the real source of information, the rest is derived from that point. 


People tend to make their own list, own data collection. Go and talk with them, understand why they do it, understand how they like to be in control and provide a suitable way that works for that person. 

Suggest an answer

Log in or Sign up to answer
AUG Leaders

Atlassian Community Events