Evolution of the number of spaces per period (year, month, week, doy) in raw SQL

Vincent Blain November 4, 2022

How to get the number of Confluence spaces by month with one query SQL ?

Ok let's see how to do it step by step.

 

Context

The goal is to build an SQL query to count the number of spaces in an Atlassian server or Atlassian Datacenter instance (*). To do this, we will make a count cummulative total in SQL. We want to obtain this dataset, for example to draw an evolution graph.

|---------|--------------|-------------------|
| period | nb_of_spaces | evol_nb_of_spaces |
|---------|--------------|-------------------|
| 2019 | 10 | 10 |
| 2020 | 20 | 30 |
| 2021 | 20 | 50 |
|---------|--------------|-------------------|

(*) Tested only on Confluence server 7.13 with PostgreSQL. The query is written for PostgreSQL database.

 

Building this query step by step

Definition of the data granularity by year

SELECT
EXTRACT(YEAR FROM s.creationdate) as period
FROM spaces s

|--------|
| period |
|--------|
| 2020 |
| 2021 |
|--------|

Definition of the data granularity by month

SELECT
CONCAT(
EXTRACT(YEAR FROM s.creationdate), '_',
LPAD(EXTRACT(MONTH FROM s.creationdate)::TEXT, 2, '0')
) as period
FROM spaces s
GROUP BY period
ORDER BY period

|---------|
| period |
|---------|
| 2020_02 |
| 2020_03 |
|---------|

Definition of the data granularity by week

SELECT
CONCAT(
EXTRACT(YEAR FROM s.creationdate), '_',
LPAD(EXTRACT(MONTH FROM s.creationdate)::TEXT, 2, '0'), '_',
LPAD(EXTRACT(WEEK FROM s.creationdate)::TEXT, 2, '0')
) as period
FROM spaces s
GROUP BY period
ORDER BY period

|------------|
| period |
|------------|
| 2020_09_36 |
| 2020_09_37 |
|------------|

 Definition of the data granularity by doy (day of the year)

SELECT
CONCAT(
EXTRACT(YEAR FROM s.creationdate), '_',
LPAD(EXTRACT(MONTH FROM s.creationdate)::TEXT, 2, '0'), '_',
LPAD(EXTRACT(WEEK FROM s.creationdate)::TEXT, 2, '0'), '_',
LPAD(EXTRACT(DOY FROM s.creationdate)::TEXT, 3, '0')
) as period
FROM spaces s
GROUP BY period
ORDER BY period

|----------------|
| period |
|----------------|
| 2020_09_36_240 |
| 2020_09_37_241 |
|----------------|

Count the number of Confluence spaces by period

SELECT
CONCAT(
EXTRACT(YEAR FROM s.creationdate), '_',
LPAD(EXTRACT(MONTH FROM s.creationdate)::TEXT, 2, '0')
) as period,
COUNT(s.spaceid) as nb_of_spaces

FROM spaces s
GROUP BY period
ORDER BY period

|---------|--------------|
| period | nb_of_spaces |
|---------|--------------|
| 2020_09 | 10 |
| 2020_10 | 20 |
| 2020_11 | 20 |
|---------|--------------|

Count the cumulative total of number of spaces by period

Two solutions to do this :

  • with the clause OVER, that defines a windows functrion SQL.
  • or with only SQL subqueries. This last solution is the fastest but it is very verbose and not pleasant to read. The performances are increased especially if we add the calculation of other evolutions of indicators, such as for example the monitoring of the number of pages, the number of attached files or the consumption of disk space.

Solution with partition (clause SQL OVER)

SELECT
CONCAT(
EXTRACT(YEAR FROM s.creationdate), '_',
LPAD(EXTRACT(MONTH FROM s.creationdate)::TEXT, 2, '0')
) as period,

SUM(COUNT(s.spaceid)) OVER (
ORDER BY CONCAT(EXTRACT(YEAR FROM s.creationdate), '_', LPAD(EXTRACT(MONTH FROM s.creationdate)::TEXT, 2, '0'))
) as nb_of_spaces

FROM
spaces s
GROUP BY period
ORDER BY period

Solution with SQL subqueries

SELECT
t1.period,
MAX(t1.nb_of_spaces) as nb_of_spaces,
COALESCE(SUM(t2.nb_of_spaces), 0) as evol_nb_of_spaces

FROM (
SELECT DISTINCT
r1.period,
MAX(r1.nb_of_spaces) as nb_of_spaces
FROM (
SELECT
CONCAT(
EXTRACT(YEAR FROM s.creationdate), '_',
LPAD(EXTRACT(MONTH FROM s.creationdate)::TEXT, 2, '0')
) as period,
COUNT(s.spaceid) as nb_of_spaces
FROM spaces s
GROUP BY period
) r1
GROUP BY r1.period
) t1

LEFT JOIN(
SELECT DISTINCT
r1.period,
MAX(r1.nb_of_spaces) as nb_of_spaces
FROM (
SELECT
CONCAT(
EXTRACT(YEAR FROM s.creationdate), '_',
LPAD(EXTRACT(MONTH FROM s.creationdate)::TEXT, 2, '0')
) as period,
COUNT(s.spaceid) as nb_of_spaces
FROM spaces s
GROUP BY period
) r1
GROUP BY r1.period
) t2 ON t2.period <= t1.period

GROUP BY t1.period
ORDER BY t1.period

|---------|--------------|-------------------|
| period | nb_of_spaces | evol_nb_of_spaces |
|---------|--------------|-------------------|
| 2020_09 | 10 | 10 |
| 2020_10 | 20 | 30 |
| 2020_11 | 20 | 50 |
|---------|--------------|-------------------|

Here we go !

 

Bonus (trick)

Wanna more indicators tracking ? 

In this request you can get the evolution of the number of spaces per period, depending on the chosen data granularity : year, month, week or doy. But it may be interesting to retrieve more indicators, like :

  • evolution of number of pages,
  • evolution of number of blogposts,
  • evolution of number of attachments,
  • evolution of number of customs,
  • evolution of number of contents,
  • evolution of number of comments,
  • evolution of number of likes,
  • evolution of total attachment size

To do this, we only need to make a full outer join of each dataset. Look out, TDLR ;) incoming !

See final request with doy data granularity : https://gist.github.com/v20100v/f9ad3f4e6f63990f20fecc45fbb611d5

 

1 comment

Srinatha T
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
November 5, 2022

@Vincent Blain , interesting read. Thanks for sharing, the sqls are really helpful. 

Like Vincent Blain likes this

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events