How to Aggregate Document Data over Rolling 12-Month Periods for U-Chart Analysis in PostgreSQL?

nigel.camp · March 3, 2024

Hello Atlassian Community,

I am working on a SQL query in PostgreSQL to evaluate document data across four rolling 12-month periods. The objective is to use the ouput to create a U chart that illustrates the defectiveness of documents based on specific criteria. Each document (identified by contentid) is evaluated for defectiveness within these periods based on three criteria:

Update Status: A document is considered to have passed if it has been updated in the last 12 months.
View Count: A document passes if it has accumulated 70 views or more in the last 12 months.
Document Version: A document passes if it has a version number greater than 3.

A document is deemed defective if it fails to meet one of the above criteria.
A document passes if it meets two or more of the criteria above. The challenge I’m facing involves aggregating this data correctly over each rolling period and calculating the defectiveness based on the criteria mentioned.

The Specific Challenges Are:

Aggregating Over Rolling Periods: I need to evaluate each document’s status over distinct rolling 12-month periods (Q1 2022 to Q4 2022, Q2 2022 to Q1 2023, etc.). I am unsure how to structure the query to dynamically adjust for these periods and correctly aggregate document data.
Calculating Defectiveness: For each document in each period, I must calculate whether it meets the criteria listed above. The calculation of whether a document is defective (based on the criteria) should result in a dataset that provides the number of defective documents per period, which will be used to generate a U chart.

The Query I’ve Attempted:

WITH Periods AS (
 SELECT 'Q1 2022 to Q4 2022' AS Period, DATE '2022-01-01' AS StartDate, DATE '2022-12-31' AS EndDate
 UNION ALL
 SELECT 'Q2 2022 to Q1 2023', DATE '2022-04-01', DATE '2023-03-31'
 UNION ALL
 SELECT 'Q3 2022 to Q2 2023', DATE '2022-07-01', DATE '2023-06-30'
 UNION ALL
 SELECT 'Q4 2022 to Q3 2023', DATE '2022-10-01', DATE '2023-09-30'
),
DocumentEvaluation AS (
 SELECT 
   p.Period,
   c.contentid,
   -- Determine if the document version is greater than 3
   (MAX(c.version) > 3) AS VersionPass,
   -- Determine if the document has been updated within the last 12 months of the period end date
   BOOL_OR(MAX(c.lastmoddate) > (p.EndDate - INTERVAL '1 year')) AS UpdatePass,
   -- Determine if the document has 70 or more views within the last 12 months of the period end date
   COUNT(CASE WHEN ao."LAST_VIEW_DATE" BETWEEN (p.EndDate - INTERVAL '1 year') AND p.EndDate THEN 1 ELSE 0 END) >= 70 AS ViewPass
 FROM 
   public.content c
   LEFT JOIN public."AO_92296B_AORECENTLY_VIEWED" ao ON c.contentid = ao."CONTENT_ID",
   Periods p
 WHERE 
   -- Include documents created at any time up to the end date of the period
   c.creationdate <= p.EndDate
 GROUP BY 
   p.Period, c.contentid
),
DefectiveDocuments AS (
 SELECT 
   Period,
   contentid,
   -- A document is defective if it fails any of the criteria
   (NOT VersionPass OR NOT UpdatePass OR NOT ViewPass) AS IsDefective
 FROM 
   DocumentEvaluation
)
SELECT 
 Period,
 -- Count the total number of documents evaluated in the period as SampleSize
 COUNT(*) AS SampleSize,
 -- Sum up the documents marked as defective
 SUM(CASE WHEN IsDefective THEN 1 ELSE 0 END) AS Defects
FROM 
 DefectiveDocuments
GROUP BY 
 Period
ORDER BY 
 Period;

My Questions Are:

How can I correctly structure the SQL query to evaluate each document against the rolling 12-month periods defined?
What is the best approach to calculate whether a document is defective based on the three criteria mentioned, within the context of each rolling period?
Is there an efficient way to aggregate this data in PostgreSQL that facilitates easy analysis and visualization in a U chart?

Useful link: Confluence Data Model

I appreciate any guidance or examples you can provide to help address these challenges. Thank you in advance for your assistance!

Forums

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

How to Aggregate Document Data over Rolling 12-Month Periods for U-Chart Analysis in PostgreSQL?

1 answer

Suggest an answer

Was this helpful?

Thanks!

TAGS

Atlassian Community Events