Is there a way to remove all records by date in Changelogs?

Kalon Mitchell May 5, 2021

We have over 14 million records in the Changelogs table in Jira.  A record is created whenever most values are changed in a ticket.  I use this table to create dashboards measuring the time between status changes in a ticket.  The process to pull in this data nightly has become unmanageable.  The table has a field for Issue Created Date, I would like to delete all records older than a specific date, can this be done?

1 answer

0 votes
Tuncay Senturk
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
May 6, 2021

Hi @Kalon Mitchell 

Welcome to the community.

As far as I know it is not possible via API. On the other hand, you can do that deleting those records in the changehistory and changeitem database tables but that's always tricky and not recommended.

I do not think it is a good idea to calculate that data from database, as it would not be a performant way. Also, it would have a bad impact on database statistics/index hits. Instead, you should use Java/REST API to get changelogs and calculate duration.

As an alternative, you can use an app for that data without a need to delete any data.

Enhancer Plugin for Jira has Time Between custom field which will give you the duration between status changes, even with any defined working calendar.

I hope it helps

Tuncay

Kalon Mitchell May 7, 2021

I understand your answer, but in my case, it does not apply.  I am pulling in the data to Sisense using a data connector and duplicating the log file in a Sisense data model.  This connector uses the Jira APIs and I can't access these APIs or the Jira database directly.  I can use a query in the connector to exclude records, but this creates another problem where the connection times out as millions of records are being bypassed.  I need a way to actually shrink this file in the Jira database.  This process now takes 5 hours, reducing the size of the log file is the only way to solve it.

Kalon Mitchell May 7, 2021

What happens when this file gets too large?  Is this not an issue for anyone else?  Any log file can potentially fill up any amount of available disk space if given enough time and activity.

Tuncay Senturk
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
May 7, 2021

Hi @Kalon Mitchell 

I am a bit confused. Are you talking about a log file or the database table?

Tuncay Senturk
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
May 7, 2021

If it is a database table, that does not cause any problem if you do not access bulk data. Generally changegroup and changehistory tables are accessed via API and only for specified issues. 

But if it is a log file (I do not know such a logfile which contains status changes though), you can limit the log file, it is by default 20MB and if it exceeds it rolls over by separate files.

Long story short, I need a bit clarification please!

Kalon Mitchell May 7, 2021

It is a table in Jira.  A record is created whenever certain fields are changed in a ticket and it records what was in the field, what it was changed to, who did it and when.  So it is essentially a log of changes to the tickets.  

Kalon Mitchell May 7, 2021

I use the changed from and to values to determine when a specific status change occurred and the date/time to measure how long it took from one specific change to another.  The data is valuable, but we now have over two years of these and we don't need more than six months.  Meanwhile, the bigger the table gets the longer it takes to import it into our Sisense BI tool.

The data is imported through the CDATA Jira connector which uses the native APIs.  As I explained earlier, I have no option but to pull in bulk data and it is not a problem as long as the size of this table can be controlled.

Tuncay Senturk
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
May 8, 2021

Now it is clear, the tables in this context are: changegroup and changeitem tables. These tables do not store unnecessary log info, I strongly recommend that these tables should not be deleted. You need to think very carefully before you do this.

These tables are well-indexed both at the database level and at the application layer (by Apache Lucene). Historically your issues might need that information, for instance, you might be using an app that displays some date/user custom fields according to these historical data. 

My suggestion: Do not directly access these tables! Only your BI systems should access and read from them incrementally. Then you can work with offline tables even trimming the old denormalized data. 

I hope it was helpful

Tuncay

Suggest an answer

Log in or Sign up to answer