Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Attachment insights for Jira Cloud with Python

Hi Atlassian Community!

I'm back with another helpful Python script designed to analyze Jira project attachments. This script provides valuable insights into your project's attachment landscape by retrieving metadata, calculating total sizes, and identifying the largest files. It's perfect for auditing storage usage, understanding content trends, and managing your Jira instance more efficiently.

The solution

This script leverages the Jira Cloud REST API to search for issues with attachments using JQL. It then iterates through these issues to extract detailed metadata for each attachment. The processed data is presented directly in the console and saved to a comprehensive log file (.log) and a detailed file (.csv) with attachment metadata.

Here's what the script focuses on:

  • Total attachments count:
    Provides the total number of attachments for each specified project and an overall total.
  • Total attachment size:
    Calculates the combined size of all attachments per project and globally, presented conveniently in Megabytes (MB).
  • Top 10 largest attachments:
    Identifies and lists the top 10 largest attachments by size for each project and an overall top 10 across all selected projects, including the issue key they belong to.

Key features

  • Analyzes attachments from one or multiple projects in a single run
  • Provides a clear breakdown of total attachments and their size (in MB) per project
  • Generates a detailed .log file with all analysis results, including project summaries and top 10 lists
  • Additionally saves comprehensive attachment metadata to a well-structured CSV file, including:
    • Project: The key of the Jira project.
    • Issue Key: The specific issue the attachment belongs to.
    • Attachment ID: The unique ID of the attachment.
    • Filename: The name of the attached file.
    • Author: The display name of the user who added the attachment.
    • Created Date: The timestamp when the attachment was created.
    • Size (Bytes): The original size of the attachment in bytes.
    • Size (MB): The size of the attachment in Megabytes.
    • MIME Type: The MIME type of the attachment (e.g., image/png, application/pdf).
  • Includes robust error handling for API requests and authentication issues
  • Prompts you for your Jira Cloud domain, email, and API token for secure authentication

Preparation

1) Ensure that the user acting on this script has the necessary project permissions to "Browse Projects".
2) Install the necessary Python libraries by running:

pip install requests

3) Prepare your Jira Cloud site URL (your-domain.atlassian.net), email address, and API token

The script

import requests
from requests.auth import HTTPBasicAuth
from getpass import getpass
import logging
import os
import csv
import json
from datetime import datetime

log_filename = f"jira_attachments_analysis_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log"

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
file_handler = logging.FileHandler(log_filename)
file_handler.setLevel(logging.INFO)
file_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
file_handler.setFormatter(file_formatter)
logging.getLogger().addHandler(file_handler)

def get_jira_auth():
    """Prompts the user for Jira Cloud domain, email, and API token."""
    jira_domain = input("Enter your Jira Cloud domain (e.g., your-domain.atlassian.net): ")
    email = input("Enter your Jira email: ")
    api_token = getpass("Enter your Jira API token: ")
    return jira_domain, HTTPBasicAuth(email, api_token)

def search_issues_with_attachments(jira_domain, auth, project_key):
    """
    Searches for all issues in a given project that have attachments.
    Handles pagination to retrieve all issues.
    """
    jql = f"project = '{project_key}' AND attachments is not EMPTY ORDER BY created ASC"
    base_url = f"https://{jira_domain}/rest/api/3/search"
    headers = {"Accept": "application/json"}
    start_at = 0
    max_results = 100  # Jira's default max_results per page
    all_issues = []

    logging.info(f"Searching for issues with attachments in project: {project_key}")

    while True:
        params = {
            "jql": jql,
            "startAt": start_at,
            "maxResults": max_results,
            "fields": "attachment"  # Request only the attachment field
        }
        try:
            response = requests.get(base_url, headers=headers, auth=auth, params=params)
            response.raise_for_status()
            search_results = response.json()
            issues = search_results.get('issues', [])
            all_issues.extend(issues)

            total = search_results.get('total', 0)
            logging.info(f"Retrieved {len(all_issues)} of {total} issues with attachments so far for project {project_key}.")

            if (start_at + max_results) >= total:
                break  # All issues retrieved
            else:
                start_at += max_results

        except requests.exceptions.RequestException as e:
            logging.error(f"Error searching for issues in project {project_key}: {e}")
            if response is not None and response.status_code == 401:
                logging.error("Authentication failed. Please verify your email and API token.")
            elif response is not None and response.status_code == 404:
                logging.error(f"Project with key '{project_key}' not found or no issues found.")
            return None
    
    logging.info(f"Successfully retrieved {len(all_issues)} issues with attachments for project {project_key}.")
    return all_issues

def process_attachments_data(issues_data, project_key):
    """
    Processes issue data to extract attachment metadata,
    calculate total count and size, and identify top 10 largest attachments.
    """
    processed_attachments = []
    total_attachments_count = 0
    total_attachments_size_bytes = 0  # in bytes
    all_attachments_for_sorting = [] # To keep track of all attachments for top 10

    if issues_data:
        for issue in issues_data:
            issue_key = issue.get('key')
            attachments = issue.get('fields', {}).get('attachment', [])

            for attachment in attachments:
                attachment_id = attachment.get('id')
                filename = attachment.get('filename')
                author_display_name = attachment.get('author', {}).get('displayName', '')
                created_date = attachment.get('created')
                size_bytes = attachment.get('size')  # Size in bytes
                mime_type = attachment.get('mimeType')

                processed_attachments.append({
                    'Project': project_key,
                    'Issue Key': issue_key,
                    'Attachment ID': attachment_id,
                    'Filename': filename,
                    'Author': author_display_name,
                    'Created Date': created_date,
                    'Size (Bytes)': size_bytes,
                    'Size (MB)': convert_bytes_to_mb(size_bytes),
                    'MIME Type': mime_type
                })
                
                if size_bytes is not None:
                    total_attachments_count += 1
                    total_attachments_size_bytes += size_bytes
                    all_attachments_for_sorting.append({
                        'Issue Key': issue_key,
                        'Filename': filename,
                        'Size': size_bytes
                    })

    top_10_attachments = sorted(all_attachments_for_sorting, key=lambda x: x['Size'], reverse=True)[:10]

    return processed_attachments, total_attachments_count, total_attachments_size_bytes, top_10_attachments

def save_to_csv(data, filename="jira_attachments_metadata.csv"):
    """Saves the processed attachment data to a CSV file."""
    if not data:
        logging.warning("No attachment data to save to CSV.")
        return

    fieldnames = data[0].keys()
    try:
        with open(filename, 'w', newline='', encoding='utf-8') as csvfile:
            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
            writer.writeheader()
            writer.writerows(data)
        logging.info(f"Crafting CSV file with the retrieved data.")
        logging.info(f"Operation completed, CSV file saved in {os.path.abspath(filename)}")
    except Exception as e:
        logging.error(f"Error saving data to CSV file: {e}")

def convert_bytes_to_mb(size_bytes):
    """Converts bytes to megabytes, rounded to two decimal places."""
    if size_bytes is None:
        return 0.00
    return round(size_bytes / (1024 * 1024), 2)

def main():
    jira_domain, auth = get_jira_auth()

    project_keys_input = input("Enter the Jira project key(s) separated by comma (e.g., PROJ1,PROJ2): ").strip()
    project_keys = [key.strip() for key in project_keys_input.split(',')]

    all_attachments_data_for_csv = []
    overall_total_attachments = 0
    overall_total_size_bytes = 0
    overall_top_10_attachments_for_sorting = []

    for project_key in project_keys:
        logging.info(f"\n--- Starting attachment retrieval for project: {project_key} ---")
        issues_with_attachments = search_issues_with_attachments(jira_domain, auth, project_key)
        
        if issues_with_attachments is not None:
            processed_data, project_attachments_count, project_total_size_bytes, project_top_10 = \
                process_attachments_data(issues_with_attachments, project_key)
            
            all_attachments_data_for_csv.extend(processed_data)
            overall_total_attachments += project_attachments_count
            overall_total_size_bytes += project_total_size_bytes
            overall_top_10_attachments_for_sorting.extend(project_top_10)

            project_total_size_mb = convert_bytes_to_mb(project_total_size_bytes)
            
            logging.info(f"\n--- Analysis for Project: {project_key} ---")
            logging.info(f"Total attachments found: {project_attachments_count}")
            logging.info(f"Total size of attachments: {project_total_size_mb:.2f} MB")
            
            if project_top_10:
                logging.info("Top 10 largest attachments in this project:")
                for i, att in enumerate(project_top_10):
                    size_mb = convert_bytes_to_mb(att['Size'])
                    logging.info(f"  {i+1}. Issue: {att['Issue Key']}, Filename: {att['Filename']}, Size: {size_mb:.2f} MB")
            else:
                logging.info("No attachments found for the top 10 list in this project.")

    logging.info(f"\n====================================")
    logging.info(f"Overall Summary Across All Projects:")
    logging.info(f"Total unique attachments retrieved: {overall_total_attachments}")
    logging.info(f"Total combined size of all attachments: {convert_bytes_to_mb(overall_total_size_bytes):.2f} MB")
    logging.info(f"====================================")

    overall_top_10_attachments_final = sorted(overall_top_10_attachments_for_sorting, key=lambda x: x['Size'], reverse=True)[:10]

    if overall_top_10_attachments_final:
        logging.info("\nOverall Top 10 Largest Attachments Across All Selected Projects:")
        for i, att in enumerate(overall_top_10_attachments_final):
            size_mb = convert_bytes_to_mb(att['Size'])
            logging.info(f"  {i+1}. Issue: {att['Issue Key']}, Filename: {att['Filename']}, Size: {size_mb:.2f} MB")
    else:
        logging.info("\nNo attachments found for the overall top 10 list.")

    if all_attachments_data_for_csv:
        save_to_csv(all_attachments_data_for_csv)
    else:
        logging.info("No attachment data retrieved. CSV file will not be created.")

    logging.info(f"\nAnalysis complete. Detailed log saved to {os.path.abspath(log_filename)}")

if __name__ == "__main__":
    main()

This is how it looks:
image (1).png

Disclaimer:

While this script is designed to facilitate certain interactions with JIRA Software Cloud as a convenience, it is essential to understand that its functionality is subject to change due to updates to JIRA Software Cloud’s API or other conditions that could affect its operation.

Please note that this script is provided on an "as is" and "as available" basis without any warranties of any kind. This script is not officially supported or endorsed by Atlassian, and its use is at your own discretion and risk.

Cheers!

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events