Processing ADF format in Python

Anton Manin
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
December 19, 2024

If you use enterprise insights, you can see different types of multiline-fields storage formats: plain text, HTML or ADF.

We generate reports in Confluence to be able to show all trends/insights on the same page.

I made a python script to convert ADF to html.

Below you can see the result -- I do not parse the 'code' parts as we do not use it in insights for objectives. 

SCR-20241219-kyps.png

 

Script itself:

import json
import html
from loguru import logger


def process_atl_format(text, level=0):
    if text is None:
return None
res = []
replacements = {
'bulletList': 'ul',
'listItem': 'li',
'paragraph': 'p',
'orderedList': 'ol',
'blockquote': 'blockquote',
'heading': 'h'

}
modifiers = {
'strong': 'b',
'em': 'em',
'underline': 'u',
'strike': 's',
'code': 'code'
}
if text.get('content') is None:
if text.get('text'):
t = html.escape(text.get('text'))
if text.get('marks'):
for mark in text.get('marks'):
mark_type = mark['type']
if mark_type == 'link':
link = mark['attrs']['href']
t = '<a href="{}">{}</a>'.format(html.escape(link), t)
elif mark_type == 'subsup':
subsup = mark['attrs']['type']
t = '<{tag}>{t}</{tag}>'.format(tag=subsup, t=t)
elif modifiers.get(mark_type):
t = '<{tag}>{t}</{tag}>'.format(tag=modifiers.get(mark_type), t=t)

res = [t]
elif text.get('type') == 'hardBreak':
res = ['<br/>']
else:
logger.debug(f'>>>>> {text}')
else:
for x in text['content']:
tmp = process_atl_format(x, level + 1)
if replacements.get(x['type']):
tag = replacements.get(x['type'])
if tag == 'h':
tag = 'h{}'.format(x['attrs']['level'])
res.append('<{tag}>{txt}</{tag}>'.format(tag=tag, txt=tmp))
else:
res.append(tmp)
res = ''.join(res)
return res.translate(str.maketrans('', '', ''.join([chr(char) for char in range(1, 32) if char not in (9, 10, 13)])))


if __name__ == '__main__':
text = json.loads("""{"version":1,"type":"doc","content":[{"type":"heading","attrs":{"level":1},"content":[{"type":"text","text":"Heading"}]},{"type":"paragraph","content":[{"type":"text","text":"Bold","marks":[{"type":"strong"}]}]},{"type":"paragraph","content":[{"type":"text","text":"italic","marks":[{"type":"em"}]}]},{"type":"paragraph","content":[{"type":"text","text":"underline","marks":[{"type":"underline"}]}]},{"type":"paragraph","content":[{"type":"text","text":"strikethrough","marks":[{"type":"strike"}]}]},{"type":"paragraph","content":[{"type":"text","text":"code","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"type":"text","text":"subscript"}]},{"type":"paragraph","content":[{"type":"text","text":"superscript"}]},{"type":"paragraph","content":[{"type":"text","text":"list"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"type":"text","text":"one"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"type":"text","text":"two"}]}]}]},{"type":"paragraph","content":[{"type":"text","text":"list"}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"type":"text","text":"1"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"type":"text","text":"2"}]}]}]},{"type":"paragraph","content":[{"type":"text","text":"Google","marks":[{"type":"link","attrs":{"href":"https://google.com"}}]}]},{"type":"codeBlock","attrs":{"language":"c"},"content":[{"type":"text","text":"sdfgsdfg"}]},{"type":"blockquote","content":[{"type":"paragraph","content":[{"type":"text","text":"quote"}]}]},{"type":"paragraph","content":[]}]}""")

result = process_atl_format(text)
print(result)

It would be great though if Enterprise Insights team will provide options to select, e.g. multiple columns to choose:

  • Override Notes -- as it stores in the DB now
  • Override Notes HTML -- valid HTML
  • Override Notes Plain Text -- just text with correct indents

How do you solve this problem?

0 comments

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events