I'm using a python script to extract all the tables from a Confluence page. I'm using requests and BeautifulSoup.
def fetch_page(session, url):
"""Fetch page content using a requests session for persistent connection."""
response = session.get(url)
if response.status_code == 200:
return response.content
raise ValueError(f"Failed to fetch the page: {response.status_code}")
I'm parsing through the tables through a function:
def parse_table(content, tag):
"""Parse HTML tables and extract relevant data based on tag."""
soup = BeautifulSoup(content, 'html.parser')
tables = soup.find_all('table')
all_tables_data = []
...
The solution is to use: response.json()['body']['view']['value'] instead of just returning the content
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.