I'm using a python script to extract all the tables from a Confluence page. I'm using requests and BeautifulSoup.
def fetch_page(session, url):
"""Fetch page content using a requests session for persistent connection."""
response = session.get(url)
if response.status_code == 200:
return response.content
raise ValueError(f"Failed to fetch the page: {response.status_code}")
I'm parsing through the tables through a function:
def parse_table(content, tag):
"""Parse HTML tables and extract relevant data based on tag."""
soup = BeautifulSoup(content, 'html.parser')
tables = soup.find_all('table')
all_tables_data = []
...
The solution is to use: response.json()['body']['view']['value'] instead of just returning the content
Online forums and learning are now in one easy-to-use experience.
By continuing, you accept the updated Community Terms of Use and acknowledge the Privacy Policy. Your public name, photo, and achievements may be publicly visible and available in search engines.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.