You're on your way to the next level! Join the Kudos program to earn points and save your progress.
Level 1: Seed
25 / 150 points
Next: Root
1 badge earned
Challenges come and go, but your rewards stay with you. Do more to earn more!
What goes around comes around! Share the love by gifting kudos to your peers.
Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!
Join now to unlock these features and more
Hello,
I have a confluence page for a technical specification of a product.
It has pages under the directory structure that vary from 1-3 children deep.
I want to put the technical specification information into an LLM to re-write the entire spec as a 20 page user manual.
to do this I want to crawl through the entire confluence tech-spec and only the children pages of that master page. I can use regex to include only the children of the master webpage to be crawled and indexed.
However, the children pages are of the same link formats as the parents, all being some variation of this regex format:
^https:\/\/bec-sv\.atlassian\.net\/wiki\/spaces\/PSC\/pages\/.*$
This means I click links to different pages under PSC and cannot exclude them as their URL pattern are the same as the children of the master document.
Is there some plugin or setting configuration I can use to make the URL links reflect the directory tree structure instead of all of them being random numbers after the pages folder?