Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

How to scrape the pages requiring login?

Kamalanathan S G June 16, 2017

I am trying to crawl through the wiki pages but the authentication required is not allowing me to get the form field names or anything. I was wondering if there is a better way to do this. Thanks!

 

1 answer

0 votes
Aron Gombas _Midori_
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
June 19, 2017

If the web interface wants you to authenticate (to login), as that page does not allow anonymous access, then your web scraper also needs to authenticate (as that is essentially doing the same thing: getting an HTTP response about that Confluence page). 

I suggest you follow the Confluence REST API way, but even in that case: you need to authenticate. 

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events