wget to download Confluence attachment

Vi Nguyen March 15, 2016

How can I use wget to download a Confluence attachment without any plugins?

The command:

wget http://{yourconfluence}/download/attachments/{pageid}/{filename

Results in an html file rather than the file itself.  In my specific case I am trying to download a powerpoint file.  If I enter the same URL in a browser, the browser will download the file correctly.

6 answers

1 accepted

1 vote
Answer accepted
William Smith
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
March 22, 2016

Try curl with the -O option. I had success using a command like this:

curl -O -uwilliam:password "http://confluence.mydomain.com/download/attachments/12345678/readme.txt"

 

Note that I stripped "?api=v2" from the download url. It wasn't necessary in my case and curl will use it in the file name.

 

This was with Confluence 4.3.7. I have no other version to test.

Vi Nguyen March 22, 2016

curl with the -O does work with the one Confluence setup that I need it to work with.  (Yet it didn't work with another Confluence setup I was testing with).  I guess this is the workaround I'll use since wget doesn't seem to work.  Thanks.

2 votes
Tommy Powell
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
March 19, 2019

Using curl resulted in a zero byte file so I ended up accessing my file this way:

wget --header="Authorization: Basic Z2l0Y29uZmlnQGhh" "https://mysite.atlassian.net/wiki/download/attachments/1841544/US_export_policy.jar"

Tommy Powell
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
March 19, 2019

I forgot to mention that I ran the following command to encode the username and password:

 

printf '%s' 'user:password' | base64

Tania
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
September 16, 2019

This solution worked for me.

2 votes
Shuma Dev April 11, 2018

I solved it by using something that seems to be the case for other Atlassian products (ie JIRA)

You need to have the login info in the URL itself.

$client = new-object System.Net.WebClient

$client.DownloadFile("http://Companywiki/download/attachments/211190363/St-1.0.docx?os_username=myusername&os_password=mypassword","D:\St-1.0.docx")

Shuma Dev April 11, 2018

also remove any other stuff added after the extension of the file name in the URL

ex:  http://mycompanywiki/download/attachments/211190363/St-1.0.docx?version=1&modificationDate=1522908951708?os_username=myusername&os_password=mypassword","D:\St-1.0.docx")

 

to 

http://mycompanywiki/download/attachments/211190363/St-1.0.docx?os_username=myusername&os_password=mypassword","D:\St-1.0.docx")

Ph August 8, 2018

Tkx a lot !! I searched a while for this

Did you find how to encode the password ?

KANGOD
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 19, 2019

This works, but when works with wget, the downloaded file name will contain ?os_username……

I don't know if there is any convenient wget option to keep the original name "St-1.0.docx"

wget ... -o "St-1.0.docx" is too complicated.

 

Similarly, ?os_authType=basic has the same issue.

 

----

Update 1:

Inspired by https://developer.atlassian.com/server/jira/platform/basic-authentication/#construct-the-authorization-header, I found 

wget --header="Authorization: Basic <encoded-string-usr:pwd>" <url>

wget --content-disposition <url>\?os_authType=basic     (Note: with usr/pwd predefined in ~/.netrc see `man netrc`)

curl -O --header="Authorization: Basic <encoded-string-usr:pwd>" <url>

 

----

Update 2:

So

function jiraget {
wget --no-check-certificate --content-disposition "$1?os_authType=basic"
}

Then we can `jiraget <file_url>`

1 vote
Shuma Dev April 10, 2018

Was this ever resolved?? 

No. the command aboved marked solved does NOT work.

I am getting exactly the same issue. 

Where file downloaded (supposed to be a .doc) starts with <!DOCTYPE html>

The file size is much smaller. And if I manually change the file to be .html, it goes into a cached page of what looks like the login page. So the credentials are not getting transferred.

Works fine on other site ie:

This works.

wget -O my.pdf  "https://www.tutorialspoint.com/vbscript/vbscript_tutorial.pdf"

But this doesn't work

wget -O my.pdf  --user user --password mypassword "http://tsm-wiki/download/attachments/38897186/Al%27s+comments.doc?version=1&modificationDate=1351755856000"

 

Any help is much appreciated.

Cheers.

1 vote
Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 15, 2016

Give it the user name and password for someone who can see the page, and the url for the attachment you're trying to get.  wget user:pass http://yourconfluence/pathtoattachment basically

Vi Nguyen March 15, 2016

It is not a permissions issue.  Using

wget http://yourconfluence/downloads/attachments/pageid/filename 

Results in an html file rather than the file itself.  

Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 15, 2016

You did not explain that in the original question.  You only asked how you use wget and I told you.

Now you've changed the question to something totally different, I can try again, although the answer is basically the same - hit the url for the file with wget and a username and password.

The problem you're having is probably that the url is wrong (download, not downloads) and you may need to add the api version to it by adding &api=2 to the end.

Vi Nguyen March 16, 2016

Yes, I changed the question in hopes of making the issue clearer.  

Fixing the misspelling (downloads --> download) and adding the &api=2 still did not work.  Tried adding a ?api=2 as well.  All resulted in an html page.

Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 16, 2016

OK, you're doing something differently to me then.  I get the file I ask for.

Two further questions -

  1. What do you get when you use a browser to hit the same link?
  2. When wget fetches html, what does it actually say?  Open the file in a browser?

 

Vi Nguyen March 16, 2016
  1. Using chrome, I pasted the URL in the address bar and I get the "Save As" dialog box.  When I've saved the selection, I am able to open the file (in this case a powerpoint).
  2. When wget fetches the same URL, it logs that it is connecting, then saving to the filename, then 100% complete, and then the file size which is much smaller than the size of the powerpoint.  I've also tried using curl -O URL with the same result as the wget command.

 

Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 16, 2016

Ok, great, so the browser is getting the right file.  So I don't know why wget doesn't.  What does the file that wget actually saves say?

Vi Nguyen March 17, 2016

The saved file has the same filename as it the attachment (ie. filename.ppt).  It won't open in powerpoint, but I can open it in notepad and it starts with "<!DOCTYPE html PUBLIC". If I manually change the extension to .html and open it in a browser, the page looks like a Confluence log-in page.  I guess this means the username and password that is part of the wget command line isn't being transferred over to Confluence's login request?  

In a new browser, entering the URL does ask me for the username and password (didn't notice it before since I was already logged into Confluence).

 

Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 17, 2016

Yes, that's correct - you aren't logged in, so Confluence serves up a login page (in html) to ask the process to log in.  How are you doing the username:password part of the wget?

Vi Nguyen March 17, 2016

wget --user myname --password mypassword URL

 

(just a side note, I'm limited to 3 responses per day since I am new to the forum)

 

0 votes
Robert Jacob
Contributor
February 11, 2019

wget is also not working for me.  I get a small file with html in it instead of the file I want.

 

If I try to put the username and password in the URL as one user suggested, I get "Permission denied".

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events