"Random" character introduced when calling "Get Content"

Karl Wallin March 4, 2021

Language: PowerShell

Hello,

I've successfully been able to use the Confluence API to fetch the contents of a specific page using "Get Content";
https://developer.atlassian.com/cloud/confluence/rest/api-group-content/#api-api-content-get

The content on the page is a table and I have parsed the response stripping all the HTML-tags using regex, each row on the table is then represented as an array and I refer to the arrays index to fetch a specific value

The table consists of 7 rows where the first 5 always have data, ex:
Name, Expiration, System, CA, Team

So setting "$nameofcert = $individarr[0]" allows me to call the "Name" of an individual row in the Confluence-table and "$team=$individarr[5]" will be the "Team" entered for that row.

Now before yesterday the response from the API always gave the same result but since yesterday without anyone modifying the content I am fetching (I've checked) I all of the sudden get a "random" character which skews the index.

<tr>
<td colspan="1" class="confluenceTd">This is the NAME of an entry</td>
<td colspan="1" class="confluenceTd">
<time datetime="2021-03-16" class="date-future">16 Mar 2021
</tim
e> 
</td>
<td colspan="1" class="confluenceTd">This is the SYSTEM of an entry</td>
<td colspan="1" class="confluenceTd">This is the CA of an entry</td>
<td colspan="1" class="con
fluenceTd">This is the TEAM of an entry</td>
<td colspan="1" class="confluenceTd">This is the COMMENT of an entry</td>
<td colspan="1" class="confluenceTd">
<br />
</td>
</tr
>

 This character "Â" is introduced in the response and it is NOT entered in the actual row / table and I never received this before yesterday.

I am using "Invoke-RestMethod" and tried specifying:
'Content-Type' = 'application/json ; charset=utf-8'
but no change (earlier I did not specify the charset).

I have not done any POSTs to the Confluence-page and the page has not been modified from when it worked ("Â" not introduced in response) to when it did not ("Â" introduced in the response).

Really scratching my head here, all ideas would be appreciated however the language I am using / know is PowerShell.

So to illustrate further the response above is parsed into:

This is the NAME of an entry
16 Mar 2021
 
This is the SYSTEM of an entry
This is the CA of an entry
This is the TEAM of an entry
This is the COMMENT of an entry

instead of the expected

This is the NAME of an entry
16 Mar 2021
This is the SYSTEM of an entry
This is the CA of an entry
This is the TEAM of an entry
This is the COMMENT of an entry

Best Regards

Update:

Ah yes, this is probably due to the fact that I first developed this on my computer using VSCODE which sets the encoding to UTF-8 and then just copy-pasted this to another machine and testing it in "Task Scheduler"; probably something to do with encoding in VSCODE vs PowerShell;
https://docs.microsoft.com/en-us/powershell/scripting/dev-cross-plat/vscode/understanding-file-encoding?view=powershell-7.1

Update 1:

This is probably caused by me not grasping how encoding works but if I specify either:

'Content-Type' = 'application/json ; charset=utf-8'

 or

'Charset' = 'utf-8'

shouldn't the response be in that encoding?

Even if I specify that then the response from "Invoke-WebRequest" shows "windows-1252" in the returned "ParsedHtml.defaultCharset"

1 answer

0 votes
Brant Schroeder
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 19, 2021

@Karl Wallin 

I am not sure that there is anyway to specify the encoding that is being used and have the API do something with it.  I know that you can specify base64 per their documentation.  

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events