chatGPT result encoding
14 February, 2023
chatGPT returns the result as a UTF-8 byte sequence in text form. Anything but ASCII 7-bit chars, for example any extended chars, languages with other scripts, will result in not readable text.
For example a result returned for the Spanish language:
¿Qué habitaciones tienen disponibles?
Expected result:
¿Qué habitaciones tienes disponibles?
Result returned for the Japanese language:
ã©ã®é¨å±ãå©ç¨å¯è½ã§ããï¼
Expected result:
どの部屋が利用可能ですか?
You need to read the result as iso-8859-1 encoding and convert as UTF-8.
For example 'é' gets encoded in UTF-8 as the byte sequence: 0xc3: 'Ã' 0xa9: '©'
But instead of 'é', chatGPT sends 'é', which is the raw UTF-8 byte sequence.
The string 'é' is a string sequence of the byte sequence 0xc3 0xa9. To get the correct Unicode string, the string elements needs to be mapped to byte elements.
[byte[]]$byteContent = [System.Text.Encoding]::GetEncoding("iso-8859-1").GetBytes($resultText)
This is done with the iso-8859-1 encoding. This will convert each char into a 8-bit representation, which then can be correctly decoded as UTF-8 to a Unicode string:
# Run chatGPT query.
$result = (Invoke-RestMethod @RestMethodParameter)
[string]$resultText = $result.choices[0].text
[byte[]]$byteContent = [System.Text.Encoding]::GetEncoding("iso-8859-1").GetBytes($resultText)
# Get the encoded result.
[string]$text = [System.Text.Encoding]::UTF8.GetString($byteContent)
Here is a full example on how to use chatGPT in PowerShell:
# https://platform.openai.com/account/api-keys
$apikey = "sk-....
<#
– Model [Required]
The ChatGPT got multiple models. Each model has its feature, strength point, and use case. You need to select one model to use while building the request. The models are:
text-davinci-003 Most capable GPT-3 model. It can do any task the other models can do, often with higher quality, longer output, and better instruction-following. It also supports inserting completions within the text.
text-curie-001 Very capable, but faster and lower cost than Davinci.
text-babbage-001 Capable of straightforward tasks, very fast, and lower cost.
text-ada-001 Capable of very simple tasks, usually the fastest model in the GPT-3 series, and lowest cost
#>
$requestBody = @{
prompt = "What is the capital of Germany?"
model = "text-ada-001"
temperature = 1
stop = "."
} | ConvertTo-Json
$header = @{
Authorization = "Bearer $apikey "
}
$restMethodParameter = @{
Method = 'Post'
Uri = 'https://api.openai.com/v1/completions'
body = $requestBody
Headers = $header
ContentType = 'application/json'
}
# Run chatGPT query.
$result = (Invoke-RestMethod @restMethodParameter)
[string]$resultText = $result.choices[0].text
[byte[]]$byteContent = [System.Text.Encoding]::GetEncoding("iso-8859-1").GetBytes($resultText)
# Get the encoded result.
[string]$text = [System.Text.Encoding]::UTF8.GetString($byteContent)