Get and set codec for QString? - c++

I'm under Windows and I supposed the default codec for QString is GBK, but I have to send some content to a Linux platform which doesn't support GBK. I'm sending some CJK content so I decided to use UTF8.
How can I get what codec I'm using for QString and set the codec for it ?
Here's the line:
packet = packet.arg(MAC, operation, text_type, text.toUtf8());
I'm trying to insert some CJK text to a normal QString.

You do not necessarily need to think about the codec. What about:
QString::fromLocal8Bit(myInput).toUtf8();
This should work fine. If you really need to manually mess with the codec look for QTextCodec.

Related

How to turn UTF8 std::string into a NSString?

Hello i have a project using both objective-c and c++ , I never set any encoding and on the right panel of the file page it says “no specific encoding set”, but I’ve read that NSString is natively utf-16 so how would I translate a c++ string(utf-8) to NSString(utf-16)?
You can use the std::string::data() method to get access to the raw bytes of the std::string. Once you have that, you can use the init(bytes:length:encoding:) constructor for NSString to convert the raw bytes into a NSString. Specify that the encoding is UTF-8.

what is "engineName" property in TI gstreamer plugin TIVidenc1

What is "engineName" property in TI Gstreamer plugin TIVidenc1?
And what values it can be?
(I know only codecServer... what else it can be?)
By documentation of TI(not too much explanation):
Engine name used by codec combo.
Here are some good examples of usage.
I have seen also this value
encode (used in network streaming but also H264 encoding to file)
With element TIVidenc I seen also:
- hmjcp (used with H264 encoding)
You can use gst-inspect TIVidenc1 to check the values (hope they are listed there) if you have installed this plugin/expansion or whatever it is..
also if you are keen you can check the sources

WIC WINCODEC_ERR_BADHEADER only for JPEG images

I have a simple encoding/ decoding application using Windows Imaging Component API. The issue I'm having is that when I use either the JPEGXR or BMP formats, everything works fine. However, when I use the JPEG codec - the encoder works fine and I can visually verify the generated JPEG image, but when I try to decode that stream, I get a WINCODEC_ERR_BADHEADER (0x88982f61)
Here's the line that fails:
hr = m_pFactory->CreateDecoderFromStream(
pInputStream,
NULL,
WICDecodeMetadataCacheOnDemand,
&pDecoder);
Here pInputStream is an IStream created from a byte array (output of the encoder - a black box which outputs a byte vector).
Please help! This is driving me nuts!
When passing stream as an argument, make sure to pre-seek it to proper initial position (esp. seek it back to the beginning if you just wrote data into it and expect further retrieval). APIs are typically not expected to seek, because this way they let you provide data in the middle of a bigger stream.

How do I decode UTF-8?

I have a UTF-8-encoded string.
This string is first saved to a file and then sent via Apache to a process written in C++, which receives it using Curl.
How can I decode the string in the C++ process?
There is a very good article on CodeProject that shows how to read utf8 .Alternatively http://utfcpp.sourceforge.net/ has also manipulations to do it ( C++ & Boost: encode/decode UTF-8 ).

Reading file with cyrillic

I have to open file with cyrillic symbols. I've encoded file into utf8. Here is example:
en: Couldn't your family afford a
costume for you
ru: Не ваша семья
позволить себе костюм для вас
How do I open file:
ifstream readFile(fileData.c_str());
while (!readFile.eof())
{
std::getline(readFile, buffer);
...
}
The first trouble, there is some symbol before text 'en' (I saw this in debugger):
"en: least"
And another trouble is cyrillic symbols:
" ru: наименьший"
What's wrong?
there is some symbol before text 'en'
That's a faux-BOM, the result of encoding a U+FEFF BYTE ORDER MARK character into UTF-8.
Since UTF-8 is an encoding that does not have a byte order, the faux-BOM shouldn't ever be used, but unfortunately quite a bit of existing software (especially in the MS world) does nonetheless. Load the messages file into a text editor and save it back out again as UTF-8, using a “UTF-8 without BOM” encoding if one is especially listed.
ru: наименьший
That's what you get when you've got a UTF-8 byte string (representing наименьший) and you print it as if it were a Code Page 1252 (Windows Western European) byte string. It's not an input problem; you have read in the string OK and have a UTF-8 byte string. But then, in code you haven't quoted, it gets output as cp1252.
If you're just printing it to the console, this is to be expected, as the console always uses the system default code page (1252 on a Western Windows install), and not UTF-8. If you need to send Unicode to the console you'll have to convert the bytes to native-Unicode wchar​s and write them from there. I don't know what the final destination for your strings is though... if you're just going to write them to another file or something you could just keep them as bytes and not care about what encoding they're in.
i suppose that your os is windows. exists several ways simple:
Use wchar_t, wstring, wifstream, etc.
Use icu library
Use other super puper library (them really many)
Note: for console printing you must use WinApi functions to convert UTF-8 to cp866 (my default cyrilic windows encoding cp1251) because of windows console supports only dos encodings.
Note: for file printing you need to know what encoding use your file
Use libiconv to convert the text to a usable encoding after reading.
Use icu to convert the text.