File reading and writing in cp866 encoding in C++

File reading and writing in cp866 encoding in C++ - c++

How do i correctly read and write file with text in cp866 encoding in C++?
UPD: i found a way to write to a file
wofstream rstrm(fileName);
rstrm.imbue(locale("rus_rus.866"));
rstrm << text_in_cyrillic.c_str();
rstrm.close();
Now how can i read file in similar way? I need to read file content to tstring object.

Use WideCharToMultiByte and tell it cp866, and write the results.

Related

CStdioFile problems with encoding on read file

I can't read a file correctly using CStdioFile.
I open notepad.exe, I type àèìòùáéíóú and I save twice, once I set codification as ANSI (really is CP-1252) and other as UTF-8.
Then I try to read it from MFC with the following block of code
BOOL ReadAllFileContent(const CString &FilePath, CString *fileContent)
{
CString sLine;
BOOL isSuccess = false;
CStdioFile input;
isSuccess = input.Open(FilePath, CFile::modeRead);
if (isSuccess) {
while (input.ReadString(sLine)) {
fileContent->Append(sLine);
}
input.Close();
}
return isSuccess;
}
When I call it, with ANSI file I've got the expected result àèìòùáéíóú
but when I try to read the UTF8 encoded file I've got Ã Ã¨Ã¬Ã²Ã¹Ã¡Ã©ÃÃ³Ãº
I would like my function works with all files regardless of the encoding.
Why I need to implement?
.EDIT.
Unfortunately, in the real app, files come from external app so change the file encoding isn't an option.I must be able to read both UTF-8 and CP-1252 files.
Any file is valid ANSI, what notepad told ANSI is really Windows-1252 encode.
I've figured out a way to read UTF-8 and CP-1252 right based on the example provided here. Although it works, I need to pass the file encode which I don't know in advance.
Thnks!

I personally use the class as advertised here:
https://www.codeproject.com/Articles/7958/CTextFileDocument
It has excellent support for reading and writing text files of various encodings including unicode in its various flavours.
I have not had a problem with it.

boost::property_tree::json_parser::read_json cannot read files if path contains cyrillic characters

Is it possible to open files that have cyrillic parts in their path? I am able to read/write cyrillic contents of files, but I do not know how to open the file as
json_parser::read_json
only has std::string as a parameter and no std::wstring. Can anyone help me?

This is a limitation inherited from the C++ standard streams. Microsoft's streams have a non-standard extension to accept wstring paths, but PTree doesn't allow them.
Try using Boost.Filesystem's streams. Open the stream outside the function and pass the open stream to read_json.

Reading the content of file other than ".txt" file

How can i read content of a file which is not a simple text file in c/c++? For example, I want to read image file such as .jpg/.png/.bmp and see the value at certain index,to check what colour it is? or if I have a .exe/.rar/.zip and want to know what value is stored at different indices?
I am aware of c style reading file, which is
FILE *fp;
fp = fopen("example.txt","r"); /* open for reading */
char c;
c = getc(fp) ;
I want to know if i replace "example.txt" with "image.png" or so, will it works? will i get correct data?

When you open a non-text file, you'll want to specify binary (untranslated) mode:
FILE *fp = fopen("example.png", "rb");
In a typical case, you do most of your reading from binary files by defining structs that mirror the structures in the file, and then using fread to read from the file into the structure (but this has to be done carefully, to ensure that things like padding in the struct don't differ between the representation in-memory and on-disk).

You would need to open the file in binary mode. This allows you to read the bytes in a "raw" mode where they are unchanged from what was in the file.
However, determining the color of a particular pixel, etc. requires that you fully understand the meaning of the bytes in the file and how they are arranged for the file being read. This second requirement is much more difficult. You'll need to do some research on the format of that file type in order to do that.

yea ofcorse you can open any file in binary mode in c. if you are interested then you can also read some 1st byte of any such non text file.
In most of the cases all different file-format has some fixed header so based on that you can identify the type of that file.
Open any matroska(.mkv) file and read 1st 4 byte you will always have this
0x1A 0x45 0xDF 0xA3
you can also see any file in binary representation hexdump utility in linux
====================
Edit:
such as .jpg/.png/.bmp and see the value at certain index,to
check what colour it is?
here you need to understand the format of that file and based on that you can know on which place's data what information is indicating..!!!

wifstream equivalent to _wfopen's "mode" parameter?

I'm having troubles opening a Unicode file in C++ using fstreams instead of the older FILE-based file handling functions. When opening a file using _wfopen, I can specify a mode to tell it what character encoding to use. Eg:
_wfopen_s(&file, fileName, unicode ? L"r+, ccs=UTF-16LE" : L"r+" );
This works fine. When using wifstream though, I get both the byte-order mark at the beginning of the file, and the rest of the file appears in memory interlaced with 0x00. Clearly it's just reading in each character as a byte.
My question is: is there any equivalent to the 'mode' parameter above for use with fstreams? It's not terrible if there isn't, I just prefer the syntax of streams over FILEs.
Thanks!

You could try setting using a conversion facet for the stream.
Check the files codecvt.h and codecvt.cpp as an example.

Read Unicode files C++

I have a simple question to ask. I have a UTF 16 text file to read wich starts with FFFE. What are the C++ tools to deal with this kind of file? I just want to read it, filter some lines, and display the result.
It looks simple, but I just have experience in work with plain ascci files and I'm in the hurry. I'm using VS C++, but I'm not want to work with managed C++.
Regards
Here a put a very simple example
wifstream file;
file.open("C:\\appLog.txt", ios::in);
wchar_t buffer[2048];
file.seekg(2);
file.getline(buffer, bSize-1);
wprintf(L"%s\n", buffer);
file.close();

You can use fgetws, which reads 16-bit characters. Your file is in little-endian,byte order. Since x86 machines are also little-endian you should be able to handle the file without much trouble. When you want to do output, use fwprintf.
Also, I agree more information could be useful. For instance, you may be using a library that abstracts away some of this.

Since you are in the hurry, use ifstream in binary mode and do your job. I had the same problems with you and this saved my day. (it is not a recommended solution, of course, its just a hack)
ifstream file;
file.open("k:/test.txt", ifstream::in|ifstream::binary);
wchar_t buffer[2048];
file.seekg(2);
file.read((char*)buffer, line_length);
wprintf(L"%s\n", buffer);
file.close();

For what it's worth, I think I've read you have to use a Microsoft function which allows you to specfiy the encoding.
http://msdn.microsoft.com/en-us/library/z5hh6ee9(VS.80).aspx

The FFFE is just the initial BOM (byte order mark). Just read from the file like you normally do, but into a wide char buffer.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

File reading and writing in cp866 encoding in C++ - c++

Use WideCharToMultiByte and tell it cp866, and write the results.

Related

CStdioFile problems with encoding on read file

boost::property_tree::json_parser::read_json cannot read files if path contains cyrillic characters

Reading the content of file other than ".txt" file

wifstream equivalent to _wfopen's "mode" parameter?

Read Unicode files C++

Categories

Resources