Find and edit unicode HEX-strings in binary file - c++

I've got a binary file of VS2013 project, compiled using Unicode. Now i have a need to find few data strings in this file and replace them, using my utility. My idea was to search this strings in hex-editor, look their addresses and then simple update this data using winapi CreateFile/SetFilePointer/WriteFile. But there is a problem. First - i can't find this strings in hex-editor (because of unicode) and now i don't sure how to update them, because unicode chars are two bytes long.

So i used WinHex and found strings of data that i need, using Unicode. Then, i found out the offset of each string. Then, just simply wrote data at needed offset:
TCHAR data[MAX_APTH *2];
DWORD dwWritten = 0;
m_sStr = m_sStr.Trim();
offset = 0x00012F12;
wcscpy_s(data, m_sStr);
SetFilePointer(hFile, offset, NULL, 0);
WriteFile(hFile, data, wcslen(data) * sizeof(TCHAR), &dwWritten, NULL);
//WriteFile(hFile, L"\00", 100, &dwWritten, NULL);
Looks like not as hard as i thought.

Related

Issue with CreateFileA method

I am having an issue with my application as while reading a file that consists of Unicode characters too. As I am using the CreateFileA method to get the data but it doesn't get the Unicode characters properly for which I am facing a lot of issues. Also, I don't know the difference between CreateFileA and CreateFileW.
I'm sorry I couldn't able to share my code. I will share my that portion of code with you.
HANDLE systemFileHandle = INVALID_HANDLE_VALUE;
systemFileHandle = CreateFileA(Filename, GENERIC_READ, FILE_SHARE_READ, nullptr, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, nullptr);
char* inBuffer=new char[totalFileSize+2];
memset(inBuffer, 0, totalFileSize+2);
ReadFile(systemFileHandle, inBuffer, totalFileSize, &bytesRead, nullptr);
And, I am getting the results on inBuffer array be like : Fernw�rmestationSW Au�en.
Can't I get it the original way they are?
So can you please help me out with this. It can be very helpful.
CreateFileA takes an ANSI-based file name, while CreateFileW takes a Unicode-based file name. There's nothing to say about the content of the file, both will return a HANDLE to the file where you can then read/write Unicode content as needed.

Win32 API Visual C++ ReadFile() function generates gibberish if second parameter is LPWSTR [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I was trying to make a very basic text editor with Win32 that has the ability to read files and change the text of an edit control to it. I want it to be able to handle chars in all languages, so I tried to use a LPWSTR for the second parameter of ReadFile(), like this:
HANDLE file = CreateFile(_T("D:\\C++ Stuff\\Testing.txt"), GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
DWORD fileSize = GetFileSize(file, NULL);
LPWSTR buffer = (LPWSTR)GlobalAlloc(GPTR, fileSize + 1);
DWORD read;
ReadFile(file, buffer, fileSize, &read, NULL);
MessageBox(NULL, buffer, NULL, NULL);
GlobalFree(buffer);
But the MessageBox shows up with a bunch of gibberish! If I use debug mode and add a watch to buffer, it's still the same. It makes no difference if the file opening contains UTF-16 encoded chars or not. Is this normal? If yes, is there any alternative way to read the file into a LPWSTR? If no, how to fix it?
I'm using Visual Studio 2015 for this project.
P.S. The code provided is only an example. In the actual code, I have checks for if CreateFile(), GetFileSize(), GlobalAlloc() and ReadFile() failed or not and null-termination of buffer.
If the text file is in ASCII/UTF-8, then reading it as raw bytes into a wide character (LPWSTR) will result in very odd garbage, because e.g. the characters ABCD (ASCII/UTF-8 encoded as 65, 66, 67, 68) would be instead encoded as two wide-character values of 0x4142 0x4344).
Check whether your text file is ASCII/UTF-8 or wide character, and note that Windows generally adds two unicode indicator bytes (0xFFFE) that no other platform supports, so even if your text file is wide character, you'll probably see weird characters from the indicator bytes.
If you need unicode, and cannot change your project to use ASCII (LPSTR), then you can either read into a byte array and then convert using the COM library function MultiByteToWideChar provided by Windows, or you can just read each byte and type-cast to wchar_t, then store in your ,
for(int position = 0; position < filesize; position++)
buffer[position] = (wchar_t)byte_buffer[position];
or equivalent.

C++ Insert text at specific position in append with Windows API

I'm having troubles with insertion of text, in append, in a classical text file. What I want to do is simple : insert a single character in front of some lines. I know the exact offset of each line beginning. I have one restriction, I have to use the Windows API : CreateFile(), WriteFile(), SetFilePointer()...
I can't insert text, whatever I do, the program write to the end, or if it writes at the good offset, it erase the existing text.
Here is my code (I just simplified some checks to be more readable here) :
HANDLE handleFile = CreateFile (filename,
FILE_APPEND_DATA,
FILE_SHARE_READ, //SHARE
NULL, //SecurityAttibute
OPEN_ALWAYS,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (handleFile != INVALID_HANDLE_VALUE) {
if (SetFilePointer (handleFile, 12345, NULL, FILE_BEGIN) != INVALID_SET_FILE_POINTER) {
DWORD written = 0;
WriteFile (handleFile, "$", 1, &written, NULL);
}
}
When I use FILE_APPEND_DATA, the SetFilePointer() doesn't work and my character is written to the end.
When I use GENERIC_WRITE, or even FILE_GENERIC_WRITE, the character is written at the good offset, but it erase the present character :'(
What is the good parameter to really insert please ?
PS : this code is for very large files, so read / write the whole file is not possible, it would be too long.
Thanks a lot !
You cannot insert text into a file in the way you are attempting. You can append data to the end, or you can overwrite existing data. In order to effect an insertion you have to re-write all the contents that follow the point of insertion.

WriteFile() Function for Win32 Applications

I am facing a problem on the WriteFile(); function using Win32 C++ Application. the second argument asks for a pointer to the buffer that is storing the information. what Syntax do I use to point the input text from the boxes? My information is text from the input of text boxes. What syntax do i use to create a pointer to that?
Here is a snippet of code the code I am using:
case IDC_BUTTON_ONE:
{
HANDLE hFile = CreateFile("C:\\test.txt", GENERIC_READ,
0, NULL, CREATE_NEW, FILE_FLAG_OVERLAPPED, NULL);
}
To write a control's text to a file you'll also need these lines:
char TextBuffer[256]; // Ascii
GetDlgItemTextA(hDlg, IDC_YOUR_CONTROL_ID, TextBuffer, ARRAY_SIZE(TextBuffer));
WriteFile(hFile, TextBuffer, strlen(TextBuffer), &SizeOut, lpOverlapped);
That'll just write plain old ASCII. If you want to use unicode and TCHARs (instead of chars) then you'll need to choose your encoding and write more than "just the bytes" from the text buffer.

Called ReadFile on a text file, got weird (Japanese?) characters

I use the next code to read all of the elemnts from a file with the handle hFile that works, and with its size that I got with GetFileSize(hFile, NULL).
_TCHAR* text = (_TCHAR*)malloc(sizeOfFile * sizeof(_TCHAR));
DWORD numRead = 0;
BOOL didntFail = ReadFile(hFile, text, sizeOfFile, &numRead, NULL);
after the operation text is some strange thing in Japanese or something, and not the content of the file.
what did i do wrong?
edit:
I understand it is the encoding problem, but then how will I convert text to LPCWSTR to use stuff like WriteConsoleOutputCharacter
Modern IDEs default to Unicode applications, meaning _TCHAR is actually wchar_t. ReadFile() works with simple bytes and if you use it to fill a _TCHAR array directly, you'll get 8-bit characters interpreted as UTF-16 Unicode. These usually show as CJK (Chinese/Japanese/Korean) glyphs.
You have three options:
convert your program to non-Unicode
use a file containing Unicode text (in UTF-16 encoding), or
read from the file into a char array and then use MultiByteToWideChar() to convert the text to Unicode.
If you mix Unicode and non-Unicode be careful to calculate the correct buffer sizes (number of bytes vs. number of characters).
Note that you can still use narrow chars with Windows in your Unicode program if you call the ANSI version of the Windows function (e.g. WriteConsoleOutputCharacterA).
You got the type of the string wrong. Text from a file that was encoded in an 8-bit encoding will look like Chinese when you look at it through a character type, like TCHAR with UNICODE defined, that uses a 16-bit encoding. Fix:
char* text = (char*)malloc(...);
You do normally have to fret a lot more about the encoding that was used to write the text. It could be utf-8 for example. You can convert from the 8-bit encoding to a TCHAR (wchar_t, really) with MultiByteToWideChar(). Its first argument is the one to fret about.
You have read an ANSI or UTF-8 text file into a UTF-16 string.
wchar_t ReadBuff[1024];
memset(&ReadBuff, 0, sizeof(ReadBuff));
HANDLE hFile = CreateFile(szPathFileName, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
DWORD NumberOfBytesRead = 0;
ReadFile(hFile, ReadBuff, 600, &NumberOfBytesRead, NULL);
wsprintf(ReadBuff, L"%S\0", ReadBuff);
ReadBuff is now in readable form.