Strange behaviour of memory mapped file, some observations and some questions - c++

Please look at this code below.
#include <windows.h>
void Write(char *pBuffer)
{
// pBuffer -= 4*sizeof(int);
for(int i = 0; i<20; i++)
*(pBuffer + sizeof(int)*i) = i+1;
}
void main()
{
HANDLE hFile = ::CreateFile("file", GENERIC_READ|GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
if(INVALID_HANDLE_VALUE == hFile)
{
::MessageBox(NULL, "", "Error", 0);
return;
}
HANDLE hMMF = ::CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, 32, NULL);
char *pBuffer = (char*)::MapViewOfFile(hMMF, FILE_MAP_WRITE, 0, 0, 0);
Write(pBuffer);
::FlushViewOfFile(pBuffer, 100);
::UnmapViewOfFile(pBuffer);
}
I have allocated only 32 bytes yet when I attempt to write past the allocated size, I don't get any error at all. Is this by design or is this a bug in Windows code? However, if you include the commented part, it gives error, as expected.
I ask this because I am thinking of using this "feature" to my advantage. Can I? FYI, I have Win XP ver 2002 SP 3 but I suspect this to be "fixed" in newer Windows' which might fail my code, IDK. Any useful link explaining some internals of this would really help.
Thanks

This isn't any different then writing past the end of a buffer that's allocated on the heap. The operating system can only slap your fingers if you write to virtual memory that isn't mapped. Mapping is page based, one page is 4096 bytes. You'll have to write past this page to get the kaboom. Change your for-loop to end at (4096+4)/4 to repro it.

The virtual memory manager has to map memory by the page, so the extent will in effect be rounded up to the nearest 4kB (or whatever your system page size is).
I don't think it's documented whether writes into the same page as mapped data, but beyond the end of the mapping, will be committed back to the file. So don't rely on that behavior, it could easily change between Windows versions.

Related

Reading Memory from Another Process in C++ | Copilot's Solution

As the title suggests, I am trying to read memory from another process in C++ in order to check if the values from the other process reach a certain level. Since I don't know anything about this, I decided to consult GitHub Copilot for help. On a normal basis, I would search the docs, but Github seems to disagree. Since I have access to GitHub Copilot, and since the front page advertisement clearly encourages users to trust Copilot's programming ability, I chose to let Copilot make this function.
So I gave it a prompt in the form of a comment: //A function that can grab an address from the memory of another process and store it as a double value
What it gave me seemed pretty good, but I will never take a function that copilot makes and blindly use it unless I know for sure it will work (because I don't trust that everything Copilot makes is never going to cause issues, especially when dealing with pointers and such). I wanted to see if someone who had experience with memory in C++ could tell me if this function will work and why it would or wouldn't work as I know nothing about getting memory from another process.
There are three main reasons why I am not just searching the docs anyway despite GitHub's statement:
Since this is a complicated and real-world use case, this will really test Copilot's programming ability and it will give me insight into how much I can trust Copilot in the future for stuff I don't know how to do (Obviously I wouldn't let this get out of hand, but it would be good to know I can trust Copilot a little more than I do right now).
Searching the docs anyways despite the statement that GitHub made on their website is quite the opposite of what Copilot is supposed to help users with, and while I understand that it's a public beta and it's not complete yet, it should at least be good enough for real-world use cases rather than simple coding cases. An answer from someone experienced will really show if it is good enough for real-world coding cases.
The docs only tell me what a function does and what to put as its parameters, it doesn't tell me how to use it. If I really wanted to know how to use it, I would have to search the web. Searching the web will most likely get me complicated examples that don't pertain to my issue and defer me from what I am actually trying to accomplish. Not only that, but it is the opposite of what Copilot is supposed to help users with, as stated in reason #2.
Here is the code that Copilot generated for me:
DWORD GetAddress(DWORD dwProcessId, LPCWSTR szModuleName, const char* szProcName)
{
HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, FALSE, dwProcessId);
if (hProcess == NULL)
return 0;
MODULEINFO modinfo;
GetModuleInformation(hProcess, GetModuleHandle(szModuleName), &modinfo, sizeof(MODULEINFO));
DWORD dwAddress = (DWORD)modinfo.lpBaseOfDll;
DWORD dwSize = (DWORD)modinfo.SizeOfImage;
MEMORY_BASIC_INFORMATION mbi;
while (dwSize > 0)
{
VirtualQueryEx(hProcess, (LPVOID)dwAddress, &mbi, sizeof(mbi));
if (mbi.State == MEM_COMMIT && !(mbi.Protect & PAGE_GUARD) && mbi.Protect & PAGE_EXECUTE_READWRITE)
{
DWORD dwOldProtect;
VirtualProtectEx(hProcess, (LPVOID)dwAddress, mbi.RegionSize, PAGE_EXECUTE_READWRITE, &dwOldProtect);
char* szBuffer = new char[mbi.RegionSize];
ReadProcessMemory(hProcess, (LPVOID)dwAddress, szBuffer, mbi.RegionSize, NULL);
for (DWORD dwIndex = 0; dwIndex < mbi.RegionSize - 4; dwIndex++)
{
if (szBuffer[dwIndex] == '\x55' && szBuffer[dwIndex + 1] == '\x8B' && szBuffer[dwIndex + 2] == 'E' && szBuffer[dwIndex + 3] == 'A')
{
DWORD dwAddress2 = dwAddress + dwIndex + 7;
DWORD dwAddress3 = dwAddress2 + *(DWORD*)(dwAddress2);
if (strcmp((char*)dwAddress3, szProcName) == 0)
{
delete[] szBuffer;
CloseHandle(hProcess);
return dwAddress2 + 4;
}
}
}
delete[] szBuffer;
VirtualProtectEx(hProcess, (LPVOID)dwAddress, mbi.RegionSize, dwOldProtect, &dwOldProtect);
}
dwAddress += mbi.RegionSize;
dwSize -= mbi.RegionSize;
}
CloseHandle(hProcess);
return 0;
}
You may point out an immediately noticeable error: The function returns DWORD rather than double, which is what I asked Copilot to return. I saw that error but from examples that I have seen (Yes, I have done at least some searching), returning DWORD works as well. I could have seen those examples wrong, and if I am correct me.
The function returns double but it only returns the value that is stored in the memory, not actually double-typed data. If you do a casting to double, you get back your original data.
You can't search in memory with anything other than a byte pointer on 64-bit systems: http://msdn.microsoft.com/en-us/library/aa746449%28v=vs.85%29.aspx
There are different ways to search for a string in memory, depending on what you are looking for: http://www.catatonicsoft.com/blog/need-to-read-and-write-strings-and-data-in-a-processs-memory/
(Read more here: https://github.com/MicrosoftArchiveOrgMember/copilot)
Memory Scraper
This program uses several techniques to obtain information from processes and memory as it runs so that it can be added to the evidence file when Cofactor terminates the target process (by default) or when you terminate the program manually (with CTRL+C).
This code was mostly cobbled together from various examples at http://www.codeproject.com and https://forums.hak5.org, with some heavy modifications made to get the output in a useful format.
The process memory usage is checked continuously and added to the log file when it changes. This is done by getting a pointer to the process' memory region, then checking all of its pages as they are referenced. When they are changed, the contents of that page will be read and added to the log file as evidence. If a process goes in and out of sleep mode or is stopped for some other reason, this program will detect that and add it to the log file accordingly.
The current DLLs loaded by processes are recorded every 5 seconds so that if a DLL gets loaded after Cofactor terminates its target, it will still be included in the log file as evidence. A process is also checked every 5 seconds for new threads being spawned so that child processes are also included in our evidence files.
We could extend this program by having it check for two modules:
1) A module containing functions that correspond to debug breakpoints (which would detect whether a debugger was attached).
2) A module containing crash signatures - integers that would trigger an alert if they were found written to memory in any of our processes (like stack smashing protections might provide).
In order to do this without complicating things too much, I'd probably use CreateRemoteThread with an address within each module to continue execution from that thread into your own code where you can check for the breakpoint or crash signature and act accordingly.
Conclusion
If you need to debug a process and can't get it to stop for any reason, this program will still be able to grab the process memory at any time so that you can search for whatever you need.
You'll have to do some extra work in order to use the log file that is created, like parsing it with a parser of your choice and searching it (using regexes or something) which I assume is outside of the scope of what Copilot is designed to do.
If you end up using this program, please let me know! I'm curious to see how many people find this program useful.
Interesting Techniques I Learned From Other Programs
Finding DLLs Loaded by a Process
The C++ code below uses the Windows API GetModuleFileNameW() to get the full path of loaded DLLs and parses it with split() to extract just the filename and not the whole path. The rest of that code just tries to avoid duplicates while being simple enough that it doesn't get too confused between different processes and file system cases (hopefully).
// Code Example: Finding DLLs Loaded by a Process
#include <stdio.h>
#include <string.h>
#include <tchar.h>
#define BUFF_SIZE 200
// Find the full path to a loaded DLL by process ID (PID) and its filename (first 8 characters)
void GetModuleFileNameEx(int pid, const char* szName, char* buff, int buffSize)
{
HANDLE hProcess = OpenProcess(PROCESS_QUERY_INFORMATION | PROCESS_VM_READ, FALSE, pid);
if (hProcess == NULL) { return; }
HMODULE hMods[1024];
DWORD cbNeeded;
if (!EnumProcessModules(hProcess, hMods, sizeof(hMods), &cbNeeded)) { return; }
for (int i = 0; i < (int)(cbNeeded / sizeof(HMODULE)); i++) {
TCHAR szModName[MAX_PATH];
if (!GetModuleFileNameEx(hProcess, hMods[i], szModName, sizeof(szModName))) { continue; }
strcat_s((char*)buff, buffSize - 1 , (char*)szModName);
// Check if the first 8 characters of the filename in the process matches
// with what we are looking for and avoid adding duplicates
char* chPtr = strchr((char*)buff, '\\');
if (chPtr != NULL) {
*chPtr = 0;
strcat_s((char*)buff, buffSize - 1 , "\\");
strcat_s((char*)buff, buffSize - 1 , szName);
HANDLE hFile = CreateFileA((LPCSTR)buff, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL | FILE_FLAG_RANDOM_ACCESS | FILE_FLAG_SEQUENTIAL_SCAN, NULL);
if (hFile != INVALID_HANDLE_VALUE) { CloseHandle(hFile); return; }
} else { break; }
}
}
int main() {
char szDllName[8]; // Maximum length of a module name is MAXPATH - 1 bytes including the NULL terminator. However we only need 8 characters to store the DLL name so use this limit to save memory.
int pid; scanf("%d", &pid);
if (pid == 0) { return 0; }
char buff[BUFF_SIZE];
GetModuleFileNameEx(pid, szDllName, buff, BUFF_SIZE);
char* chPtr = strchr(buff, '\\');
// Change the path separator character to a null terminator so we can split it
if (chPtr != NULL) { *chPtr = 0; }
chPtr = strtok(buff, "\\");
while (chPtr != NULL) {
printf("%s\n", chPtr);
chPtr = strtok(NULL, "\\");
}
return 0;
}

CreateFile2, WriteFile, and ReadFile: how can I enforce 16 byte alignment?

I'm creating and writing a file with CreateFile2 and WriteFile, then later using readfile with the to read 16 bytes at a time into an __m128i and then performing simd operations on it. Works fine in debug mode, but throws the access denied (0xc0000005) error code in release mode. In my experience, that happens when I'm trying to shove non 16-byte-aligned stuff into 16-byte-aligned stuff. However, I'm unsure where the lack of 16-byte-alignment is first rearing its ugly head.
#define simd __m128i
Is it in the CreateFile2() call?
_CREATEFILE2_EXTENDED_PARAMETERS extend = { 0 };
extend.dwSize = sizeof(CREATEFILE2_EXTENDED_PARAMETERS);
extend.dwFileAttributes = FILE_ATTRIBUTE_NORMAL;
extend.dwFileFlags = /*FILE_FLAG_NO_BUFFERING |*/ FILE_FLAG_OVERLAPPED;
extend.dwSecurityQosFlags = SECURITY_ANONYMOUS;
extend.lpSecurityAttributes = nullptr;
extend.hTemplateFile = nullptr;
hMappedFile = CreateFile2(
testFileName.c_str(),
GENERIC_READ | GENERIC_WRITE,
0,
OPEN_ALWAYS,
&extend);
...in the WriteFile() call?
_OVERLAPPED positionalData;
positionalData.Offset = 0;
positionalData.OffsetHigh = 0;
positionalData.hEvent = 0;
bool writeCheck = WriteFile(
hMappedFile,
&buffer[0],
vSize,
NULL,
&positionalData);
...in the later ReadFile() call?
const simd* FileNodePointer(
_In_ const uint32_t index) const throw()
{
std::vector<simd> Node(8);
_OVERLAPPED positionalData;
positionalData.Offset = index;
positionalData.OffsetHigh = 0;
positionalData.hEvent = 0;
ReadFile(
hMappedFile,
(LPVOID)&Node[0],
128,
NULL,
&positionalData);
return reinterpret_cast<const simd*>(&Node[0]);
}
How can I enforce 16-byte-alignment here?
Thanks!
TL;DR You have a classic "use after free" error.
None of these functions require 16 byte alignment. If buffering is enabled, they don't care about alignment at all, and if direct I/O is enabled, they require page alignment which is much more restrictive than 16 bytes.
If your data buffer is unaligned, it's because you created it that way. The file I/O is not moving your buffer in memory.
But your access violation is not caused by alignment problems at all, it is the dangling pointer you return from FileNodePointer:
return reinterpret_cast<const simd*>(&Node[0]);
That's a pointer into content of a vector with automatic lifetime, the vector destructor runs during the function return process and frees the memory containing the data you just read from the file.

C++ memory allocation for windows

So I'm reading Windows via c/c++ fifth edition which was released before c11 so lacks some of the newer data types and methods, but was touted to be a great book on Windows.
I am just learning Windows development and c++ and when I posted questions related to file operations with code samples from the book, I got feedback that allocating buffers with the malloc function is not a good practice anymore as it requires freeing up the memroy. I should use vectors or strings instead.
That is ok. But what is the case with Windows's own data types? Here is a code sample from the book:
//initialization omitted
BOOL bResult = GetLogicalProcessorInformation(pBuffer, &dwSize);
if (GetLastError() != ERROR_INSUFFICIENT_BUFFER) {
_tprintf(TEXT("Impossible to get processor information\n"));
return;
}
pBuffer = (PSYSTEM_LOGICAL_PROCESSOR_INFORMATION)malloc(dwSize);
bResult = GetLogicalProcessorInformation(pBuffer, &dwSize);
Is there a better solution for this type of query than using malloc to allocate the proper amount of memory?
Or is declaring a vector of type PROCESOR INFORMATION STRUCTRUE the way to go?
The win32 api is sometimes a pain to use, but you could allways use the raw bytes in a std::vector<char> as a SYSTEM_LOGICAL_PROCESSOR_INFORMATION:
std::vector<char> buffer(sizeof(SYSTEM_LOGICAL_PROCESSOR_INFORMATION));
size_t buffersize = buffer.size();
SYSTEM_LOGICAL_PROCESSOR_INFORMATION *ptr
= (SYSTEM_LOGICAL_PROCESSOR_INFORMATION *)&(buffer[0]);
BOOL bResult = GetLogicalProcessorInformation(ptr, &buffersize);
if (GetLastError() == ERROR_INSUFFICIENT_BUFFER)
{
buffer.resize(buffersize);
ptr = (SYSTEM_LOGICAL_PROCESSOR_INFORMATION *)&(buffer[0]);
bResult = GetLogicalProcessorInformation(ptr, &buffersize);
}
Just be avare that the value of &(buffer[0]) may change after buffer.resize(...);
Other than that, I normally don't use the win32 api, so any bugs concerning how to call win32, you have to fix yourself
Take a look at the MSDN documentation and you will see that buffer should be "A pointer to a buffer that receives an array of SYSTEM_LOGICAL_PROCESSOR_INFORMATION structures. If the function fails, the contents of this buffer are undefined." So Zdeslav Vojkovic's answer will not work here (as Raymond Chen has pointed out). You could use std::vector<SYSTEM_LOGICAL_PROCESSOR_INFORMATION> in this case and then just call 'resize' with dwSize / sizeof(SYSTEM_LOGICAL_PROCESSOR_INFORMATION) as the argument. This would look something like:
using SLPI = SYSTEM_LOGICAL_PROCESSOR_INFORMATION;
std::vector<SLPI> slpi;
DWORD dwSize = 0;
if (!GetLogicalProcessorInformation(slpi.data(), &dwSize))
{
if (GetLastError() != ERROR_INSUFFICIENT_BUFFER) { /* error handling */ }
// Not really necessary, but good to make sure
assert(dwSize % sizeof(SLPI) == 0);
slpi.resize(dwSize / sizeof(SLPI));
if (!GetLogicalProcessorInformation(slpi.data(), &dwSize)) { /* error handling */ }
}
Personally, I'd prefer to wrap the above into a function and just return slpi so you don't need to go through this entire shenanigans every time you wish to make a call to GetLogicalProcessorInformation.

FMOD Ex Memory Allocation Issue

I have a certain problem when using FmodEx. I've searched thoroughly over the net to see if someone had my same problem but I didn't find anything related to it.
I made a class that loads and plays my sounds, in this case, streams. Here is my code:
Audio::Audio()
{
//Create system object//
m_Result = FMOD::System_Create(&m_pSystem);
ErrorCheck(m_Result);
//Check FMOD version//
m_Result = m_pSystem->getVersion(&m_FmodVersion);
if(m_FmodVersion < FMOD_VERSION)
MessageBox(NULL, FMOD_ErrorString(m_Result), "FMOD Version Error", MB_OK);
//Check if hardware acceleration is disabled//
m_pSystem->getDriverCaps(0, &m_Caps, 0, &m_SpeakerMode);
if (m_Caps & FMOD_CAPS_HARDWARE_EMULATED)
MessageBox(NULL, FMOD_ErrorString(m_Result), "FMOD Acceleration Error", MB_OK);
//Initialize system object//
m_Result = m_pSystem->init(2, FMOD_INIT_NORMAL, 0);
ErrorCheck(m_Result);
m_pChannel = 0;
m_IsLoaded = false;
}
void Audio::LoadMusic(char *filename)
{
m_Result = m_pSystem->createStream(filename, FMOD_CREATESTREAM, 0, &m_pSound);
ErrorCheck(m_Result);
}
void Audio::Play()
{
SetPause(false);
m_Result = m_pSystem->playSound(FMOD_CHANNEL_FREE, m_pSound, false, &m_pChannel);
ErrorCheck(m_Result);
SetPause(true);
}
After this I just do:
pAudio->LoadMusic("test.mp3");
pAudio->Play();
The sound plays no problem. The problem happens when loading the stream. The memory used keeps increasing all the time and it won't stop. I'm guessing that this happens because the small buffer it's using to read the mp3 stream is not beeing freed, thus, it looks for the next available piece of free memory in the RAM, thus the memory usage of the program doesn't stop increasing.
I thought that maybe using the "release" method after each play would work, but then I noticed that release frees ALL the memory in the sound instance.
Could anyone give me some pointers on to what I'm doing wrong here? How do I prevent this?
I'm not sure if I have made it clear enough or not.
Thanks in advance for the help.
Each time you call pAudio->LoadMusic you will allocate (leak) more memory because you are creating a new FMOD::Sound instance (which as you indicate has its own stream buffer). If you simply want to play the sound again, just call pAudio->Play and the stream will restart.
If you are concerned about FMOD memory usage you can call Memory_GetStats to monitor it, just in case I have miss-understood your usage and something else is causing the leak.

Reading comport with QT through winapi

I'm trying to get a list of comports that are currently in use to present them in my GUI.
I'm using the following code:
TCHAR szComPort[8];
HANDLE hCom = NULL;
char str[20];
for (int i = 1; i <= 255; ++i)
{
if (i < 10)
wsprintf(szComPort, ("COM%d"), i);
else
wsprintf(szComPort, ("\\\\.\\COM%d"), i);
hCom = CreateFile(szComPort,
GENERIC_READ|GENERIC_WRITE,
0,
NULL,
OPEN_EXISTING,
0,
NULL);
if (INVALID_HANDLE_VALUE != hCom)
{
sprintf_s(str,"COM%d",i);
ui->COMLIST->addItem(str);
}
CloseHandle(hCom);
}
This works fine on my laptop but for some reason it crashes QT on my PC for comports 10 and higher (meaning if i change i<=255 to i<=9 it works fine),
Any ideas?
Thank you!
You have 8 wchars in the szComPort buffer, you you are writing 10 characters for COM10 and above and 11 characters for COM11 and above. Make the buffer at least 11 units.
Edit: The usual practice is to make the buffer simply large enough with enough slop so you don't have to count characters. I'd probably just look at the string, think it has about 10 characters so with something for the formatted value it won't result in 30 and declare the buffer 32 items. The stack has enough room for this few extra bytes and you are not even initializing it, so there is no performance penalty and it's less risk.