in uPyCraft IDE or Putty, just sending km.press('a') then it works fine,
but in my C++, i tried to writefile with km.press('a'), it doesn't work.
i can't find what is wrong
uPyCraft Successfull
`bool CSerialPort::OpenPort(CString portname)
{
m_hComm = CreateFile(L"//./" + portname,
GENERIC_READ | GENERIC_WRITE,
0,
0,
OPEN_EXISTING,
0,
0);
if (m_hComm == INVALID_HANDLE_VALUE)
{
std::cout << "INVALID HANDLE" << std::endl;
return false;
}
else
return true;
}
bool CSerialPort::WriteByte(const char * bybyte)
{
byte iBytesWritten = 0;
if (WriteFile(m_hComm, &bybyte, 1, &m_iBytesWritten, NULL) == 0)
return false;
else
return true;
}
int main()
{
CSerialPort _serial;
_serial.OpenPort(L"COM4");
_serial.WriteByte("km.press('a')");
}`
i tried this,
but it doesn't work, i also check _serial Isn't INVALID HANDLE.
someone help me for sending "km.press('a')" to serial
and sending km.move(0,1) with using Putty and uPyCraft,
it works fine but
string test = "km.move(0,1)";
DWORD dwBytesWritten;
WriteFile(m_hComm,&test,sizeof(test),dwBytesWritten,NULL);
it doesn't work. just changing km.move(0,1) to km.move(0,10), then i don't know why but it works fine.
what is different with uPyCraft(Putty) and C++?
By the looks of it, I'm assuming your class definition looks something like this:
class CSerialPort {
public:
bool OpenPort(CString portname);
bool WriteByte(const char* bybyte);
private:
HANDLE m_hComm;
byte m_iBytesWritten;
};
byte is not the proper type. DWORD is.
CString may be used, but you are using wide string literals anyway so you could just use CreateFileW, std::wstrings and std::wstring_views.
WriteByte implies that you only want to write one byte - and indeed, your implementation does only write one byte - but it's the wrong byte. It writes one byte out of the memory of the bybyte variable, not the memory it points at.
A minor redefinition of the class:
#include <string_view> // added header
class CSerialPort {
public:
// take a `std::wstring` instead
bool OpenPort(const std::wstring& portname);
// WriteBytes instead of WriteByte:
bool WriteBytes(const void* bytesPtr, DWORD bytesToWrite);
// write both wide and non-wide string_views
bool WriteString(std::string_view str);
bool WriteString(std::wstring_view str);
private:
HANDLE m_hComm;
DWORD m_iBytesWritten; // the proper type
};
The implementation in the .cpp file then becomes:
bool CSerialPort::OpenPort(const std::wstring& portname) {
// Use CreateFileW since you've hardcoded wide string literals anyway:
m_hComm = CreateFileW((L"//./" + portname).c_str(),
GENERIC_READ | GENERIC_WRITE,
0,
0,
OPEN_EXISTING,
0,
0);
return m_hComm != INVALID_HANDLE_VALUE;
}
bool CSerialPort::WriteBytes(const void* bytesPtr, DWORD bytesToWrite)
{
return
WriteFile(m_hComm, bytesPtr, bytesToWrite, &m_iBytesWritten, nullptr) != 0;
}
// the WriteString overloads taking string_views pass on the pointer
// and length to `WriteBytes`:
bool CSerialPort::WriteString(std::string_view str) {
return WriteBytes(str.data(), str.size());
}
bool CSerialPort::WriteString(std::wstring_view str) {
return WriteBytes(str.data(), str.size() * // wchar_t's are more than 1 byte:
sizeof(std::wstring_view::value_type));
}
And your main would then use the WriteString overload taking a std::string_view (by passing a const char* to WriteString):
int main()
{
CSerialPort _serial;
if(_serial.OpenPort(L"COM4")) {
_serial.WriteString("km.press('a')");
} else {
std::cerr << "failed opening COM4\n";
}
}
Note: The section you added at the end has several errors:
string test = "km.move(0,1)";
DWORD dwBytesWritten;
WriteFile(m_hComm,&test,sizeof(test),dwBytesWritten,NULL);
&test takes the address of the std::string object. You should use test.c_str() to get a const char* to the first character in the string.
sizeof(test) gets the size of the std::string object, not the length of the actual string. You should use test.size() instead.
dwBytesWritten is passed by value but the function expects a pointer to a DWORD that it can write to. You should use &dwBytesWritten instead.
WriteFile(m_hComm, test.c_str(), test.size(), &dwBytesWritten, NULL);
Related
This is working for me...
std::string GetProgramDataPath() {
CHAR path[MAX_PATH];
HRESULT hr = SHGetFolderPathA(nullptr, CSIDL_COMMON_APPDATA, nullptr, 0, path); // path accepted as LPSTR parameter?
if (SUCCEEDED(hr)) {
return std::string(path); // then automatically cast to const char*?
}
else {
return std::string();
}
}
...but I don't know why. I try to pass LPSTR, but I get:
Error C4700 "uninitialized local variable 'path' used"
I look up how to initialize LPSTR and come up with this:
std::string GetProgramDataPath() {
LPSTR path = new CHAR[MAX_PATH];
HRESULT hr = SHGetFolderPathA(nullptr, CSIDL_COMMON_APPDATA, nullptr, 0, path);
if (SUCCEEDED(hr)) {
std::string strPath(path);
delete[] path;
return std::string(strPath);
}
else {
delete[] path;
return std::string();
}
}
Is this the 'correct' code? With new and delete it seems wrong. Am I doing something unsafe by just using CHAR[]? How come it works instead of LPSTR? I believe it has something to do with the "equivalence of pointers and arrays" in C, but it seems there are some automatic conversions from CHAR[] to LPSTR to const char * in this code I don't understand.
Instead of managing the memory your self with new and delete I'd use a std::string instead and let it manage the memory.
static std::string GetProgramDataPath()
{
std::string buffer(MAX_PATH, '\0');
const HRESULT result = SHGetFolderPathA
(
nullptr,
CSIDL_COMMON_APPDATA,
nullptr,
0,
buffer.data()
);
if (SUCCEEDED(result))
{
// Cut off the trailing null terminating characters.
// Doing this will allow you to append to the string
// in the position that you'd expect.
if (const auto pos{ buffer.find_first_of('\0') }; pos != std::string::npos)
buffer.resize(pos);
// Here is how you can append to the string further.
buffer.append(R"(\Some\Other\Directory)");
return buffer;
}
buffer.clear();
return buffer;
}
Here is one way you could do it using std::filesystem::path and SHGetKnownFolderPath.
namespace fs = std::filesystem;
static fs::path GetProgramDataPath()
{
struct buffer {
wchar_t* data{ nullptr };
~buffer() { CoTaskMemFree(data); }
} buf{};
const HRESULT result = SHGetKnownFolderPath
(
FOLDERID_ProgramData,
0,
nullptr,
&buf.data
);
return SUCCEEDED(result)
? fs::path{ buf.data }
: fs::path{};
}
int main()
{
fs::path path{ GetProgramDataPath() };
if (!path.empty())
{
// Here is one way you can append to a path.
// You can also use the append member function as well.
path /= R"(Some\Other\Directory)";
// When you're ready you can call either the generic_string or
// string member function on the path.
const std::string s1{ path.string() };
const std::string s2{ path.generic_string() };
// Prints: 'C:\ProgramData\Some\Other\Directory'.
std::cout << s1 << '\n';
// Prints: 'C:/ProgramData/Some/Other/Directory'.
std::cout << s2 << '\n';
}
}
This is working for me...but I don't know why.
LPSTR is just an alias for CHAR* (aka char*):
typedef CHAR *LPSTR;
In certain contexts, a fixed-sized CHAR[] (aka char[]) array will decay into a CHAR* (aka char*) pointer to its 1st element, such as when passing the array by value in a function parameter, as you are doing.
I try to pass LPSTR, but I get Error C4700 "uninitialized local variable 'path' used".
Because LPSTR is just a pointer, and you likely did not point it at anything meaningful.
Is this the 'correct' code?
Technically yes, that will work (though return std::string(strPath) should be return strPath instead). However, you should consider using std::string or std::vector<char> instead to manage memory for you, don't use new[]/delete[] directly, eg:
std::string GetProgramDataPath() {
std::vector<char> path(MAX_PATH);
HRESULT hr = SHGetFolderPathA(nullptr, CSIDL_COMMON_APPDATA, nullptr, 0, path.data());
if (SUCCEEDED(hr)) {
return std::string(path.data());
}
return std::string();
}
Am I doing something unsafe by just using CHAR[]?
No.
How come it works instead of LPSTR?
Because CHAR[] decays into the same type that LPSTR is an alias of.
it seems there are some automatic conversions from CHAR[] to LPSTR to const char * in this code.
Correct.
I have a custom stream implementation that uses Win32 API functions like ::CreateFile2, ::ReadFile, ::WriteFile. Also the stream implements Flush and Truncate functions that are not supported by std::fstream (its flush() flushes its internal buffer, but not the operating system buffer):
class WinStream : public IoStream
{
public:
size_t Read(uint8_t* buffer, size_t count) override
{
const DWORD nNumberOfBytesToRead = static_cast<DWORD>(count);
assert(nNumberOfBytesToRead == count);
DWORD NumberOfBytesRead = 0;
Check(::ReadFile(m_hFile, buffer, nNumberOfBytesToRead, &NumberOfBytesRead, NULL) != FALSE);
return NumberOfBytesRead;
}
void Write(const uint8_t* buffer, size_t count) override
{
const DWORD nNumberOfBytesToWrite = static_cast<DWORD>(count);
assert(nNumberOfBytesToWrite == count);
DWORD NumberOfBytesWritten = 0;
if (::WriteFile(m_hFile, buffer, nNumberOfBytesToWrite, &NumberOfBytesWritten, NULL) == FALSE)
{
throw IoError(format()
<< _T("::WriteFile failed. This may indicate that the disk is full. Win32 Error: ")
<< ::GetLastError());
}
if (nNumberOfBytesToWrite != NumberOfBytesWritten)
{
throw IoError(format() << _T("Requested ") << nNumberOfBytesToWrite
<< _T(" bytes, but actually written ") << NumberOfBytesWritten << _T("."));
}
bool End() override
{
return GetFileSizeHelper() == GetFilePointerHelper();
}
void Seek(std::size_t pos, bool begin = true) override
{
LARGE_INTEGER li;
li.QuadPart = pos;
Check(::SetFilePointerEx(m_hFile, li, NULL, begin ? FILE_BEGIN : FILE_END) != INVALID_SET_FILE_POINTER);
}
void Move(std::ptrdiff_t offset) override
{
LARGE_INTEGER li;
li.QuadPart = offset;
Check(::SetFilePointerEx(m_hFile, li, NULL, FILE_CURRENT) != INVALID_SET_FILE_POINTER);
}
void Flush() override
{
Check(::FlushFileBuffers(m_hFile) != FALSE);
}
void Truncate() override
{
Check(::SetEndOfFile(m_hFile) != FALSE);
}
private:
void Check(bool success)
{
if (!success)
{
throw Win32Exception();
}
}
LONGLONG GetFileSizeHelper()
{
LARGE_INTEGER li;
li.QuadPart = 0;
Check(::GetFileSizeEx(m_hFile, &li) != FALSE);
return li.QuadPart;
}
LONGLONG GetFilePointerHelper()
{
LARGE_INTEGER liOfs = { 0 };
LARGE_INTEGER liNew = { 0 };
Check(::SetFilePointerEx(m_hFile, liOfs, &liNew, FILE_CURRENT) != INVALID_SET_FILE_POINTER);
return liNew.QuadPart;
}
FileHandle m_hFile;
}
inline UniqueFileHandle CreateUniqueFile(const String& file_name)
{
HANDLE hFile = ::CreateFile2(
file_name.c_str(),
GENERIC_READ | GENERIC_WRITE,
0, //FILE_SHARE_READ | FILE_SHARE_WRITE,
OPEN_ALWAYS,
NULL //&extendedParams
);
if (hFile == INVALID_HANDLE_VALUE)
{
DWORD dw_err = ::GetLastError();
throw IoError(format() << _T("Cannot open file ')" << file_name << "' for updating, error = " << dw_err));
}
return hFile;
}
What is the right (or modern) way to migrate this code to Linux? And what about Android, MacOS and iOS?
It should use non-buffered read/write functions.
As Homer512 said, you should use the POSIX file functions, like fopen, fwrite, fclose, and fread. You can find a reference for these functions here.
My personal preference for having the code work on many systems is using preprocessor directives, and having code for each os. A full list of os directives can be found here.
To separate the code for operating systems, you can do something like this:
#ifdef _WIN32
//Windows code here
#elifdef __APPLE__
//MacOS code here
#elifdef __linux__
//Linux code here
#endif
You can then use a specific set of functions, like open a file, write to a file, then define them on different systems using their specific ways. You can also easily extend this to other operating systems easily.
I have enumerated a processes modules and have a MODULEINFO. From that I have a base address, size of the module, and the entrypoint. If I have a separate process with an integer int x = 4 defined in main(), can I scan for that integer's address using what I have with MODULEINFO? Wouldn't x exist on the stack, which is separate from the module exe?
I tried making a loop with the base address and SizeOfImage member, casting the base address to a byte*, and then adding 1 byte and then casting it to a int* to search for a specific value, however every value I got back was a "0". I believe my method was (grossly) incorrect.
If it is possible to scan an int value can anyone point me in the general direction to do so?
Yes--local variables (non-static ones, anyway) are allocated on the stack. To see their values, you'll need to write something on the order of a debugger, such as pausing the program while it's running (and the function containing the variable of interest is active), and walk the stack to find the value.
Since you're apparently using Windows, functions you'll probably want to look at include:
WaitForDebugEvent (or WaitForDebugEventEx)
ContinueDebugEvent
Stackwalk64
You'll probably also want to look at the dbghlp API, probably starting with these:
SymInitialize
SymFromName
SymCleanup
There's a lot more there to consider, but that's probably enough to at least get a bit of a start. I previously posted an answer that demonstrates StackWalk64, and some of the Sym* stuff.
Here's some code with the bare skeleton of a debugger that will spawn a child process, and then log the debug events it produces:
#include <windows.h>
#include <stdio.h>
#include "child_process.h"
void dispatch_child_event(DEBUG_EVENT const &event, child_process const &child) {
char *file_name;
char buffer[512];
switch ( event.dwDebugEventCode ) {
case LOAD_DLL_DEBUG_EVENT:
file_name = child.get_string(event.u.LoadDll.lpImageName);
if ( event.u.LoadDll.fUnicode)
printf("Loading %S\n", (wchar_t *)file_name);
else
printf("Loading %s\n", file_name);
break;
case EXCEPTION_DEBUG_EVENT:
switch (event.u.Exception.ExceptionRecord.ExceptionCode)
{
case EXCEPTION_ACCESS_VIOLATION:
{
if ( event.u.Exception.dwFirstChance)
break;
EXCEPTION_RECORD const &r = event.u.Exception.ExceptionRecord;
printf("Access Violation %x at %0#p\n",
r.ExceptionCode,
r.ExceptionAddress);
break;
}
case EXCEPTION_BREAKPOINT:
printf("Breakpoint reached\n");
break;
case EXCEPTION_DATATYPE_MISALIGNMENT:
if ( !event.u.Exception.dwFirstChance)
printf("Misaligned data exception.\n");
break;
case EXCEPTION_SINGLE_STEP:
printf("Single Step...\n");
break;
case DBG_CONTROL_C:
if ( !event.u.Exception.dwFirstChance)
printf("Control+C pressed\n");
break;
break;
}
case CREATE_THREAD_DEBUG_EVENT:
printf("Client created a thread\n");
break;
case CREATE_PROCESS_DEBUG_EVENT:
printf("Create-Process\n");
break;
case EXIT_THREAD_DEBUG_EVENT:
printf("Thread exited.\n");
break;
case UNLOAD_DLL_DEBUG_EVENT:
printf("DLL being unloaded\n");
break;
case OUTPUT_DEBUG_STRING_EVENT: {
OUTPUT_DEBUG_STRING_INFO const &d = event.u.DebugString;
char *string = child.get_debug_string(d.lpDebugStringData,
d.nDebugStringLength);
if ( d.fUnicode)
printf("Debug string: %S\n", string);
else
printf("Debug string: %s\n", string);
break;
}
}
}
int main(int argc, char **argv) {
DEBUG_EVENT event;
if ( argc < 2 ) {
fprintf(stderr, "Usage: Trace [executable|PID]");
return EXIT_FAILURE;
}
child_process child(argv[1]);
do {
WaitForDebugEvent(&event, INFINITE);
dispatch_child_event(event, child);
ContinueDebugEvent( event.dwProcessId,
event.dwThreadId,
DBG_CONTINUE );
} while ( event.dwDebugEventCode != EXIT_PROCESS_DEBUG_EVENT);
return 0;
}
That uses the following child_process header:
#ifndef CHILD_PROCESS_H_INC_
#define CHILD_PROCESS_H_INC_
#include <windows.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <io.h>
#include "syserror.h"
struct no_spawn {
no_spawn() { system_error("Spawning Program"); }
};
class child_process {
HANDLE process_;
HANDLE thread_;
mutable char buffer[FILENAME_MAX];
public:
child_process(char const *filename);
char *get_string(void *string_name, DWORD num = 0) const;
char *get_debug_string(void *string, DWORD num) const;
HANDLE process() { return process_; }
HANDLE thread() { return thread_; }
~child_process() { CloseHandle(process()); }
};
#endif
The implementation of that class is as follows:
#include "child_process.h"
static BOOL find_image(char const *name, char *buffer) {
// Try to find an image file named by the user.
// First search for the exact file name in the current
// directory. If that's not found, look for same base name
// with ".com", ".exe" and ".bat" appended, in that order.
// If we can't find it in the current directory, repeat
// the entire process on directories specified in the
// PATH environment variable.
//
#define elements(array) (sizeof(array)/sizeof(array[0]))
static char *extensions[] = {".com", ".exe", ".bat", ".cmd"};
int i;
char temp[FILENAME_MAX];
if (-1 != _access(name, 0)) {
strcpy(buffer, name);
return TRUE;
}
for (i=0; i<elements(extensions); i++) {
strcpy(temp, name);
strcat(temp, extensions[i]);
if ( -1 != _access(temp, 0)) {
strcpy(buffer, temp);
return TRUE;
}
}
_searchenv(name, "PATH", buffer);
if ( buffer[0] != '\0')
return TRUE;
for ( i=0; i<elements(extensions); i++) {
strcpy(temp, name);
strcat(temp, extensions[i]);
_searchenv(temp, "PATH", buffer);
if ( buffer[0] != '\0')
return TRUE;
}
return FALSE;
}
child_process::child_process(char const *filename) {
if (isdigit(filename[0])) {
DWORD id = atoi(filename);
process_ = OpenProcess(PROCESS_ALL_ACCESS, false, atoi(filename));
DebugActiveProcess(id);
}
else {
char buf[FILENAME_MAX];
PROCESS_INFORMATION pi = {0};
STARTUPINFO si = {0};
si.cb = sizeof(si);
if (!find_image(filename, buf))
throw no_spawn();
BOOL new_process_ = CreateProcess(buf, NULL, NULL, NULL, FALSE,
DEBUG_ONLY_THIS_PROCESS,
NULL, NULL,
&si, &pi);
if (!new_process_)
throw no_spawn();
CloseHandle(pi.hThread);
process_ = pi.hProcess;
thread_ = pi.hThread;
}
}
char *child_process::get_string(void *string_name, DWORD num) const {
// string_name is a pointer to a pointer to a string, with the pointer and the
// string itself located in another process_. We use Readprocess_Memory to read
// the first pointer, then the string itself into our process_ address space.
// We then return a pointer (in our address space) to the string we read in.
//
char *ptr;
SIZE_T bytes_read;
if ( 0 == num )
num = sizeof(buffer);
if ( string_name == NULL )
return NULL;
ReadProcessMemory(process_,
string_name,
&ptr,
sizeof(ptr),
&bytes_read);
if (NULL == ptr )
return NULL;
ReadProcessMemory(process_,
ptr,
buffer,
num,
&bytes_read);
return buffer;
}
char *child_process::get_debug_string(void *string, DWORD num) const {
static char buffer[FILENAME_MAX];
SIZE_T bytes_read;
if ( string == NULL )
return NULL;
ReadProcessMemory(process_,
string,
buffer,
num,
&bytes_read);
return buffer;
}
That's not enough to do everything you want yet, but at least it should give you a start in the general direction.
Oh, one disclaimer: I wrote most of this code quite a long time ago. There are parts I'd certainly do differently if I were to write it today.
I'm trying to read a Unicode string from another process's memory with this code:
Function:
bool ReadWideString(const HANDLE& hProc, const std::uintptr_t& addr, std::wstring& out) {
std::array<wchar_t, maxStringLength> outStr;
auto readMemRes = ReadProcessMemory(hProc, (LPCVOID)addr,(LPVOID)&out, sizeof(out), NULL);
if (!readMemRes)
return false;
else {
out = std::wstring(outStr.data());
}
return true;
}
Call:
std::wstring name;
bool res = ReadWideString(OpenedProcessHandle, address, name);
std::wofstream test("test.txt");
test << name;
test.close();
This is working well with English letters, but when I try to read Cyrillic, it outputs nothing. I tried with std::string, but all I get is just a random junk like "EC9" instead of "Дебил".
I'm using Visual Studio 17 and the C++17 standard.
You can't read directly into the wstring the way you are doing. That will overwrite it's internal data members and corrupt surrounding memory, which would be very bad.
You are allocating a local buffer, but you are not using it for anything. Use it, eg:
bool ReadWideString(HANDLE hProc, std::uintptr_t addr, std::wstring& out) {
std::array<wchar_t, maxStringLength> outStr;
SIZE_T numRead = 0;
if (!ReadProcessMemory(hProc, reinterpret_cast<LPVOID>(addr), &outStr, sizeof(outStr), &numRead))
return false;
out.assign(outStr.data(), numRead / sizeof(wchar_t));
return true;
}
std::wstring name;
if (ReadWideString(OpenedProcessHandle, address, name)) {
std::ofstream test("test.txt", std::ios::binary);
wchar_t bom = 0xFEFF;
test.write(reinterpret_cast<char*>(&bom), sizeof(bom));
test.write(reinterpret_cast<const char*>(name.c_str()), name.size() * sizeof(wchar_t));
}
Alternatively, get rid of the local buffer and preallocate the wstring's memory buffer instead, then you can read directly into it, eg:
bool ReadWideString(HANDLE hProc, std::uintptr_t addr, std::wstring& out) {
out.resize(maxStringLength);
SIZE_T numRead = 0;
if (!ReadProcessMemory(hProc, reinterpret_cast<LPVOID>(addr), &out[0], maxStringLength * sizeof(wchar_t), &numRead)) {
out.clear();
return false;
}
out.resize(numRead / sizeof(wchar_t));
return true;
}
Or
bool ReadWideString(HANDLE hProc, std::uintptr_t addr, std::wstring& out) {
std::wstring outStr;
outStr.resize(maxStringLength);
SIZE_T numRead = 0;
if (!ReadProcessMemory(hProc, reinterpret_cast<LPVOID>(addr), &outStr[0], maxStringLength * sizeof(wchar_t), &numRead))
return false;
outStr.resize(numRead / sizeof(wchar_t));
out = std::move(outStr);
return true;
}
I'm trying to read data from XML file and store every element ("< some data/>") in vector container vector<TCHAR*> , why the Task Manager shows the memory usage much greater than vector size(~80mb instead of ~59mb) :
#define _UNICODE
#include<tchar.h>
#include<iostream>
#include<windows.h>
#include<vector>
using namespace std;
HANDLE hFile;
HANDLE hThread;
vector<TCHAR*> tokens;
DWORD tokensSize;
DWORD WINAPI Thread(LPVOID lpVoid);
void main()
{
tokensSize = 0;
hFile = CreateFile("db.xml",GENERIC_READ,0,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,NULL);
if(hFile == INVALID_HANDLE_VALUE) {
cout<<"CreateFile Error # "<<GetLastError()<<endl;
}
DWORD fileSize = GetFileSize(hFile,NULL);
cout<<"fileSize = "<<fileSize<<" bytes = "<<fileSize/1024/1024<<" mb"<<endl;
TCHAR* buffer = new TCHAR[fileSize / sizeof(TCHAR) + 1];
ZeroMemory(buffer,fileSize);
DWORD bytesRead;
if(!ReadFile(hFile,buffer,fileSize,&bytesRead,NULL)){
cout<<"ReadFile Error # "<<GetLastError()<<endl;
}
CloseHandle(hFile);
hThread = CreateThread(NULL,0,Thread,(LPVOID)buffer,0,NULL);
WaitForSingleObject(hThread,INFINITE);
for(int i=0;i<tokens.size();i++)
tokensSize+=(_tcslen(tokens[i])+1)*sizeof(TCHAR);
cout<<"vector size = "<<tokensSize<<" bytes = "<<tokensSize/1024/1024<<" mb"<<endl;
cin.get();
}
DWORD WINAPI Thread(LPVOID lpVoid)
{
wstring entireDB = (TCHAR*)lpVoid;
delete[]lpVoid;
wstring currentElement;
wstring::size_type lastPos = 0;
wstring::size_type next;
next = entireDB.find(_T(">"),lastPos);
TCHAR* szStr;
do
{
currentElement = entireDB.substr(lastPos,next+1-lastPos);
szStr = new TCHAR[currentElement.length()+1];
_tcscpy(szStr,currentElement.c_str());
tokens.push_back(szStr);
lastPos = next+1;
next = entireDB.find(_T(">"),lastPos);
}
while(next != wstring::npos);
entireDB.clear();
return 0;
}
OUTPUT:~
fileSize = 57mb
vectorSize = 58mb
but the TaskManager shows ~ 81mb.
What am I doing wrong?
THNX!
First, as Aesthete as pointed out, you never clear the token vector once you're finished with it. This should be done, or change the token vector to utilize self-cleaning content like std::string or std::wstring.
Which brings me to the side-by-side below. Please review this against your existing code. There are a number of changes you'll want to compare. The one you will likely not see until you cmopile+run is the memory footprint difference, which may surprise you.
Major Changes
Global tokens is now a vector of std::wstring rather than raw wchar_t pointers
Uses MultiByteToWideChar to translate the input file.
Allocates a std::wstring dynamically as the thread parameter. This removes one full copy of the file image. The thread is responsible for deleteing the wstring once finished parsing the content.
Uses _beginthreadex() for starting the thread. The fundamental reason for this is because of the C/C++ runtime usage. In the past the runtime sets up various thread-local-storage that must be properly cleaned, and are so when using _beginthreadex(). It is almost identical to CreateThread(), but honestly I look forward to the day when MS has their stuff together and gives us std::thread officially like the rest of the civilized world.
Minor/Meaningless Changes
Global variables are brought to local scope where appropriate. this means the only real global now is the tokens vector.
The thread procedure now pushes substrings straight to the tokens vector.
uses argv[1] for the filename (easy to debug that way, no other special reason). can be changed back to your hard-coded filename as needed.
I hope this gives you some ideas on cleaning this up, and more importantly, how yoy can do almost the entire task you're given without having to go new and delete nuts.
Notes: this does NOT check the input file for a byte-order-mark. I'm taking it on faith that your claim it is UTF8 is straight-up and doesn't have a BOM at the file beginning. If your input file does have a BOM, you need to adjust the code that reads the file in to account for this.
#include <windows.h>
#include <tchar.h>
#include <process.h>
#include <iostream>
#include <vector>
#include <string>
using namespace std;
// global map of tokens
vector<wstring> tokens;
// format required by _beginthreadex()
unsigned int _stdcall ThreadProc(void *p);
int main(int argc, char *argv[])
{
HANDLE hThread = NULL;
std::string xml;
std::wstring* pwstr = NULL;
// check early exit
if (argc != 2)
{
cout << "Usage: " << argv[0] << " filename" << endl;
return EXIT_FAILURE;
}
// use runtime library for reading the file content. the WIN32 CreateFile
// API is required for some things, but not for general file ops.
HANDLE hFile = CreateFileA(argv[1], GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile != INVALID_HANDLE_VALUE)
{
DWORD dwFileSize = GetFileSize(hFile, NULL);
if (dwFileSize > 0)
{
// allocate a string large enough for the whole file.
std::string xml(dwFileSize, 0);
DWORD bytesRead = 0;
if (ReadFile(hFile, &xml.at(0), dwFileSize, &bytesRead, NULL) && (bytesRead == dwFileSize))
{
// invoke MB2WC to determine wide-char requirements
int ires = MultiByteToWideChar(CP_UTF8, 0, xml.c_str(), -1, NULL, 0);
if (ires > 0)
{
// allocate a wstring for our thread parameter.
pwstr = new wstring(ires, 0);
MultiByteToWideChar(CP_UTF8, 0, xml.c_str(), -1, &pwstr->at(0), ires);
// launch thread. it own the wstring we're sending, including cleanup.
hThread = (HANDLE)_beginthreadex(NULL, 0, ThreadProc, pwstr, 0, NULL);
}
}
}
// release the file handle
CloseHandle(hFile);
}
// wait for potential thread
if (hThread != NULL)
{
WaitForSingleObject(hThread, INFINITE);
CloseHandle(hThread);
}
// report space taken by tokens
size_t tokensSize = 0;
for (vector<wstring>::const_iterator it = tokens.begin(); it != tokens.end(); ++it)
tokensSize += it->size()+1;
cout << "tokens count = " << tokens.size() << endl
<< "tokens size = "<< tokensSize <<" bytes" << endl;
cin.get();
}
// our thread parameter is a dynamic-allocated wstring.
unsigned int _stdcall ThreadProc(void *p)
{
// early exit on null insertion
if (p == NULL)
return EXIT_FAILURE;
// use string passed to us.
wstring* pEntireDB = static_cast<wstring*>(p);
wstring::size_type last = 0;
wstring::size_type next = pEntireDB->find(L'>',last);
while(next != wstring::npos)
{
tokens.push_back(pEntireDB->substr(last, next-last+1));
last = next+1;
next = pEntireDB->find(L'>', last);
}
// delete the wstring (no longer needed)
delete pEntireDB;
return EXIT_SUCCESS;
}
You allocate memory here, in the do-while loop:
szStr = new TCHAR[currentElement.length()+1];
And you never release it with the delete operator