I have to read some data line by line from a large file (more than 7GB), it contains a list of vertex coordinates and face to vertex connectivity information to form a mesh. I am also learning how to use open, mmap on Linux and CreateFileA, CreateFileMapping, MapViewOfFile on Windows. Both Linux and Windows versions are 64bit compiled.
When I am on Linux (using docker) with g++-10 test.cpp -O3 -std=c++17 I get around 6s.
When I am on Windows (my actual PC) both with (version 19.29.30037 x64) cl test.cpp /EHsc /O3 /std:c++17 I get 13s, and with clang++-11 (from Visual Studio Build Tools) I get 11s.
Both systems (same PC, but one is using docker) use the same exact code except for generating the const char* that represents the memory array and the uint64_t size that reprents the memory size.
This is the way I switch platforms:
// includes for using a specific platform API
#ifdef _WIN32
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
// using windows handle void*
#define handle_type HANDLE
#else
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>
// using file descriptors
#define handle_type int
#endif
Specifically the code for getting the memory in an array of char-s is:
using uint_t = std::size_t;
// open the file -----------------------------------------------------------------------------
handle_type open(const std::string& filename) {
#ifdef _WIN32
// see windows file mapping api for parameter explanation
return ::CreateFileA(filename.c_str(), GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); // private access
#else
return ::open(filename.c_str(), O_RDONLY);
#endif
}
// get the memory size to later have a bound for reading -------------------------------------
uint_t memory_size(handle_type fid) {
#ifdef _WIN32
LARGE_INTEGER size{};
if (!::GetFileSizeEx(fid, &size)) {
std::cerr << "file not found\n";
return size.QuadPart;
}
return size.QuadPart;
#else
struct stat sb;
// get the file stats and check if not zero size
if (fstat(fid, &sb)) {
std::cerr << "file not found\n";
return decltype(sb.st_size){};
}
return sb.st_size;
#endif
}
// get the actual char array to access memory ------------------------------------------------
const char* memory_map(handle_type fid, uint_t memory_size) {
#ifdef _WIN32
HANDLE mapper = ::CreateFileMapping(fid, NULL, PAGE_READONLY, 0, 0, NULL);
return reinterpret_cast<const char*>(::MapViewOfFile(mapper, FILE_MAP_READ, 0, 0, memory_size));
#else
return reinterpret_cast<const char*>(::mmap(NULL, memory_size, PROT_READ, MAP_PRIVATE, fid, 0));
#endif
}
I am completely new to this sort of parsing and was wondering if I am doing something wrong in choosing the parameters in the Windows API (to mimic the behaviour of mmap) or if the difference in time is a matter of compilers/systems and have to accept it?
The actual time to open, get the memory size, and the memory map is negligible both on Linux and on Windows, the rest of the code is identical, as it only operates using the const char* and size_t info.
Thanks for taking the time to read. Any tip is greatly appreciated and sorry if anything is unclear.
Maybe you should take a look at https://github.com/alitrack/mman-win32 which is a mmap implementation for Windows. That way you don't need to write different code for Windows.
Related
I am writing a code for macOS application.
The application would be running on M1 based Macs as well as Intel based Macs also.
What would be the switch to differentiate M1 and Intel?
if (M1)
{
do something for M1
}
else if (Intel)
{
do something for Intel
}
I think, you can use __arm__ to detect arm architecture:
#ifdef __arm__
//do smth on arm (M1)
#else
//do smth on x86 (Intel)
#endif
I was just fooling around with this and found this reference for Objective-C from apple that seemed to work with clang for C++.
// Objective-C example
#include "TargetConditionals.h"
#if TARGET_OS_OSX
// Put CPU-independent macOS code here.
#if TARGET_CPU_ARM64
// Put 64-bit Apple silicon macOS code here.
#elif TARGET_CPU_X86_64
// Put 64-bit Intel macOS code here.
#endif
#elif TARGET_OS_MACCATALYST
// Put Mac Catalyst-specific code here.
#elif TARGET_OS_IOS
// Put iOS-specific code here.
#endif
https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary
I specifically checked to see if TARGET_CPU_ARM64 was defined in my header.
Hopefully this helps someone.
If you need a runtime check instead of compile time check, you can think of using something like below
#include <sys/sysctl.h>
#include <mach/machine.h>
int main(int argc, const char * argv[])
{
cpu_type_t type;
size_t size = sizeof(type);
sysctlbyname("hw.cputype", &type, &size, NULL, 0);
int procTranslated;
size = sizeof(procTranslated);
// Checks whether process is translated by Rosetta
sysctlbyname("sysctl.proc_translated", &procTranslated, &size, NULL, 0);
// Removes CPU_ARCH_ABI64 or CPU_ARCH_ABI64_32 encoded with the Type
cpu_type_t typeWithABIInfoRemoved = type & ~CPU_ARCH_MASK;
if (typeWithABIInfoRemoved == CPU_TYPE_X86)
{
if (procTranslated == 1)
{
cout << "ARM Processor (Running x86 application in Rosetta)";
}
else
{
cout << "Intel Processor";
}
}
else if (typeWithABIInfoRemoved == CPU_TYPE_ARM)
{
cout << "ARM Processor";
}
}
For some reason I can no longer compile a c file in my c++ clr console application. It worked before without the clr support, I also switched my project to compile as /TP still not working. Any help would be greatly appreciated.
Error
Severity Code Description Project File Line Suppression State
Error C2664 'int strcmp(const char *,const char *)': cannot convert argument 1 from 'WCHAR [260]' to 'const char *'
snowkill.c
#include "snowkill.h"
void killProcessByName(WCHAR *filename)
{
HANDLE hSnapShot = CreateToolhelp32Snapshot(TH32CS_SNAPALL, NULL);
PROCESSENTRY32 pEntry;
pEntry.dwSize = sizeof(pEntry);
BOOL hRes = Process32First(hSnapShot, &pEntry);
while (hRes)
{
if (strcmp(pEntry.szExeFile, filename) == 0)
{
HANDLE hProcess = OpenProcess(PROCESS_TERMINATE, 0,
(DWORD)pEntry.th32ProcessID);
if (hProcess != NULL && pEntry.th32ProcessID != GetCurrentProcessId())
{
TerminateProcess(hProcess, 9);
CloseHandle(hProcess);
}
}
hRes = Process32Next(hSnapShot, &pEntry);
}
CloseHandle(hSnapShot);
}
snowkill.h
#pragma once
#include "stdafx.h"
#include <windows.h>
#include <process.h>
#include <Tlhelp32.h>
#include <winbase.h>
#include <string.h>
#ifdef __cplusplus
extern "C" {
#endif
void killProcessByName(WCHAR *filename);
#ifdef __cplusplus
}
#endif
main.cpp
#include "stdafx.h"
#include "snowkill.h"
#include "motion.h"
#include "info.h"
#include "flushsound.h"
#include "snowserial.h"
using namespace System;
bool on() {
return true;
}
bool off() {
return false;
}
int main()
{
listenoncommport();
for (;;) {
string onoff = checkfile();
if (onoff == "1")
{
//detected();
}
else
{
WCHAR *proccc = L"firefox.exe";
killProcessByName(proccc);
//notdetected();
}
Sleep(5000);
}
return 0;
}
You could change every instance of WCHAR to TCHAR so text setting is "generic", or as already mentioned, change the project property character set to be Unicode only.
void killProcessByName(TCHAR *filename)
/* ... */
if (_tcscmp(pEntry.szExeFile, filename) == 0) /* replaced strcmp */
/* ... */
#include <windows.h> /* needed in order to use TEXT() macro */
/* ... */
TCHAR *proccc = TEXT("firefox.exe"); /* TEXT() is a <windows.h> macro */
Use TCHAR type everywhere if the functions involved are not WCHAR specific. That would allow project setting to build either ANSI/ASCII (not set) or Unicode.
Note that Process32First and Process32Next use TCHAR.
This is mostly for legacy, since Windows 2000 and later API functions use Unicode internally, converting ANSI/ASCII to Unicode as needed, while Windows NT and older API functions use ANSI/ASCII.
However, typically many or most text files (such as source code) are ANSI/ASCII and not Unicode, and it's awkward to have to support Unicode for Windows API and then ANSI/ASCII for text files in the same program, and for those projects I use ANSI/ASCII.
By using the TCHAR based generic types, I can share common code with projects that use Unicode and with projects that use ANSI/ASCII.
The error message is clear: you have an error at this precise line:
if (strcmp(pEntry.szExeFile, filename) == 0)
Because your arguments are not of char* type as expected by strcmp but WCHAR* types. You should use wcscmp instead, which is basically the same function, but working with wchar_t* type.
szExeFile in tagPROCESSENTRY32 is declared as TCHAR, which will be a 1-byte char when compiling with Character Set set to 'Not Set' or 'Multibyte'. Set Character Set in your project settings to Use Unicode Character Set to fix the problem.
Also, use wcscmp to compare WCHAR types.
I work on a linux platform and I use g++ with the above program that copies a function from the code area to the data area. How do I change protection of data segment in order to allow me to execute the copied function ?
The code is bellow:
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#define Return asm volatile("pop %rbp; retq; retq; retq; retq; retq;")
int64_t funcEnd=0xc35dc3c3c3c3c35d;
constexpr int maxCode=0x800;
int8_t code[maxCode];
void testCode(void){
int a=8,b=7;
a+=b*a;
Return;
}
typedef void (*action)(void);
int main(int argc, char **argv)
{
action a=&testCode;
testCode();
int8_t *p0=(int8_t*)a,*p=p0,*p1=p0+maxCode;
for(;p!=p1;p++)
if ( (*(int64_t*)p)==funcEnd ) break;
if(p!=p1){
p+=sizeof(int64_t);
printf("found\n");
memcpy(&code,(void*)a,p-(int8_t*)a);
((action)&code)();
}
printf("returning 0\n");
return 0;
}
It depends if you are trying to do this statically (at build-time), or at dynamically (at run-time).
Build-time
You need to tell GCC to put your blob in a section that is executable. We use __attribute__((section)), and this trick to specify the attributes of the section when we create it.
Run-time
TL;DR: Jump to the end of my answer, where I use mmap.
Although others might be questioning why you'd want do allow something like this at run-time, keep in mind that this is exactly what a VM with a JIT compiler (e.g. Java VM, .NET CLR, etc.) do when emitting native code.
You need to change the memory protections of the memory where you're trying to execute. We do that with mprotect(addr, PROT_EXEC). Note that addr must be aligned to the page size of your platform. On x86, the page size is 4K. We use aligned_alloc to guarantee this alignment.
Example (of both):
#define _ISOC11_SOURCE
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h> /* mprotect() */
__attribute__((section(".my_executable_blob,\"awx\",#progbits#")))
static uint8_t code[] = {
0xB8,0x2A,0x00,0x00,0x00, /* mov eax,0x2a */
0xC3, /* ret */
};
int main(void)
{
int (*func)(void);
/* Execute a static blob of data */
func = (void*)code;
printf("(static) code returned %d\n", func());
/* Execute a dynamically-allocated blob of data */
void *p = aligned_alloc(0x1000, sizeof(code));
if (!p) {
fprintf(stderr, "aligned_alloc() failed\n");
return 2;
}
memcpy(p, code, sizeof(code));
if (mprotect(p, sizeof(code), PROT_EXEC) < 0) {
perror("mprotect");
return 2;
}
func = p;
printf("(dynamic) code returned %d\n", func());
return 0;
}
Output:
$ ./a.out
(static) code returned 42
(dynamic) code returned 42
SELinux Impact
Note that this puts your executable code on the heap which might be a bit dangerous. SELinux on my CentOS 7 machine actually denied the mprotect call:
SELinux is preventing /home/jreinhart/so/a.out from using the execheap access on a process.
***** Plugin allow_execheap (53.1 confidence) suggests ********************
If you do not think /home/jreinhart/so/a.out should need to map heap memory that is both writable and executable.
Then you need to report a bug. This is a potentially dangerous access.
So I had to temporarily sudo setenforce 0 to get this to work.
I'm not sure why, however, because looking in /proc/[pid]/maps, the pages are clearly marked only as executable, not as "writable and executable" as SELinux indicated. If I move the memcpy after the mprotect, my process segfaults, because I'm trying to write to non-writable memory. So it seems SELinux is being a bit too over-zealous here.
Use mmap instead
Instead of mprotecting a region of the heap (allocated with aligned_alloc), it is more straightforward to use mmap. This also avoids any issues with SELinux, as we're not trying to execute on the heap.
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
#include <sys/mman.h> /* mmap() */
static uint8_t code[] = {
0xB8,0x2A,0x00,0x00,0x00, /* mov eax,0x2a */
0xC3, /* ret */
};
int main(void)
{
void *p = mmap(NULL, sizeof(code), PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (p==MAP_FAILED) {
fprintf(stderr, "mmap() failed\n");
return 2;
}
memcpy(p, code, sizeof(code));
int (*func)(void) = p;
printf("(dynamic) code returned %d\n", func());
pause();
return 0;
}
The final solution
The mmap solution is good, but it doesn't provide us any safety; our mmaped region of code is readable, writable, and executable. It would be better to only allow the memory to be writable while we're putting our code in place, then making it executable only. The following code does just that:
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
#include <sys/mman.h> /* mmap(), mprotect() */
static uint8_t code[] = {
0xB8,0x2A,0x00,0x00,0x00, /* mov eax,0x2a */
0xC3, /* ret */
};
int main(void)
{
const size_t len = sizeof(code);
/* mmap a region for our code */
void *p = mmap(NULL, len, PROT_READ|PROT_WRITE, /* No PROT_EXEC */
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (p==MAP_FAILED) {
fprintf(stderr, "mmap() failed\n");
return 2;
}
/* Copy it in (still not executable) */
memcpy(p, code, len);
/* Now make it execute-only */
if (mprotect(p, len, PROT_EXEC) < 0) {
fprintf(stderr, "mprotect failed to mark exec-only\n");
return 2;
}
/* Go! */
int (*func)(void) = p;
printf("(dynamic) code returned %d\n", func());
pause();
return 0;
}
I am trying to get the file size of a system application on windows. To test this i have created a test application that tries to get the file size of smss.exe in C:\Windows\System32\smss.exe but it fails with error: ERROR_FILE_NOT_FOUND. The file does actually exist (i have checked). I've also tried different methods for getting the file size, with: FindFirstFile, CreateFile and GetFileSizeEx. But all return the same error. I would also like to read the file contents.
What am i doing wrong?
The code:
// Test.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <windows.h>
#include <stdio.h>
#include <tchar.h>
#include <iostream>
__int64 getFileSize(LPWSTR filePath)
{
WIN32_FILE_ATTRIBUTE_DATA fad;
if (!GetFileAttributesEx(filePath, GetFileExInfoStandard, &fad))
{
_tprintf(TEXT("\n CAnt get file size for file %s error %d"), filePath, GetLastError());
return 0;
}
LARGE_INTEGER size;
size.HighPart = fad.nFileSizeHigh;
size.LowPart = fad.nFileSizeLow;
return size.QuadPart;
}
int _tmain(int argc, _TCHAR* argv[])
{
_tprintf(TEXT("File size %d "), getFileSize(L"C:\\Windows\\System32\\smss.exe"));
}
As your application is 32-bit, the system redirects your path to go to SysWOW64 instead, where there is no smss.exe. While you have discovered that Wow64DisableWow64FsRedirection disables this redirection, also consider that having a 64-bit program would also do the trick.
Getting the size of a file is already answered here (can't yet add a comment to your question, so I need to write it as an answer):
How can I get a file's size in C++?
std::ifstream::pos_type filesize(const char* filename)
{
std::ifstream in(filename, std::ifstream::in | std::ifstream::binary);
in.seekg(0, std::ifstream::end);
return in.tellg();
}
I want to get driver version of nVidia video card.
So I used WMI and get data from "DriverVersion" obejct of "Win32_VideoController" class.
But it was like "9.18.13.1106"(file version) and what I wanted is something like "311.06"(treiber version).
Where can I get that information?
If it is impossible on WMI, I want to know other way to get that.
Thanks.
You can do this using NVML from nVidia's Tesla Deployment Kit. You can retrieve the internal driver version (the one you're accustomed to seeing for an nVidia driver) with code like this:
#include <iostream>
#include <string>
#include <stdlib.h>
#include <nvml.h>
#include <windows.h>
namespace {
typedef nvmlReturn_t (*init)();
typedef nvmlReturn_t (*shutdown)();
typedef nvmlReturn_t (*get_version)(char *, unsigned);
class NVML {
init nvmlInit;
shutdown nvmlShutdown;
get_version nvmlGetDriverVersion;
std::string find_dll() {
std::string loc(getenv("ProgramW6432"));
loc += "\\Nvidia Corporation\\nvsmi\\nvml.dll";
return loc;
}
public:
NVML() {
HMODULE lib = LoadLibrary(find_dll().c_str());
nvmlInit = (init)GetProcAddress(lib, "nvmlInit");
nvmlShutdown = (shutdown)GetProcAddress(lib, "nvmlShutdown");
nvmlGetDriverVersion = (get_version)GetProcAddress(lib, "nvmlSystemGetDriverVersion");
if (NVML_SUCCESS != nvmlInit())
throw(std::runtime_error("Unable to initialize NVML"));
}
std::string get_ver() {
char buffer[81];
nvmlGetDriverVersion(buffer, sizeof(buffer));
return std::string(buffer);
}
~NVML() {
if (NVML_SUCCESS != nvmlShutdown())
throw(std::runtime_error("Unable to shut down NVML"));
}
};
}
int main() {
std::cout << "nVidia Driver version: " << NVML().get_ver();
}
Note that if you're writing this purely for your own use on a machine where you're free to edit the PATH, you can simplify this quite a bit. Most of the code deals with the fact that this uses NVML.DLL, which is in a directory that's not normally on the path, so the code loads that dynamically, and uses GetProcAddress to find the functions in it that we need to use. In this case, we're only using three functions, so it's not all that difficult to deal with, but it still at drastically increases the length of the code.
If we could ignore all that nonsense, the real code would just come out to something on this general order:
nvmlInit();
nvmlSystemGetDriverVersion(result, sizeof(result));
std::cout << result;
nvmlShutdown();
Anyway, to build it, you'll need a command line something like:
cl -Ic:\tdk\nvml\include nv_driver_version.cpp
...assuming you've installed the Tesla Deployment Kit at c:\tdk.
In any case, yes, I've tested this to at least some degree. On my desktop it prints out:
nVidia Driver version: 314.22
...which matches what I have installed.
To get the Nvidia driver version through C++ on Win64:
Download NVAPI https://developer.nvidia.com/rtx/path-tracing/nvapi/get-started, a few MB
The main folder of the downloaded archive contains several header files, one of which is nvapi.h. Those headers are needed for compilation. The subfolder amd64 contains nvapi64.lib, which is needed for linking. The following code will now show the driver version:
#include <iostream>
extern "C" {
#include "nvapi.h"
}
int main() {
NvAPI_Status status = NVAPI_OK;
NvAPI_ShortString str;
status = NvAPI_Initialize();
if (status == NVAPI_LIBRARY_NOT_FOUND) {
//in this case NvAPI_GetErrorMessage() will only return an empty string
std::printf("error no nvidia driver found\n");
} else if (status != NVAPI_OK) {
NvAPI_GetErrorMessage(status, str);
std::printf("error initializing nvapi: %s\n", str);
}
NvU32 version = 0;
NvAPI_ShortString branch;
status = NvAPI_SYS_GetDriverAndBranchVersion(&version, branch);
if (status != NVAPI_OK) {
NvAPI_GetErrorMessage(status, str);
std::printf("error getting driver version: %s\n", str);
} else {
std::printf("driver version %d.%d", version / 100, version % 100);
}
}