How to accelerate C++ writing speed to the speed tested by CrystalDiskMark? - c++

Now I get about 3.6GB data per second in memory, and I need to write them on my SSD continuously. I used CrystalDiskMark to test the writing speed of my SSD, it is almost 6GB per second, so I had thought this work should not be that hard.
![my SSD test result][1]:
[1] "test result":
My computer is Windows 10, using Visual Studio 2017 community.
I found this question and tried the highest voted answer. Unfortunately, the writing speed was only about 1s/GB for his option_2, far slower than tested by CrystalDiskMark. And then I tried memory mapping, this time writing becomes faster, about 630ms/GB, but still much slower. Then I tried multi-thread memory mapping, it seems that when the number of threads is 4, the speed was about 350ms/GB, and when I add the threads' number, the writing speed didn't go up anymore.
Code for memory mapping:
#include <fstream>
#include <chrono>
#include <vector>
#include <cstdint>
#include <numeric>
#include <random>
#include <algorithm>
#include <iostream>
#include <cassert>
#include <thread>
#include <windows.h>
#include <sstream>
// Generate random data
std::vector<int> GenerateData(std::size_t bytes) {
assert(bytes % sizeof(int) == 0);
std::vector<int> data(bytes / sizeof(int));
std::iota(data.begin(), data.end(), 0);
std::shuffle(data.begin(), data.end(), std::mt19937{ std::random_device{}() });
return data;
// Memory mapping
int map_write(int* data, int size, int id){
char* name = (char*)malloc(100);
sprintf_s(name, 100, "D:\\data_%d.bin",id);
return -1;
DWORD dwFileSize = size;
char* rname = (char*)malloc(100);
sprintf_s(rname, 100, "data_%d.bin", id);
HANDLE hFileMap = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, dwFileSize, rname);//create file
if (hFileMap == NULL) {
return -2;
PVOID pvFile = MapViewOfFile(hFileMap, FILE_MAP_WRITE, 0, 0, 0);//Acquire the address of file on disk
if (pvFile == NULL) {
return -3;
PSTR pchAnsi = (PSTR)pvFile;
memcpy(pchAnsi, data, dwFileSize);//memery copy
return 0;
// Multi-thread memory mapping
void Mem2SSD_write(int* data, int size){
int part = size / sizeof(int) / 4;
int index[4];
index[0] = 0;
index[1] = part;
index[2] = part * 2;
index[3] = part * 3;
std::thread ta(map_write, data + index[0], size / 4, 10);
std::thread tb(map_write, data + index[1], size / 4, 11);
std::thread tc(map_write, data + index[2], size / 4, 12);
std::thread td(map_write, data + index[3], size / 4, 13);
int main() {
const std::size_t kB = 1024;
const std::size_t MB = 1024 * kB;
const std::size_t GB = 1024 * MB;
for (int i = 0; i < 10; ++i) {
std::vector<int> data = GenerateData(1 * GB);
auto startTime = std::chrono::high_resolution_clock::now();
Mem2SSD_write(&data[0], 1 * GB);
auto endTime = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime).count();
std::cout << "1G writing cost: " << duration << " ms" << std::endl;
return 0;
So I'd like to ask, is there any faster writing method for C++ to writing huge files? Or, why can't I write as fast as tested by CrystalDiskMark? How does CrystalDiskMark write?
Any help would be greatly appreciated. Thank you!

first of all this is not c++ question but os related question. for get maximum performance need need use os specific low level api call, which not exist in general c++ libs. from your code clear visible that you use windows api, so search solution for windows how minimum.
from CreateFileW function:
the flags give maximum asynchronous performance, because the I/O does
not rely on the synchronous operations of the memory manager.
so we need use combination of this 2 flags in call CreateFileW or FILE_NO_INTERMEDIATE_BUFFERING in call NtCreateFile
also extend file size and valid data length take some time, so better if final file at begin is known - just set file final size via NtSetInformationFile with FileEndOfFileInformation
or via SetFileInformationByHandle with FileEndOfFileInfo. and then set valid data length with SetFileValidData or via NtSetInformationFile with FileValidDataLengthInformation. set valid data length require SE_MANAGE_VOLUME_NAME privilege enabled when opening a file initially (but not when call SetFileValidData)
also look for file compression - if file compressed (it will be compressed by default if created in compressed folder) this is very slow writting. so need disbale file compression via FSCTL_SET_COMPRESSION
then when we use asynchronous I/O (fastest way) we not need create several dedicated threads. instead we need determine number of I/O requests run in concurrent. if you use CrystalDiskMark it actually run CdmResource\diskspd\diskspd64.exe for test and this is coresponded to it -o<count> parameter (run diskspd64.exe /? > h.txt for look parameters list).
use non Buffering I/O make task more hard, because exist 3 additional requirements:
Any ByteOffset passed to WriteFile must be a multiple of the sector
The Length passed to WriteFile must be an integral of the sector
Buffers must be aligned in accordance with the alignment requirement
of the underlying device. To obtain this information, call
NtQueryInformationFile with FileAlignmentInformation
or GetFileInformationByHandleEx with FileAlignmentInfo
in most situations, page-aligned memory will also be sector-aligned,
because the case where the sector size is larger than the page size is
so almost always buffers allocated with VirtualAlloc function and multiple page size (4,096 bytes ) is ok. in concrete test for smaller code size i use this assumption
struct WriteTest
enum { opCompression, opWrite };
WriteTest* pTest;
ULONG opcode;
ULONG offset;
LONGLONG _TotalSize, _BytesLeft;
HANDLE _hFile;
ULONG64 _StartTime;
void* _pData;
REQUEST* _pRequests;
ULONG _BlockSize;
ULONG _ConcurrentRequestCount;
ULONG _dwThreadId;
LONG _dwRefCount;
WriteTest(ULONG BlockSize, ULONG ConcurrentRequestCount)
if (BlockSize & (BlockSize - 1))
_BlockSize = BlockSize, _ConcurrentRequestCount = ConcurrentRequestCount;
_dwRefCount = 1, _hFile = 0, _pRequests = 0, _pData = 0;
_dwThreadId = GetCurrentThreadId();
if (_pData)
VirtualFree(_pData, 0, MEM_RELEASE);
if (_pRequests)
delete [] _pRequests;
if (_hFile)
PostThreadMessageW(_dwThreadId, WM_QUIT, 0, 0);
void Release()
if (!InterlockedDecrement(&_dwRefCount))
delete this;
void AddRef()
void StartWrite()
fvdl.ValidDataLength.QuadPart = _TotalSize;
NTSTATUS status;
if (0 > (status = NtSetInformationFile(_hFile, &iosb, &_TotalSize, sizeof(_TotalSize), FileEndOfFileInformation)) ||
0 > (status = NtSetInformationFile(_hFile, &iosb, &fvdl, sizeof(fvdl), FileValidDataLengthInformation)))
DbgPrint("FileValidDataLength=%x\n", status);
ULONG offset = 0;
ULONG dwNumberOfBytesTransfered = _BlockSize;
_BytesLeft = _TotalSize + dwNumberOfBytesTransfered;
ULONG ConcurrentRequestCount = _ConcurrentRequestCount;
REQUEST* irp = _pRequests;
_StartTime = GetTickCount64();
irp->opcode = opWrite;
irp->pTest = this;
irp->offset = offset;
offset += dwNumberOfBytesTransfered;
} while (--ConcurrentRequestCount);
void FillBuffer(PULONGLONG pu, LONGLONG ByteOffset)
ULONG n = _BlockSize / sizeof(ULONGLONG);
*pu++ = ByteOffset, ByteOffset += sizeof(ULONGLONG);
} while (--n);
void DoWrite(REQUEST* irp)
LONG BlockSize = _BlockSize;
LONGLONG BytesLeft = InterlockedExchangeAddNoFence64(&_BytesLeft, -BlockSize) - BlockSize;
if (0 < BytesLeft)
ByteOffset.QuadPart = _TotalSize - BytesLeft;
PVOID Buffer = RtlOffsetToPointer(_pData, irp->offset);
FillBuffer((PULONGLONG)Buffer, ByteOffset.QuadPart);
NTSTATUS status = NtWriteFile(_hFile, 0, 0, irp, irp, Buffer, BlockSize, &ByteOffset, 0);
if (0 > status)
OnComplete(status, 0, irp);
else if (!BytesLeft)
// write end
ULONG64 time = GetTickCount64() - _StartTime;
WCHAR sz[64];
StrFormatByteSizeW((_TotalSize * 1000) / time, sz, RTL_NUMBER_OF(sz));
DbgPrint("end:%S\n", sz);
static VOID NTAPI _OnComplete(
_In_ NTSTATUS status,
_In_ ULONG_PTR dwNumberOfBytesTransfered,
_Inout_ PVOID Ctx
reinterpret_cast<REQUEST*>(Ctx)->pTest->OnComplete(status, dwNumberOfBytesTransfered, reinterpret_cast<REQUEST*>(Ctx));
VOID OnComplete(NTSTATUS status, ULONG_PTR dwNumberOfBytesTransfered, REQUEST* irp)
if (0 > status)
DbgPrint("OnComplete[%x]: %x\n", irp->opcode, status);
switch (irp->opcode)
case opCompression:
case opWrite:
if (dwNumberOfBytesTransfered == _BlockSize)
DbgPrint(":%I64x != %x\n", dwNumberOfBytesTransfered, _BlockSize);
if (!(_pRequests = new REQUEST[_ConcurrentRequestCount]) ||
!(_pData = VirtualAlloc(0, _BlockSize * _ConcurrentRequestCount, MEM_COMMIT, PAGE_READWRITE)))
ULONGLONG sws = _BlockSize - 1;
_TotalSize = as.QuadPart = (size + sws) & ~sws;
NTSTATUS status = NtCreateFile(&hFile,
poa, &iosb, &as, 0, 0, FILE_OVERWRITE_IF,
if (0 > status)
return status;
_hFile = hFile;
if (0 > (status = RtlSetIoCompletionCallback(hFile, _OnComplete, 0)))
return status;
REQUEST* irp = _pRequests;
irp->pTest = this;
irp->opcode = opCompression;
status = NtFsControlFile(hFile, 0, 0, irp, irp, FSCTL_SET_COMPRESSION, &cmp, sizeof(cmp), 0, 0);
if (0 > status)
OnComplete(status, 0, irp);
return status;
void WriteSpeed(POBJECT_ATTRIBUTES poa, ULONGLONG size, ULONG BlockSize, ULONG ConcurrentRequestCount)
if (0 <= status)
if (WriteTest * pTest = new WriteTest(BlockSize, ConcurrentRequestCount))
status = pTest->Create(poa, size);
if (0 <= status)
MessageBoxW(0, 0, L"Test...", MB_OK|MB_ICONINFORMATION);

These are the suggestions that come to my mind:
stop all running processes that are using the disk, in particular
disable Windows Defender realtime protection (or other anti virus/malware)
disable pagefile
use Windows Resource Monitor to find processes reading or writing to your disk
make sure you write continuous sectors on disk
don't take into account file opening and closing times
do not use multithreading (your disk is using DMA so the CPU won't matter)
write data that is in RAM (obviously)
be sure to disable all debugging features when building (build a release)
if using M.2 PCIe disk (seems to be your case) make sure other PCIe
devices aren't stealing PCIe lanes to your disk (the CPU has a
limited number AND mobo too)
don't run the test from your IDE
disable Windows file indexing
Finally, you can find good hints on how to code fast writes in C/C++ in this question's thread: Writing a binary file in C++ very fast

One area that might give you improvement is to have your threads running constantly and each reading from a queue.
At the moment every time you go to write you spawn 4 threads (which is slow) and then they're deconstructed at the end of the function. You'll see a speedup of at least the cpu time of your function if you spawn the threads at the start and have them all reading from separate queue's in an infinite loop.
They'll simply check after a SMALL delay if there's anything in their queue, if their is they'll write it all. Your only issue then is making sure order of data is maintained.


Writing to read only address with kernel driver c++

Im trying to write memory to a user mode process with kernel driver,
the current address im trying to write memory for is read only, I want to write 4 bytes to the current address,
the thing is if i change protection ( page ) of the process with VirtualProtectEx , it works and it writes the memory but this is only on user mode level, my intention is to change the protection of the process from kernel mode, i want to make it READWRITE, then change it back to READ from kernel space,
Now what I tried to do is giving me a BSOD ( blue screen of death ) with error : Kmode_exception_not_handld
I cant understand what in my code is triggering this BSOD my PC have very limited specs and i cant debug in VM to know..
I will write the code that works but in user mode , and what the code is not working for me in kernel space:
here the code that works:
void dispatch::handler(void* info_struct)
PINFO_STRUCT info = (PINFO_STRUCT)info_struct;
if (info->code == CODE_READ_MEMORY)
PEPROCESS target_process = NULL;
if (NT_SUCCESS(PsLookupProcessByProcessId((HANDLE)info->process_id, &target_process)))
memory::read_memory(target_process, (void*)info->address, &info->buffer, info->size);
DbgPrintEx(0, 0, "[TEST]: Read Memory\n");
else if (info->code == CODE_WRITE_MEMORY)
PEPROCESS target_process = NULL;
if (NT_SUCCESS(PsLookupProcessByProcessId((HANDLE)info->process_id, &target_process)))
memory::write_memory(target_process, &info->buffer, (void*)info->address, info->size);
DbgPrintEx(0, 0, "[TEST]: Write Memory\n");
NTSTATUS memory::write_memory(PEPROCESS target_process, void* source, void* target, size_t size)
if (!target_process) { return STATUS_INVALID_PARAMETER; }
size_t bytes = 0;
NTSTATUS status = MmCopyVirtualMemory(IoGetCurrentProcess(), source, target_process, target, size, KernelMode, &bytes);
if (!NT_SUCCESS(status) || !bytes)
return status;
int main()
DWORD oldprt
ULONG writeTest1 = 3204497152;
VirtualProtectEx(ProcManager::hProcess, (PVOID)(testAddr), 4, PAGE_READWRITE, &oldprt);
driver_control::write_memory(process_id, testAddr, writeTest1);
VirtualProtectEx(ProcManager::hProcess, (PVOID)(testAddr), 4, PAGE_READONLY, &oldprt);
return 0;
Now what I want to do is stop using the VirtualProtectEx, and change the PAGE protection to READWRITE from kernel space, so what i did is add this in the dispatch::handler function:
else if (info->code == CODE_WRITE_MEMORY)
PEPROCESS target_process = NULL;
if (NT_SUCCESS(PsLookupProcessByProcessId((HANDLE)info->process_id, &target_process)))
PMDL Mdl = IoAllocateMdl((void*)info->address, info->size, FALSE, FALSE, NULL);
if (!Mdl)
return false;
// Locking and mapping memory with RW-rights:
MmProbeAndLockPages(Mdl, KernelMode, IoReadAccess);
PVOID Mapping = MmMapLockedPagesSpecifyCache(Mdl, KernelMode, MmNonCached, NULL, FALSE, NormalPagePriority);
MmProtectMdlSystemAddress(Mdl, PAGE_READWRITE);
memory::write_memory(target_process, &info->buffer, (void*)info->address, info->size);
// Resources freeing:
MmUnmapLockedPages(Mapping, Mdl);
DbgPrintEx(0, 0, "[TEST]: Write Read Only Memory\n");
So this what I've added caused the BSOD, but why I cannot understand, what am i doing wrong here?
here is the info struct if needed to understand more, :
#define CODE_READ_MEMORY 0x1
typedef struct _INFO_STRUCT
ULONG code;
ULONG process_id;
ULONG address;
ULONG buffer;
ULONG size;
any suggestions on solving this problem?

How to record continuous raw audio data into a circular buffer with C++ on Windows 10?

Since Windows Multimedia turned out to be utterly incapable of recording continuous audio, I got the hint to use Windows Core Audio. There is sort of a manual here, but I can't figure out how to write the loads of overhead code to get the recording working. Can anyone provide a complete, minimal implementation of continuous audio recording to a circular buffer?
So far I am stuck at the code below not getting past the line pEnumerator->GetDefaultAudioEndpoint(eRender, eConsole, &pDevice); because pEnumerator remains nullptr.
#include <Windows.h>
#include <Audioclient.h>
#include <Mmdeviceapi.h>
#define REFTIMES_PER_SEC 10000000
int main() {
UINT32 bufferFrameCount;
UINT32 numFramesAvailable;
IMMDeviceEnumerator* pEnumerator = NULL;
IMMDevice* pDevice = NULL;
IAudioClient* pAudioClient = NULL;
IAudioCaptureClient* pCaptureClient = NULL;
UINT32 packetLength = 0;
BYTE* pData;
DWORD flags;
CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_ALL, __uuidof(IMMDeviceEnumerator), (void**)&pEnumerator);
pEnumerator->GetDefaultAudioEndpoint(eRender, eConsole, &pDevice);
pDevice->Activate(__uuidof(IAudioClient), CLSCTX_ALL, NULL, (void**)&pAudioClient);
pAudioClient->Initialize(AUDCLNT_SHAREMODE_SHARED, AUDCLNT_STREAMFLAGS_LOOPBACK, hnsRequestedDuration, 0, pwfx, NULL);
pAudioClient->GetBufferSize(&bufferFrameCount); // Get the size of the allocated buffer.
pAudioClient->GetService(__uuidof(IAudioCaptureClient), (void**)&pCaptureClient);
// Calculate the actual duration of the allocated buffer.
REFERENCE_TIME hnsActualDuration = (double)REFTIMES_PER_SEC* bufferFrameCount / pwfx->nSamplesPerSec;
pAudioClient->Start(); // Start recording.
// Each loop fills about half of the shared buffer.
while(true) {
// Sleep for half the buffer duration.
while(packetLength != 0) {
// Get the available data in the shared buffer.
pCaptureClient->GetBuffer(&pData, &numFramesAvailable, &flags, NULL, NULL);
pData = NULL; // Tell CopyData to write silence.
// Copy the available capture data to the audio sink.
//hr = pMySink->CopyData(pData, numFramesAvailable, &bDone);
return 0;
EDIT (24.07.2021):
Here is an update of my code for troubleshooting:
#include <Windows.h>
#include <Audioclient.h>
#include <Mmdeviceapi.h>
#include <chrono>
class Clock {
typedef chrono::high_resolution_clock clock;
chrono::time_point<clock> t;
Clock() { start(); }
void start() { t = clock::now(); }
double stop() const { return chrono::duration_cast<chrono::duration<double>>(clock::now()-t).count(); }
const uint base = 4096;
const uint sample_rate = 48000; // must be supported by microphone
const uint sample_size = 1*base; // must be a power of 2
const uint bandwidth = 5000; // must be <= sample_rate/2
float* wave = new float[sample_size]; // circular buffer
void fill(float* const wave, const float* const buffer, int offset) {
for(int i=sample_size; i>=offset; i--) {
wave[i] = wave[i-offset];
for(int i=0; i<offset; i++) {
const uint p = offset-1-i;
wave[i] = 0.5f*(buffer[2*p]+buffer[2*p+1]); // left and right channels
int main() {
for(uint i=0; i<sample_size; i++) wave[i] = 0.0f;
Clock clock;
#define REFTIMES_PER_SEC 10000000
UINT32 bufferFrameCount;
UINT32 numFramesAvailable;
IMMDeviceEnumerator* pEnumerator = NULL;
IMMDevice* pDevice = NULL;
IAudioClient* pAudioClient = NULL;
IAudioCaptureClient* pCaptureClient = NULL;
UINT32 packetLength = 0;
BYTE* pData;
DWORD flags;
CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_ALL, __uuidof(IMMDeviceEnumerator), (void**)&pEnumerator);
pEnumerator->GetDefaultAudioEndpoint(eRender, eConsole, &pDevice);
pDevice->Activate(__uuidof(IAudioClient), CLSCTX_ALL, NULL, (void**)&pAudioClient);
println(pwfx->wFormatTag);// 65534
println(WAVE_FORMAT_PCM);// 1
println(pwfx->nChannels);// 2
println((uint)pwfx->nSamplesPerSec);// 48000
println(pwfx->wBitsPerSample);// 32
println(pwfx->nBlockAlign);// 8
println(pwfx->wBitsPerSample*pwfx->nChannels/8);// 8
println((uint)pwfx->nAvgBytesPerSec);// 384000
println((uint)(pwfx->nBlockAlign*pwfx->nSamplesPerSec*pwfx->nChannels));// 768000
println(pwfx->cbSize);// 22
pAudioClient->Initialize(AUDCLNT_SHAREMODE_SHARED, AUDCLNT_STREAMFLAGS_LOOPBACK, hnsRequestedDuration, 0, pwfx, NULL);
pAudioClient->GetBufferSize(&bufferFrameCount); // Get the size of the allocated buffer.
pAudioClient->GetService(__uuidof(IAudioCaptureClient), (void**)&pCaptureClient);
// Calculate the actual duration of the allocated buffer.
//REFERENCE_TIME hnsActualDuration = (double)REFTIMES_PER_SEC* bufferFrameCount / pwfx->nSamplesPerSec;
pAudioClient->Start(); // Start recording.
while(running) {
pCaptureClient->GetNextPacketSize(&packetLength); // packetLength and numFramesAvailable are either 0 or 480
pCaptureClient->GetBuffer(&pData, &numFramesAvailable, &flags, NULL, NULL);
const int offset = (uint)numFramesAvailable;
if(offset>0) {
fill(wave, (float*)pData, offset); // here I add pData to the circular buffer "wave"
while(packetLength != 0) {
pCaptureClient->GetBuffer(&pData, &numFramesAvailable, &flags, NULL, NULL); // Get the available data in the shared buffer.
pData = NULL; // Tell CopyData to write silence.
You're not calling CoInitializeEx, so all COM calls will fail.
You should also be testing all calls to see if they return an error.
To address the questions posed in the comments:
I believe that if you want to operate the endpoint in shared mode then you have to use the parameters returned by GetFixFormat. This means that:
you are limited to the one sample rate (unless you write code to perform a conversion, which is a non-trivial task)
if you want the samples as floats, you will have to convert them yourself
To write code that runs on all machines, you must cater for whatever the mix format throws at you. This might be:
16 bit integers
24 bit integers (nBlockAlign = 3)
24 bit integers in 32 bit containers (nBlockAlign = 4)
32 bit integers
32 bit floating point (rare)
64 bit floating point (unheard of, in my experience)
The samples will be in the native byte order of the machine your code is running on, and are interleaved.
So, case out on the various parameters in pwfx and write the relevant code for each sample format you want to support.
Assuming you want your floats to be normalised to -1 .. +1, and 2-channel input data, you might do this for 16 bit integers, for example:
const int16_t *inbuf = (const int16_t *) pData;
float *outbuf = ...;
for (int i = 0; i < numFramesAvailable * 2; ++i)
int16_t sample = *inbuf++;
*outbuf++ = (float) (sample * (1.0 / 32767));
Note that I avoid a (slow) floating point division by multiplying by the reciprocal (the compiler will pre-calculate 1.0 / 32767).
I'll leave the rest to you.
You could use this audio library instead. Its way easier to get up and running than trying to interface with the platform specific SDKs:
Also, while removing the sleep might not help in your example you should never call sleep, lock a mutex, or allocate memory during audio processing. The delay introduced by those is completely arbitrary compared to the short buffer times, so will always create problems for you.

Bluetooth LE: setting characteristic to byte array sends wrong values

I am using Bluetoothleapis.h to communicate with a custom Bluetooth Low Energy device.
The device is setup the following way:
Custom GATT service
Characteristic#1 Read/Write (expects 3 bytes)
Characteristic#2 Read/Notify (returns 1 byte)
I am able get proper values from characteristic#2. However, when I try to send data to characteristic#1, the device receives weird data.
The characteristic is responsible for 3 parameters of a real-life object (imagine a light with intensity, color, etc). (0,0,0) should respond to the "light" being off, but if I send the (0,0,0), I can see that the device receives something else (I cannot tell what exactly, but it is not off). The state does not seem to change no matter what values I send.
I have tried alternating between write and write-no-response, both produce the same result.
GetCharacteristicValue interestingly returns a charValueDataSize of 8, even though the characteristic is known to accept only 3 bytes. Coincidentally, the size for the 1-byte read-only characteristic is 9, for some reason.
I have tried limiting the size of the WriteValue to only 3 bytes, but in this case I get an invalid argument error. Answers elsewhere on StackOverflow have indicated that I need to use the one I get from GetCharacteristicValue, and transfer my data into there.
Given the fact that the real object's state does not change no matter which values are sent, I suspect that the problem is somewhere with the way I set up the byte array to transfer the data.
Furthermore, calling GetCharacteristicValue even after setting it returns an empty array.
I am not sure what values are actually being sent, and I lack the hardware to track them via Wireshark.
DWORD WriteValueToCharacteristic(__in const HANDLE deviceHandle,
__in const CharacteristicData* pCharData,
__in const UCHAR* writeBuffer,
__in const USHORT bufferSize,
__in const BOOL noResponse )
PBTH_LE_GATT_CHARACTERISTIC pCharacteristic = pCharData->pCharacteristic;
USHORT charValueDataSize;
hr = BluetoothGATTGetCharacteristicValue
Log(L"BluetoothGATTSetCharacteristicValue returned error %d", hr);
return -1;
GetProcessHeap(), HEAP_ZERO_MEMORY, charValueDataSize + sizeof(BTH_LE_GATT_CHARACTERISTIC_VALUE)
if (pWriteValue == NULL)
Log(L"Out of memory.");
return -1;
hr = BluetoothGATTGetCharacteristicValue
memcpy(pWriteValue->Data, writeBuffer, bufferSize);
hr = BluetoothGATTSetCharacteristicValue
if (hr != S_OK)
Log(L"BluetoothGATTSetCharacteristicValue returned error %d", hr);
return -1;
HeapFree(GetProcessHeap(), 0, pWriteValue);
SetCharacteristicValue returns S_OK, producing no errors.
Both reading and writing to the characteristic work fine when using a BLE app on Android.
Update 1
#Shubham pointed out it might be an endianness issue, so I tried to substitute memcpy for the following:
int j = 0;
int i = charValueDataSize - 1;
while (j < bufferSize)
pWriteValue->Data[i] = writeBuffer[j];
However, nothing changed.
Update 2
I have incorporated the changes as per emil's suggestion, and it worked! Posting the full code in case somebody else experiences the same issue.
Incidentally, even though the characteristic is marked as Writable: true, Writable-no-response: false, I need to set the flags to no-response in order for the values to get sent.
DWORD WriteValueToCharacteristic(__in const HANDLE deviceHandle, __in const CharacteristicData* pCharData, __in const UCHAR* writeBuffer, __in const USHORT bufferSize, __in const BOOL noResponse)
PBTH_LE_GATT_CHARACTERISTIC pCharacteristic = pCharData->pCharacteristic;
USHORT charValueDataSize = 512;
GetProcessHeap(), HEAP_ZERO_MEMORY, charValueDataSize + sizeof(BTH_LE_GATT_CHARACTERISTIC_VALUE)
if (pWriteValue == NULL)
Log(L"Out of memory.");
return -1;
hr = BluetoothGATTGetCharacteristicValue
if (bufferSize > pWriteValue->DataSize)
if(pWriteValue->DataSize == 0)
pWriteValue->DataSize = bufferSize;
// after the first write, DataSize stays as 3
//pWriteValue->DataSize here is 3, as expected
//buffer size is also 3
memcpy(pWriteValue->Data, writeBuffer, bufferSize);
hr = BluetoothGATTSetCharacteristicValue
if (hr != S_OK)
Log(L"BluetoothGATTSetCharacteristicValue returned error %d", hr);
HeapFree(GetProcessHeap(), 0, pWriteValue);
return -1;
HeapFree(GetProcessHeap(), 0, pWriteValue);
My suggestion is that you first set charValueDataSize to 512 + sizeof(BTH_LE_GATT_CHARACTERISTIC_VALUE) (maximum possible), and skip the initial read that would get the size. Then check pWriteValue->DataSize to get the actual size after a successful read. Also make sure you free your memory even in case of error.

SHCreateStreamOnFileEx on files larger than 2**32 bytes

I'm getting an IStream for a file using SHCreateStreamOnFileEx, but its Read() method appears to misbehave on extremely large files when the new position of the seek pointer is 2 ** 32 bytes or further into the file.
ISequentialStream::Read's documentation says:
This method adjusts the seek pointer by the actual number of bytes read.
This is the same behaviour as read(2) and fread(3) on all platforms I'm aware of.
But with these streams, this isn't the actual behaviour I see in some cases:
Seek(2 ** 32 - 2, SEEK_SET, &pos), Read(buf, 1, &bytesRead), Seek(0, MOVE_CUR, &pos) → bytesRead == 1 and pos == 2 ** 32 - 1, as expected.
Seek(2 ** 32 - 1, SEEK_SET, &pos), Read(buf, 1, &bytesRead), Seek(0, MOVE_CUR, &pos) → bytesRead == 1, but pos == (2 ** 32 - 1) + 4096, which is incorrect. This means that any subsequent reads (without another Seek to fix the cursor position) read the wrong data, and my application doesn't work!
Am I “holding it wrong”? Is there some flag I need to set to make this class behave properly? Or is this a bug in Shlwapi.dll?
The code below reproduces this problem for me. (Set OFFSET = WORKS to see the successful case.)
#include "stdafx.h"
static const int64_t TWO_THIRTY_TWO = 4294967296LL;
static const int64_t WORKS = TWO_THIRTY_TWO - 2LL;
static const int64_t FAILS = TWO_THIRTY_TWO - 1LL;
static const int64_t OFFSET = FAILS;
static void checkPosition(CComPtr< IStream > fileStream, ULONGLONG expectedPosition)
move.QuadPart = 0;
HRESULT hr = fileStream->Seek(move, SEEK_CUR, &newPosition);
ULONGLONG error = newPosition.QuadPart - expectedPosition;
ASSERT(error == 0);
int main()
const wchar_t *path = /* path to a file larger than 2**32 bytes */ L"C:\\users\\wjt\\Desktop\\eos-eos3.1-amd64-amd64.170216-122002.base.img";
CComPtr< IStream > fileStream;
hr = SHCreateStreamOnFileEx(path, STGM_READ, FILE_ATTRIBUTE_NORMAL, FALSE, NULL, &fileStream);
// Advance
move.QuadPart = OFFSET;
hr = fileStream->Seek(move, SEEK_SET, &newPosition);
ASSERT(newPosition.QuadPart == OFFSET);
// Check position
checkPosition(fileStream, OFFSET);
// Read
char buf[1];
ULONG bytesRead = 0;
hr = fileStream->Read(buf, 1, &bytesRead);
ASSERT(bytesRead == 1);
// Check position: this assertion fails if the Read() call moves the cursor
// across the 2**32 byte boundary
checkPosition(fileStream, OFFSET + 1);
return 0;
this is really windows bug. tested on several windows version including latest SHCore.DLL version 10.0.14393.0 x64. simple way for reproduce:
void BugDemo(PCWSTR path)
ULONG dwBytesRet;
// i not want really take disk space
if (DeviceIoControl(hFile, FSCTL_SET_SPARSE, NULL, 0, NULL, 0, &dwBytesRet, NULL))
static FILE_END_OF_FILE_INFO eof = { 0, 2 };// 8GB
if (SetFileInformationByHandle(hFile, FileEndOfFileInfo, &eof, sizeof(eof)))
IStream* pstm;
if (!SHCreateStreamOnFileEx(path, STGM_READ|STGM_SHARE_DENY_NONE, 0,FALSE, NULL, &pstm))
LARGE_INTEGER pos = { 0xffffffff };
if (!pstm->Seek(pos, STREAM_SEEK_SET, &newpos) && !pstm->Read(&newpos, 1, &dwBytesRet))
pos.QuadPart = 0;
if (!pstm->Seek(pos, STREAM_SEEK_CUR, &newpos))
DbgPrint("newpos={%I64x}\n", newpos.QuadPart);//newpos={100000fff}
// close and delete
void BugDemo()
if (ULONG len = GetTempPath(RTL_NUMBER_OF(path), path))
if (len + 16 < MAX_PATH)
swprintf(path + len, L"%08x%08x", ~ft.dwLowDateTime, ft.dwHighDateTime);
I trace virtual long CFileStream::Seek(LARGE_INTEGER, ULONG, ULARGE_INTEGER* ); under debugger and can confirm that this function not design to work with files more than 4GB size
if be more exactly, why is 100000FFF offset - CFileStream use internal buffer for read 1000 byte size. when you ask read 1 byte from FFFFFFFF offset - it actually read 1000 bytes to the buffer and file offset become 100000FFF. when you then call Seek(0, STREAM_SEEK_CUR, &newpos) - CFileStream call SetFilePointer(hFile, 1-1000, 0/*lpDistanceToMoveHigh*/, FILE_CURRENT)
(1 this is internal position in buffer, because we read 1 byte minus buffer size 1000) . if not take to account overflow can be (100000FFF + (1 - 1000)) == 100000000 but
read about SetFilePointer
If lpDistanceToMoveHigh is NULL and the new file position does not fit
in a 32-bit value, the function fails and returns
as result SetFilePointer fail (return INVALID_SET_FILE_POINTER) but CFileStream even not check for this. and then it call SetFilePointerEx(hFile, 0, &newpos, FILE_CURRENT) and return to you newpos which still 100000FFF

Terrible Serial Port / USB code (C++) - suggestions for fixes?

I don't have much experience with Serial I/O, but have recently been tasked with fixing some highly flawed serial code, because the original programmer has left the company.
The application is a Windows program that talks to a scientific instrument serially via a virtual COMM port running on USB. Virtual COMM port USB drivers are provided by FTDI, since they manufacture the USB chip we use on the instrument.
The serial code is in an unmanaged C++ DLL, which is shared by both our old C++ software, and our new C# / .Net (WinForms) software.
There are two main problems:
Fails on many XP systems
When the first command is sent to the instrument, there's no response. When you issue the next command, you get the response from the first one.
Here's a typical usage scenario (full source for methods called is included below):
char szBuf [256];
CloseConnection ();
if (OpenConnection ())
ClearBuffer ();
// try to get a firmware version number
WriteChar ((char) 'V');
BOOL versionReadStatus1 = ReadString (szBuf, 100);
On a failing system, the ReadString call will never receive any serial data, and times out. But if we issue another, different command, and call ReadString again, it will return the response from the first command, not the new one!
But this only happens on a large subset of Windows XP systems - and never on Windows 7. As luck would have it, our XP dev machines worked OK, so we did not see the problem until we started beta testing. But I can also reproduce the problem by running an XP VM (VirtualBox) on my XP dev machine. Also, the problem only occurs when using the DLL with the new C# version - works fine with the old C++ app.
This seemed to be resolved when I added a Sleep(21) to the low level BytesInQue method before calling ClearCommError, but this exacerbated the other problem - CPU usage. Sleeping for less than 21 ms would make the failure mode reappear.
High CPU usage
When doing serial I/O CPU use is excessive - often above 90%. This happens with both the new C# app and the old C++ app, but is much worse in the new app. Often makes the UI very non-responsive, but not always.
Here's the code for our Port.cpp class, in all it's terrible glory. Sorry for the length, but this is what I'm working with. Most important methods are probably OpenConnection, ReadString, ReadChar, and BytesInQue.
// Port.cpp: Implements the CPort class, which is
// the class that controls the serial port.
// Copyright (C) 1997-1998 Microsoft Corporation
// All rights reserved.
// This source code is only intended as a supplement to the
// Broadcast Architecture Programmer's Reference.
// For detailed information regarding Broadcast
// Architecture, see the reference.
#include <windows.h>
#include <stdio.h>
#include <assert.h>
#include "port.h"
// Construction code to initialize the port handle to null.
m_hDevice = (HANDLE)0;
// default parameters
m_uPort = 1;
m_uBaud = 9600;
m_uDataBits = 8;
m_uParity = 0;
m_uStopBits = 0; // = 1 stop bit
m_chTerminator = '\n';
m_bCommportOpen = FALSE;
m_nTimeOut = 50;
m_nBlockSizeMax = 2048;
// Destruction code to close the connection if the port
// handle was valid.
if (m_hDevice)
// Open a serial communication port for writing short
// one-byte commands, that is, overlapped data transfer
// is not necessary.
BOOL CPort::OpenConnection()
char szPort[64];
m_bCommportOpen = FALSE;
// Build the COM port string as "COMx" where x is the port.
if (m_uPort > 9)
wsprintf(szPort, "\\\\.\\COM%d", m_uPort);
wsprintf(szPort, "COM%d", m_uPort);
// Open the serial port device.
m_hDevice = CreateFile(szPort,
NULL, // No security attributes
if (m_hDevice == INVALID_HANDLE_VALUE)
SaveLastError ();
m_hDevice = (HANDLE)0;
return FALSE;
return SetupConnection(); // After the port is open, set it up.
} // end of OpenConnection()
// Configure the serial port with the given settings.
// The given settings enable the port to communicate
// with the remote control.
BOOL CPort::SetupConnection(void)
DCB dcb; // The DCB structure differs betwwen Win16 and Win32.
dcb.DCBlength = sizeof(DCB);
// Retrieve the DCB of the serial port.
BOOL bStatus = GetCommState(m_hDevice, (LPDCB)&dcb);
if (bStatus == 0)
SaveLastError ();
return FALSE;
// Assign the values that enable the port to communicate.
dcb.BaudRate = m_uBaud; // Baud rate
dcb.ByteSize = m_uDataBits; // Data bits per byte, 4-8
dcb.Parity = m_uParity; // Parity: 0-4 = no, odd, even, mark, space
dcb.StopBits = m_uStopBits; // 0,1,2 = 1, 1.5, 2
dcb.fBinary = TRUE; // Binary mode, no EOF check : Must use binary mode in NT
dcb.fParity = dcb.Parity == 0 ? FALSE : TRUE; // Enable parity checking
dcb.fOutX = FALSE; // XON/XOFF flow control used
dcb.fInX = FALSE; // XON/XOFF flow control used
dcb.fNull = FALSE; // Disable null stripping - want nulls
dcb.fOutxCtsFlow = FALSE;
dcb.fOutxDsrFlow = FALSE;
dcb.fDsrSensitivity = FALSE;
dcb.fDtrControl = DTR_CONTROL_ENABLE;
dcb.fRtsControl = RTS_CONTROL_DISABLE ;
// Configure the serial port with the assigned settings.
// Return TRUE if the SetCommState call was not equal to zero.
bStatus = SetCommState(m_hDevice, &dcb);
if (bStatus == 0)
SaveLastError ();
return FALSE;
DWORD dwSize;
COMMPROP *commprop;
DWORD dwError;
dwSize = sizeof(COMMPROP) + sizeof(MODEMDEVCAPS) ;
commprop = (COMMPROP *)malloc(dwSize);
memset(commprop, 0, dwSize);
if (!GetCommProperties(m_hDevice, commprop))
dwError = GetLastError();
m_bCommportOpen = TRUE;
return TRUE;
void CPort::SaveLastError ()
DWORD dwLastError = GetLastError ();
LPVOID lpMsgBuf;
(LPTSTR) &lpMsgBuf,
strcpy (m_szLastError,(LPTSTR)lpMsgBuf);
// Free the buffer.
LocalFree( lpMsgBuf );
void CPort::SetTimeOut (int nTimeOut)
m_nTimeOut = nTimeOut;
// Close the opened serial communication port.
void CPort::CloseConnection(void)
if (m_hDevice != NULL &&
CloseHandle(m_hDevice); ///that the port has been closed.
m_hDevice = (HANDLE)0;
// Set the device handle to NULL to confirm
m_bCommportOpen = FALSE;
int CPort::WriteChars(char * psz)
int nCharWritten = 0;
while (*psz)
nCharWritten +=WriteChar(*psz);
return nCharWritten;
// Write a one-byte value (char) to the serial port.
int CPort::WriteChar(char c)
DWORD dwBytesInOutQue = BytesInOutQue ();
if (dwBytesInOutQue > m_dwLargestBytesInOutQue)
m_dwLargestBytesInOutQue = dwBytesInOutQue;
static char szBuf[2];
szBuf[0] = c;
szBuf[1] = '\0';
DWORD dwBytesWritten;
DWORD dwTimeOut = m_nTimeOut; // 500 milli seconds
DWORD start, now;
start = GetTickCount();
now = GetTickCount();
if ((now - start) > dwTimeOut )
strcpy (m_szLastError, "Timed Out");
return 0;
WriteFile(m_hDevice, szBuf, 1, &dwBytesWritten, NULL);
while (dwBytesWritten == 0);
OutputDebugString(TEXT(strcat(szBuf, "\r\n")));
return dwBytesWritten;
int CPort::WriteChars(char * psz, int n)
DWORD dwBytesWritten;
WriteFile(m_hDevice, psz, n, &dwBytesWritten, NULL);
return dwBytesWritten;
// Return number of bytes in RX queue
DWORD CPort::BytesInQue ()
DWORD dwErrorFlags;
DWORD dwLength;
// check number of bytes in queue
ClearCommError(m_hDevice, &dwErrorFlags, &ComStat ) ;
dwLength = ComStat.cbInQue;
return dwLength;
DWORD CPort::BytesInOutQue ()
DWORD dwErrorFlags;
DWORD dwLength;
// check number of bytes in queue
ClearCommError(m_hDevice, &dwErrorFlags, &ComStat );
dwLength = ComStat.cbOutQue ;
return dwLength;
int CPort::ReadChars (char* szBuf, int nMaxChars)
if (BytesInQue () == 0)
return 0;
DWORD dwBytesRead;
ReadFile(m_hDevice, szBuf, nMaxChars, &dwBytesRead, NULL);
return (dwBytesRead);
// Read a one-byte value (char) from the serial port.
int CPort::ReadChar (char& c)
static char szBuf[2];
szBuf[0] = '\0';
szBuf[1] = '\0';
if (BytesInQue () == 0)
return 0;
DWORD dwBytesRead;
ReadFile(m_hDevice, szBuf, 1, &dwBytesRead, NULL);
c = *szBuf;
if (dwBytesRead == 0)
return 0;
return dwBytesRead;
BOOL CPort::ReadString (char *szStrBuf , int nMaxLength)
char str [256];
char str2 [256];
DWORD dwTimeOut = m_nTimeOut;
DWORD start, now;
int nBytesRead;
int nTotalBytesRead = 0;
char c = ' ';
static char szCharBuf [2];
szCharBuf [0]= '\0';
szCharBuf [1]= '\0';
szStrBuf [0] = '\0';
start = GetTickCount();
while (c != m_chTerminator)
nBytesRead = ReadChar (c);
nTotalBytesRead += nBytesRead;
if (nBytesRead == 1 && c != '\r' && c != '\n')
*szCharBuf = c;
strncat (szStrBuf,szCharBuf,1);
if (strlen (szStrBuf) == nMaxLength)
return TRUE;
// restart timer for next char
start = GetTickCount();
// check for time out
now = GetTickCount();
if ((now - start) > dwTimeOut )
strcpy (m_szLastError, "Timed Out");
return FALSE;
return TRUE;
int CPort::WaitForQueToFill (int nBytesToWaitFor)
DWORD start = GetTickCount();
if (BytesInQue () >= nBytesToWaitFor)
if (GetTickCount() - start > m_nTimeOut)
return 0;
} while (1);
return BytesInQue ();
int CPort::BlockRead (char * pcInputBuffer, int nBytesToRead)
int nBytesRead = 0;
int charactersRead;
while (nBytesToRead >= m_nBlockSizeMax)
if (WaitForQueToFill (m_nBlockSizeMax) < m_nBlockSizeMax)
return nBytesRead;
charactersRead = ReadChars (pcInputBuffer, m_nBlockSizeMax);
pcInputBuffer += charactersRead;
nBytesRead += charactersRead;
nBytesToRead -= charactersRead;
if (nBytesToRead > 0)
if (WaitForQueToFill (nBytesToRead) < nBytesToRead)
return nBytesRead;
charactersRead = ReadChars (pcInputBuffer, nBytesToRead);
nBytesRead += charactersRead;
nBytesToRead -= charactersRead;
return nBytesRead;
Based on my testing and reading, I see several suspicious things in this code:
COMMTIMEOUTS is never set. MS docs say "Unpredictable results can occur if you fail to set the time-out values". But I tried setting this, and it didn't help.
Many methods (e.g. ReadString) will go into a tight loop and hammer the port with repeated reads if they don't get data immediately . This seems to explain the high CPU usage.
Many methods have their own timeout handling, using GetTickCount(). Isn't that what COMMTIMEOUTS is for?
In the new C# (WinForms) program, all these serial routines are called directly from the main thread, from a MultiMediaTimer event. Maybe should be run in a different thread?
BytesInQue method seems to be a bottleneck. If I break to debugger when CPU usage is high, that's usually where the program stops. Also, adding a Sleep(21) to this method before calling ClearCommError seems to resolve the XP problem, but exacerbates the CPU usage problem.
Code just seems unnecessarily complicated.
My Questions
Can anyone explain why this only works with a C# program on a small number of XP systems?
Any suggestions on how to rewrite this? Pointers to good sample code would be most welcome.
There are some serious problems with that class and it makes things even worse that there is a Microsoft copyright on it.
There is nothing special about this class. And it makes me wonder why it even exists except as an Adapter over Create/Read/WriteFile. You wouldnt even need this class if you used the SerialPort class in the .NET Framework.
Your CPU usage is because the code goes into an infinite loop while waiting for the device to have enough available data. The code might as well say while(1); If you must stick with Win32 and C++ you can look into Completion Ports and setting the OVERLAPPED flag when invoking CreateFile. This way you can wait for data in a separate worker thread.
You need to be careful when communicating to multiple COM ports. It has been a long time since I've done C++ but I believe the static buffer szBuff in the Read and Write methods is static for ALL instances of that class. It means if you invoke Read against two different COM ports "at the same time" you will have unexpected results.
As for the problems on some of the XP machines, you will most certainly figure out the problem if you check GetLastError after each Read/Write and log the results. It should be checking GetLastError anyways as it sometimes isn't always an "error" but a request from the subsystem to do something else in order to get the result you want.
You can get rid of the the whole while loop for blocking if you set COMMTIMEOUTS correctly. If there is a specific timeout for a Read operation use SetCommTimeouts before you perform the read.
I set ReadIntervalTimeout to the max timeout to ensure that the Read won't return quicker than m_nTimeOut. This value will cause Read to return if the time elapses between any two bytes. If it was set to 2 milliseconds and the first byte came in at t, and the second came in at t+1, the third at t+4, ReadFile would of only returned the first two bytes since the interval between the bytes was surpassed. ReadTotalTimeoutConstant ensures that you will never wait longer than m_nTimeOut no matter what.
maxWait = BytesToRead * ReadTotalTimeoutMultiplier + ReadTotalTimeoutConstant. Thus (BytesToRead * 0) + m_nTimeout = m_nTimeout
BOOL CPort::SetupConnection(void)
// Snip...
comTimeOut.ReadIntervalTimeout = m_nTimeOut; // Ensure's we wait the max timeout
comTimeOut.ReadTotalTimeoutMultiplier = 0;
comTimeOut.ReadTotalTimeoutConstant = m_nTimeOut;
comTimeOut.WriteTotalTimeoutMultiplier = 0;
comTimeOut.WriteTotalTimeoutConstant = m_nTimeOut;
// If return value != nBytesToRead check check GetLastError()
// Most likely Read timed out.
int CPort::BlockRead (char * pcInputBuffer, int nBytesToRead)
DWORD dwBytesRead;
if (FALSE == ReadFile(
// Check GetLastError
return dwBytesRead;
return dwBytesRead;
I have no idea if this is completely correct but it should give you an idea. Remove the ReadChar and ReadString methods and use this if your program relies on things being synchronous. Be careful about setting high time outs also. Communications are fast, in the milliseconds.
Here's a terminal program I wrote years ago (probably at least 15 years ago, now that I think about it). I just did a quick check, and under Windows 7 x64, it still seems to work reasonably well -- connects to my GPS, read, and displays the data coming from it.
If you look at the code, you can see that I didn't spend much time selecting the comm timeout values. I set them all to 1, intending to experiment with longer timeouts until the CPU usage was tolerable. To make a long story short, it uses so little CPU time I've never bothered. For example, on the Task Manager's CPU usage graph, I can't see any difference between it running and not. I've left it running collecting data from the GPS for a few hours at a time, and the Task Manager still says its total CPU usage is 0:00:00.
Bottom line: I'm pretty sure it could be more efficient -- but sometimes good enough is good enough. Given how heavily I don't use it any more, and the chances of ever adding anything like file transfer protocols, making it more efficient probably won't ever get to the top of the pile of things to do.
#include <stdio.h>
#include <conio.h>
#include <string.h>
#define STRICT
#include <windows.h>
void system_error(char *name) {
// Retrieve, format, and print out a message from the last error. The
// `name' that's passed should be in the form of a present tense noun
// (phrase) such as "opening file".
char *ptr = NULL;
(char *)&ptr,
fprintf(stderr, "\nError %s: %s\n", name, ptr);
int main(int argc, char **argv) {
int ch;
char buffer[64];
HANDLE file;
DWORD read, written;
DCB port;
HANDLE keyboard = GetStdHandle(STD_INPUT_HANDLE);
HANDLE screen = GetStdHandle(STD_OUTPUT_HANDLE);
DWORD mode;
char port_name[128] = "\\\\.\\COM3";
char init[] = "";
if ( argc > 2 )
sprintf(port_name, "\\\\.\\COM%s", argv[1]);
// open the comm port.
file = CreateFile(port_name,
if ( INVALID_HANDLE_VALUE == file) {
system_error("opening file");
return 1;
// get the current DCB, and adjust a few bits to our liking.
memset(&port, 0, sizeof(port));
port.DCBlength = sizeof(port);
if (!GetCommState(file, &port))
system_error("getting comm state");
if (!BuildCommDCB("baud=19200 parity=n data=8 stop=1", &port))
system_error("building comm DCB");
if (!SetCommState(file, &port))
system_error("adjusting port settings");
// set short timeouts on the comm port.
timeouts.ReadIntervalTimeout = 1;
timeouts.ReadTotalTimeoutMultiplier = 1;
timeouts.ReadTotalTimeoutConstant = 1;
timeouts.WriteTotalTimeoutMultiplier = 1;
timeouts.WriteTotalTimeoutConstant = 1;
if (!SetCommTimeouts(file, &timeouts))
system_error("setting port time-outs.");
// set keyboard to raw reading.
if (!GetConsoleMode(keyboard, &mode))
system_error("getting keyboard mode");
if (!SetConsoleMode(keyboard, mode))
system_error("setting keyboard mode");
if (!EscapeCommFunction(file, CLRDTR))
system_error("clearing DTR");
if (!EscapeCommFunction(file, SETDTR))
system_error("setting DTR");
if (!WriteFile(file, init, sizeof(init), &written, NULL))
system_error("writing data to port");
if (written != sizeof(init))
system_error("not all data written to port");
// basic terminal loop:
do {
// check for data on port and display it on screen.
ReadFile(file, buffer, sizeof(buffer), &read, NULL);
if (read)
WriteFile(screen, buffer, read, &written, NULL);
// check for keypress, and write any out the port.
if ( kbhit() ) {
ch = getch();
WriteFile(file, &ch, 1, &written, NULL);
// until user hits ctrl-backspace.
} while ( ch != 127);
// close up and go home.
return 0;
I would add
to the while loop in CPort::WaitForQueToFill()
This will give the OS a chance to actually place some bytes in the queue.