Writing to a memory mapped file shows read accesses in htop - c++

I would like to use a memory mapped file to write data. I am using the following test code on a ubuntu machine. The code is compiled with g++ -std=c++14 -O3 .
#include <sys/mman.h>
#include <unistd.h>
#include <fcntl.h>
#include <cstdlib>
#include <cstdio>
#include <cassert>
int main(){
constexpr size_t GB1 = 1 << 30;
size_t capacity = GB1 * 4;
size_t numElements = capacity / sizeof(size_t);
int fd = open("./mmapfile", O_RDWR);
assert(fd >= 0);
int error = ftruncate(fd, capacity);
assert(error == 0);
void* ptr = mmap(0, capacity, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
assert(ptr != MAP_FAILED);
size_t* data = (size_t*)ptr;
for(size_t i = 0; i < numElements; i++){
data[i] = i;
}
munmap(ptr, capacity);
}
The data is correctly being written to file. However, the htop command shows that half of the disk io bandwidth of the program is used by read accesses. My concern is that the code will not perform well if only half the bandwith can be used for writes.
Why are there read accesses in the code?
Can they be avoided or are they expected?

The read access occurs because as the pages are accessed for the first time they need to be read in from disk. The OS is not clarvoyant and doesn't know that the reads will be thrown out.
To avoid the issue, don't use mmap(). Build the blocks in buffer and write them out the old fashioned way.

Related

mmap() cannot allocate memory when repeatedly mapping and unmapping one single page

I have read many SO (and other) questions, but I couldn't find one that helped me. I want to mmap two files at once and copy their content byte-by-byte (I know this seems ridiculous, but this is my minimal reproducibly example). Therefore I loop through every byte, copy it, and after the size of one page in my files, I munmap the current page and mmap the next page. Imo there should only ever be one page (4096 bytes) of each file be needed so there shouldn't be any memory problem.
Also, if the output file is too small, the memory is allocated via posix_fallocate, which runs fine. I a lack of memory space in the hard drive can't be the problem either imo.
But as soon as I am going for a bit larger files with ~140 MB, I get the cannot allocate memory error from the output-file that I am writing into. Do you guys have any idea how this is?
#include <sys/types.h>
#include <sys/mman.h>
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <iostream>
#include <bitset>
#include <fcntl.h>
#include <sys/stat.h>
#include <math.h>
#include <errno.h>
using namespace std;
int main()
{
char file_input[] = "medium_big_file";
char file_output[] = "foo_output";
int fd_input = -1;
int fd_output = -1;
unsigned char *map_page_input, *map_page_output;
struct stat stat_input, stat_output;
if ((fd_input = open(file_input, O_RDONLY)) == -1 ||
(fd_output = open(file_output, O_RDWR|O_CREAT, 0644)) == -1) {
cerr << "Error on open()" << endl;
return EXIT_FAILURE;
}
// get file size via stat()
stat(file_input, &stat_input);
stat(file_output, &stat_output);
const size_t size_input = stat_input.st_size;
const size_t size_output = stat_output.st_size;
const size_t pagesize = getpagesize();
size_t page = 0;
size_t pos = pagesize;
if (size_output < size_input) {
if (posix_fallocate(fd_output, 0, size_input) != 0) {
cerr << "file space allocation didn't work" << endl;
return EXIT_FAILURE;
}
}
while(pos + (pagesize * (page-1)) < size_input) {
// check if input needs the next page
if (pos == pagesize) {
munmap(&map_page_input, pagesize);
map_page_input = (unsigned char*)mmap(NULL, pagesize, PROT_READ,
MAP_FILE|MAP_PRIVATE, fd_input, page * pagesize);
munmap(&map_page_output, pagesize);
map_page_output = (unsigned char*)mmap(NULL, pagesize,
PROT_READ|PROT_WRITE, MAP_SHARED, fd_output, page * pagesize);
page += 1;
pos = 0;
if (map_page_output == MAP_FAILED) {
cerr << "errno: " << strerror(errno) << endl;
cerr << "mmap failed on page " << page << endl;
return EXIT_FAILURE;
}
}
memcpy(&map_page_output[pos], &map_page_input[pos], 1);
pos += 1;
}
munmap(&map_page_input, pagesize);
munmap(&map_page_output, pagesize);
close(fd_input);
close(fd_output);
return EXIT_SUCCESS;
}
The very first iteration of the loop attempts to unmap something that was never mapped, and passes a completely uninitialized pointer to munmap. Not once, but twice.
Finally, munmap expects a pointer to the mmap-ed memory, and not a pointer to a pointer to the mmap-ed memory.
The shown code fails to check the return status from munmap. If it did, it would've discovered that every call to munmap fails (hopefully, but if the first call happens to pass an aligned pointer, a chunk of the stack might end up being unmapped, with the ensuing hilarity), so the shown code just keeps allocating more, and more pages, and running out of memory.
You must fix both bugs.
You do not check the exit code of munmap. It fails. It fails because you do not need to take the address of the address. Replace:
munmap(&map_page_input, pagesize);
with
munmap(map_page_input, pagesize);
Because munmap fails, you run out of max number of mappings per process.
munmap takes as first argument the value returned by mmap. In your code munpap receives a pointer to a variable containing it, so you are not actually unmapping the area. Just remove "&" in munmap call.

What are the fastest methods to read from a file in standard C++? [duplicate]

I am currently writing a program in c++ which includes reading lots of large text files. Each has ~400.000 lines with in extreme cases 4000 or more characters per line. Just for testing, I read one of the files using ifstream and the implementation offered by cplusplus.com. It took around 60 seconds, which is way too long. Now I was wondering, is there a straightforward way to improve reading speed?
edit:
The code I am using is more or less this:
string tmpString;
ifstream txtFile(path);
if(txtFile.is_open())
{
while(txtFile.good())
{
m_numLines++;
getline(txtFile, tmpString);
}
txtFile.close();
}
edit 2: The file I read is only 82 MB big. I mainly said that it could reach 4000 because I thought it might be necessary to know in order to do buffering.
edit 3: Thank you all for your answers, but it seems like there is not much room to improve given my problem. I have to use readline, since I want to count the number of lines. Instantiating the ifstream as binary didn't make reading any faster either. I will try to parallelize it as much as I can, that should work at least.
edit 4: So apparently there are some things I can to. Big thank you to sehe for putting so much time into this, I appreciate it a lot! =)
Updates: Be sure to check the (surprising) updates below the initial answer
Memory mapped files have served me well1:
#include <boost/iostreams/device/mapped_file.hpp> // for mmap
#include <algorithm> // for std::find
#include <iostream> // for std::cout
#include <cstring>
int main()
{
boost::iostreams::mapped_file mmap("input.txt", boost::iostreams::mapped_file::readonly);
auto f = mmap.const_data();
auto l = f + mmap.size();
uintmax_t m_numLines = 0;
while (f && f!=l)
if ((f = static_cast<const char*>(memchr(f, '\n', l-f))))
m_numLines++, f++;
std::cout << "m_numLines = " << m_numLines << "\n";
}
This should be rather quick.
Update
In case it helps you test this approach, here's a version using mmap directly instead of using Boost: see it live on Coliru
#include <algorithm>
#include <iostream>
#include <cstring>
// for mmap:
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
const char* map_file(const char* fname, size_t& length);
int main()
{
size_t length;
auto f = map_file("test.cpp", length);
auto l = f + length;
uintmax_t m_numLines = 0;
while (f && f!=l)
if ((f = static_cast<const char*>(memchr(f, '\n', l-f))))
m_numLines++, f++;
std::cout << "m_numLines = " << m_numLines << "\n";
}
void handle_error(const char* msg) {
perror(msg);
exit(255);
}
const char* map_file(const char* fname, size_t& length)
{
int fd = open(fname, O_RDONLY);
if (fd == -1)
handle_error("open");
// obtain file size
struct stat sb;
if (fstat(fd, &sb) == -1)
handle_error("fstat");
length = sb.st_size;
const char* addr = static_cast<const char*>(mmap(NULL, length, PROT_READ, MAP_PRIVATE, fd, 0u));
if (addr == MAP_FAILED)
handle_error("mmap");
// TODO close fd at some point in time, call munmap(...)
return addr;
}
Update
The last bit of performance I could squeeze out of this I found by looking at the source of GNU coreutils wc. To my surprise using the following (greatly simplified) code adapted from wc runs in about 84% of the time taken with the memory mapped file above:
static uintmax_t wc(char const *fname)
{
static const auto BUFFER_SIZE = 16*1024;
int fd = open(fname, O_RDONLY);
if(fd == -1)
handle_error("open");
/* Advise the kernel of our access pattern. */
posix_fadvise(fd, 0, 0, 1); // FDADVICE_SEQUENTIAL
char buf[BUFFER_SIZE + 1];
uintmax_t lines = 0;
while(size_t bytes_read = read(fd, buf, BUFFER_SIZE))
{
if(bytes_read == (size_t)-1)
handle_error("read failed");
if (!bytes_read)
break;
for(char *p = buf; (p = (char*) memchr(p, '\n', (buf + bytes_read) - p)); ++p)
++lines;
}
return lines;
}
1 see e.g. the benchmark here: How to parse space-separated floats in C++ quickly?
4000 * 400,000 = 1.6 GB if you're hard drive isn't an SSD you're likely getting ~100 MB/s sequential read. That's 16 seconds just in I/O.
Since you don't elaborate on the specific code your using or how you need to parse these files (do you need to read it line by line, does the system have a lot of RAM could you read the whole file into a large RAM buffer and then parse it?) There's little you can do to speed up the process.
Memory mapped files won't offer any performance improvement when reading a file sequentially. Perhaps manually parsing large chunks for new lines rather than using "getline" would offer an improvement.
EDIT After doing some learning (thanks #sehe). Here's the memory mapped solution I would likely use.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <errno.h>
int main() {
char* fName = "big.txt";
//
struct stat sb;
long cntr = 0;
int fd, lineLen;
char *data;
char *line;
// map the file
fd = open(fName, O_RDONLY);
fstat(fd, &sb);
//// int pageSize;
//// pageSize = getpagesize();
//// data = mmap((caddr_t)0, pageSize, PROT_READ, MAP_PRIVATE, fd, pageSize);
data = mmap((caddr_t)0, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
line = data;
// get lines
while(cntr < sb.st_size) {
lineLen = 0;
line = data;
// find the next line
while(*data != '\n' && cntr < sb.st_size) {
data++;
cntr++;
lineLen++;
}
/***** PROCESS LINE *****/
// ... processLine(line, lineLen);
}
return 0;
}
Neil Kirk, unfortunately I can not reply to your comment (not enough reputation) but I did a performance test on ifstream an stringstream and the performance, reading a text file line by line, is exactly the same.
std::stringstream stream;
std::string line;
while(std::getline(stream, line)) {
}
This takes 1426ms on a 106MB file.
std::ifstream stream;
std::string line;
while(ifstream.good()) {
getline(stream, line);
}
This takes 1433ms on the same file.
The following code is faster instead:
const int MAX_LENGTH = 524288;
char* line = new char[MAX_LENGTH];
while (iStream.getline(line, MAX_LENGTH) && strlen(line) > 0) {
}
This takes 884ms on the same file.
It is just a little tricky since you have to set the maximum size of your buffer (i.e. maximum length for each line in the input file).
As someone with a little background in competitive programming, I can tell you: At least for simple things like integer parsing the main cost in C is locking the file streams (which is by default done for multi-threading). Use the unlocked_stdio versions instead (fgetc_unlocked(), fread_unlocked()). For C++, the common lore is to use std::ios::sync_with_stdio(false) but I don't know if it's as fast as unlocked_stdio.
For reference here is my standard integer parsing code. It's a lot faster than scanf, as I said mainly due to not locking the stream. For me it was as fast as the best hand-coded mmap or custom buffered versions I'd used previously, without the insane maintenance debt.
int readint(void)
{
int n, c;
n = getchar_unlocked() - '0';
while ((c = getchar_unlocked()) > ' ')
n = 10*n + c-'0';
return n;
}
(Note: This one only works if there is precisely one non-digit character between any two integers).
And of course avoid memory allocation if possible...
Do you have to read all files at the same time? (at the start of your application for example)
If you do, consider parallelizing the operation.
Either way, consider using binary streams, or unbffered read for blocks of data.
Use Random file access or use binary mode. for sequential, this is big but still it depends on what you are reading.

pointer get wrong value in different thread

I am writing a piece of code to demonstrate the multi-threading share memory writing.
However, my code gets a strange 0xffffffff pointer I can't make out why. I haven't been writing cpp code for a while. please let me know if I get something wrong.
I compile with the command:
g++ --std=c++11 shared_mem_multi_write.cpp -lpthread -g
I get error echoes like:
function base_ptr: 0x5eebff, src_ptr: 0x7f21a9c4e010, size: 6220800
function base_ptr: 0xffffffffffffffff, src_ptr: 0x7f21a9c4e010, size: 6220800
function base_ptr: 0xbdd7ff, src_ptr: 0x7f21a9c4e010, size: 6220800
function base_ptr: 0x23987ff, src_ptr: 0x7f21a9c4e010, size: 6220800
function base_ptr: 0x11cc3ff, src_ptr: 0x7f21a9c4e010, size: 6220800
function base_ptr: 0x17bafff, src_ptr: 0x7f21a9c4e010, size: 6220800
function base_ptr: 0x1da9bff, src_ptr: 0x7f21a9c4e010, size: 6220800
Segmentation fault (core dumped)
my os is CentOS Linux release 7.6.1810 (Core) gcc version 4.8.5 and the code is posted below:
#include <chrono>
#include <cstdio>
#include <cstring>
#include <functional>
#include <iostream>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/stat.h>
#include <thread>
#include <vector>
#include <memory>
const size_t THREAD_CNT = 40;
const size_t FRAME_SIZE = 1920 * 1080 * 3;
const size_t SEG_SIZE = FRAME_SIZE * THREAD_CNT;
void func(char *base_ptr, char *src_ptr, size_t size)
{
printf("function base_ptr: %p, src_ptr: %p, size: %u\n", base_ptr, src_ptr, size);
while (1)
{
auto now = std::chrono::system_clock::now();
memcpy(base_ptr, src_ptr, size);
std::chrono::system_clock::time_point next_ts =
now + std::chrono::milliseconds(42); // 24 frame per seconds => 42 ms per frame
std::this_thread::sleep_until(next_ts);
}
}
int main(int argc, char **argv)
{
int shmkey = 666;
int shmid;
shmid = shmget(shmkey, SEG_SIZE, IPC_CREAT);
char *src_ptr = new char[FRAME_SIZE];
char *shmpointer = static_cast<char *>(shmat(shmid, nullptr, 0));
std::vector<std::shared_ptr<std::thread>> t_vec;
t_vec.reserve(THREAD_CNT);
for (int i = 0; i < THREAD_CNT; ++i)
{
//t_vec[i] = std::thread(func, i * FRAME_SIZE + shmpointer, src_ptr, FRAME_SIZE);
t_vec[i] = std::make_shared<std::thread>(func, i * FRAME_SIZE + shmpointer, src_ptr, FRAME_SIZE);
}
for (auto &&t : t_vec)
{
t->join();
}
return 0;
}
You forgot specify access rights for created SHM segment (http://man7.org/linux/man-pages/man2/shmget.2.html):
The value shmflg is composed of:
...
In addition to the above flags, the least significant 9 bits of shmflg specify the permissions granted to the owner, group, and others. These bits have the same format, and the same meaning, as the mode argument of open(2). Presently, execute permissions are not used by the system.
Change
shmid = shmget(shmkey, SEG_SIZE, IPC_CREAT);
into
shmid = shmget(shmkey, SEG_SIZE, IPC_CREAT | 0666);
It works for me now: https://wandbox.org/permlink/Am4r2GBvM7kSmpdO
Note that I use only a vector of threads (no shared pointers), as other suggested in comments. You can possibly reserve its space as well.
You forget one very important thing: Error handling!
Both the shmget and shmat functions can fail. If they fail they return the value -1.
Now if you look at the first base_ptr value, it's 0x5eebff. That just happens to be the same as FRAME_SIZE - 1 (FRAME_SIZE is 0x5eec00). That means shmat do return -1, and has failed.
Since you keep on using this erroneous value, all bets are off.
You need to check for errors, and if that happens print the value of errno to find out what have gone wrong:
void* ptr = shmat(shmid, nullptr, 0);
if (ptr == (void*) -1)
{
std::cout << "Error getting shared memory: " << std::strerror(errno) << '\n';
return EXIT_FAILURE;
}
Do something similar for shmget.
Now it's also easy to understand the 0xffffffffffffffff value. It's the two's complement hexadecimal notation for -1, and it's passed to the first thread that is created.

Random mmaped memory access up to 16% slower than heap data access

Our software builds a data structure in memory that is about 80 gigabytes large. It can then either use this data structure directly to do its computation, or dump it to disk so it can be reused several times afterwards. A lot of random memory accesses happens in this data structure.
For larger input this data structure can grow even larger (our largest one was over 300 gigabytes large) and our servers have enough memory to hold everything in RAM.
If the data structure is dumped to disk, it gets loaded back into the address space with mmap, forced into the os page cache, and lastly mlocked (code at the end).
The problem is that there is about a 16% difference in performance between just using the computed data structure immediately on the heap (see Malloc version), or mmaping the dumped file (see mmap version ).
I don't have a good explanation why this is the case. Is there a way to find out why mmap is being so much slower? Can I close this performance gap somehow?
I did the measurements on a server running Scientific Linux 7.2 with a 3.10 kernel, it has 128GB RAM (enough to fit everything), and repeated them several times with similar results. Sometimes the gap is a bit smaller, but not by much.
New Update (2017/05/23):
I produced a minimal test case, where the effect can be seen. I tried the different flags (MAP_SHARED etc.) without success. The mmap version is still slower.
#include <random>
#include <iostream>
#include <sys/time.h>
#include <ctime>
#include <omp.h>
#include <sys/mman.h>
#include <unistd.h>
constexpr size_t ipow(int base, int exponent) {
size_t res = 1;
for (int i = 0; i < exponent; i++) {
res = res * base;
}
return res;
}
size_t getTime() {
struct timeval tv;
gettimeofday(&tv, NULL);
size_t ret = tv.tv_usec;
ret /= 1000;
ret += (tv.tv_sec * 1000);
return ret;
}
const size_t N = 1000000000;
const size_t tableSize = ipow(21, 6);
size_t* getOffset(std::mt19937 &generator) {
std::uniform_int_distribution<size_t> distribution(0, N);
std::cout << "Offset Array" << std::endl;
size_t r1 = getTime();
size_t *offset = (size_t*) malloc(sizeof(size_t) * tableSize);
for (size_t i = 0; i < tableSize; ++i) {
offset[i] = distribution(generator);
}
size_t r2 = getTime();
std::cout << (r2 - r1) << std::endl;
return offset;
}
char* getData(std::mt19937 &generator) {
std::uniform_int_distribution<char> datadist(1, 10);
std::cout << "Data Array" << std::endl;
size_t o1 = getTime();
char *data = (char*) malloc(sizeof(char) * N);
for (size_t i = 0; i < N; ++i) {
data[i] = datadist(generator);
}
size_t o2 = getTime();
std::cout << (o2 - o1) << std::endl;
return data;
}
template<typename T>
void dump(const char* filename, T* data, size_t count) {
FILE *file = fopen(filename, "wb");
fwrite(data, sizeof(T), count, file);
fclose(file);
}
template<typename T>
T* read(const char* filename, size_t count) {
#ifdef MMAP
FILE *file = fopen(filename, "rb");
int fd = fileno(file);
T *data = (T*) mmap(NULL, sizeof(T) * count, PROT_READ, MAP_SHARED | MAP_NORESERVE, fd, 0);
size_t pageSize = sysconf(_SC_PAGE_SIZE);
char bytes = 0;
for(size_t i = 0; i < (sizeof(T) * count); i+=pageSize){
bytes ^= ((char*)data)[i];
}
mlock(((char*)data), sizeof(T) * count);
std::cout << bytes;
#else
T* data = (T*) malloc(sizeof(T) * count);
FILE *file = fopen(filename, "rb");
fread(data, sizeof(T), count, file);
fclose(file);
#endif
return data;
}
int main (int argc, char** argv) {
#ifdef DATAGEN
std::mt19937 generator(42);
size_t *offset = getOffset(generator);
dump<size_t>("offset.bin", offset, tableSize);
char* data = getData(generator);
dump<char>("data.bin", data, N);
#else
size_t *offset = read<size_t>("offset.bin", tableSize);
char *data = read<char>("data.bin", N);
#ifdef MADV
posix_madvise(offset, sizeof(size_t) * tableSize, POSIX_MADV_SEQUENTIAL);
posix_madvise(data, sizeof(char) * N, POSIX_MADV_RANDOM);
#endif
#endif
const size_t R = 10;
std::cout << "Computing" << std::endl;
size_t t1 = getTime();
size_t result = 0;
#pragma omp parallel reduction(+:result)
{
size_t magic = 0;
for (int r = 0; r < R; ++r) {
#pragma omp for schedule(dynamic, 1000)
for (size_t i = 0; i < tableSize; ++i) {
char val = data[offset[i]];
magic += val;
}
}
result += magic;
}
size_t t2 = getTime();
std::cout << result << "\t" << (t2 - t1) << std::endl;
}
Please excuse the C++, its random class is easier to use. I compiled it like this:
# The version that writes down the .bin files and also computes on the heap
g++ bench.cpp -fopenmp -std=c++14 -O3 -march=native -mtune=native -DDATAGEN
# The mmap version
g++ bench.cpp -fopenmp -std=c++14 -O3 -march=native -mtune=native -DMMAP
# The fread/heap version
g++ bench.cpp -fopenmp -std=c++14 -O3 -march=native -mtune=native
# For madvice add -DMADV
On this server I get the following times (ran all of the commands a few times):
./mmap
2030ms
./fread
1350ms
./mmap+madv
2030ms
./fread+madv
1350ms
numactl --cpunodebind=0 ./mmap
2600 ms
numactl --cpunodebind=0 ./fread
1500 ms
malloc() back-end can make use of THP (Transparent Huge Pages), which is something not possible when using mmap() backed by a file.
Using huge pages (even transparently) can reduce drastically the number of TLB misses while running your application.
An interesting test could be to disable transparent hugepages and run your malloc() test again.
echo never > /sys/kernel/mm/transparent_hugepage/enabled
You could also measure TLB misses using perf:
perf stat -e dTLB-load-misses,iTLB-load-misses ./command
For more infos on THP please see:
https://www.kernel.org/doc/Documentation/vm/transhuge.txt
People are waiting for a long time to have a page cache which is huge page capable, allowing the mapping of files using huge pages (or a mix of huge pages and standard 4K pages).
There are a bunch of articles on LWN about transparent huge page cache, but it does not have reached production kernel yet.
Transparent huge pages in the page cache (May 2016):
https://lwn.net/Articles/686690
There is also a presentation from January this year about the future of Linux page cache:
https://youtube.com/watch?v=xxWaa-lPR-8
Additionally, you can avoid all those calls to mlock on individual pages in your mmap() implementation by using the MAP_LOCKED flag.
If you are not privileged, this may require to adjust the memlock limit.
I might be wrong, but...
It seems to me that the issue isn't with mmap, but with the fact that the code maps the memory to a file.
The Linux malloc falls back to mmap for large allocations, so both memory allocation flavors essentially use the same backend (mmap)... however, the only difference is that malloc uses mmap without mapping to a specific file on the hard drive.
The syncing of the memory information to the disk might be what's causing the "slower" performance. It's similar to saving the file almost constantly.
You might consider testing mmap without the file, by using the MAP_ANONYMOUS flag (and fd == -1 on some systems) to test for any difference.
On the other hand, I'm not sure if the "slower" memory access isn't actually faster in the long run - would you lock the whole thing to sage 300Gb to the disk? How long would that take? ...
... the fact that you're doing it automatically in small increments might be a performance gain rather than a penalty.

Windows / C++ fread() stops reading in data from another process

So I've run into a quite frustrating problem... essentially I'm trying to transfer data between two different programs. The producer program sends data into stdout. The consumer program starts up the producer via _popen() and then uses fread() to read from that process... this works initially, but after maybe 15 loop iterations, the consumer starts reading in 0 bytes every time (even though the producer should still be outputting data).
I noticed if I tune the data-rates down a lot, I don't run into this problem... however that's not really an option, as in a real scenario, the data would be uncompressed video being piped in through ffmpeg.
Here is my consumer program:
#include <stdio.h>
#include <stdlib.h>
#include <windows.h>
int main() {
FILE * in = _popen("Pipe.exe", "r");
if(in == NULL){
printf("ERROR\n");
exit(0);
}
int stride = 163840;
char * buffer = (char*)malloc(stride);
int bytesRead;
while(true){
bytesRead = fread(buffer, 1, stride, in);
printf("%i\n", bytesRead);
Sleep(10);
}
}
And my producer program:
#include <stdio.h>
#include <stdlib.h>
#include <windows.h>
int main() {
int stride = 163840;
char * buffer = (char*) malloc(stride);
char value = 0;
while(true){
// This just changes the data for every loop iteration
for(int i = 0; i < stride; i++){
buffer[i] = value;
}
value = value == 255 ? 0 : value + 1;
fwrite(buffer, 1, stride, stdout);
Sleep(10);
}
}