C++ Read large data, parse, then write data - c++

I'm trying to read a large dataset, format it the way I need, and then write it to another file. I'm trying to use C++ over SAS or STATA for the speed advantage. The data file are usually around 10gigabytes. And my current code takes over an hour to run (and then I kill it because I'm sure that something is very inefficient with my code.
Is there a more efficient way to do this? Maybe read the file into memory and then analyze it using the switch statements? (I have 32gb ram linux 64bit). Is it possible that reading, and then writing within the loop slows it down since it is constantly reading, then writing? I tried to read it from one drive, and then write to another in an attempt to speed this up.
Are the switch cases slowing it down?
The process I have now reads the data using getline, uses the switch statement to parse it correctly, and then writes it to my outfile. And repeats for 300 million lines. There are about 10 more cases in the switch statement, but I didn't copy for brevity's sake.
The code is probably very ugly all being in the main function, but I wanted to get it working before I worked on attractiveness.
I've tried using read() but without any success. Please let me know if I need to clarify anything.
Thank you for the help!
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <stdio.h>
//#include <cstring>
//#include <boost/algorithm/string.hpp>
#include <vector>
using namespace std;
//using namespace boost;
struct dataline
{
char type[0];
double second;
short mill;
char event[1];
char ticker[6];
char marketCategory[1];
char financialStatus[1];
int roundLotSize;
short roundLotOnly;
char tradingState[1];
char reserved[1];
char reason[4];
char mpid[4];
char primaryMarketMaker[1];
char primaryMarketMode[1];
char marketParticipantState[1];
unsigned long orderNumber;
char buySell[0];
double shares;
float price;
int executedShares;
double matchNumber;
char printable[1];
double executionPrice;
int canceledShares;
double sharesBig;
double crossPrice;
char crossType[0];
double pairedShares;
double imbalanceShares;
char imbalanceDirection[1];
double fairPrice;
double nearPrice;
double currentReferencePrice;
char priceVariationIndicator[1];
};
int main ()
{
string a;
string b;
string c;
string d;
string e;
string f;
string g;
string h;
string k;
string l;
string times;
string smalltimes;
short time; //counter to keep second filled
short smalltime; //counter to keep millisecond filled
double N;
double NN;
double NNN;
int length;
char M;
//vector<> fout;
string line;
ofstream fout ("/media/3tb/test.txt");
ifstream myfile;
myfile.open("S050508-v3.txt");
dataline oneline;
if (myfile.is_open())
{
while ( myfile.good() )
{
getline (myfile,line);
// cout << line<<endl;;
a=line.substr(0,1);
stringstream ss(a);
char type;
ss>>type;
switch (type)
{
case 'T':
{
if (type == 'T')
{
times=line.substr(1,5);
stringstream s(times);
s>>time;
//oneline.second=time;
//oneline.second;
//cout<<time<<endl;
}
else
{
time=time;
}
break;
}
case 'M':
{
if (type == 'M')
{
smalltimes=line.substr(1,3);
stringstream ss(smalltimes);
ss>>smalltime; //oneline.mill;
// cout<<smalltime<<endl; //smalltime=oneline.mill;
}
else
{
smalltime=smalltime;
}
break;
}
case 'R':
{
oneline.second=time;
oneline.mill=smalltime;
a=line.substr(0,1);
stringstream ss(a);
ss>>oneline.type;
b=line.substr(1,6);
stringstream sss(b);
sss>>oneline.ticker;
c=line.substr(7,1);
stringstream ssss(c);
ssss>>oneline.marketCategory;
d=line.substr(8,1);
stringstream sssss(d);
sssss>>oneline.financialStatus;
e=line.substr(9,6);
stringstream ssssss(e);
ssssss>>oneline.roundLotSize;
f=line.substr(15,1);
stringstream sssssss(f);
sssssss>>oneline.roundLotOnly;
*oneline.tradingState=0;
*oneline.reserved=0;
*oneline.reason=0;
*oneline.mpid=0;
*oneline.primaryMarketMaker=0;
*oneline.primaryMarketMode=0;
*oneline.marketParticipantState=0;
oneline.orderNumber=0;
*oneline.buySell=0;
oneline.shares=0;
oneline.price=0;
oneline.executedShares=0;
oneline.matchNumber=0;
*oneline.printable=0;
oneline.executionPrice=0;
oneline.canceledShares=0;
oneline.sharesBig=0;
oneline.crossPrice=0;
*oneline.crossType=0;
oneline.pairedShares=0;
oneline.imbalanceShares=0;
*oneline.imbalanceDirection=0;
oneline.fairPrice=0;
oneline.nearPrice=0;
oneline.currentReferencePrice=0;
*oneline.priceVariationIndicator=0;
break;
}//End Case
}//End Switch
}//end While
myfile.close();
}//End If
else cout << "Unable to open file";
cout<<"Junk"<<endl;
return 0;
}
UPDATE So I've been trying to use memory map, but now I'm getting a segmentation fault.
I've been trying to follow different examples to piece together something that would work for mine. Why would I be getting a segmentation fault? I've taken the first part of my code, which looks like this:
int main (int argc, char** path)
{
long i;
int fd;
char *map;
char *FILEPATH = path;
unsigned long FILESIZE;
FILE* fp = fopen(FILEPATH, "/home/brian/Desktop/S050508-v3.txt");
fseek(fp, 0, SEEK_END);
FILESIZE = ftell(fp);
fseek(fp, 0, SEEK_SET);
fclose(fp);
fd = open(FILEPATH, O_RDONLY);
map = (char *) mmap(0, FILESIZE, PROT_READ, MAP_SHARED, fd, 0);
char z;
stringstream ss;
for (long i = 0; i <= FILESIZE; ++i)
{
z = map[i];
if (z != '\n')
{
ss << z;
}
else
{
// c style tokenizing
ss.str("");
}
}
if (munmap(map, FILESIZE) == -1) perror("Error un-mmapping the file");
close(fd);

The data file are usually around 10gigabytes.
...
Are the switch cases slowing it down?
Almost certainly not, smells like you're I/O bound. But you should consider measuring it. Modern CPUs have performance counters which are pretty easy to leverage with the right tools. But let's start to partition the problems into some major domains: I/O to devices, load/store to memory, CPU. You can place some markers in your code where you read a clock in order to understand how long each of the operations are taking. On linux you can use clock_gettime() or the rdtsc instruction to access a clock with higher precision than the OS tick.
Consider mmap/CreateFileMapping, either of which might provide better efficiency/throughput to the pages you're accessing.
Consider large/huge pages if streaming through large amounts of data which has already been paged in.
From the manual for mmap():
Description
mmap() creates a new mapping in the virtual address space of the
calling process. The starting address for the new mapping is specified
in addr. The length argument specifies the length of the mapping.
Here's an mmap() example:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#define FILEPATH "/tmp/mmapped.bin"
#define NUMINTS (1000)
#define FILESIZE (NUMINTS * sizeof(int))
int main(int argc, char *argv[])
{
int i;
int fd;
int *map; /* mmapped array of int's */
fd = open(FILEPATH, O_RDONLY);
if (fd == -1) {
perror("Error opening file for reading");
exit(EXIT_FAILURE);
}
map = mmap(0, FILESIZE, PROT_READ, MAP_SHARED, fd, 0);
if (map == MAP_FAILED) {
close(fd);
perror("Error mmapping the file");
exit(EXIT_FAILURE);
}
/* Read the file int-by-int from the mmap
*/
for (i = 1; i <=NUMINTS; ++i) {
printf("%d: %d\n", i, map[i]);
}
if (munmap(map, FILESIZE) == -1) {
perror("Error un-mmapping the file");
}
close(fd);
return 0;
}

Related

What are the fastest methods to read from a file in standard C++? [duplicate]

I am currently writing a program in c++ which includes reading lots of large text files. Each has ~400.000 lines with in extreme cases 4000 or more characters per line. Just for testing, I read one of the files using ifstream and the implementation offered by cplusplus.com. It took around 60 seconds, which is way too long. Now I was wondering, is there a straightforward way to improve reading speed?
edit:
The code I am using is more or less this:
string tmpString;
ifstream txtFile(path);
if(txtFile.is_open())
{
while(txtFile.good())
{
m_numLines++;
getline(txtFile, tmpString);
}
txtFile.close();
}
edit 2: The file I read is only 82 MB big. I mainly said that it could reach 4000 because I thought it might be necessary to know in order to do buffering.
edit 3: Thank you all for your answers, but it seems like there is not much room to improve given my problem. I have to use readline, since I want to count the number of lines. Instantiating the ifstream as binary didn't make reading any faster either. I will try to parallelize it as much as I can, that should work at least.
edit 4: So apparently there are some things I can to. Big thank you to sehe for putting so much time into this, I appreciate it a lot! =)
Updates: Be sure to check the (surprising) updates below the initial answer
Memory mapped files have served me well1:
#include <boost/iostreams/device/mapped_file.hpp> // for mmap
#include <algorithm> // for std::find
#include <iostream> // for std::cout
#include <cstring>
int main()
{
boost::iostreams::mapped_file mmap("input.txt", boost::iostreams::mapped_file::readonly);
auto f = mmap.const_data();
auto l = f + mmap.size();
uintmax_t m_numLines = 0;
while (f && f!=l)
if ((f = static_cast<const char*>(memchr(f, '\n', l-f))))
m_numLines++, f++;
std::cout << "m_numLines = " << m_numLines << "\n";
}
This should be rather quick.
Update
In case it helps you test this approach, here's a version using mmap directly instead of using Boost: see it live on Coliru
#include <algorithm>
#include <iostream>
#include <cstring>
// for mmap:
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
const char* map_file(const char* fname, size_t& length);
int main()
{
size_t length;
auto f = map_file("test.cpp", length);
auto l = f + length;
uintmax_t m_numLines = 0;
while (f && f!=l)
if ((f = static_cast<const char*>(memchr(f, '\n', l-f))))
m_numLines++, f++;
std::cout << "m_numLines = " << m_numLines << "\n";
}
void handle_error(const char* msg) {
perror(msg);
exit(255);
}
const char* map_file(const char* fname, size_t& length)
{
int fd = open(fname, O_RDONLY);
if (fd == -1)
handle_error("open");
// obtain file size
struct stat sb;
if (fstat(fd, &sb) == -1)
handle_error("fstat");
length = sb.st_size;
const char* addr = static_cast<const char*>(mmap(NULL, length, PROT_READ, MAP_PRIVATE, fd, 0u));
if (addr == MAP_FAILED)
handle_error("mmap");
// TODO close fd at some point in time, call munmap(...)
return addr;
}
Update
The last bit of performance I could squeeze out of this I found by looking at the source of GNU coreutils wc. To my surprise using the following (greatly simplified) code adapted from wc runs in about 84% of the time taken with the memory mapped file above:
static uintmax_t wc(char const *fname)
{
static const auto BUFFER_SIZE = 16*1024;
int fd = open(fname, O_RDONLY);
if(fd == -1)
handle_error("open");
/* Advise the kernel of our access pattern. */
posix_fadvise(fd, 0, 0, 1); // FDADVICE_SEQUENTIAL
char buf[BUFFER_SIZE + 1];
uintmax_t lines = 0;
while(size_t bytes_read = read(fd, buf, BUFFER_SIZE))
{
if(bytes_read == (size_t)-1)
handle_error("read failed");
if (!bytes_read)
break;
for(char *p = buf; (p = (char*) memchr(p, '\n', (buf + bytes_read) - p)); ++p)
++lines;
}
return lines;
}
1 see e.g. the benchmark here: How to parse space-separated floats in C++ quickly?
4000 * 400,000 = 1.6 GB if you're hard drive isn't an SSD you're likely getting ~100 MB/s sequential read. That's 16 seconds just in I/O.
Since you don't elaborate on the specific code your using or how you need to parse these files (do you need to read it line by line, does the system have a lot of RAM could you read the whole file into a large RAM buffer and then parse it?) There's little you can do to speed up the process.
Memory mapped files won't offer any performance improvement when reading a file sequentially. Perhaps manually parsing large chunks for new lines rather than using "getline" would offer an improvement.
EDIT After doing some learning (thanks #sehe). Here's the memory mapped solution I would likely use.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <errno.h>
int main() {
char* fName = "big.txt";
//
struct stat sb;
long cntr = 0;
int fd, lineLen;
char *data;
char *line;
// map the file
fd = open(fName, O_RDONLY);
fstat(fd, &sb);
//// int pageSize;
//// pageSize = getpagesize();
//// data = mmap((caddr_t)0, pageSize, PROT_READ, MAP_PRIVATE, fd, pageSize);
data = mmap((caddr_t)0, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
line = data;
// get lines
while(cntr < sb.st_size) {
lineLen = 0;
line = data;
// find the next line
while(*data != '\n' && cntr < sb.st_size) {
data++;
cntr++;
lineLen++;
}
/***** PROCESS LINE *****/
// ... processLine(line, lineLen);
}
return 0;
}
Neil Kirk, unfortunately I can not reply to your comment (not enough reputation) but I did a performance test on ifstream an stringstream and the performance, reading a text file line by line, is exactly the same.
std::stringstream stream;
std::string line;
while(std::getline(stream, line)) {
}
This takes 1426ms on a 106MB file.
std::ifstream stream;
std::string line;
while(ifstream.good()) {
getline(stream, line);
}
This takes 1433ms on the same file.
The following code is faster instead:
const int MAX_LENGTH = 524288;
char* line = new char[MAX_LENGTH];
while (iStream.getline(line, MAX_LENGTH) && strlen(line) > 0) {
}
This takes 884ms on the same file.
It is just a little tricky since you have to set the maximum size of your buffer (i.e. maximum length for each line in the input file).
As someone with a little background in competitive programming, I can tell you: At least for simple things like integer parsing the main cost in C is locking the file streams (which is by default done for multi-threading). Use the unlocked_stdio versions instead (fgetc_unlocked(), fread_unlocked()). For C++, the common lore is to use std::ios::sync_with_stdio(false) but I don't know if it's as fast as unlocked_stdio.
For reference here is my standard integer parsing code. It's a lot faster than scanf, as I said mainly due to not locking the stream. For me it was as fast as the best hand-coded mmap or custom buffered versions I'd used previously, without the insane maintenance debt.
int readint(void)
{
int n, c;
n = getchar_unlocked() - '0';
while ((c = getchar_unlocked()) > ' ')
n = 10*n + c-'0';
return n;
}
(Note: This one only works if there is precisely one non-digit character between any two integers).
And of course avoid memory allocation if possible...
Do you have to read all files at the same time? (at the start of your application for example)
If you do, consider parallelizing the operation.
Either way, consider using binary streams, or unbffered read for blocks of data.
Use Random file access or use binary mode. for sequential, this is big but still it depends on what you are reading.

Cant copy the whole text file to char array

I am trying to copy a whole text file into char array using fstream but even upon increasing the size of the array it reads the text file to same limit .i am bount to save it in a char array and it will be good if it is not a dynamic one ??? any solution please ...
// smallGrams.cpp : Defines the entry point for the console application.
//
//#include "stdafx.h"
#include<iostream>
using namespace std;
#include<string>
#include<fstream>
void readInput(const char* Path);
void removePunctucationMarks();
void removeSpacing();
void insertDots();
char * getText();
void generateUnigrams();
void generateBigrams();
void generateTrigrams();
double validateSentance(string str);
string sentenceCreation(int position);
int main()
{
char *path="alice.txt";
readInput(path);
return 0;
}
void readInput(const char* Path)
{
ifstream infile;
infile.open(Path);
if(!infile.fail())
cout<<"File opened successfully"<<endl;
else
cout<<"File failed to open"<<endl;
int arrSize=100000000;
char *arr=new char[arrSize];
int i=0;
while(!infile.eof()&&i<arrSize)
{
infile.get(arr[i]);
i++;
}
arr[i-1]='\0';
for(short i=0;i<arrSize&&arr[i]!='\0';i++)
{
cout<<arr[i];
}
}
This is a C style solution that works. It checks the file size and then allocate the necessary memory for the array and reads all the content of the file in one call. The fread() call returns the number of bytes you requested or an error has ocurred (check fread() reference)
# include <cstring>
# include <cstdlib>
# include <cstdio>
int main(int argc, char *argv[]) {
char *data;
int data_len;
FILE *fd;
fd = fopen ("file.txt", "r");
if (fd == NULL) {
// error
return -1;
}
fseek (fd , 0 , SEEK_END);
data_len = ftell (fd);
rewind (fd);
data = (char *) malloc ((data_len + 1) * sizeof (char));
memset (data, data_len + 1, NULL);
if (fread (data, sizeof (char), data_len, fd) != data_len) {
// error
return -1;
}
printf ("%s\n", data);
fclose (fd);
free (data);
return 0;
}
Here with a simple doubling method...
#include<iostream>
#include<string>
#include<fstream>
#include <cstdint>
#include <cstring>
using namespace std;
void readInput(const char* Path)
{
ifstream infile;
infile.open(Path);
if(!infile.fail())
cout<<"File opened successfully"<<endl;
else{
cout<<"File failed to open"<<endl;
return;
}
int capacity=1000;
char *arr=new char[capacity];
char *temp;
int i=0;
while(infile >> arr[i])
{
i++;
if ( i >= capacity ) {
temp = new char[capacity*2];
std::memcpy(temp , arr, capacity);
delete [] arr;
arr = temp;
capacity *=2;
}
}
}
int main()
{
char *path="alice.txt";
readInput(path);
return 0;
}
The error could when you read and display the array content using the for loop and not on reading the data from file.
Use int instead of short in for loop, as short can increment upto 32768, only.

getline() function error in c++ code

Can someone tell me what am i doing wrong here i am getting an error saying getline() not declared in this scope.........any help would be appreciated.
no matching function for call to getline(char**, size_t*, FILE*&)
#include<iostream>
#include<fstream>
#include<string>
using namespace std;
char *s;
int main(int argc, char *argv[])
{
FILE* fd = fopen("input.txt", "r");
if(fd == NULL)
{
fputs("Unable to open input.txt\n", stderr);
exit(EXIT_FAILURE);
}
size_t length = 0;
ssize_t read;
const char* backup;
while ((read = getline(&s, &length, fd) ) > 0)
{
backup = s;
if (A() && *s == '\n')
{
printf("%sis in the language\n", backup);
}
else
{
fprintf(stderr, "%sis not in the language\n", backup);
}
}
fclose(fd);
return 0;
}
You'll need to use C++ style code in order to use getline in a cross platform way.
#include <fstream>
#include <string>
using namespace std;
std::string s;
bool A() { return true; }
int main(int argc, char *argv[])
{
ifstream myfile("input.txt");
if(!myfile.is_open())
{
fprintf(stderr, "Unable to open input.txt\n");
return 1;
}
size_t length = 0;
size_t read;
std::string backup;
while (getline(myfile, s))
{
backup = s;
if (A() && s == "\n")
{
printf("%s is in the language\n", backup.c_str());
}
else
{
fprintf(stderr, "%s is not in the language\n", backup.c_str());
}
}
return 0;
}
What are you trying to do with getline(&s, &length, fd)? Are you trying to use the C getline?
Assuming you have opened the file correctly, in c++ your getline should look something like this: getline(inputStream, variableToReadInto, optionalDelimiter).
You didn't include <stdio.h> but you did include <fstream>. Maybe use ifstream fd("input.txt");
What's A()
If you ARE trying to use the C getline, the using namespace std may be interfering
Why are you using printf and fprintf and not cout << xxxxxx and fd << xxxxxx
You seem to be a bit confused with various getline function signatures.
The standard C++ std::getline signature is
template< class CharT, class Traits, class Allocator >
std::basic_istream<CharT,Traits>& getline( std::basic_istream<CharT,Traits>& input,
std::basic_string<CharT,Traits,Allocator>& str,
CharT delim );
It takes an input stream object, a string and a character delimiter (there's an overload without the delimiter too).
The posix getline signature is
ssize_t getdelim(char **lineptr, size_t *n, int delim, FILE *stream);
with the delimiter optional again.
now in your code your passing arguments as if calling the posix version without delimiter. If you want to use the standard one you'll have to change the arguments (i.e. istream object instead of FILE*). I don't know if the posix one is even available for you, since posix is different from any C++ standard.
Note that the fputs, FILE*, fprintf are C filehandling functions, not the C++ ones.

Trouble with C++ file I/O

Noobie Alert.
Ugh. I'm having some real trouble getting some basic file I/O stuff done using <stdio.h> or <fstream>. They both seem so clunky and non-intuitive to use. I mean, why couldn't C++ just provide a way to get a char* pointer to the first char in the file? That's all I'd ever want.
I'm doing Project Euler Question 13 and need to play with 50-digit numbers. I have the 150 numbers stored in the file 13.txt and I'm trying to create a 150x50 array so I can play with the digits of each number directly. But I'm having tons of trouble. I've tried using the C++ <fstream> library and recently straight <stdio.h> to get it done, but something must not be clicking for me. Here's what I have;
#include <iostream>
#include <stdio.h>
int main() {
const unsigned N = 100;
const unsigned D = 50;
unsigned short nums[N][D];
FILE* f = fopen("13.txt", "r");
//error-checking for NULL return
unsigned short *d_ptr = &nums[0][0];
int c = 0;
while ((c = fgetc(f)) != EOF) {
if (c == '\n' || c == '\t' || c == ' ') {
continue;
}
*d_ptr = (short)(c-0x30);
++d_ptr;
}
fclose(f);
//do stuff
return 0;
}
Can someone offer some advice? Perhaps a C++ guy on which I/O library they prefer?
Here's a nice efficient solution (but doesn't work with pipes):
std::vector<char> content;
FILE* f = fopen("13.txt", "r");
// error-checking goes here
fseek(f, 0, SEEK_END);
content.resize(ftell(f));
fseek(f, 0, SEEK_BEGIN);
fread(&content[0], 1, content.size(), f);
fclose(f);
Here's another:
std::vector<char> content;
struct stat fileinfo;
stat("13.txt", &fileinfo);
// error-checking goes here
content.resize(fileinfo.st_size);
FILE* f = fopen("13.txt", "r");
// error-checking goes here
fread(&content[0], 1, content.size(), f);
// error-checking goes here
fclose(f);
I would use an fstream. The one problem you have is that you obviously can't fit the numbers in the file into any of C++'s native numeric types (double, long long, etc.)
Reading them into strings is pretty easy though:
std::fstream in("13.txt");
std::vector<std::string> numbers((std::istream_iterator<std::string>(in)),
std::istream_iterator<std::string>());
That will read in each number into a string, so the number that was on the first line will be in numbers[0], the second line in numbers[1], and so on.
If you really want to do the job in C, it can still be quite a lot easier than what you have above:
char *dupe(char const *in) {
char *ret;
if (NULL != (ret=malloc(strlen(in)+1))
strcpy(ret, in);
return ret;
}
// read the data:
char buffer[256];
char *strings[256];
size_t pos = 0;
while (fgets(buffer, sizeof(buffer), stdin)
strings[pos++] = dupe(buffer);
Rather than reading the one hundred 50 digit numbers from a file, why not read them directly in from a character constant?
You could start your code out with:
static const char numbers[] =
"37107287533902102798797998220837590246510135740250"
"46376937677490009712648124896970078050417018260538"...
With a semicolon at the last line.

How to write to a memory buffer with a FILE*?

Is there any way to create a memory buffer as a FILE*. In TiXml it can print the xml to a FILE* but i cant seem to make it print to a memory buffer.
There is a POSIX way to use memory as a FILE descriptor: fmemopen or open_memstream, depending on the semantics you want: Difference between fmemopen and open_memstream
I guess the proper answer is that by Kevin. But here is a hack to do it with FILE *. Note that if the buffer size (here 100000) is too small then you lose data, as it is written out when the buffer is flushed. Also, if the program calls fflush() you lose the data.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
FILE *f = fopen("/dev/null", "w");
int i;
int written = 0;
char *buf = malloc(100000);
setbuffer(f, buf, 100000);
for (i = 0; i < 1000; i++)
{
written += fprintf(f, "Number %d\n", i);
}
for (i = 0; i < written; i++) {
printf("%c", buf[i]);
}
}
fmemopen can create FILE from buffer, does it make any sense to you?
I wrote a simple example how i would create an in-memory FILE:
#include <unistd.h>
#include <stdio.h>
int main(){
int p[2]; pipe(p); FILE *f = fdopen( p[1], "w" );
if( !fork() ){
fprintf( f, "working" );
return 0;
}
fclose(f); close(p[1]);
char buff[100]; int len;
while( (len=read(p[0], buff, 100))>0 )
printf(" from child: '%*s'", len, buff );
puts("");
}
C++ basic_streambuf inheritance
In C++, you should avoid FILE* if you can.
Using only the C++ stdlib, it is possible to make a single interface that transparently uses file or memory IO.
This uses techniques mentioned at: Setting the internal buffer used by a standard stream (pubsetbuf)
#include <cassert>
#include <cstring>
#include <fstream>
#include <iostream>
#include <ostream>
#include <sstream>
/* This can write either to files or memory. */
void write(std::ostream& os) {
os << "abc";
}
template <typename char_type>
struct ostreambuf : public std::basic_streambuf<char_type, std::char_traits<char_type> > {
ostreambuf(char_type* buffer, std::streamsize bufferLength) {
this->setp(buffer, buffer + bufferLength);
}
};
int main() {
/* To memory, in our own externally supplied buffer. */
{
char c[3];
ostreambuf<char> buf(c, sizeof(c));
std::ostream s(&buf);
write(s);
assert(memcmp(c, "abc", sizeof(c)) == 0);
}
/* To memory, but in a hidden buffer. */
{
std::stringstream s;
write(s);
assert(s.str() == "abc");
}
/* To file. */
{
std::ofstream s("a.tmp");
write(s);
s.close();
}
/* I think this is implementation defined.
* pusetbuf calls basic_filebuf::setbuf(). */
{
char c[3];
std::ofstream s;
s.rdbuf()->pubsetbuf(c, sizeof c);
write(s);
s.close();
//assert(memcmp(c, "abc", sizeof(c)) == 0);
}
}
Unfortunately, it does not seem possible to interchange FILE* and fstream: Getting a FILE* from a std::fstream
You could use the CStr method of TiXMLPrinter which the documentation states:
The TiXmlPrinter is useful when you
need to:
Print to memory (especially in non-STL mode)
Control formatting (line endings, etc.)
https://github.com/Snaipe/fmem is a wrapper for different platform/version specific implementations of memory streams
It tries in sequence the following implementations:
open_memstream.
fopencookie, with growing dynamic buffer.
funopen, with growing dynamic buffer.
WinAPI temporary memory-backed file.
When no other mean is available, fmem falls back to tmpfile()