I just wrote the following C++ function to programmatically determine how much RAM a system has installed. It works, but it seems to me that there should be a simpler way to do this. Am I missing something?
getRAM()
{
FILE* stream = popen("head -n1 /proc/meminfo", "r");
std::ostringstream output;
int bufsize = 128;
while( !feof(stream) && !ferror(stream))
{
char buf[bufsize];
int bytesRead = fread(buf, 1, bufsize, stream);
output.write(buf, bytesRead);
}
std::string result = output.str();
std::string label, ram;
std::istringstream iss(result);
iss >> label;
iss >> ram;
return ram;
}
First, I'm using popen("head -n1 /proc/meminfo") to get the first line of the meminfo file from the system. The output of that command looks like
MemTotal: 775280 kB
Once I've got that output in an istringstream, it's simple to tokenize it to get at the information I want. Is there a simpler way to read in the output of this command? Is there a standard C++ library call to read in the amount of system RAM?
On Linux, you can use the function sysinfo which sets values in the following struct:
#include <sys/sysinfo.h>
int sysinfo(struct sysinfo *info);
struct sysinfo {
long uptime; /* Seconds since boot */
unsigned long loads[3]; /* 1, 5, and 15 minute load averages */
unsigned long totalram; /* Total usable main memory size */
unsigned long freeram; /* Available memory size */
unsigned long sharedram; /* Amount of shared memory */
unsigned long bufferram; /* Memory used by buffers */
unsigned long totalswap; /* Total swap space size */
unsigned long freeswap; /* swap space still available */
unsigned short procs; /* Number of current processes */
unsigned long totalhigh; /* Total high memory size */
unsigned long freehigh; /* Available high memory size */
unsigned int mem_unit; /* Memory unit size in bytes */
char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding for libc5 */
};
If you want to do it solely using functions of C++ (I would stick to sysinfo), I recommend taking a C++ approach using std::ifstream and std::string:
unsigned long get_mem_total() {
std::string token;
std::ifstream file("/proc/meminfo");
while(file >> token) {
if(token == "MemTotal:") {
unsigned long mem;
if(file >> mem) {
return mem;
} else {
return 0;
}
}
// Ignore the rest of the line
file.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
}
return 0; // Nothing found
}
There isn't any need to use popen(). You can just read the file yourself.
Also, if their first line isn't what you're looking for, you'll fail, since head -n1 only reads the first line and then exits. I'm not sure why you're mixing C and C++ I/O like that; it's perfectly OK, but you should probably opt to go all C or all C++. I'd probably do it something like this:
int GetRamInKB(void)
{
FILE *meminfo = fopen("/proc/meminfo", "r");
if(meminfo == NULL)
... // handle error
char line[256];
while(fgets(line, sizeof(line), meminfo))
{
int ram;
if(sscanf(line, "MemTotal: %d kB", &ram) == 1)
{
fclose(meminfo);
return ram;
}
}
// If we got here, then we couldn't find the proper line in the meminfo file:
// do something appropriate like return an error code, throw an exception, etc.
fclose(meminfo);
return -1;
}
Remember /proc/meminfo is just a file. Open the file, read the first line, and close the file. VoilĂ !
Even top (from procps) parses /proc/meminfo. See here.
Related
I have a program where I need to operate on different types of files.
I want the input and output files of the following program to be the same.
#include<iostream>
#include<string>
#include<fstream>
#include<sstream>
typedef unsigned char u8;
using namespace std;
char* readFileBytes(string name)
{
ifstream fl(name);
fl.seekg( 0, ios::end );
size_t len = fl.tellg();
char *ret = new char[len];
fl.seekg(0, ios::beg);
fl.read(ret, len);
fl.close();
return ret;
}
int main(int argc, char *argv[]){
string name = "file.pdf";
u8* file = (u8*) readFileBytes(name);
// cout<<str<<endl;
int len = 0;
while(file[len] != '\0')
len++;
cout<<"FILESIZE : "<<len<<endl;
string filename = "file2.pdf";
ofstream outfile(filename,ios::out | ios::binary);
outfile.write((char*) file,len);
outfile.close();
exit(0);
}
The difference between the output and input files is checked using diff
diff file.pdf file2.pdf
What should I do to make file2.pdf the same as file.pdf?
I have tried using xxd to change the binary into hexadecimal but the disadvantage is that the overall size doubles. So therefore I want to operate in binary only.
size_t len = fl.tellg();
char *ret = new char[len];
In this manner the shown code determines the number of characters in the file. This is fine. The only problem with it is that after this number of characters is read, this very important information is completely forgotten and thrown away. This function returns only this ret pointer, and the actual number of characters in it is now an unsolvable mystery.
But then, main() attempts to solve this mystery as follows:
int len = 0;
while(file[len] != '\0')
len++;
This attempts to reverse-engineer the number of characters by looking for the first 0 byte in the buffer.
Which has absolutely nothing to do with anything. The first character in the file may be a 0 byte, so this will calculate that the file is empty, and not ten gigabytes in size.
Or the file can contain just a string "Hello world", which this for loop will happily blow past, then start rooting around in some random memory after this buffer, resulting in undefined behavior.
That's the fatal logical flaw in the shown code: the actual size of the file is thrown away, and instead reverse-engineered in a flawed way.
You will need to rework the code so that the number of characters in the file, the original len, is also returned to main(), and it uses that, instead of attempting to guess what it originally was.
P.S. delete-ing the ret buffer, after you're done with it, would also be a good idea too. An even better idea is to avoid using new, using vector instead, which will happily give you its size() any time you ask for it, and you won't have to worry about deleting the allocated memory.
In order to correctly process binary data, the size must be stored and cannot be computed from a sentinel null byte, because null bytes can be legimate bytes in a binary file. So you should return the read lenght in addition to the buffer, or even better copy each buffer to the new file until you have exhausted the input file:
int main(int argc, char *argv[]){
constexpr size_t sz = 10240; // size of buffer
char buffer[sz];
string name = "file.pdf";
string filename = "file2.pdf";
ifstream fl(name);
ofstream outfile(filename,ios::out | ios::binary);
int len = 0, buflen;
for (;;) {
buflen = fl.read(buf, len);
if (buflen == 0) break; // reached EOF
len += buflen;
if (buflen != outfile.write(buf, buflen)) {
// display an error message
return 1;
}
}
fl.close();
outfile.close()
cout<<"FILESIZE : "<<len<<endl;
exit(0);
}
I just wrote the following C++ function to programmatically determine how much RAM a system has installed. It works, but it seems to me that there should be a simpler way to do this. Am I missing something?
getRAM()
{
FILE* stream = popen("head -n1 /proc/meminfo", "r");
std::ostringstream output;
int bufsize = 128;
while( !feof(stream) && !ferror(stream))
{
char buf[bufsize];
int bytesRead = fread(buf, 1, bufsize, stream);
output.write(buf, bytesRead);
}
std::string result = output.str();
std::string label, ram;
std::istringstream iss(result);
iss >> label;
iss >> ram;
return ram;
}
First, I'm using popen("head -n1 /proc/meminfo") to get the first line of the meminfo file from the system. The output of that command looks like
MemTotal: 775280 kB
Once I've got that output in an istringstream, it's simple to tokenize it to get at the information I want. Is there a simpler way to read in the output of this command? Is there a standard C++ library call to read in the amount of system RAM?
On Linux, you can use the function sysinfo which sets values in the following struct:
#include <sys/sysinfo.h>
int sysinfo(struct sysinfo *info);
struct sysinfo {
long uptime; /* Seconds since boot */
unsigned long loads[3]; /* 1, 5, and 15 minute load averages */
unsigned long totalram; /* Total usable main memory size */
unsigned long freeram; /* Available memory size */
unsigned long sharedram; /* Amount of shared memory */
unsigned long bufferram; /* Memory used by buffers */
unsigned long totalswap; /* Total swap space size */
unsigned long freeswap; /* swap space still available */
unsigned short procs; /* Number of current processes */
unsigned long totalhigh; /* Total high memory size */
unsigned long freehigh; /* Available high memory size */
unsigned int mem_unit; /* Memory unit size in bytes */
char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding for libc5 */
};
If you want to do it solely using functions of C++ (I would stick to sysinfo), I recommend taking a C++ approach using std::ifstream and std::string:
unsigned long get_mem_total() {
std::string token;
std::ifstream file("/proc/meminfo");
while(file >> token) {
if(token == "MemTotal:") {
unsigned long mem;
if(file >> mem) {
return mem;
} else {
return 0;
}
}
// Ignore the rest of the line
file.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
}
return 0; // Nothing found
}
There isn't any need to use popen(). You can just read the file yourself.
Also, if their first line isn't what you're looking for, you'll fail, since head -n1 only reads the first line and then exits. I'm not sure why you're mixing C and C++ I/O like that; it's perfectly OK, but you should probably opt to go all C or all C++. I'd probably do it something like this:
int GetRamInKB(void)
{
FILE *meminfo = fopen("/proc/meminfo", "r");
if(meminfo == NULL)
... // handle error
char line[256];
while(fgets(line, sizeof(line), meminfo))
{
int ram;
if(sscanf(line, "MemTotal: %d kB", &ram) == 1)
{
fclose(meminfo);
return ram;
}
}
// If we got here, then we couldn't find the proper line in the meminfo file:
// do something appropriate like return an error code, throw an exception, etc.
fclose(meminfo);
return -1;
}
Remember /proc/meminfo is just a file. Open the file, read the first line, and close the file. VoilĂ !
Even top (from procps) parses /proc/meminfo. See here.
I'm trying to read a large dataset, format it the way I need, and then write it to another file. I'm trying to use C++ over SAS or STATA for the speed advantage. The data file are usually around 10gigabytes. And my current code takes over an hour to run (and then I kill it because I'm sure that something is very inefficient with my code.
Is there a more efficient way to do this? Maybe read the file into memory and then analyze it using the switch statements? (I have 32gb ram linux 64bit). Is it possible that reading, and then writing within the loop slows it down since it is constantly reading, then writing? I tried to read it from one drive, and then write to another in an attempt to speed this up.
Are the switch cases slowing it down?
The process I have now reads the data using getline, uses the switch statement to parse it correctly, and then writes it to my outfile. And repeats for 300 million lines. There are about 10 more cases in the switch statement, but I didn't copy for brevity's sake.
The code is probably very ugly all being in the main function, but I wanted to get it working before I worked on attractiveness.
I've tried using read() but without any success. Please let me know if I need to clarify anything.
Thank you for the help!
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <stdio.h>
//#include <cstring>
//#include <boost/algorithm/string.hpp>
#include <vector>
using namespace std;
//using namespace boost;
struct dataline
{
char type[0];
double second;
short mill;
char event[1];
char ticker[6];
char marketCategory[1];
char financialStatus[1];
int roundLotSize;
short roundLotOnly;
char tradingState[1];
char reserved[1];
char reason[4];
char mpid[4];
char primaryMarketMaker[1];
char primaryMarketMode[1];
char marketParticipantState[1];
unsigned long orderNumber;
char buySell[0];
double shares;
float price;
int executedShares;
double matchNumber;
char printable[1];
double executionPrice;
int canceledShares;
double sharesBig;
double crossPrice;
char crossType[0];
double pairedShares;
double imbalanceShares;
char imbalanceDirection[1];
double fairPrice;
double nearPrice;
double currentReferencePrice;
char priceVariationIndicator[1];
};
int main ()
{
string a;
string b;
string c;
string d;
string e;
string f;
string g;
string h;
string k;
string l;
string times;
string smalltimes;
short time; //counter to keep second filled
short smalltime; //counter to keep millisecond filled
double N;
double NN;
double NNN;
int length;
char M;
//vector<> fout;
string line;
ofstream fout ("/media/3tb/test.txt");
ifstream myfile;
myfile.open("S050508-v3.txt");
dataline oneline;
if (myfile.is_open())
{
while ( myfile.good() )
{
getline (myfile,line);
// cout << line<<endl;;
a=line.substr(0,1);
stringstream ss(a);
char type;
ss>>type;
switch (type)
{
case 'T':
{
if (type == 'T')
{
times=line.substr(1,5);
stringstream s(times);
s>>time;
//oneline.second=time;
//oneline.second;
//cout<<time<<endl;
}
else
{
time=time;
}
break;
}
case 'M':
{
if (type == 'M')
{
smalltimes=line.substr(1,3);
stringstream ss(smalltimes);
ss>>smalltime; //oneline.mill;
// cout<<smalltime<<endl; //smalltime=oneline.mill;
}
else
{
smalltime=smalltime;
}
break;
}
case 'R':
{
oneline.second=time;
oneline.mill=smalltime;
a=line.substr(0,1);
stringstream ss(a);
ss>>oneline.type;
b=line.substr(1,6);
stringstream sss(b);
sss>>oneline.ticker;
c=line.substr(7,1);
stringstream ssss(c);
ssss>>oneline.marketCategory;
d=line.substr(8,1);
stringstream sssss(d);
sssss>>oneline.financialStatus;
e=line.substr(9,6);
stringstream ssssss(e);
ssssss>>oneline.roundLotSize;
f=line.substr(15,1);
stringstream sssssss(f);
sssssss>>oneline.roundLotOnly;
*oneline.tradingState=0;
*oneline.reserved=0;
*oneline.reason=0;
*oneline.mpid=0;
*oneline.primaryMarketMaker=0;
*oneline.primaryMarketMode=0;
*oneline.marketParticipantState=0;
oneline.orderNumber=0;
*oneline.buySell=0;
oneline.shares=0;
oneline.price=0;
oneline.executedShares=0;
oneline.matchNumber=0;
*oneline.printable=0;
oneline.executionPrice=0;
oneline.canceledShares=0;
oneline.sharesBig=0;
oneline.crossPrice=0;
*oneline.crossType=0;
oneline.pairedShares=0;
oneline.imbalanceShares=0;
*oneline.imbalanceDirection=0;
oneline.fairPrice=0;
oneline.nearPrice=0;
oneline.currentReferencePrice=0;
*oneline.priceVariationIndicator=0;
break;
}//End Case
}//End Switch
}//end While
myfile.close();
}//End If
else cout << "Unable to open file";
cout<<"Junk"<<endl;
return 0;
}
UPDATE So I've been trying to use memory map, but now I'm getting a segmentation fault.
I've been trying to follow different examples to piece together something that would work for mine. Why would I be getting a segmentation fault? I've taken the first part of my code, which looks like this:
int main (int argc, char** path)
{
long i;
int fd;
char *map;
char *FILEPATH = path;
unsigned long FILESIZE;
FILE* fp = fopen(FILEPATH, "/home/brian/Desktop/S050508-v3.txt");
fseek(fp, 0, SEEK_END);
FILESIZE = ftell(fp);
fseek(fp, 0, SEEK_SET);
fclose(fp);
fd = open(FILEPATH, O_RDONLY);
map = (char *) mmap(0, FILESIZE, PROT_READ, MAP_SHARED, fd, 0);
char z;
stringstream ss;
for (long i = 0; i <= FILESIZE; ++i)
{
z = map[i];
if (z != '\n')
{
ss << z;
}
else
{
// c style tokenizing
ss.str("");
}
}
if (munmap(map, FILESIZE) == -1) perror("Error un-mmapping the file");
close(fd);
The data file are usually around 10gigabytes.
...
Are the switch cases slowing it down?
Almost certainly not, smells like you're I/O bound. But you should consider measuring it. Modern CPUs have performance counters which are pretty easy to leverage with the right tools. But let's start to partition the problems into some major domains: I/O to devices, load/store to memory, CPU. You can place some markers in your code where you read a clock in order to understand how long each of the operations are taking. On linux you can use clock_gettime() or the rdtsc instruction to access a clock with higher precision than the OS tick.
Consider mmap/CreateFileMapping, either of which might provide better efficiency/throughput to the pages you're accessing.
Consider large/huge pages if streaming through large amounts of data which has already been paged in.
From the manual for mmap():
Description
mmap() creates a new mapping in the virtual address space of the
calling process. The starting address for the new mapping is specified
in addr. The length argument specifies the length of the mapping.
Here's an mmap() example:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#define FILEPATH "/tmp/mmapped.bin"
#define NUMINTS (1000)
#define FILESIZE (NUMINTS * sizeof(int))
int main(int argc, char *argv[])
{
int i;
int fd;
int *map; /* mmapped array of int's */
fd = open(FILEPATH, O_RDONLY);
if (fd == -1) {
perror("Error opening file for reading");
exit(EXIT_FAILURE);
}
map = mmap(0, FILESIZE, PROT_READ, MAP_SHARED, fd, 0);
if (map == MAP_FAILED) {
close(fd);
perror("Error mmapping the file");
exit(EXIT_FAILURE);
}
/* Read the file int-by-int from the mmap
*/
for (i = 1; i <=NUMINTS; ++i) {
printf("%d: %d\n", i, map[i]);
}
if (munmap(map, FILESIZE) == -1) {
perror("Error un-mmapping the file");
}
close(fd);
return 0;
}
Noobie Alert.
Ugh. I'm having some real trouble getting some basic file I/O stuff done using <stdio.h> or <fstream>. They both seem so clunky and non-intuitive to use. I mean, why couldn't C++ just provide a way to get a char* pointer to the first char in the file? That's all I'd ever want.
I'm doing Project Euler Question 13 and need to play with 50-digit numbers. I have the 150 numbers stored in the file 13.txt and I'm trying to create a 150x50 array so I can play with the digits of each number directly. But I'm having tons of trouble. I've tried using the C++ <fstream> library and recently straight <stdio.h> to get it done, but something must not be clicking for me. Here's what I have;
#include <iostream>
#include <stdio.h>
int main() {
const unsigned N = 100;
const unsigned D = 50;
unsigned short nums[N][D];
FILE* f = fopen("13.txt", "r");
//error-checking for NULL return
unsigned short *d_ptr = &nums[0][0];
int c = 0;
while ((c = fgetc(f)) != EOF) {
if (c == '\n' || c == '\t' || c == ' ') {
continue;
}
*d_ptr = (short)(c-0x30);
++d_ptr;
}
fclose(f);
//do stuff
return 0;
}
Can someone offer some advice? Perhaps a C++ guy on which I/O library they prefer?
Here's a nice efficient solution (but doesn't work with pipes):
std::vector<char> content;
FILE* f = fopen("13.txt", "r");
// error-checking goes here
fseek(f, 0, SEEK_END);
content.resize(ftell(f));
fseek(f, 0, SEEK_BEGIN);
fread(&content[0], 1, content.size(), f);
fclose(f);
Here's another:
std::vector<char> content;
struct stat fileinfo;
stat("13.txt", &fileinfo);
// error-checking goes here
content.resize(fileinfo.st_size);
FILE* f = fopen("13.txt", "r");
// error-checking goes here
fread(&content[0], 1, content.size(), f);
// error-checking goes here
fclose(f);
I would use an fstream. The one problem you have is that you obviously can't fit the numbers in the file into any of C++'s native numeric types (double, long long, etc.)
Reading them into strings is pretty easy though:
std::fstream in("13.txt");
std::vector<std::string> numbers((std::istream_iterator<std::string>(in)),
std::istream_iterator<std::string>());
That will read in each number into a string, so the number that was on the first line will be in numbers[0], the second line in numbers[1], and so on.
If you really want to do the job in C, it can still be quite a lot easier than what you have above:
char *dupe(char const *in) {
char *ret;
if (NULL != (ret=malloc(strlen(in)+1))
strcpy(ret, in);
return ret;
}
// read the data:
char buffer[256];
char *strings[256];
size_t pos = 0;
while (fgets(buffer, sizeof(buffer), stdin)
strings[pos++] = dupe(buffer);
Rather than reading the one hundred 50 digit numbers from a file, why not read them directly in from a character constant?
You could start your code out with:
static const char numbers[] =
"37107287533902102798797998220837590246510135740250"
"46376937677490009712648124896970078050417018260538"...
With a semicolon at the last line.
Is there a function for FILE (fopen?) that allows me to just read one int from a binary file?
So far I'm trying this, but I'm getting some kind of error I can't see cause the program just crashes without telling me.
void opentest()
{
FILE *fp = fopen("dqmapt.mp", "r");
int i = 0;
int j = 0;
int k = 0;
int * buffer;
if (fp)
{
buffer = (int *) (sizeof(int));
i = (int) fread(buffer,1, (sizeof(int)), fp);
fscanf(fp, "%d", &j);
fclose(fp);
}
printf("%d\n", i);
printf("%d\n", j);
}
Now that you have changed your question, let me ask one. What is the format of the file you are trying to read?
For a binary file there are some changes required how you open the file:
/* C way */
FILE *fp = fopen("text.bin", "rb"); /* note the b; this is a compound mode */
/* C++ way */
std::ifstream ifs("test.txt", ios::in | ios::binary);
Reading in the contents is easy. But remember, your file has 2 integers at the begining -- width, height which determine how many more to read i.e. another width * height number of integers. So, your best bet is to read the first two integers first. You will need to use two buffers -- one for the width and height and then depending on their value another one to read the rest of the file. So, lets read in the first two integers:
char buf[ 2 * sizeof(int) ]; /* will store width and height */
Read in the two integers:
/* C way */
fread(buf, sizeof(int), 2, fp); /* the syntax changes, FILE pointer is last */
/* C++ way*/
ifs.read(buf, sizeof buf);
Now, the tricky part. You have to convert the stuff to double. This again depends on your system endianness -- whether a simple assignment works or whether a byte swapping is necessary. As another poster has pointed out WriteInt() writes integers in big-endian format. Figure out what system you are on. And then you can proceed further.
FILE is a C datastructure. It is included in C++ for C compatibility. You can do this:
/* The C way */
#include <stdio.h>
#include <stdlib.h>
int main(void) {
FILE *fp = fopen("test.txt", "r");
int i = 0;
if (fp) {
fscanf(fp, "%d", &i);
fclose(fp);
}
printf("%d\n", i);
}
You can use the std::ifstream thing to open a file for reading. You have to read in the contents using some other incantation to read the file contents and extract the desired information out of it yourself.
/* The C++ way */
#include <fstream>
#include <iostream>
int main() {
std::ifstream ifs("test.txt");
int i = 0;
if (ifs.good()) {
ifs >> i;
}
std::cout << i << std::endl;
}
Note you can use the C style functions in C++ as well, though this is the least recommended way:
/* The C way in C++ */
#include <cstdio>
#include <cstdlib>
int main() {
using namespace std;
FILE *fp = fopen("test.txt", "r");
int i = 0;
if (fp) {
fscanf(fp, "%d", &i);
fclose(fp);
}
printf("%d\n", i);
}
[Note: Both examples assume you have a text file to read from]
Do you want to read a textual representation of an int? Then you can use fscanf, it's sort of the opposite of printf
int n;
if( fscanf(filePointer, "%d", &n) == 1 )
// do stuff with n
If you want to read some binary data and treat it as an int, well that's going to depend how it was written in the first place.
I am not a Java programmer, so this is just based on what I've read in the [docs](http://java.sun.com/j2se/1.4.2/docs/api/java/io/DataOutputStream.html#writeInt(int)).
That said, it says
Writes an int to the underlying output stream as four bytes, high byte first. If no exception is thrown, the counter written is incremented by 4.
So it's a big endian four byte integer. I don't know if it's two's complement or not, but that's probably a safe assumption (and can probably be found somewhere in the java docs/spec). Big endian is the same as network byte order, so you can use ntohl to convert it the endianness of your C++ platform. Beyond that, you just need to read the four bytes, which can be done with fread.
Int represented as text or binary?
For text, use fscanf; for binary, use fread.