Reading a text file using a relative path - c++

I'm looking to read a text file using a relative path in C++. The directory structure is as follows: source -> Resources -> Settings -> Video.txt.
The contents of the file (note: these are used for testing, of course):
somes*** = 1
mores*** = 2
evenmores*** = 3
According to my research, this is possible. Still, I find that this has yet to work. For example, when I step through my debugger, my char *line variable which is used to receive line-by-line text file input is always at a constant 8 value. As to my understanding, a char pointer can act as a dynamic array of characters which you may reassign to.
Why can't I read my file? When I try do an if ( !videoSettings ), it returns true, and I get an error message (created by myself).
Code
#ifdef WIN32
const char *filePath = "Resources\\Settings\\Video.txt";
#else
const char *filePath = "Resources/Settings/Video.txt";
#endif
std::ifstream videoSettings( filePath );
if ( !videoSettings )
{
cout << "ERROR: Failed opening file " << filePath << ". Switching to configure mode." << endl;
//return false;
}
int count = 0;
char *line;
while( !videoSettings.eof() )
{
videoSettings >> line;
cout << "LOADING: " << *line << "; ";
count = sizeof( line ) / sizeof( char );
cout << "VALUE: " << line[ count - 1 ];
/*
for ( int i = count; i > count; --i )
{
if ( i == count - 4 )
{
}
}
*/
}
delete line;

Wow ok- you cannot read a string of text into just a char * you need to preallocate the memory first.
2nd the size of a char* pointer is constant - but the size of the data it points to is not
I suggest using the std::string getline call and avoid all the dynamic memory allocation
So this would be
std::ifstream in("file.txt");
std::string line;
while(getline(in, line))
{
std::cout << line << std::endl;
}
Lastly relative paths are the last of your problems in you code example :-)

Related

Winapi get filename, weird behavior when put into a function [duplicate]

This question already has answers here:
Returning char* from function not working
(3 answers)
Closed 4 years ago.
note: the title may not be relevant to the issue, because I can't understand where the issue comes from.
The isuue: I have a piece of code which gets a filename from a dir and appends it to a vector < char* >.
It works ok, but if I wrap it in a function, it gives weird behavior, the resulting vector element prints single random character in console instead of the filename.
I have checked and re-checked everything but can't see why it is happening.
Below is complete runnable code, I compiled it on Windows with cl.exe
namely just cl.exe "a.cpp" /EHsc. x64 native environment and target.
I put together both pieces, for easier testing.
#include <windows.h>
#include <iostream>
#include <vector>
using namespace std;
void getimglist ( const char* mask, vector < char* > & flist )
{
WIN32_FIND_DATA data;
HANDLE hFind;
hFind = FindFirstFile ( mask, & data );
cout << "-fname: " << data.cFileName << "\n";
flist.push_back ( data.cFileName );
FindClose ( hFind );
}
int main (int argc, char* argv[])
{
vector < char* > L ;
vector < char* > L2 ;
const char* mask = ".\\input-cam0\\*.jpg";
const char* mask2 = ".\\input-cam0\\*.jpg";
// this code works
WIN32_FIND_DATA data;
HANDLE hFind;
hFind = FindFirstFile ( mask, & data );
cout << " first file: " << data.cFileName << "\n";
L.push_back ( data.cFileName );
FindClose (hFind);
// ***
cout << " size:" << L.size() << "\n";
cout << " first file L:" << L[0] << "\n";
// this works weird, output is different and wrong
cout << " ** function call **\n";
getimglist ( mask2, L2 );
cout << " size:" << L2.size() << "\n";
cout << " first file: " << L2[0] << "\n";
return 0;
} // end main
Output:
first file: 000-001.jpg
size:1
first file L:000-001.jpg
** function call **
-fname: 000-001.jpg
size:1
first file: R
See the last line ^, here it is R, and if I run multiple times the exe file, it prints single random character. And the result from main block gives correct results. Where is the problem?
WIN32_FIND_DATA data is allocated on the stack, stack memory is destroyed when your variable goes out of scope (in this case at the end of the function). This means that the pointer you have stored in your vector doesn't point to a valid address any more. When you print it the application happily prints whatever is now in the specified memory segment.
The simple fix is to copy the string into a new variable, the preferred way to do this would be to change your vectors into vector<std::string>. You can still call flist.push_back ( data.cFileName ) and the compiler will copy the character array into a std::string for you.

How to ignore new lines while reading blocks of data from file

Im trying to read blocks of data from a file, but I couldn't know how to ignore the newline character when I use istream::read.
Im aware that I can use for loop to load the characters to a cstring one by one with condition to ignore new lines character, but I hope there is clever way to solve this problem.
My intention to avoid using strings or vectors.
#include <iostream>
#include <fstream>
#include <cstring>
void readIt(char* fileName) {
std::ifstream seqsFile;
seqsFile.open(fileName) ;
if (seqsFile.fail()) {
std::cout << "Failed in opening: " << fileName << std::endl;
std::exit(1);
}
seqsFile.seekg(84);
char *buffer;
buffer = new char [7];
seqsFile.read(buffer, 7);
buffer[7] = 0;
std::cout << buffer << std::endl;
}
int main(int argc, char** argv) {
readIt(argv[1]);
return 0;
}
file:
gsi|33112219|sp|O
GACATTCTGGTGGTGGACTCGGAGGCATGATAGCAGGTGCAGCTGGTGCAGCCGCAGCAGCTTATGGAGC
GCAGCAGCTTATGGAGC
current output:
GAGC
GC
desired output:
GAGCGCA
modified version:
void readIt(char* fileName) {
std::ifstream seqsFile;
seqsFile.open(fileName) ;
if (seqsFile.fail()) {
std::cout << "Failed in opening: " << fileName << std::endl;
std::exit(1);
}
seqsFile.seekg(84);
char *buffer;
buffer = new char [7];
char next ;
for ( int i = 0 ; i < 7; i++) {
seqsFile.get(next);
if (next=='\n') {
i--;
continue;
}
buffer[i] = next;
}
buffer[7]=0;
std::cout << buffer << std::endl;
}
Your program has undefined behavior since you are modifying buffer using an out of range index. You have:
buffer = new char [7]; // Allocating 7 chars.
seqsFile.read(buffer, 7); // Reading 7 chars. OK.
buffer[7] = 0; // 7 is an out of range index. Not OK.
Allocate memory for at least 8 chars.
buffer = new char [8];
Also, when you intend to read the contents of a file using istream::read, it is recommended that you open the file in binary mode.
seqsFile.open(fileName, std::ios_base::binary) ;
Well, you can not tell not to read newlines - they will appear in your buffer variable anyway and you have to handle it.
Also, you have to fix the buffer size, as R Sahu mentioned
Regarding your question, i can suggest following snippet:
while ((index = strlen(buffer)) < 7)
{
seqsFile >> &buffer[index];
}
strlen here will return size of buffer upto /0 or newline character as well
You didn't tell what to do with whitespaces, so they will be ignored as well

How to speed up counting the occurences of a word in large files?

I need to count the occurrences of the string "<page>" in a 104gb file, for getting the number of articles in a given Wikipedia dump. First, I've tried this.
grep -F '<page>' enwiki-20141208-pages-meta-current.xml | uniq -c
However, grep crashes after a while. Therefore, I wrote the following program. However, it only processes 20mb/s of the input file on my machine which is about 5% workload of my HDD. How can I speed up this code?
#include <iostream>
#include <fstream>
#include <string>
int main()
{
// Open up file
std::ifstream in("enwiki-20141208-pages-meta-current.xml");
if (!in.is_open()) {
std::cout << "Could not open file." << std::endl;
return 0;
}
// Statistics counters
size_t chars = 0, pages = 0;
// Token to look for
const std::string token = "<page>";
size_t token_length = token.length();
// Read one char at a time
size_t matching = 0;
while (in.good()) {
// Read one char at a time
char current;
in.read(&current, 1);
if (in.eof())
break;
chars++;
// Continue matching the token
if (current == token[matching]) {
matching++;
// Reached full token
if (matching == token_length) {
pages++;
matching = 0;
// Print progress
if (pages % 1000 == 0) {
std::cout << pages << " pages, ";
std::cout << (chars / 1024 / 1024) << " mb" << std::endl;
}
}
}
// Start over again
else {
matching = 0;
}
}
// Print result
std::cout << "Overall pages: " << pages << std::endl;
// Cleanup
in.close();
return 0;
}
Assuming there are no insanely large lines in the file using something like
for (std::string line; std::getline(in, line); } {
// find the number of "<page>" strings in line
}
is bound to be a lot faster! Reading each characters as a string of one character is about the worst thing you can possibly do. It is really hard to get any slower. For each character, there stream will do something like this:
Check if there is a tie()ed stream which needs flushing (there isn't, i.e., that's pointless).
Check if the stream is in good shape (except when having reached the end it is but this check can't be omitted entirely).
Call xsgetn() on the stream's stream buffer.
This function first checks if there is another character in the buffer (that's similar to the eof check but different; in any case, doing the eof check only after the buffer was empty removes a lot of the eof checks)
Transfer the character to the read buffer.
Have the stream check if it reached all (1) characters and set stream flags as needed.
There is a lot of waste in there!
I can't really imagine why grep would fail except that some line blows massively over the expected maximum line length. Although the use of std::getline() and std::string() is likely to have a much bigger upper bound, it is still not effective to process huge lines. If the file may contain lines which are massive, it may be more reasonable to use something along the lines of this:
for (std::istreambuf_iterator<char> it(in), end;
(it = std::find(it, end, '<') != end; ) {
// match "<page>" at the start of of the sequence [it, end)
}
For a bad implementation of streams that's still doing too much. Good implementations will do the calls to std::find(...) very efficiently and will probably check multiple characters at one, adding a check and loop only for something like every 16th loop iteration. I'd expect the above code to turn your CPU-bound implementation into an I/O-bound implementation. Bad implementation may still be CPU-bound but it should still be a lot better.
In any case, remember to enable optimizations!
I'm using this file to test with: http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-meta-current1.xml-p000000010p000010000.bz2
It takes roughly 2.4 seconds versus 11.5 using your code. The total character count is slightly different due to not counting newlines, but I assume that's acceptable since it's only used to display progress.
void parseByLine()
{
// Open up file
std::ifstream in("enwiki-latest-pages-meta-current1.xml-p000000010p000010000");
if(!in)
{
std::cout << "Could not open file." << std::endl;
return;
}
size_t chars = 0;
size_t pages = 0;
const std::string token = "<page>";
std::string line;
while(std::getline(in, line))
{
chars += line.size();
size_t pos = 0;
for(;;)
{
pos = line.find(token, pos);
if(pos == std::string::npos)
{
break;
}
pos += token.size();
if(++pages % 1000 == 0)
{
std::cout << pages << " pages, ";
std::cout << (chars / 1024 / 1024) << " mb" << std::endl;
}
}
}
// Print result
std::cout << "Overall pages: " << pages << std::endl;
}
Here's an example that adds each line to a buffer and then processes the buffer when it reaches a threshold. It takes 2 seconds versus ~2.4 from the first version. I played with several different thresholds for the buffer size and also processing after a fixed number (16, 32, 64, 4096) of lines and it all seems about the same as long as there is some batching going on. Thanks to Dietmar for the idea.
int processBuffer(const std::string& buffer)
{
static const std::string token = "<page>";
int pages = 0;
size_t pos = 0;
for(;;)
{
pos = buffer.find(token, pos);
if(pos == std::string::npos)
{
break;
}
pos += token.size();
++pages;
}
return pages;
}
void parseByMB()
{
// Open up file
std::ifstream in("enwiki-latest-pages-meta-current1.xml-p000000010p000010000");
if(!in)
{
std::cout << "Could not open file." << std::endl;
return;
}
const size_t BUFFER_THRESHOLD = 16 * 1024 * 1024;
std::string buffer;
buffer.reserve(BUFFER_THRESHOLD);
size_t pages = 0;
size_t chars = 0;
size_t progressCount = 0;
std::string line;
while(std::getline(in, line))
{
buffer += line;
if(buffer.size() > BUFFER_THRESHOLD)
{
pages += processBuffer(buffer);
chars += buffer.size();
buffer.clear();
}
if((pages / 1000) > progressCount)
{
++progressCount;
std::cout << pages << " pages, ";
std::cout << (chars / 1024 / 1024) << " mb" << std::endl;
}
}
if(!buffer.empty())
{
pages += processBuffer(buffer);
chars += buffer.size();
std::cout << pages << " pages, ";
std::cout << (chars / 1024 / 1024) << " mb" << std::endl;
}
}

Issues with variable used in reading binary file

I'm writing some serial port code and need to read the contents of a file (in binary) to a variable.
Starting from the example for "Binary files" at http://www.cplusplus.com/doc/tutorial/files/ ,
I try opening a .jpg file:
#include <iostream>
#include <fstream>
using namespace std;
ifstream::pos_type size;
char * memblock;
int main () {
ifstream file ("example.jpg", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
size = file.tellg();
memblock = new char [size];
file.seekg (0, ios::beg);
file.read (memblock, size);
file.close();
cout << memblock << endl;
delete[] memblock;
}
else cout << "Unable to open file";
return 0;
}
However, only the first 4 characters (32 bits) are printed in the console.
What's particularly odd though is that using ostream::write() with that supposedly faulty variable "memblock" works perfectly:
ofstream fileOut ("writtenFile.jpg",ios::out|ios::binary);
fileOut.write(memblock,size);
fileOut.close();
ie it creates a new .jpg file.
So my question is why the memblock variable seems to only contain the first 4 characters.
There is probably a 0 in your binary data. cout is a text stream so looks at memblock as a string. If it reaches a null character then it thinks the string has finished.
See this for some help pin outputting binary data:
How to make cout behave as in binary mode?
Hmmm. A quick glance at the page you cite shows that the author doesn't
know much about IO in C++. Avoid it, since much of what it says is
wrong.
For the rest: .jpg is not a text format, and cannot be simply output
to cout. When you use <<, of course, the output stops at the first
'\0' character, but all sorts of binary data may cause wierd effects:
data may be interpreted as an escape sequence repositionning the cursor,
locking the keyboard (actually happened to me once), etc. These
problems will occur even if you use std::cout.write() (which will not
stop at the '\0' character). If you want to visualize the data,
your best bet is some sort of binary dump. (I use something like the
following for visualizing large blocks of data:
template <typename InputIterator>
void
process(
InputIterator begin,
InputIterator end,
std::ostream& output = std::cout )
{
IOSave saveAndRestore( output ) ;
output.setf( std::ios::hex, std::ios::basefield ) ;
output.setf( std::ios::uppercase ) ;
output.fill( '0' ) ;
static int const lineLength( 16 ) ;
while ( begin != end ) {
int inLineCount = 0;
unsigned char buffer[ lineLength ] ;
while ( inLineCount != lineLength && begin != end ) {
buffer[inLineCount] = *begin;
++ begin;
++ inLineCount;
}
for ( int i = 0 ; i < lineLength ; ++ i ) {
static char const *const
separTbl [] =
{
" ", " ", " ", " ",
" ", " ", " ", " ",
" ", " ", " ", " ",
" ", " ", " ", " ",
} ;
output << separTbl[ i ] ;
if ( i < inLineCount ) {
output << std::setw( 2 )
<< static_cast< unsigned int >(buffer[ i ] & 0xFF) ) ;
} else {
output << " " ;
}
}
output << " |" ;
for ( int i = 0 ; i != inLineCount ; ++ i ) {
output << (i < lengthRead && isprint( buffer[ i ] )
? static_cast< char >( buffer[ i ] )
: ' ') ;
}
output << '|' << std::endl ;
}
}
(You'll also want to read into an std::vector<char>, so you don't have
to worry about freeing the memory.)

how to get a value from a char[]

however,thanks everyone who help me.
I want to get the VmRSS value from /proc/pid/status,below is the code
int main()
{
const int PROCESS_MEMORY_FILE_LEN = 500;
FILE *file;
std::string path("/proc/4378/status");
//path += boost::lexical_cast<std::string>( pid );
//path += "/status";
if(!(file = fopen(path.c_str(),"r")))
{
std::cout <<"open " << path<<"is failed " << std::endl;
return float(-1);
}
char fileBuffer[PROCESS_MEMORY_FILE_LEN];
memset(fileBuffer, 0, PROCESS_MEMORY_FILE_LEN);
if(fread(fileBuffer, 1, PROCESS_MEMORY_FILE_LEN - 1, file) != (PROCESS_MEMORY_FILE_LEN - 1))
{
std::cout <<"fread /proc/pid/status is failed"<< std::endl;
return float(-1);
}
fclose(file);
unsigned long long memoryUsage = 0;
int a = sscanf(fileBuffer,"VmRSS: %llu", &memoryUsage);
std::cout << a << std::endl;
std::cout << memoryUsage << std::endl;
}
at last,thanks
Based on your comments: To find VmRSS within your char array use C++ algorithms in combination with the C++ string library. Then you'll get the position of VmRSS and all you'll have to do is to retrieve the wanted result. With a little knowledge of the structure of these entries, this should be an easy task.
In addition to that it might be better to use fstream for reading directly into a string.