Using regex iterator with char * - c++

I'm trying to read a file into a buffer and then use regex iterator. I know I can use a C++ string iterator with the regex iterator (constructor is std::regex_iterator<std::string::iterator>), but I'd like to avoid copying my buffer into a string and keep using low level functions to read the file (right now I'm using open() and read()).
struct stat buff;
int file = open(argv[1], O_RDONLY);
if(!file)
cout << "Error opening file" << endl;
else if(fstat(file, &buff))
cout << "Error" << endl;
else
{
cout << (buff.st_size) << endl;
char fr[buff.st_size+1];
read(file, fr, buff.st_size); // using string::c_str() or string::data() didn't work
fr[buff.st_size] = '\0';
// then use regex iterator to iterate through matches
}
close(file);
I think that my options are to find a way to use read() with a C++ string instead of char * or a way to use the regex iterator on a char array. I could write one, but I'm also trying to keep my program as small as possible.
Is there a way I can do that? How can I use C++ string as C char * (for read())?

Just use std::regex_iterator<char*>. A pointer is a fine bidirectional iterator on it's own. Also, avoid allocating a large char array on the stack, it might overflow. Instead, use the heap:
std::unique_ptr<char[]> fr = new char[buff.st_size + 1];

If you want to use a std::string you can simply pass the address of the first element of the string to the read() function like this:
struct stat buff;
int file = open(argv[1], O_RDONLY);
if(!file)
cout << "Error opening file" << endl;
else if(fstat(file, &buff))
cout << "Error" << endl;
else
{
cout << (buff.st_size) << endl;
// char fr[buff.st_size+1];
std::string fr; // use a std::string
fr.resize(buff.st_size); // resize it to create internal buffer
read(file, &fr[0], fr.size()); // this should work
// read(file, fr, buff.st_size);
// fr[buff.st_size] = '\0';
// then use regex iterator to iterate through matches
}
close(file);

Related

c++ fstream::read only returning 1st char

Preface: I am a inexperienced coder so its probably an obvious error. Also like all of this code is stolen and slapped together so I claim no ownership of this code.
System: I am using windows 10 64 bit. I write my code in Notepad++ and compile with MinGW G++.
What I'm trying to do: I am trying to read an entire file (BMP format) into a variable and return a pointer to that variable as the return of a function.
What's happening: The variable is only storing the first char of the file.
char* raw_data(std::string filename){
//100% non-stolen
std::ifstream is (filename, std::ifstream::binary);
if (is) {
// get length of file:
is.seekg (0, is.end);
int length = is.tellg();
is.seekg (0, is.beg);
std::cout << is.tellg() << "\n";
char * buffer = new char [length];
std::cout << "Reading " << length << " characters... \n";
// read data as a block:
is.read (buffer,length);
std::cout << "\n\n" << *buffer << "\n\n";
if (is)
{std::cout << "all characters read successfully.";}
else
{std::cout << "error: only " << is.gcount() << " could be read";}
is.close();
// ...buffer contains the entire file...
//101% non-stolen
return {buffer};
}
return {};
}
The code calling the function is
char * image_data = new char [image_size];
image_data = raw_data("Bitmap.bmp");
This compiles fine and the EXE outputs
0
Reading 2665949 characters...
B
all characters read successfully.
The file Bitmap.bmp starts:
BM¶ƒ 6 ( € ‰ €ƒ Δ Δ ¨δό¨δό¨δό¨
As you can see, the variable buffer only stores the first char of Bitmap.bmp (if I change the 1st char it also changes)
Any help would be appreciated.
Thank you for your time.
std::cout << "\n\n" << *buffer << "\n\n";
Buffer is a char*, so by dereferencing it you get a single char, which in your case is B. If you want to output the whole data that you read just don't dereference the pointer, in C/C++ char* has special treatment when outputing with std::cout,printf and such.
std::cout << "\n\n" << buffer << "\n\n";
Keep in mind that by convention, C-strings in char* should be null-terminated, yours is not and the caller of your function has no effective way to check how long it is, that information is lost as functions like strlen expect the Cstring to be null-terminated too. You should look at std::vector<char> or std::string for holding such data, as they will hold the information about the size, and clean after themselves.

c++ munmap_chunk(): invalid pointer:

I'm trying to read and write to a binary file, it mostly works however
upon returning 0 in main ill get munmap_chunk(): invalid pointer: error
ill get a memory dump and a stack trace when the program closes
https://imgur.com/a/CSBg8
here is a screenshot of the memory dump and stack trace, I don't know how to read this
#include <fstream>
#include <iostream>
using namespace std;
struct player{
string name;
};
bool WriteTest(player playerData){
// Create our objects.
fstream filestream;
//attempt to open file and then read first player
filestream.open ("file.bin", ios::binary | ios::out);
filestream.write(reinterpret_cast <char *> (&playerData),
sizeof(playerData));
if(filestream.fail()){
//create file if there is no file
cout << "write open failed" << endl;
filestream.close();
return false;
}
filestream.close();
cout << "write sucsess" << endl;
return true;
}
player ReadTest(){
player playerData;
// Create our objects.
fstream filestream;
//attempt to open file and then read first player
filestream.open ("file.bin", ios::binary | ios::in);
filestream.read(reinterpret_cast <char *> (&playerData),
sizeof(playerData));
if(filestream.fail()){
//create file if there is no file
cout << "read open failed" << endl;
filestream.close();
return playerData;
}
filestream.close();
cout << "read sucsess" << endl;
return playerData;
}
void displayPlayerData(player playerData){
cout << " Name :" << playerData.name << endl;
}
int main(){
player source;
source.name = "bap";
displayPlayerData(source);
WriteTest(source);
getchar();
player playerData = ReadTest();
displayPlayerData(playerData);
return 0;
}
Your player struct contains a std::string, thus the type is not C-layout compatible.
Thus using functions such as:
filestream.write(reinterpret_cast <char *> (&playerData), sizeof(playerData));
and
filestream.read(reinterpret_cast <char *> (&playerData), sizeof(playerData));
will not work correctly.
The std::string contains a pointer to a buffer of characters (leaving aside the short string buffer if the string class is implemented that way), and writing std::string directly to a file will totally miss those characters since you will only be writing a pointer value.
Additionally, reading into playerData will not initialize the std::string with the data. Instead you'll just be corrupting the std::string object with garbage you read from the file. This is more than likely where your program fails -- you are trying to use a corrupted std::string object.
But the tell-tale sign why this could never work is that sizeof(player) is a fixed, compile-time value, and it is the third parameter in the read and write functions. When run here, the sizeof(player) is 32. So you will always be reading / writing 32 bytes of data. What if the std::string name; member holds 1,000 characters? How will you be able to read/write 1,000 characters by specifying you only want to read/write 32 bytes? That could never work.
The correct way to handle this is either:
1) Change the std::string member to an array of char. Then the player class will be C-layout compatible and can be read and written using your techniques of binary file reading / writing
or
2) Properly serialize the string data to the file. You can overload operator >> and operator << to read/write the string data, or use a library such as Boost::Serialize.

Write and read records to .dat file C++

I am quite new to C++ and am trying to work out how to write a record in the format of this structure below to a text file:
struct user {
int id;
char username [20];
char password [20];
char name [20];
char email [30];
int telephone;
char address [70];
int level;
};
So far, I'm able to write to it fine but without an incremented id number as I don't know how to work out the number of records so the file looks something like this after I've written the data to the file.
1 Nick pass Nick email tele address 1
1 user pass name email tele address 1
1 test test test test test test 1
1 user pass Nick email tele addy 1
1 nbao pass Nick email tele 207 1
Using the following code:
ofstream outFile;
outFile.open("users.dat", ios::app);
// User input of data here
outFile << "\n" << 1 << " " << username << " " << password << " " << name << " "
<< email << " " << telephone << " " << address << " " << 1;
cout << "\nUser added successfully\n\n";
outFile.close();
So, how can I increment the value for each record on insertion and how then target a specific record in the file?
EDIT: I've got as far as being able to display each line:
if (inFile.is_open())
{
while(!inFile.eof())
{
cout<<endl;
getline(inFile,line);
cout<<line<<endl;
}
inFile.close();
}
What you have so far is not bad, except that it cannot handle cases where there is space in your strings (for example in address!)
What you are trying to do is write a very basic data base. You require three operations that need to be implemented separately (although intertwining them may give you better performance in certain cases, but I'm sure that's not your concern here).
Insert: You already have this implemented. Only thing you might want to change is the " " to "\n". This way, every field of the struct is in a new line and your problem with spaces are resolved. Later when reading, you need to read line by line
Search: To search, you need to open the file, read struct by struct (which itself consists of reading many lines corresponding to your struct fields) and identifying the entities of your interest. What to do with them is another issue, but simplest case would be to return the list of matching entities in an array (or vector).
Delete: This is similar to search, except you have to rewrite the file. What you do is basically, again read struct by struct, see which ones match your criteria of deletion. You ignore those that match, and write (like the insert part) the rest to another file. Afterwards, you can replace the original file with the new file.
Here is a pseudo-code:
Write-entity(user &u, ofstream &fout)
fout << u.id << endl
<< u.username << endl
<< u.password << endl
<< ...
Read-entity(user &u, ifstream &fin)
char ignore_new_line
fin >> u.id >> ignore_new_line
fin.getline(u.username, 20);
fin.getline(u.password, 20);
...
if end of file
return fail
Insert(user &u)
ofstream fout("db.dat");
Write-entity(u, fout);
fout.close();
Search(char *username) /* for example */
ifstream fin("db.dat");
user u;
vector<user> results;
while (Read-entity(u))
if (strcmp(username, u.username) == 0)
results.push_back(u);
fin.close();
return results;
Delete(int level) /* for example */
ifstream fin("db.dat");
ofstream fout("db_temp.dat");
user u;
while (Read-entity(u))
if (level != u.level)
Write-entity(u, fout);
fin.close();
fout.close();
copy "db_temp.dat" to "db.dat"
Side note: It's a good idea to place the \n after data has been written (so that your text file would end in a new line)
Using typical methods at least you will need to use fix size records if you want to have random access when reading the file so say you have 5 characters for name it will be stored as
bob\0\0
or whatever else you use to pad, this way you can index with record number * record size.
To increment the index you in the way you are doing you will need to the read the file to find the high existing index and increment it. Or you can load the file into memory and append the new record and write the file back
std::vector<user> users=read_dat("file.dat");
user user_=get_from_input();
users.push_back(user_);
then write the file back
std::ofstream file("file.dat");
for(size_t i=0; i!=users.size(); ++i) {
file << users.at(i);
//you will need to implement the stream extractor to do this easily
}
I suggest to wrap the file handler into a Class, and then overload the operator >> and << for your struct, with this was you will control the in and out.
For instance
struct User{
...
};
typedef std::vector<User> UserConT;
struct MyDataFile
{
ofstream outFile;
UserConT User_container;
MyDataFile(std::string const&); //
MyDataFile& operator<< (User const& user); // Implement and/or process the record before to write
MyDataFile& operator>> (UserConT & user); // Implement the extraction/parse and insert into container
MyDataFile& operator<< (UserConT const & user); //Implement extraction/parse and insert into ofstream
};
MyDataFile& MyDataFile::operator<< (User const& user)
{
static unsigned myIdRecord=User_container.size();
myIdRecord++;
outFile << user.id+myIdRecord << ....;
return *this;
}
int main()
{
MydataFile file("data.dat");
UserConT myUser;
User a;
//... you could manage a single record
a.name="pepe";
...
file<<a;
..//
}
A .Dat file is normally a simple text file itself that can be opened with notepad . So , you can simply read the Last Line of the file , read it , extract the first character , convert it into integer . THen increment the value and be done .
Some sample code here :
#include <iostream.h>
#include <fstream.h>
using namespace std;
int main(int argc, char *argv[])
{
ifstream in("test.txt");
if(!in) {
cout << "Cannot open input file.\n";
return 1;
}
char str[255];
while(in) {
in.getline(str, 255); // delim defaults to '\n'
//if(in) cout << str << endl;
}
// Now str contains the last line ,
if ((str[0] >=48) || ( str[0] <=57))
{
int i = atoi(str[0]);
i++;
}
//i contains the latest value , do your operation now
in.close();
return 0;
}
Assuming your file format doesn't not need to be human readable.
You can write the struct out to file such as.
outFile.open("users.dat", ios::app | ios::binary);
user someValue = {};
outFile.write( (char*)&someValue, sizeof(user) );
int nIndex = 0;
user fetchValue = {};
ifstream inputFile.open("user.data", ios::binary);
inputFile.seekg (0, ios::end);
int itemCount = inputFile.tellg() / sizeof(user);
inputFile.seekg (0, ios::beg);
if( nIndex > -1 && nIndex < itemCount){
inputFile.seekg ( sizeof(user) * nIndex , ios::beg);
inputFile.read( (char*)&fetchValue, sizeof(user) );
}
The code that writes to the file is a member function of the user struct?
Otherwise I see no connection with between the output and the struct.
Possible things to do:
write the id member instead of 1
use a counter for id and increment it at each write
don't write the id and when reading use the line number as id

C++ Fstream Only Prints One Word

This is a very strange issue. I'm trying to print a large text file, it's a Wikipedia entry. It happens to be the page on Velocity. So, when I tell it to print the file, it prints "In", when it should print "In physics, velocity is etc, etc etc".
Here's the code I'm using to print out:
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ifstream wiki;
wiki.open("./wiki/velocity.txt");
char* wikiRead;
wiki >> wikiRead;
cout << wikiRead << endl;
wiki.close();
}
Please help.
wiki >> wikiRead;
The default delimiter for stream is space, so when the stream encounters a space, it simply stops reading, that is why it reads only one word.
If you want the stream to read all words, the you've to use a loop as:
char* wikiRead = new char[1024]; //must allocate some memory!
while(wiki >> wikiRead)
{
cout << wikiRead << endl;
}
wiki.close();
delete []wikiRead; //must deallocate the memory
This will print all the words in the file, each on a new line. Note if any of the word in the file is more than 1024 character long, then this program would invoke undefined behavior, and the program might crash. In that case, you've to allocate a bigger chunk of memory.
But why use char* in the first place? In C++, you've better choice: Use std::string.
#include<string>
std::string word;
while(wiki >> word)
{
cout << word << endl;
}
wiki.close();
Its better now.
If you want to read line-by-line, instead of word-by-word, then use std::getline as:
std::string line;
while(std::getline(wiki, line))
{
cout << line << endl;
}
wiki.close();
This will read a complete line, even if the line contains spaces between the words, and will print each line a newline.
You ask the stream to read the (binary) value of a pointer (probably 4 bytes, depending on your machine architecture), then you ask it to print the text pointed to by those 4 bytes!
I wonder why you ignored the compiler warning (most of the modern compiler warns you about using uninitialized variables). How about this?
ifstream wiki;
wiki.open("./wiki/velocity.txt");
char wikiRead[255];
wiki >> wikiRead;
cout << wikiRead << endl;
wiki.close();
Alternatively I'd suggest you to use string object with getline to get a single line of text.
string str;
getline(wiki, str);
The >> operator applied to a char * reads only one word. Moreover, you're reading into an uninitialized pointer, which is not valid. Usually std::string, not char *, is used for string processing in C++.
If you only want to print the file's contents, you can hook the file's buffer directly to std::cout:
int main() {
std::ifstream wiki("./wiki/velocity.txt");
std::cout << wiki.rdbuf() << '\n';
}
If you want to put the contents into an automatically-allocated string, use std::getline with the delimiter disabled.
int main() {
std::ifstream wiki("./wiki/velocity.txt");
std::string wiki_contents;
getline( wiki, wiki_contents, '\0' /* do not stop at newline */ );
std::cout << wiki_contents << '\n'; // do something with the string
}
Since you want to read a large file, reading it block by block is a better way.
ifstream wiki;
wiki.open("./wiki/velocity.txt");
const int buf_size = 1024;
char* wikiRead = 0;
int cnt = 1;
do
{
wikiRead = realloc( wikiRead, bufsize*cnt );
wiki.Read( wikiRead + (bufSize*(cnt-1)), buf_size ); //appends to reallocated memory
cnt++;
}while( !wiki.eof())
wikiRead[(bufSize*(cnt-2)) + wiki.gcount() + 1] = '\0'; // null termination.
wiki.Close();
cout << wikiRead;
delete[] wikiRead;
The operator>> is designed to only read one word at a time. If you want to read lines, use getline.
#include <iostream>
#include <fstream>
#include<string>
using namespace std;
int main()
{
ifstream wiki;
wiki.open("./wiki/velocity.txt");
string wikiRead;
while (getline(wiki, wikiRead))
{
cout << wikiRead << endl;
}
wiki.close();
}

C++ - Convert FILE* to CHAR*

I found a C++ source file which calculates expressions from a command line argument (argv[1]), however I now want to change it to read a file.
double Utvardering(char* s) {
srcPos = s;
searchToken();
return PlusMinus();
}
int main(int argc, char* argv[]) {
if (argc > 1) {
FILE* fFile = fopen(argv[1], "r");
double Value = Utvardering(fopen(argv[1], "r"));
cout << Value << endl;
}else{
cout << "Usage: " << argv[0] << " FILE" << endl;
}
cin.get();
return 0;
}
However the Utvardering function requires a char* parameter. How can I convert the data read from a file, fopen to a char*?
The function fopen just opens a file. To get a string from there, you need to read the file. There are different ways to to this. If you know the max size of your string in advance, this would do:
const int MAX_SIZE = 1024;
char buf[MAX_SIZE];
if (!fgets(buf, MAX_SIZE, fFile) {
cerr << "Read error";
exit(1);
}
double Value = Utvardering(buf);
Note: this method is typical for C, not for C++. If you want more idiomatic C++ code, you can use something like this (instead of FILE and fopen):
ifstream in;
in.open(argv[1]);
if (!in) { /* report an error */ }
string str;
in >> str;
Use the fread() function to read data from the FILE* into a buffer. Send that buffer into Utvardering().
I have no idea what "Utvardering" expects, or how it's using the information.
There are two possibilities -
1) Utvardering may be defined using char*, but expecting a FILE* (in effect, treating char* like void*). I've seen this before, even though it's pretty awful practice. In that case, just cast fFile to char* and pass it in.
2) Utvardering may be expecting a null terminated string (char*) as input. If you're using fopen like this, you can use fread to read the file contents into a buffer (char[]), and pass it to your function that takes a char*.
It looks like you need to write code to read the file into a character array and pass that to Utvardering.
Just passing the return value of fopen will cause the address of the opaque data structure pointed to by that pointer to be passed to Utvardering. Utvardering will happily treat those bytes as character data when they are not. Not good.
Good example of reading data from a file here:
http://www.cplusplus.com/reference/clibrary/cstdio/fread/
then pass the buffer to your function