I have a file which is similar to /etc/passwd (semi-colon separated values), and need to extract all three values per line into variables then compare them to what have been given into the program. Here is my code:
typedef struct _UserModel UserModel;
struct _UserModel {
char username[50];
char email[55];
char pincode[30];
};
void get_user(char *username) {
ifstream io("test.txt");
string line;
while (io.good() && !io.eof()) {
getline(io, line);
if (line.length() > 0 && line.substr(0,line.find(":")).compare(username)==0) {
cout << "found user!\n";
UserModel tmp;
sscanf(line.c_str() "%s:%s:%s", tmp.username, tmp.pincode, tmp.email);
assert(0==strcmp(tmp.username, username));
}
}
}
I can't strcmp the values as the trailing '\0' mean the strings are different, so the assertion fails. I only really want to hold the memory for the values anyway and not use up memory that I don't need for these values. What do I need to change to get this to work..?
sscanf is so C'ish.
struct UserModel {
string username;
string email;
string pincode;
};
void get_user(char *username) {
ifstream io("test.txt");
string line;
while (getline(io, line)) {
UserModel tmp;
istringstream str(line);
if (getline(str, tmp.username, ':') && getline(str, tmp.pincode, ':') && getline(str, tmp.email)) {
if (username == tmp.username)
cout << "found user!\n";
}
}
}
If you are using c++, I would try to use std::string, iostreams and all those things that come with C++, but then again...
I understand that your problem is that one of the C strings is null terminated, while the other is not, and then the strcmp is stepping to the '\0' on one string, but the other has another value... if that is the only thing you want to change, use strncpy with the length of the string that is known.
Here's a complete example that does what I think you asked about.
Things you didn't ask for but it does anyway:
It uses exceptions to report data file format errors so that GetModelForUser() can simply return an object (instead of a boolean or something like that).
It uses a template function for splitting the line into fields. This really the heart of the original question and so it's a bit unfortunate that this is arguably over-complex. But the idea here of making it a template function is that this separates the concerns of splitting a string into fields from choosing a data structure to represent the result.
/* Parses a file of user data.
* The data file is of this format:
* username:email-address:pincode
*
* The pincode field is actually one-way-encrypted with a secret salt
* in order to avoid catastrophic loss of customer data when the file
* or a backup tape is lost/leaked/compromised. However, this code
* simply treats it as an opaque value.
*
* Internationalisation: this code assumes that the data file is
* encoded in the execution character set, whatever that is. This
* means that updates to the file must first transcode the
* username/mail-address/pincode data into the execution character
* set.
*/
#include <string>
#include <vector>
#include <fstream>
#include <iostream>
#include <iterator>
#include <exception>
const char* MODEL_DATA_FILE_NAME = "test.txt";
// This stuff should really go in a header file.
class UserUnknown : public std::exception { };
class ModelDataIsMissing : public std::exception { };
class InvalidModelData : public std::exception { }; // base: don't throw this directly.
class ModelDataBlankLine : public InvalidModelData { };
class ModelDataEmptyUsername : public InvalidModelData { };
class ModelDataWrongNumberOfFields : public InvalidModelData { };
class UserModel {
std::string username_;
std::string email_address_;
std::string pincode_;
public:
UserModel(std::string username, std::string email_address, std::string pincode)
: username_(username), email_address_(email_address), pincode_(pincode) {
}
UserModel(const UserModel& other)
: username_(other.username_),
email_address_(other.email_address_),
pincode_(other.pincode_) {
}
std::string GetUsername() const { return username_; }
std::string GetEmailAddress() const { return email_address_; }
std::string GetPincode() const { return pincode_; }
};
UserModel GetUserModelForUser(const std::string& username)
throw (InvalidModelData, UserUnknown, ModelDataIsMissing);
// This stuff is the implementation.
namespace { // use empty namespace for modularity.
template void SplitStringOnSeparator(
std::string input, char separator, ForwardIterator output)
{
std::string::const_iterator field_start, pos;
bool in_field = false;
for (pos = input.begin(); pos != input.end(); ++pos) {
if (!in_field) {
field_start = pos;
in_field = true;
}
if (*pos == separator) {
*output++ = std::string(field_start, pos);
in_field = false;
}
}
if (field_start != input.begin()) {
*output++ = std::string(field_start, pos);
}
}
}
// Returns a UserModel instance for the specified user.
//
// Don't call this more than once per program invocation, because
// you'll end up with quadratic performance. Instead modify this code
// to return a map from username to model data.
UserModel GetUserModelForUser(const std::string& username)
throw (InvalidModelData, UserUnknown, ModelDataIsMissing)
{
std::string line;
std::ifstream in(MODEL_DATA_FILE_NAME);
if (!in) {
throw ModelDataIsMissing();
}
while (std::getline(in, line)) {
std::vector<std::string> fields;
SplitStringOnSeparator(line, ':', std::back_inserter(fields));
if (fields.size() == 0) {
throw ModelDataBlankLine();
} else if (fields.size() != 3) {
throw ModelDataWrongNumberOfFields();
} else if (fields[0].empty()) {
throw ModelDataEmptyUsername();
} else if (fields[0] == username) {
return UserModel(fields[0], fields[1], fields[2]);
}
// We don't diagnose duplicate usernames in the file.
}
throw UserUnknown();
}
namespace {
bool Example (const char *arg)
{
const std::string username(arg);
try
{
UserModel mod(GetUserModelForUser(username));
std::cout << "Model data for " << username << ": "
<< "username=" << mod.GetUsername()
<< ", email address=" << mod.GetEmailAddress()
<< ", encrypted pin code=" << mod.GetPincode()
<< std::endl;
return true;
}
catch (UserUnknown) {
std::cerr << "Unknown user " << username << std::endl;
return false;
}
}
}
int main (int argc, char *argv[])
{
int i, returnval=0;
for (i = 1; i < argc; ++i)
{
try
{
if (!Example(argv[i])) {
returnval = 1;
}
}
catch (InvalidModelData) {
std::cerr << "Data file " << MODEL_DATA_FILE_NAME << " is invalid." << std::endl;
return 1;
}
catch (ModelDataIsMissing) {
std::cerr << "Data file " << MODEL_DATA_FILE_NAME << " is missing." << std::endl;
return 1;
}
}
return returnval;
}
/* Local Variables: /
/ c-file-style: "stroustrup" /
/ End: */
I don't see a problem with strcmp, but you have one in your sscanf format. %s will read upto the first non white character, so it will read the :. You probably want "%50[^:]:%55[^:]:%30s" as format string. I've added field size in order to prevent buffer overflow, but I could be off by one in the limit.
Related
I have the following code prints each unique word and its count from a text file (contains >= 30k words), however it's separating words by whitespace, I had results like so:
how can I modify the code to specify the expected dividers?
template <class KTy, class Ty>
void PrintMap(map<KTy, Ty> map)
{
typedef std::map<KTy, Ty>::iterator iterator;
for (iterator p = map.begin(); p != map.end(); p++)
cout << p->first << ": " << p->second << endl;
}
void UniqueWords(string fileName) {
// Will store the word and count.
map<string, unsigned int> wordsCount;
// Begin reading from file:
ifstream fileStream(fileName);
// Check if we've opened the file (as we should have).
if (fileStream.is_open())
while (fileStream.good())
{
// Store the next word in the file in a local variable.
string word;
fileStream >> word;
//Look if it's already there.
if (wordsCount.find(word) == wordsCount.end()) // Then we've encountered the word for a first time.
wordsCount[word] = 1; // Initialize it to 1.
else // Then we've already seen it before..
wordsCount[word]++; // Just increment it.
}
else // We couldn't open the file. Report the error in the error stream.
{
cerr << "Couldn't open the file." << endl;
}
// Print the words map.
PrintMap(wordsCount);
}
You can use a stream with a std::ctype<char> facet imbue()ed which considers whatever characters you fancy as space. Doing so would look something like this:
#include<locale>
#include<cctype>
struct myctype_table {
std::ctype_base::mask table[std::ctype<char>::table_size];
myctype_table(char const* spaces) {
while (*spaces) {
table[static_cast<unsigned char>(*spaces)] = std::ctype_base::isspace;
}
}
};
class myctype
: private myctype_table,
, public std::ctype<char> {
public:
myctype(char const* spaces)
: myctype_table(spaces)
, std::ctype<char>(table) {
};
};
int main() {
std::locale myloc(std::locale(), new myctype(" \t\n\r?:.,!"));
std::cin.imbue(myloc);
for (std::string word; std::cin >> word; ) {
// words are separated by the extended list of spaces
}
}
This code isn't test right now - I'm typing on a mobile device. I probably misused some of the std::cypte<char> interfaces but something along those lines after fixing the names, etc. should work.
As you expect the forbidden characters at the end of the found word you can remove them prior to push the word into wordsCount:
if(word[word.length()-1] == ';' || word[word.length()-1] == ',' || ....){
word.erase(word.length()-1);
}
After fileStream >> word;, you can call this function. Take a look and see if it's clear:
string adapt(string word) {
string forbidden = "!?,.[];";
string ret = "";
for(int i = 0; i < word.size(); i++) {
bool ok = true;
for(int j = 0; j < forbidden.size(); j++) {
if(word[i] == forbidden[j]) {
ok = false;
break;
}
}
if(ok)
ret.push_back(word[i]);
}
return ret;
}
Something like this:
fileStream >> word;
word = adapt(word);
I have data in the following format in a text file. Filename - empdata.txt
Note that there are no blank space between the lines.
Sl|EmployeeID|Name|Department|Band|Location
1|327427|Brock Mcneil|Research and Development|U2|Pune
2|310456|Acton Golden|Advertising|P3|Hyderabad
3|305540|Hollee Camacho|Payroll|U3|Bangalore
4|218801|Simone Myers|Public Relations|U3|Pune
5|144051|Eaton Benson|Advertising|P1|Chennai
I have a class like this
class empdata
{
public:
int sl,empNO;
char name[20],department[20],band[3],location[20];
};
I created an array of objects of class empdata.
How to read the data from the file which has n lines of data in the above specified format and store them to the array of (class)objects created?
This is my code
int main () {
string line;
ifstream myfile ("empdata.txt");
for(int i=0;i<10;i++) //processing only first 10 lines of the file
{
getline (myfile,line);
//What should I do with this "line" so that I can extract data
//from this line and store it in the class object?
}
return 0;
}
So basically my question is how to extract data from a string which has data separated by '|' character and store each data to a separate variable
I prefer to use the String Toolkit. The String Toolkit will take care of converting the numbers as it parses.
Here is how I would solve it.
#include <fstream>
#include <strtk.hpp> // http://www.partow.net/programming/strtk
using namespace std;
// using strings instead of character arrays
class Employee
{
public:
int index;
int employee_number;
std::string name;
std::string department;
std::string band;
std::string location;
};
std::string filename("empdata.txt");
// assuming the file is text
std::fstream fs;
fs.open(filename.c_str(), std::ios::in);
if(fs.fail()) return false;
const char *whitespace = " \t\r\n\f";
const char *delimiter = "|";
std::vector<Employee> employee_data;
// process each line in turn
while( std::getline(fs, line ) )
{
// removing leading and trailing whitespace
// can prevent parsing problemsfrom different line endings.
strtk::remove_leading_trailing(whitespace, line);
// strtk::parse combines multiple delimeters in these cases
Employee e;
if( strtk::parse(line, delimiter, e.index, e.employee_number, e.name, e.department, e.band, e.location) )
{
std::cout << "succeed" << std::endl;
employee_data.push_back( e );
}
}
AFAIK, there is nothing that does it out of the box. But you have all the tools to build it yourself
The C way
You read the lines into a char * (with cin.getline()) and then use strtok, and strcpy
The getline way
The getline function accept a third parameter to specify a delimiter. You can make use of that to split the line through a istringstream. Something like :
int main() {
std::string line, temp;
std::ifstream myfile("file.txt");
std::getline(myfile, line);
while (myfile.good()) {
empdata data;
std::getline(myfile, line);
if (myfile.eof()) {
break;
}
std::istringstream istr(line);
std::getline(istr, temp, '|');
data.sl = ::strtol(temp.c_str(), NULL, 10);
std::getline(istr, temp, '|');
data.empNO = ::strtol(temp.c_str(), NULL, 10);
istr.getline(data.name, sizeof(data.name), '|');
istr.getline(data.department, sizeof(data.department), '|');
istr.getline(data.band, sizeof(data.band), '|');
istr.getline(data.location, sizeof(data.location), '|');
}
return 0;
}
This is the C++ version of the previous one
The find way
You read the lines into a string (as you currently do) and use string::find(char sep, size_t pos) to find next occurence of the separator and copy the data (from string::c_str()) between start of substring and separator to your fields
The manual way
You just iterate the string. If the character is a separator, you put a NULL at the end of current field and pass to next field. Else, you just write the character in current position of current field.
Which to choose ?
If you are more used to one of them, stick to it.
Following is just my opinion.
The getline way will be the simplest to code and to maintain.
The find way is mid level. It is still at a rather high level and avoids the usage of istringstream.
The manual way will be really low level, so you should structure it to make it maintainable. For example your could a explicit description of the lines as an array of fields with a maximimum size and current position. And as you have both int and char[] fields it will be tricky. But you can easily configure it the way you want. For example, your code only allow 20 characters for department field, whereas Research and Development in line 2 is longer. Without special processing, the getline way will leave the istringstream in bad state and will not read anything more. And even if you clear the state, you will be badly positionned. So you should first read into a std::string and then copy the beginning to the char * field.
Here is a working manual implementation :
class Field {
public:
virtual void reset() = 0;
virtual void add(empdata& data, char c) = 0;
};
class IField: public Field {
private:
int (empdata::*data_field);
bool ok;
public:
IField(int (empdata::*field)): data_field(field) {
ok = true;
reset();
}
void reset() { ok = true; }
void add(empdata& data, char c);
};
void IField::add(empdata& data, char c) {
if (ok) {
if ((c >= '0') && (c <= '9')) {
data.*data_field = data.*data_field * 10 + (c - '0');
}
else {
ok = false;
}
}
}
class CField: public Field {
private:
char (empdata::*data_field);
size_t current_pos;
size_t size;
public:
CField(char (empdata::*field), size_t size): data_field(field), size(size) {
reset();
}
void reset() { current_pos = 0; }
void add(empdata& data, char c);
};
void CField::add(empdata& data, char c) {
if (current_pos < size) {
char *ix = &(data.*data_field);
ix[current_pos ++] = c;
if (current_pos == size) {
ix[size -1] = '\0';
current_pos +=1;
}
}
}
int main() {
std::string line, temp;
std::ifstream myfile("file.txt");
Field* fields[] = {
new IField(&empdata::sl),
new IField(&empdata::empNO),
new CField(reinterpret_cast<char empdata::*>(&empdata::name), 20),
new CField(reinterpret_cast<char empdata::*>(&empdata::department), 20),
new CField(reinterpret_cast<char empdata::*>(&empdata::band), 3),
new CField(reinterpret_cast<char empdata::*>(&empdata::location), 20),
NULL
};
std::getline(myfile, line);
while (myfile.good()) {
Field** f = fields;
empdata data = {0};
std::getline(myfile, line);
if (myfile.eof()) {
break;
}
for (std::string::const_iterator it = line.begin(); it != line.end(); it++) {
char c;
c = *it;
if (c == '|') {
f += 1;
if (*f == NULL) {
continue;
}
(*f)->reset();
}
else {
(*f)->add(data, c);
}
}
// do something with data ...
}
for(Field** f = fields; *f != NULL; f++) {
free(*f);
}
return 0;
}
It is directly robust, efficient and maintainable : adding a field is easy, and it is tolerant to errors in input file. But it is way loooonger than the other ones, and would need much more tests. So I would not advise to use it without special reasons (necessity to accept multiple separators, optional fields and dynamic order, ...)
Try this simple code segment , this will read the file and , give a print , you can read line by line and later you can use that to process as you need .
Data : provided bu you : in file named data.txt.
package com.demo;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
public class Demo {
public static void main(String a[]) {
try {
File file = new File("data.txt");
FileReader fileReader = new FileReader(file);
BufferedReader bufferReader = new BufferedReader(fileReader);
String data;
while ((data = bufferReader.readLine()) != null) {
// data = br.readLine( );
System.out.println(data);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
In console you will get output like this :
Sl|EmployeeID|Name|Department|Band|Location
1|327427|Brock Mcneil|Research and Development|U2|Pune
2|310456|Acton Golden|Advertising|P3|Hyderabad
3|305540|Hollee Camacho|Payroll|U3|Bangalore
4|218801|Simone Myers|Public Relations|U3|Pune
5|144051|Eaton Benson|Advertising|P1|Chennai
This is a simple idea, you may do what you need.
In C++ you can change the locale to add an extra character to the separator list of the current locale:
#include <locale>
#include <iostream>
struct pipe_is_space : std::ctype<char> {
pipe_is_space() : std::ctype<char>(get_table()) {}
static mask const* get_table()
{
static mask rc[table_size];
rc['|'] = std::ctype_base::space;
rc['\n'] = std::ctype_base::space;
return &rc[0];
}
};
int main() {
using std::string;
using std::cin;
using std::locale;
cin.imbue(locale(cin.getloc(), new pipe_is_space));
string word;
while(cin >> word) {
std::cout << word << "\n";
}
}
I would like to know if it is possible to inherit from std::ostream, and to override flush() in such a way that some information (say, the line number) is added to the beginning of each line. I would then like to attach it to a std::ofstream (or cout) through rdbuf() so that I get something like this:
ofstream fout("file.txt");
myostream os;
os.rdbuf(fout.rdbuf());
os << "this is the first line.\n";
os << "this is the second line.\n";
would put this into file.txt
1 this is the first line.
2 this is the second line.
flush() wouldn't be the function to override in this context, though you're on the right track. You should redefine overflow() on the underlying std::streambuf interface. For example:
class linebuf : public std::streambuf
{
public:
linebuf() : m_sbuf() { m_sbuf.open("file.txt", std::ios_base::out); }
int_type overflow(int_type c) override
{
char_type ch = traits_type::to_char_type(c);
if (c != traits_type::eof() && new_line)
{
std::ostream os(&m_sbuf);
os << line_number++ << " ";
}
new_line = (ch == '\n');
return m_sbuf.sputc(ch);
}
int sync() override { return m_sbuf.pubsync() ? 0 : -1; }
private:
std::filebuf m_sbuf;
bool new_line = true;
int line_number = 1;
};
Now you can do:
linebuf buf;
std::ostream os(&buf);
os << "this is the first line.\n"; // "1 this is the first line."
os << "this is the second line.\n"; // "2 this is the second line."
Live example
James Kanze's classic article on Filtering Streambufs has a very similar example which puts a timestamp at the beginning of every line. You could adapt that code.
Or, you could use the Boost tools that grew out of the ideas in that article.
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/array.hpp>
#include <cstring>
#include <limits>
// line_num_filter is a model of the Boost concept OutputFilter which
// inserts a sequential line number at the beginning of every line.
class line_num_filter
: public boost::iostreams::output_filter
{
public:
line_num_filter();
template<typename Sink>
bool put(Sink& snk, char c);
template<typename Device>
void close(Device&);
private:
bool m_start_of_line;
unsigned int m_line_num;
boost::array<char, std::numeric_limits<unsigned int>::digits10 + 4> m_buf;
const char* m_buf_pos;
const char* m_buf_end;
};
line_num_filter::line_num_filter() :
m_start_of_line(true),
m_line_num(1),
m_buf_pos(m_buf.data()),
m_buf_end(m_buf_pos)
{}
// put() must return true if c was written to dest, or false if not.
// After returning false, put() with the same c might be tried again later.
template<typename Sink>
bool line_num_filter::put(Sink& dest, char c)
{
// If at the start of a line, print the line number into a buffer.
if (m_start_of_line) {
m_buf_pos = m_buf.data();
m_buf_end = m_buf_pos +
std::snprintf(m_buf.data(), m_buf.size(), "%u ", m_line_num);
m_start_of_line = false;
}
// If there are buffer characters to be written, write them.
// This can be interrupted and resumed if the sink is not accepting
// input, which is why the buffer and pointers need to be members.
while (m_buf_pos != m_buf_end) {
if (!boost::iostreams::put(dest, *m_buf_pos))
return false;
++m_buf_pos;
}
// Copy the actual character of data.
if (!boost::iostreams::put(dest, c))
return false;
// If the character copied was a newline, get ready for the next line.
if (c == '\n') {
++m_line_num;
m_start_of_line = true;
}
return true;
}
// Reset the filter object.
template<typename Device>
void line_num_filter::close(Device&)
{
m_start_of_line = true;
m_line_num = 1;
m_buf_pos = m_buf_end = m_buf.data();
}
int main() {
using namespace boost::iostreams;
filtering_ostream myout;
myout.push(line_num_filter());
myout.push(std::cout);
myout << "this is the first line.\n";
myout << "this is the second line.\n";
}
I'm a C++ newbie who came from Java, so I need some guidance on some really basic issues I'm stumbling upon as I go.
I'm reading lines from a file, and each line consists of 6 strings/ints, which will be sent as parameters to a temporary variable.
Example:
Local1,Local2,ABC,200,300,asphalt
However, there are two subtypes of variable. One has a string as the last parameter (like 'asphalt' in the example above). The other one has an int instead. I have a method that reads each parameter and sends it to a variable, but how do I detect if the last bit of string is an integer or a string beforehand, so I know if I should send it to a Type1 variable or a Type2 one?
Many thanks!
Since you want to determine the type of the last column, then this ought to work:
#include <iostream>
#include <string>
#include <cstdlib>
#include <vector>
#include <sstream>
#include <cctype>
#include <algorithm>
enum Types {
NONE,
STRING,
INTEGER,
DOUBLE
};
struct Found {
std::string string_val;
int integer_val;
double double_val;
enum Types type;
};
//copied verbatim from:
//http://stackoverflow.com/a/2845275/866930
inline bool isInteger(const std::string &s) {
if(s.empty() || ((!std::isdigit(s[0])) && (s[0] != '-') && (s[0] != '+'))) return false;
char * p ;
std::strtol(s.c_str(), &p, 10);
return (*p == 0);
}
//modified slightly for decimals:
inline bool isDouble(const std::string &s) {
if(s.empty() || ((!std::isdigit(s[0])) && (s[0] != '-') && (s[0] != '+'))) return false ;
char * p ;
std::strtod(s.c_str(), &p) ;
return (*p == 0);
}
bool isNotAlpha(char c) {
return !(std::isalpha(c));
}
//note: this searches for strings containing only characters from the alphabet
//however, you can modify that behavior yourself.
bool isString (const std::string &s) {
std::string::const_iterator it = std::find_if(s.begin(), s.end(), isNotAlpha);
return (it == s.end()) ? true : false;
}
void determine_last_column (const std::string& str, Found& found) {
//reset found:
found.integer_val = 0;
found.double_val = 0;
found.string_val = "";
found.type = NONE;
std::string temp;
std::istringstream iss(str);
int column = 0;
char *p;
while(std::getline(iss, temp, ',')) {
if (column == 5) {
//now check to see if the column is an integer or not:
if (isInteger(temp)) {
found.integer_val = static_cast<int>(std::strtol(temp.c_str(), &p, 10));
found.type = INTEGER;
}
else if (isDouble(temp)) {
found.double_val = static_cast<double>(std::strtod(temp.c_str(), &p));
found.type = DOUBLE;
}
else if (isString(temp)) {
found.string_val = temp;
found.type = STRING;
}
}
++column;
}
if (found.type == INTEGER) {
std::cout << "An integer was found: " << found.integer_val << std::endl;
}
else if(found.type == DOUBLE) {
std::cout << "A double was found: " << found.double_val << std::endl;
}
else if(found.type == STRING) {
std::cout << "A string was found: " << found.string_val << std::endl;
}
else {
std::cout << "A valid type was not found! Something went wrong..." << std::endl;
}
}
int main() {
std::string line_t1 = "Local1,Local2,ABC,200,300,asphalt";
std::string line_t2 = "Local1,Local2,ABC,200,300,-7000.3";
Found found;
determine_last_column(line_t1, found);
determine_last_column(line_t2, found);
return 0;
}
This outputs and correctly assigns the appropriate value:
A string was found: asphalt
An integer was found: -7000.3
This version works on int, double, string; does not require boost; and, is plain vanilla C++98.
REFERENCES:
UPDATE:
This version now supports both positive and negative numbers that are integers or doubles, in addition to strings.
First, create an array that can store both strings and integers:
std::vector<boost::variant<std::string, int>> items;
Second, split the input string on commas:
std::vector<std::string> strings;
boost::split(strings, input, boost::is_any_of(","));
Last, parse each token and insert it into the array:
for (auto&& string : strings) {
try {
items.push_back(boost::lexical_cast<int>(string));
} catch(boost::bad_lexical_cast const&) {
items.push_back(std::move(string));
}
}
I'm looking for a way to write floats/ints/strings to a file and read them as floats/ints/strings. (basically read/write as ios::binary).
I ended up writing it myself. Just wanted to share it with others.
It might not be optimized, but I had some difficulties finding C++ code that mimics C#'s BinaryReader & BinaryWriter classes. So I created one class that handles both read and write.
Quick things to note:
1) "BM" is just a prefix for my classes.
2) BMLogging is a helper class that simply does:
cout << "bla bla bla" << endl;
So you can ignore the calls to BMLogging, I kept them to highlight the cases where we could warn the user.
Here's the code:
#include <iostream>
#include <fstream>
using namespace std;
// Create the macro so we don't repeat the code over and over again.
#define BMBINARY_READ(reader,value) reader.read((char *)&value, sizeof(value))
enum BMBinaryIOMode
{
None = 0,
Read,
Write
};
class BMBinaryIO
{
// the output file stream to write onto a file
ofstream writer;
// the input file stream to read from a file
ifstream reader;
// the filepath of the file we're working with
string filePath;
// the current active mode.
BMBinaryIOMode currentMode;
public:
BMBinaryIO()
{
currentMode = BMBinaryIOMode::None;
}
// the destructor will be responsible for checking if we forgot to close
// the file
~BMBinaryIO()
{
if(writer.is_open())
{
BMLogging::error(BMLoggingClass::BinaryIO, "You forgot to call close() after finishing with the file! Closing it...");
writer.close();
}
if(reader.is_open())
{
BMLogging::error(BMLoggingClass::BinaryIO, "You forgot to call close() after finishing with the file! Closing it...");
reader.close();
}
}
// opens a file with either read or write mode. Returns whether
// the open operation was successful
bool open(string fileFullPath, BMBinaryIOMode mode)
{
filePath = fileFullPath;
BMLogging::info(BMLoggingClass::BinaryIO, "Opening file: " + filePath);
// Write mode
if(mode == BMBinaryIOMode::Write)
{
currentMode = mode;
// check if we had a previously opened file to close it
if(writer.is_open())
writer.close();
writer.open(filePath, ios::binary);
if(!writer.is_open())
{
BMLogging::error(BMLoggingClass::BinaryIO, "Could not open file for write: " + filePath);
currentMode = BMBinaryIOMode::None;
}
}
// Read mode
else if(mode == BMBinaryIOMode::Read)
{
currentMode = mode;
// check if we had a previously opened file to close it
if(reader.is_open())
reader.close();
reader.open(filePath, ios::binary);
if(!reader.is_open())
{
BMLogging::error(BMLoggingClass::BinaryIO, "Could not open file for read: " + filePath);
currentMode = BMBinaryIOMode::None;
}
}
// if the mode is still the NONE/initial one -> we failed
return currentMode == BMBinaryIOMode::None ? false : true;
}
// closes the file
void close()
{
if(currentMode == BMBinaryIOMode::Write)
{
writer.close();
}
else if(currentMode == BMBinaryIOMode::Read)
{
reader.close();
}
}
bool checkWritabilityStatus()
{
if(currentMode != BMBinaryIOMode::Write)
{
BMLogging::error(BMLoggingClass::BinaryIO, "Trying to write with a non Writable mode!");
return false;
}
return true;
}
// Generic write method that will write any value to a file (except a string,
// for strings use writeString instead).
void write(void *value, size_t size)
{
if(!checkWritabilityStatus())
return;
// write the value to the file.
writer.write((const char *)value, size);
}
// Writes a string to the file
void writeString(string str)
{
if(!checkWritabilityStatus())
return;
// first add a \0 at the end of the string so we can detect
// the end of string when reading it
str += '\0';
// create char pointer from string.
char* text = (char *)(str.c_str());
// find the length of the string.
unsigned long size = str.size();
// write the whole string including the null.
writer.write((const char *)text, size);
}
// helper to check if we're allowed to read
bool checkReadabilityStatus()
{
if(currentMode != BMBinaryIOMode::Read)
{
BMLogging::error(BMLoggingClass::BinaryIO, "Trying to read with a non Readable mode!");
return false;
}
// check if we hit the end of the file.
if(reader.eof())
{
BMLogging::error(BMLoggingClass::BinaryIO, "Trying to read but reached the end of file!");
reader.close();
currentMode = BMBinaryIOMode::None;
return false;
}
return true;
}
// reads a boolean value
bool readBoolean()
{
if(checkReadabilityStatus())
{
bool value = false;
BMBINARY_READ(reader, value);
return value;
}
return false;
}
// reads a character value
char readChar()
{
if(checkReadabilityStatus())
{
char value = 0;
BMBINARY_READ(reader, value);
return value;
}
return 0;
}
// read an integer value
int readInt()
{
if(checkReadabilityStatus())
{
int value = 0;
BMBINARY_READ(reader, value);
return value;
}
return 0;
}
// read a float value
float readFloat()
{
if(checkReadabilityStatus())
{
float value = 0;
BMBINARY_READ(reader, value);
return value;
}
return 0;
}
// read a double value
double readDouble()
{
if(checkReadabilityStatus())
{
double value = 0;
BMBINARY_READ(reader, value);
return value;
}
return 0;
}
// read a string value
string readString()
{
if(checkReadabilityStatus())
{
char c;
string result = "";
while((c = readChar()) != '\0')
{
result += c;
}
return result;
}
return "";
}
};
EDIT: I replaced all the read/write methods above with these: (updated the usage code as well)
// Generic write method that will write any value to a file (except a string,
// for strings use writeString instead)
template<typename T>
void write(T &value)
{
if(!checkWritabilityStatus())
return;
// write the value to the file.
writer.write((const char *)&value, sizeof(value));
}
// Writes a string to the file
void writeString(string str)
{
if(!checkWritabilityStatus())
return;
// first add a \0 at the end of the string so we can detect
// the end of string when reading it
str += '\0';
// create char pointer from string.
char* text = (char *)(str.c_str());
// find the length of the string.
unsigned long size = str.size();
// write the whole string including the null.
writer.write((const char *)text, size);
}
// reads any type of value except strings.
template<typename T>
T read()
{
checkReadabilityStatus();
T value;
reader.read((char *)&value, sizeof(value));
return value;
}
// reads any type of value except strings.
template<typename T>
void read(T &value)
{
if(checkReadabilityStatus())
{
reader.read((char *)&value, sizeof(value));
}
}
// read a string value
string readString()
{
if(checkReadabilityStatus())
{
char c;
string result = "";
while((c = read<char>()) != '\0')
{
result += c;
}
return result;
}
return "";
}
// read a string value
void readString(string &result)
{
if(checkReadabilityStatus())
{
char c;
result = "";
while((c = read<char>()) != '\0')
{
result += c;
}
}
}
This is how you would use it to WRITE:
string myPath = "somepath to the file";
BMBinaryIO binaryIO;
if(binaryIO.open(myPath, BMBinaryIOMode::Write))
{
float value = 165;
binaryIO.write(value);
char valueC = 'K';
binaryIO.write(valueC);
double valueD = 1231.99;
binaryIO.write(valueD);
string valueStr = "spawnAt(100,200)";
binaryIO.writeString(valueStr);
valueStr = "helpAt(32,3)";
binaryIO.writeString(valueStr);
binaryIO.close();
}
Here's how you would use it to READ:
string myPath = "some path to the same file";
if(binaryIO.open(myPath, BMBinaryIOMode::Read))
{
cout << binaryIO.read<float>() << endl;
cout << binaryIO.read<char>() << endl;
double valueD = 0;
binaryIO.read(valueD); // or you could use read<double()
cout << valueD << endl;
cout << binaryIO.readString() << endl;
cout << binaryIO.readString() << endl;
binaryIO.close();
}
EDIT 2: You could even write/read a whole structure in 1 line:
struct Vertex {
float x, y;
};
Vertex vtx; vtx.x = 2.5f; vtx.y = 10.0f;
// to write it
binaryIO.write(vtx);
// to read it
Vertex vtxRead;
binaryIO.read(vtxRead); // option 1
vtxRead = binaryIO.read<Vertex>(); // option 2
Hope my code is clear enough.
I subclassed ifstream and ofstream: ibfstream and obfstream. I made a little helper class that would detect the endianness of the machine I was compiling/running on. Then I added a flag for ibfstream and obfstream that indicated whether bytes in primitive types should be flipped. These classes also had methods to read/write primitive types and arrays of such types flipping the byte order as necessary. Finally, I set ios::binary for these classes by default.
I was often working on a little-endian machine and wanting to write big-endian files or vice versa. This was used in a program that did a lot of I/O with 3D graphics files of various formats.
I subclassed ifstream and ofstream: ibfstream and obfstream. I made a class that would detect the endianness of the machine I was compiling/running on. Then I added a flag for ibfstream and obfstream that indicated whether bytes in primitive types should be flipped. These classes also had methods to read/write primitive types and arrays of such types flipping the byte order as necessary.
I was often working on a little-endian machine and wanting to write big-endian files or vice versa. This was used in a program tht did a lot of I/O with 3D graphics files of various formats.