std::stringstream to read int and strings, from a string - c++

I am programming in C++ and I'm not sure how to achieve the following:
I am copying a file stream to memory (because I was asked to, I'd prefer reading from stream), and and then trying to access its values to store them into strings and int variables.
This is to create an interpreter. The code I will try to interpret is (ie):
10 PRINT A
20 GOTO 10
This is just a quick example code. Now the values will be stored in a "map" structure at first and accessed later when everything will be "interpreted".
The values to be stored are:
int lnum // line number
string cmd // command (PRINT and GOTO)
string exp // expression (A and 10 in this case but could hold expressions like (a*b)-c )
question is given the following code, how do I access those values and store them in memory?
Also the exp string is of variable size (can be just a variable or an expression) so I am not sure how to read that and store it in the string.
code:
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
#include <cstring>
#include <map>
#include <sstream>
using namespace std;
#include "main.hh"
int main ()
{
int lenght;
char *buffer;
// get file directory
string dir;
cout << "Please drag and drop here the file to interpret: ";
getline (cin,dir);
cout << "Thank you.\n";
cout << "Please wait while your file is being interpreted.\n \n";
// Open File
ifstream p_prog;
p_prog.open (dir.c_str());
// Get file size
p_prog.seekg (0, ios::end);
lenght = p_prog.tellg();
p_prog.seekg(0, ios::beg);
// Create buffer and copy stream to it
buffer = new char[lenght];
p_prog.read (buffer,lenght);
p_prog.close();
// Define map<int, char>
map<int, string> program;
map<int, string>::iterator iter;
/***** Read File *****/
int lnum; // line number
string cmd; // store command (goto, let, etc...)
string exp; // to be subst with expr. type inst.
// this is what I had in mind but not sure how to use it properly
// std::stringstream buffer;
// buffer >> lnum >> cmd >> exp;
program [lnum] = cmd; // store values in map
// free memory from buffer, out of scope
delete[] buffer;
return 0;
}
I hope this is clear.
Thank you for your help.
Valerio

You can use a std::stringstream to pull tokens, assuming that you already know the type.
For an interpreter, I'd highly recommend using an actual parser rather than writing your own. Boost's XPressive library or ANTLR work quite well. You can build your interpreter primitives using semantic actions as you parse the grammar or simply build an AST.
Another option would be Flex & Bison. Basically, these are all tools for parsing pre-defined grammars. You can build your own, but prepare for frustration. Recursively balancing parentheses or enforcing order of operations (divide before multiply, for example) isn't trivial.
The raw C++ parsing method follows:
#include <sstream>
#include <string>
// ... //
istringstream iss(buffer);
int a, b;
string c, d;
iss >> a;
iss >> b;
iss >> c;
iss >> d;

The way something like this can be done (especially the arithmetic expression part that you alluded to) is:
Write some code that determines where a token ends and begins. For example 5 or + would be called a token. You might scan the text for these, or common separators such as whitespace.
Write up the grammar of the language you're parsing. For example you might write:
expression -> value
expression -> expression + expression
expression -> expression * expression
expression -> function ( expression )
expression -> ( expression )
Then based on this grammar you would write something that parses tokens of expressions into trees.
So you might have a tree that looks like this (pardon the ASCII art)
+
/ \
5 *
/ \
x 3
Where this represents the expression 5 + (x * 3). By having this in a tree structure it is really easy to evaluate expressions in your code: you can recursively descend the tree, performing the operations with the child nodes as arguments.
See the following Wikipedia articles:
Parser
Top-down parsing (your needs are probably simple enough to use this)
Recursive-descent parser (a simple way to convert a grammar into code)
Or consult your local computer science department. :-)
There are also tools that will generate these parsers for you based on a grammar. You can do a search for "parser generator".

Don't do the dynamic allocation of the buffer explicitly use a vector.
This makes memory management implicit.
// Create buffer and copy stream to it
std::vector<char> buffer(lenght);
p_prog.read (&buffer[0],lenght);
p_prog.close();
Personally I don't explicitly use close() (unless I want to catch an exception). Just open a file in a scope that will cause the destructor to close the file when it goes out of scope.

This might be of help:
http://oopweb.com/CPP/Documents/CPPHOWTO/Volume/C++Programming-HOWTO-7.html
Especially section 7.3.
You might be better off just <<'ing the lines in rather than the seeking and charbuffer route.

Related

allocate array from file data c++

Here is my code where prime.txt contains some primes: 7, 11, 13, 17, 23... :
#include <stdio.h>
#include <stdlib.h>
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
string file= "primes.txt";
ifstream fichier;
fichier.open(file);
int count = 0, prime;
int *buffer = (int*) malloc(8080*sizeof(int));
while ( fichier >> prime){
buffer[count] = prime;
count++;
}
fichier.close();
return 0;
}
I would like to know if there's a way to allocate an array from file's data without using loops? I saw that you can do it with binary files but I was wondering if we could do it for files with string or int too.
Here is a method that doesn't use an explicit loop:
fichier >> buffer[count++]; // Read and convert to internal format
fichier >> buffer[count++];
fichier >> buffer[count++];
// Repeat for each number in the file
fichier >> buffer[count++];
In the runtime library, a loop is used when reading characters to build the number. Also, the input stream may be buffered, which is another loop.
You can avoid writing the loop if you use iterators or ranges. Here is an equivalent example using ranges:
auto view = std::ranges::istream_view<int>(fichier);
auto copy_result = std::ranges::copy(view, buffer);
int count = std::distance(buffer, copy_result.out);
The loops are still there, inside the standard library. There's no way to avoid that with input streams.
I saw that you can do it with binary files but I was wondering if we could do it for files with string or int too.
You transform the character strings into integers. Memory mapping doesn't work when the data needs to be transformed. This could work if the integers were written in binary format rather than text. That does however have the caveat that the file would not be portable to other systems that represent integers differently in memory.
Furthermore, there is no standard way to memory map files in C++. You would need to rely on an API provided by the system.
Even then, there would be loops. But those loops would be inside the operating system kernel and not having system calls in loop iterations of user code may potentially improve performance depending on usage pattern.
If you are mentioning the loop used to store the length in count than no, you'll always need to iterate through the content of the file.
It's possible that there are some methods to do this without a explicit loop but internally they will necessarily contain a loop.

How to read double digits and single digits in C++

I have an issue where I cannot get my C++ program to read double digit integers.
My idea is to read it as string and then somehow parse it into separate integers and insert them into an array, but I am stuck on getting the code to read digits properly.
Sample Output:
i: 0 codeColumn 0
i: 1 codeColumn 1
i: 2 codeColumn 0 0
i: 3 codeColumn 0
i: 4 codeColumn 31 0
i: 5 codeColumn 1
i: 6 codeColumn 43 0
i: 7 codeColumn 3
i: 8 codeColumn 9 0
So the file is basically a line of triplets delimited by a comma:
0,1,0 0,0,31 0,0,18 0,0,8 0,11,0
My question is how do you get the trailing zeroes (see above) to move to a new line? I tried using "char" and a bunch of if statements to concatenate the single digits into double digits, but I feel like that's not really efficient or ideal. Any ideas?
My code:
#include <iostream> // Basic I/O
#include <string> // string classes
#include <fstream> // file stream classes
#include <sstream>
#include <vector>
int main()
{
ifstream fCode;
fCode.open("code.txt");
vector<string> codeColumn;
while (getline(fCode, codeLine, ',')) {
codeColumn.push_back(codeLine);
}
for (size_t i = 0; i < codeColumn.size(); ++i) {
cout << " i: " << i << " codeColumn " << codeColumn[i] << endl;
}
fCode.close();
}
getline(fCode, codeLine, ',')
is going to read between commas, so 0,1,0 0,0,31 will split up exactly as you have seen.
0,1,0 0,0,31
^ ^ ^ ^
The tokens collected are everything between the ^s
You have two delimiters you need to take into account comma and space. The easiest way to handle the space is with dumb old >>.
std::string triplet;
while (fCode >> triplet)
{
// do stuff with triplet. Maybe something like
std::istringstream strm(triplet); // make a stream out of the triplet
int a;
int b;
int c;
char sep1;
char sep2;
while (strm >> a >> sep1 >> b >> sep2 >> c // read all the tokens we want from triplet
&& sep1 == sep2 == ',') // and the separators are commas. Triplet is valid
{
// do something with a, b, and c
}
}
Documentation for std::istringstream.
So, I will show you 3 solutions from easy to understand C-Style code, then more-modern C++ code using the std::algorithm library and iterators, and, at the end an object oriented C++ solution.
I will also explain to you that std::getline can be, but should not be used for splitting strings into tokens.
I saw from your question that you had difficulties to understand that. And I understand your concern.
But let's start with an easy solution. I show the code and then explain it to you:
#include <iostream>
#include <fstream>
#include <string>
int main() {
// Open the source text file, and check, if there was no failure
if (std::ifstream fCode{ "r:\\code.txt" }; fCode) {
size_t tripletCounter{ 0 };
// Now, read all triplets from the file in a simple for loop
for (std::string triplet{}; fCode >> triplet; ) {
// Prepare output
std::cout << "\ni:\t" << tripletCounter++ << "\tcodeColumn:\t";
// Go through the triplet, search for comma, then output the parts
for (size_t i{ 0U }, startpos{ 0U }; i <= triplet.size(); ++i) {
// So, if there is a comma or the end of the string
if ((triplet[i] == ',') || (i == (triplet.size()))) {
// Print substring
std::cout << (triplet.substr(startpos, i - startpos)) << ' ';
startpos = i + 1;
}
}
}
}
else {
std::cerr << "\n*** Error, Could not open source file\n";
}
return 0;
}
You see, we need just a few lines of easy to understand code that will fullfil your requirements and produce the desired output.
Some maybe for you new features:
The if statement with initializer. This is available since C++17. You can (in addition to the condition) define a variable and initalize it. So, in
if (std::ifstream fCode{ "r:\\code.txt" }; fCode) {
we first define a variable with name "fCode" of type std::ifstream. We use the uniform initialzer "{}", to initialze it with the input file name.
This will call the constructor for the variable "fCode", and open the file. (This is was this constructor does). After the closing "}" of the "if-statement" the variable "fCode" will fall out of scope and the destructor for the std::ifstream will be called. This will close the file automatically.
This type of if-statement has been introduced to help to prevent name space solution. The variable shall only be visible in the scope, where it is used. Without that, you would have to define the std::ifstream outside (before) the if and it would be visible for the outer context and the file would be closed at a very late time. So, please get aquainted to that.
Next we define the a "tripletCounter". That is hust necessary for output. There is no other usage.
Then, again such an if-statement with initailizer. We first define an empty std::string "triplet" and then use the extractor operator to read text until the next white space. This is how the "extractor" (>>) works. We use the whole expression as condition, to check, if the extraction worlked, or if we hit the end of file (or some other error). This works because the extractor operator returns the stream in that is was working, so a reference to "fCode". And the stream has on overwritten boolen operator !, to check the condition of the stream. Please see here.
You should always and for every IO-Operation check, if it worked or not.
So, next we split the triple (e.g. "0,1,0") into its sub-strings with an very easy for loop. We go through all characters in the string and check, if the current chacter is a comma or the end of string. In that case, we output, the characters before the delimiter.
Very simple and easy to understand. std::getline is not needed here.
So, next solution, more advanced:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>
#include <regex>
std::regex re(",");
int main() {
// Open the source text file, and check, if there was no failure
if (std::ifstream fCode{ "r:\\code.txt" }; fCode) {
size_t tripletCounter{ 0 };
// Now, read all triplets from the file into a vector
std::vector triplets(std::istream_iterator<std::string>(fCode), {});
// Next, go through all triplets
for (const std::string &triplet : triplets) {
// Prepare output
std::cout << "\ni:\t" << tripletCounter++ << "\tcodeColumn:\t";
// Split triplet into code column. All codes are in vector codeColums
std::vector codeColumns(std::sregex_token_iterator(triplet.begin(), triplet.end(), re, -1), {});
//Show codes
for (const std::string& code : codeColumns) std::cout << code << ' ';
}
}
else {
std::cerr << "\n*** Error, Could not open source file\n";
}
return 0;
}
The beginning is the same. But then:
// Now, read all triplets from the file into a vector
std::vector triplets(std::istream_iterator<std::string>(fCode), {});
UhOh. Whats that. Let's start with the std::istream_iterator. If you read the linked description, then you will find out, that it will basically call the extractor operator >> for the specified type. And since it is an iterator, it will call it again and again, if the iterator is incremented. Ok, understandable, but then
We define variable triplets as std::vector and call its constructor with 2 arguments. That constructor is the the so called range constructor of the std::vector. Please see the descrition for constructor 5. Aha, it gets a "begin()" iterator and an "end()" iterator. Aha, but what is this strange {} instead of the "end()"-iterator. This is the default initializer (please see here and here. And if we look at the description of the std::istream_iterator we can see the the default is the end iterator. OK, understood.
I assum that you know about the range based for, which comes next. Good. But now, we come to the most difficult point. Splitting a string with delimiters. People are using std::getline. But why? Why are people doing such strange stuff?
What do people expect from the function, when they read
getline ?
Most people would say, Hm, I guess it will read a complete line from somewhere. And guess what, that was the basic intention for this function. Read a line from a stream and put it into a string.
As you can see here std::getline has some additional functionality.
And this lead to a major misuse of this function for splitting up std::strings into tokens.
Splitting strings into tokens is a very old task. In very early C there was the function strtok, which still exists, even in C++. Please see std::strtok.
But because of the additional functionality of std::getline is has been heavily misused for tokenizing strings. If you look on the top question/answer regarding how to parse a CSV file (please see here), then you will see what I mean.
People are using std::getline to read a text line, a string, from the original stream, then stuffing it into an std::istringstream again and use std::getline with delimiter again to parse the string into tokens.
Weird.
Because, since many many years, we have a dedicated, special function for tokenizing strings, especially and explicitly designed for that purpose. It is the
std::sregex_token_iterator
And since we have such a dedicated function, we should simply use it.
This thing is an iterator. For iterating over a string, hence the function name is starting with an s. The begin part defines, on what range of input we shall operate, (begin(), end()), then there is a std::regex for what should be matched / or what should not be matched in the input string. The type of matching strategy is given with last parameter.
0 --> give me the stuff that I defined in the regex and
-1 --> give me that what is NOT matched based on the regex.
We can use this iterator for storing the tokens in a std::vector. The std::vector has a range constructor, which takes 2 iterators as parameter, and copies the data between the first iterator and 2nd iterator to the std::vector. The statement
std::vector tokens(std::sregex_token_iterator(s.begin(), s.end(), re, -1), {});
defines a variable “tokens” as a std::vector and uses again the range-constructor of the std::vector. Please note: I am using C++17 and can define the std::vector without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction"). I also used that for the vector above.
Additionally, you can see that I do not use the "end()"-iterator explicitly.
This iterator will be constructed from the empty brace-enclosed default initializer with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that, as already described.
You can read any number of tokens in a line and put it into the std::vector
But you can do even more. You can validate your input. If you use 0 as last parameter, you define a std::regex that even validates your input. And you get only valid tokens.
Overall, the usage of a dedicated functionality is superior over the misused std::getline and people should simply use it.
Some people may complain about the function overhead, but how many of them are using big data. And even then, the approach would be probably then to use string.findand string.substring or std::stringviews or whatever.
So, somehow advanced, but you will eventually learn it.
And now we will use an object oriented approach. As you know, C++ is an object oriented language.
We can put data, and methods working with that data, in a class (struct). The functionality is encapsulated. Only the class should know, how to operate on its data. Sw, we will define a class "Code". This contains a std::array consisting of 3 st::strings. and associated functions. For the array we made a typedef for easier writing. The functions that we need, are input and output. So, we will overwrite the extractor and the inserter operator.
In these operators, we use functions as dscribed above.
And as a result of all this work, we get an elegant main function, where all the work is done in 3 lines of code.
Please see:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>
#include <regex>
#include <array>
#include <algorithm>
using Triplet = std::array<std::string, 3>;
std::regex re(",");
struct Code {
// Our Data
Triplet triplet{};
// Overwrite extractor operator for easier input
friend std::istream& operator >> (std::istream& is, Code& c) {
// Read a triplet with commans
if (std::string s{}; is >> s) {
// Copy the single columns of the triplet in to our internal Data structure
std::copy(std::sregex_token_iterator(s.begin(), s.end(), re, -1), {}, c.triplet.begin());
}
return is;
}
// Overwrite inserter for easier output
friend std::ostream& operator << (std::ostream& os, const Code& c) {
return os << c.triplet[0] << ' ' << c.triplet[1] << ' ' << c.triplet[2];
}
};
int main() {
// Open the source text file, and check, if there was no failure
if (std::ifstream fCode{ "r:\\code.txt" }; fCode) {
// Now, read all triplets from the file, split it and put the Codes into a vector
std::vector code(std::istream_iterator<Code>(fCode), {});
// Show output
for (size_t tripletCounter{ 0U }; tripletCounter < code.size(); tripletCounter++)
std::cout << "\ni:\t" << tripletCounter << "\tcodeColumn:\t" << code[tripletCounter];
}
else {
std::cerr << "\n*** Error, Could not open source file\n";
}
return 0;
}

Without looping through the string, how can I grab all integers from said string? String class methods?

Rule I must abide by
Do not use loops or character arrays to process strings for any of the questions below. Use member functions of the string class. You can use a loop to read the file and to count the number of processors.
Some Tips
Here are some functions that you might find useful:
File class: getline
String class: find, rfind, substr, length, c_str, constant npos
Misc. functions: atoi, atof
(may require the C standard library for C++, i.e., )
isstringstream
(Both of the above are ways to convert a string to a number.)
Here is an example string I would need to extract:
"46 bits physical, 48 bits virtual"
I can go through the same string twice. I'd want to grab 46 and store it and then do the same for 48.
I'm not sure the best way to go about this. Is it possible to do something like this:
string.find_first_of(integer);
string.find_last_not_of(integer);
Or possibly regex? I think I can use that as long as I don't need to use a 3rd party library or anything like that.
The following ended up working for me.
#include <sstream>
string myString = "hello 47";
int val;
istringstream iss (myString);
iss >> val;
cout << val << endl;
// The output of val will be 47.
Since you indicated in the comments that STL is allowed, you can use a generic programming approach relying on STL algorithms. For example,
#include <iostream>
#include <algorithm>
#include <iterator>
#include <string>
int main()
{
using namespace std;
string haystack = "46 bits physical, 48 bits virtual";
string result;
remove_copy_if(begin(haystack), end(haystack),
back_inserter(result),
[](char c) { return !isspace(c) && !isdigit(c); } );
cout << result;
}
You basically treat the characters in the string as a stream of inputs, from that just filter out all non-digit characters and keeping whatever delimiter char you want to use. My example keeps whitespace as delimiter.
The above gives the output
46 48

Obtaining a certain section from a line in a file (C++)

I've spent a lot of time looking online to find a answer for this, but nothing was helping, so I figured I'd post my specific scenario. I have a .txt file (see below), and I am trying to write a routine that just finds a certain chunk of a certain line (e.g. I want to get the 5 digit number from the second column of the first line). The file opens fine and I'm able to read in the entire thing, but I just don't know how to get certain chunks from a line specifically. Any suggestions? (NOTE: These names and numbers are fictional...)
//main cpp file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
ifstream fin;
fin.open("customers.txt");
return 0;
}
//customers.txt
100007 13153 09067.50 George F. Thompson
579489 21895 00565.48 Keith Y. Graham
711366 93468 04602.64 Isabel F. Anderson
Text parsing is not such a trivial thing to implement.
If your format won't change you could try to parse it by yourself, use random access file access and use regular expressions to extract the part of the stream that you need, or read a certain quantity of chars.
If you go the regex way, you'll need C++11 or a third party library, like Boost or POCO.
If you can format the text file then you might also want to choose a standard to structure your data, like XML, and use the facilities of that format to extract the information you want. POCO might help you there.
Some simple hints in your code to help you, you will need to complete the code. But the missing pieces are easy to find at stackoverflow.
//main cpp file
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
splitLine(const char* str, vector<string> results){
// splits str and stores each value in results vector
}
int main()
{
ifstream fin;
fin.open("customers.txt");
char buffer[128];
if(fin.good()){
while(!fin.eof()){
fin.getline(buffer, 256);
cout << buffer << endl;
vector<string> results;
splitLine(buffer, results);
// now results MUST contain 4 strings, for each
// column in a line
}
}
return 0;
}
If the columns are separated by whitespace then the second column of the first row is simpy the second token extracted from the stream.
std::ifstream input{"customers.txt"}; // Open file input stream.
std::istream_iterator<int> it{input}; // Create iterator to first token.
int number = *std::next(it); // Advance to next token and dereference.

saving files in c++ at different full paths

I am writing a program in C++ which I need to save some .txt files to different locations as per the counter variable in program what should be the code? Please help
I know how to save file using full path
ofstream f;
f.open("c:\\user\\Desktop\\**data1**\\example.txt");
f.close();
I want "c:\user\Desktop\data*[CTR]*\filedata.txt"
But here the data1,data2,data3 .... and so on have to be accessed by me and create a textfile in each so what is the code?
Counter variable "ctr" is already evaluated in my program.
You could snprintf to create a custom string. An example is this:
char filepath[100];
snprintf(filepath, 100, "c:\\user\\Desktop\\data%d\\example.txt", datanum);
Then whatever you want to do with it:
ofstream f;
f.open(filepath);
f.close();
Note: snprintf limits the maximum number of characters that can be written on your buffer (filepath). This is very useful for when the arguments of *printf are strings (that is, using %s) to avoid buffer overflow. In the case of this example, where the argument is a number (%d), it is already known that it cannot have more than 10 characters and so the resulting string's length already has an upper bound and just making the filepath buffer big enough is sufficient. That is, in this special case, sprintf could be used instead of snprintf.
You can use the standard string streams, such as:
#include <fstream>
#include <string>
#include <sstream>
using namespace std;
void f ( int data1 )
{
ostringstream path;
path << "c:\\user\\Desktop\\" << data1 << "\\example.txt";
ofstream file(path.str().c_str());
if (!file.is_open()) {
// handle error.
}
// write contents...
}