Character by Character Input from a file, in C++ - c++

Is there any way to get input from a file one number at a time?
For example I want to store the following integer in an vector of integers since it is so long and can't be held by even a long long int.
12345678901234567900
So how can I read this number from a file so that I can:
vector<int> numbers;
number.push_back(/*>>number goes here<<*/)
I know that the above code isn't really complete but I hope that it explains what I am trying to do.
Also I've tried google and so far it has proved innefective because only tutorials for C are coming up which aren't really helping me all too much.
Thank is advance,
Dan Chevalier

This could be done in a variety of ways, all of them boiling down to converting each char '0'..'9' to the corresponding integer 0..9. Here's how it can be done with a single function call:
#include <string>
#include <iostream>
#include <vector>
#include <iterator>
#include <functional>
#include <algorithm>
int main()
{
std::string s = "12345678901234567900";
std::vector<int> numbers;
transform(s.begin(), s.end(), back_inserter(numbers),
std::bind2nd(std::minus<char>(), '0'));
// output
copy(numbers.begin(), numbers.end(),
std::ostream_iterator<int>(std::cout, " "));
std::cout << '\n';
}
When reading from a file, you could read the string and transform(), or even transform() directly from istream iterators, if there is nothing else in that file besides your number:
std::ifstream f("test.txt");
std::vector<int> numbers;
transform(std::istream_iterator<char>(f),
std::istream_iterator<char>(),
back_inserter(numbers),
std::bind2nd(std::minus<char>(), '0'));

Off the top of my head this should fill up a character array which you can then iterate through. I realize it's not exactly what you were after but it's my preferred method.
void readfile(char *string)
{
ifstream NumberFile;
NumberFile.open("./Number"); //For a unix file-system
NumberFile >> string;
NumberFile.close();
}
Also, to perform operations on the actual numbers you can use:
CharacterArray[ElementNumber] - '0'
and to get the number when it is small enough to fit in a datatype you add each element of the array multiplied by 10 to the power of its index.

You can read a char at a time with char c; cin.get(c); and convert it to the numeral with c -= '0'. But perhaps you can just read it as a string or use something like BigNum.

Related

Without looping through the string, how can I grab all integers from said string? String class methods?

Rule I must abide by
Do not use loops or character arrays to process strings for any of the questions below. Use member functions of the string class. You can use a loop to read the file and to count the number of processors.
Some Tips
Here are some functions that you might find useful:
File class: getline
String class: find, rfind, substr, length, c_str, constant npos
Misc. functions: atoi, atof
(may require the C standard library for C++, i.e., )
isstringstream
(Both of the above are ways to convert a string to a number.)
Here is an example string I would need to extract:
"46 bits physical, 48 bits virtual"
I can go through the same string twice. I'd want to grab 46 and store it and then do the same for 48.
I'm not sure the best way to go about this. Is it possible to do something like this:
string.find_first_of(integer);
string.find_last_not_of(integer);
Or possibly regex? I think I can use that as long as I don't need to use a 3rd party library or anything like that.
The following ended up working for me.
#include <sstream>
string myString = "hello 47";
int val;
istringstream iss (myString);
iss >> val;
cout << val << endl;
// The output of val will be 47.
Since you indicated in the comments that STL is allowed, you can use a generic programming approach relying on STL algorithms. For example,
#include <iostream>
#include <algorithm>
#include <iterator>
#include <string>
int main()
{
using namespace std;
string haystack = "46 bits physical, 48 bits virtual";
string result;
remove_copy_if(begin(haystack), end(haystack),
back_inserter(result),
[](char c) { return !isspace(c) && !isdigit(c); } );
cout << result;
}
You basically treat the characters in the string as a stream of inputs, from that just filter out all non-digit characters and keeping whatever delimiter char you want to use. My example keeps whitespace as delimiter.
The above gives the output
46 48

why does char[1] read entire word from my input file?

this is what I have done till now: I want to read words from file in C++ and I am allowed to use only cstring library. this is my piece of code
#include <cstring>
#include <fstream>
#include <stdio.h>
using namespace std;
int main(){
ifstream file;
char word[1];
file.open("p.txt");
while (!file.eof()){
file >> word;
cout << word << endl;
}
system("pause");
return 0;
}
It is working fine and reading one word at a time. But I don't understand how this is working fine.
How can char array of any size be it char word[1] or char word[50] read only one word at a time ignoring spaces.
And further I want to store these words in dynamic array. How can I achieve this? Any guidance would be appreciated?
Your code has undefined behaviour. operator >> simply overwrites memory beyond the array.
Take into account that included by you header <stdio.h> is not used in the program. On the other hand you need to include header <cstdlib> that declares function system.
As for your second question then you should use for example standard container std::vector<std::string>
For example
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cstdlib>
int main()
{
std::ifstream file("p.txt");
std::string s;
std::vector<std::string> v;
v.reserve( 100 );
while ( file >> s ) v.push_back( s );
std::system( "pause" );
return 0;
}
Or you can simply define the vector as
std::vector<std::string> v( ( std::istream_iterator<std::string>( file ) ),
std::istream_iterator<std::string>() );
provided that you will include header <iterator>
For example
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>
#include <cstdlib>
int main()
{
std::ifstream file("p.txt");
std::vector<std::string> v( ( std::istream_iterator<std::string>( file ) ),
std::istream_iterator<std::string>() );
for ( const std::string &s : v ) std::cout << s << std::endl;
std::system( "pause" );
return 0;
}
Your code is invoking undefined behavior. That it doesn't crash is a roll of the dice, but its execution is not deterministic precisely because that is the nature of being undefined.
The easiest way (I've found) to load a file of words with whitespace separation is by:
std::ifstream inp("p.txt");
std::istream_iterator<std::string> inp_it(inp), inp_eof;
std::vector<std::string> strs(inp_it, inp_eof);
strs will contain every whitespace delimited char sequence as a linear vector of std::string. Use std::string for dynamic string content and don't feel the least bit guilty about exploiting the hell out of the hard work those that came before you gave us all: The Standard Library.
Your code is failing due to the overload of char * for operator>>.
An array of char, regardless the size, will decompose to the type char * where the value is the address of the start of the array.
For compatibility with the C language, the overloaded operator>>(char *) has been implemented to read one or more characters until a terminating whitespace character is reached, or there is an error with the stream.
If you declare an array of 1 character and read from a file containing "California", the function will put 'C' into the first location of the array and keep writing the remaining characters to the next locations in memory (regardless of what data type they are). This is known as a buffer overflow.
A much safer method is to read into a std::string or if you only want one character, use a char variable. Look in your favorite C++ reference for the getline methods. There is an overload for reading until a given delimiter is reached.
You only need a couple changes:
#include <cstring>
#include <fstream>
#include <stdio.h>
#include <string>
int main(){
ifstream file;
string word;
file.open("p.txt");
while (file >> word){
cout << word << endl;
}
system("pause");
return 0;
}
It works because you are lucky and you don't overwrite some critical memory. You need to allocate enough bytes for char word array, say char word[64]. And use while(file>>word) as your test for EOF. In the loop you can push_back the word into a std::vector<string> if you are allowed to use C++ STL.
If you want a simple C++11 STL-like solution, use this
#include <algorithm>
#include <iterator>
#include <vector>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
ifstream fin("./in.txt"); // input file
vector<string> words; // store the words in a vector
copy(istream_iterator<string>(fin),{}, back_inserter(words)); // insert the words
for(auto &elem: words)
cout << elem << endl; // display them
}
Or, more compactly, construct the container directly from the stream iterator like
vector<string> words(istream_iterator<string>(fin),{});
and remove the copy statement.
If instead a vector<string> you use a multiset<string> (#include <set>) and change
copy(istream_iterator<string>(fin),{}, back_inserter(words)); // insert the words
to
copy(istream_iterator<string>(fin),{}, inserter(words, words.begin())); // insert the words
you get the words ordered. So using STL is the cleanest approach in my opinion.
You're using C++, so you can avoid all that C stuff.
std::string word;
std::vector<std::string> words;
std::fstream stream("wordlist");
// this assumes one word (or phrase, with spaces, etc) per line...
while (std::getline(stream, word))
words.push_back(word);
or for multiple words (or phrases, with spaces, etc) per line separated by commas:
while (std::getline(stream, word, ','))
words.push_back(word);
or for multiple words per line separated by spaces:
while(stream >> word)
words.push_back(word);
No need to worry about buffer sizes or memory allocation or anything like that.
file>>char *
Will work with any char * and you are using
file >> word;
and it simply sees work variable as a char * but you are getting a segemntation fault somewhere and if your code grows you will see something is not working without any logical reason. GDB debugger will show you the seg fault

Length of a char array

I have the code like this:
#include <iostream.h>
#include <fstream.h>
void main()
{
char dir[25], output[10],temp[10];
cout<<"Enter file: ";
cin.getline(dir,25); //like C:\input.txt
ifstream input(dir,ios::in);
input.getline(output,'\eof');
int num = sizeof(output);
ofstream out("D:\\size.txt",ios::out);
out<<num;
}
I want to print the length of the output. But it always returns the number 10 (the given length) even if the input file has only 2 letters ( Like just "ab"). I've also used strlen(output) but nothing changed. How do I only get the used length of array?
I'm using VS C++ 6.0
sizeof operator on array gives you size allocated for the array, which is 10.
You need to use strlen() to know length occupied inside the array, but you need to make sure the array is null terminated.
With C++ better alternative is to simple use: std::string instead of the character array. Then you can simply use std::string::size() to get the size.
sizeof always prints the defined size of an object based on its type, not anything like the length of a string.
At least by current standards, your code has some pretty serious problems. It looks like it was written for a 1993 compiler running on MS-DOS, or something on that order. With a current compiler, the C++ headers shouldn't have .h on the end, among other things.
#include <iostream>
#include <fstream>
#include <string>
int main() {
std::string dir, output, temp;
std::cout<<"Enter file: ";
std::getline(cin, dir); //like C:\input.txt
std::ifstream input(dir.c_str());
std::getline(input, output);
std::ofstream out("D:\\size.txt");
out<<output.size();
}
The getline that you are using is an unformatted input function so you can retrieve the number of characters extracted with input.gcount().
Note that \e is not a standard escape sequence and the character constant \eof almost certainly doesn't do what you think it does. If you don't want to recognise any delimiter you should use read, not getline, passing the size of your buffer so that you don't overflow it.

How does this program work

This is my first C++ program. It prints the number of words in the input.
My first question, how does it go into the loop and add to the count? is it every time i type the space character? if so, how does it know I'm trying to count words?
using namespace std;
int main() {
int count;
string s;
count = 0;
while (cin >> s)
count++;
cout << count << '\n';
return 0;
}
My second question. Can someone explain to me what namespace std means for a begineer?
When you do cin >> string. You will read a word and put it in the string. Yes, it will read char by char until reach the delimiter.
Std means Standard. Standard C++ library is inside the std namespace. You can rewrite or code without the using namespace std:
int main() {
int count;
std::string s;
count = 0;
while (std::cin >> s)
count++;
std::cout << count << '\n';
return 0;
}
I discourage that novices use the using namespace std statement because it is harder to understand what is going on.
Cin will capture input until a space, yes. The specific style of loop you have will go until an End-Of-File (EOF) is found or until bad input is provided. That loop doesn't look like common C++ practice to me, but it's described here.
2.namespace std is how you tell the compiler where to look to find the objects you're referencing in your code. Because different objects are "inside" different namespaces, you either have to tell the compiler where they are specifically (aka std::cin) or tell it for convenience where an object you use will be in the future (with using namespace std).
In your code, cin >> s attempts to read a std::string from input stream. If the attempt succeeds, then the returned value of cin >> s implicitly converts into true and the while loop continues, incrementing the counter. Otherwise, the while loop exits when the attempt fails, as there is no more data to read from the input stream.
You can use std::distance to count the words, as shown below:
#include <iostream>
#include <algorithm>
#include <iterator>
#include <string>
int main() {
std::istream_iterator<std::string> begin(std::cin), end;
size_t count = std::distance(begin, end);
std::cout << count << std::endl;
return 0;
}
Demo : http://www.ideone.com/Hldz3
In this code, you create two iterators begin and end, passing both to std::distance function. The function calculates the distance between begin and end. The distance is nothing but the number of strings in the input stream, because the iterator begin iterates over strings coming from the input stream, and end defines the end of the iterator where begin stops iterating. The reason why begin iterates over strings is because the template argument to std::istream_iterator is std::string:
std::istream_iterator<std::string> begin(std::cin), end;
//^^^^^^^^^^^
If you change this to char, then begin will iterator over char, which means the following program will count the number of characters in the input stream:
#include <iostream>
#include <algorithm>
#include <iterator>
int main() {
std::istream_iterator<char> begin(std::cin), end;
size_t count = std::distance(begin, end);
std::cout << count << std::endl;
return 0;
}
Demo : http://www.ideone.com/NH52y
Similarly, you can do many cool things if you start using iterators from <iterator> header and generic functions from <algorithm> header.
For example, let say we want to count the number of lines in the input stream. So what change would we make to the above program to get the job done? The way we change std::string to char when we wanted to count characters, immediately suggests that now we need to change it to line so that we could iterate over line (instead of char).
As no line class exist in the Standard library, we've to define one ourselves, but the interesting thing is that we can keep it empty as shown below, with full working code:
#include <iostream>
#include <algorithm>
#include <iterator>
#include <string>
struct line {}; //Interesting part!
std::istream& operator >>(std::istream & in, line &)
{
std::string s;
return std::getline(in, s);
}
int main() {
std::istream_iterator<line> begin(std::cin), end;
size_t count = std::distance(begin, end);
std::cout << count << std::endl;
return 0;
}
Yes, along with line, you've to define operator>> for line as well. It is used by std::istream_terator<line> class.
Demo : http://www.ideone.com/iKPA6

Equivalent of a python generator in C++ for buffered reads

Guido Van Rossum demonstrates the simplicity of Python in this article and makes use of this function for buffered reads of a file of unknown length:
def intsfromfile(f):
while True:
a = array.array('i')
a.fromstring(f.read(4000))
if not a:
break
for x in a:
yield x
I need to do the same thing in C++ for speed reasons! I have many files containing sorted lists of unsigned 64 bit integers that I need to merge. I have found this nice piece of code for merging vectors.
I am stuck on how to make an ifstream for a file of unknown length present itself as a vector which can be happily iterated over until the end of the file is reached. Any suggestions? Am I barking up the correct tree with an istreambuf_iterator?
In order to disguise an ifstream (or really, any input stream) in a form that acts like an iterator, you want to use the istream_iterator or the istreambuf_iterator template class. The former is useful for files where the formatting is of concern. For example, a file full of whitespace-delimited integers can be read into the vector's iterator range constructor as follows:
#include <fstream>
#include <vector>
#include <iterator> // needed for istream_iterator
using namespace std;
int main(int argc, char** argv)
{
ifstream infile("my-file.txt");
// It isn't customary to declare these as standalone variables,
// but see below for why it's necessary when working with
// initializing containers.
istream_iterator<int> infile_begin(infile);
istream_iterator<int> infile_end;
vector<int> my_ints(infile_begin, infile_end);
// You can also do stuff with the istream_iterator objects directly:
// Careful! If you run this program as is, this won't work because we
// used up the input stream already with the vector.
int total = 0;
while (infile_begin != infile_end) {
total += *infile_begin;
++infile_begin;
}
return 0;
}
istreambuf_iterator is used to read through files a single character at a time, disregarding the formatting of the input. That is, it will return you all characters, including spaces, newline characters, and so on. Depending on your application, that may be more appropriate.
Note: Scott Meyers explains in Effective STL why the separate variable declarations for istream_iterator are needed above. Normally, you would do something like this:
ifstream infile("my-file.txt");
vector<int> my_ints(istream_iterator<int>(infile), istream_iterator<int>());
However, C++ actually parses the second line in an incredibly bizarre way. It sees it as the declaration of a function named my_ints that takes in two parameters and returns a vector<int>. The first parameter is of type istream_iterator<int> and is named infile (the parantheses are ignored). The second parameter is a function pointer with no name that takes zero arguments (because of the parantheses) and returns an object of type istream_iterator<int>.
Pretty cool, but also pretty aggravating if you're not watching out for it.
EDIT
Here's an example using the istreambuf_iterator to read in a file of 64-bit numbers laid out end-to-end:
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
using namespace std;
int main(int argc, char** argv)
{
ifstream input("my-file.txt");
istreambuf_iterator<char> input_begin(input);
istreambuf_iterator<char> input_end;
// Fill a char vector with input file's contents:
vector<char> char_input(input_begin, input_end);
input.close();
// Convert it to an array of unsigned long with a cast:
unsigned long* converted = reinterpret_cast<unsigned long*>(&char_input[0]);
size_t num_long_elements = char_input.size() * sizeof(char) / sizeof(unsigned long);
// Put that information into a vector:
vector<unsigned long> long_input(converted, converted + num_long_elements);
return 0;
}
Now, I personally rather dislike this solution (using reinterpret_cast, exposing char_input's array), but I'm not familiar enough with istreambuf_iterator to comfortably use one templatized over 64-bit characters, which would make this much easier.