Counting the number of words in a file - c++

#include <iostream>
#include <string>
#include <fstream>
#include <cstring>
using namespace std;
int hmlines(ifstream &a){
int i=0;
string line;
while (getline(a,line)){
cout << line << endl;
i++;
}
return i;
}
int hmwords(ifstream &a){
int i=0;
char c;
while ((c=a.get()) && (c!=EOF)){
if(c==' '){
i++;
}
}
return i;
}
int main()
{
int l=0;
int w=0;
string filename;
ifstream matos;
start:
cout << "give me the name of the file i wish to count lines, words and chars: ";
cin >> filename;
matos.open(filename.c_str());
if (matos.fail()){
goto start;
}
l = hmlines(matos);
matos.seekg(0, ios::beg);
w = hmwords(matos);
/*c = hmchars(matos);*/
cout << "The # of lines are :" << l << ". The # of words are : " << w ;
matos.close();
}
The file that i am trying to open has the following contents.
Twinkle, twinkle, little bat!
How I wonder what you're at!
Up above the world you fly,
Like a teatray in the sky.
The output i get is:
give me the name of the file i wish to count lines, words and chars: ert.txt
Twinkle, twinkle, little bat!
How I wonder what you're at!
Up above the world you fly,
Like a teatray in the sky.
The # of lines are :4. The # of words are : 0

int hmwords(ifstream &a){
int i;
You've forgotten to initialize i. It can contain absolutely anything at that point.
Also note that operator>> on streams skips whitespace by default. Your word counting loop needs the noskipws modifier.
a >> noskipws >> c;
Another problem is that after you call hmlines, matos is at end of stream. You need to reset it if you want to read the file again. Try something like:
l = hmlines(matos);
matos.clear();
matos.seekg(0, ios::beg);
w = hmwords(matos);
(The clear() is necessary, otherwise seekg has no effect.)

Formatted input eats whitespaces. You can just count tokens directly:
int i = 0;
std::string dummy;
// Count words from the standard input, aka "cat myfile | ./myprog"
while (cin >> dummy) ++i;
// Count files from an input stream "a", aka "./myprog myfile"
while (a >> dummy) ++i;

Related

Reading coordinate file with ifstream while ignoring headers and writing to array

There's a series of coordinates I'm trying to write to an array so I can perform calculations on, but I haven't been able to read the file correctly since I can't ignore the headers, and when I do remove the headers it also doesn't seem to correctly write the values to the array.
The coordinate file is a txt as below.
Coordinates of 4 points
x y z
-0.06325 0.0359793 0.0420873
-0.06275 0.0360343 0.0425949
-0.0645 0.0365101 0.0404362
-0.064 0.0366195 0.0414512
Any help with the code is much appreciated. I've tried using .ignore to skip the two header lines but they don't seem to work as expected.
#include <iostream>
#include <iomanip>
#include <fstream>
#include <string>
using namespace std;
int main() {
int i = 1;
int count = 1;
char separator;
const int MAX = 10000;
int x[MAX];
int y[MAX];
int z[MAX];
int dist[MAX];
char in_file[16]; // File name string of 16 characters long
char out_file[16];
ifstream in_stream;
ofstream out_stream;
out_stream << setiosflags(ios::left); // Use IO Manipulators to set output to align left
cout << "This program reads a series of values from a given file, saves them into an array and performs calculations." << endl << endl;
// User data input
cout << "Enter the input in_file name: \n";
cin >> in_file;
cout << endl;
in_stream.open(in_file, ios::_Nocreate);
cout << "Enter the output file name: \n";
cin >> out_file;
cout << endl;
out_stream.open(out_file);
// While loop in case in_file does not exist / cannot be opened
while (in_stream.fail()) {
cout << "Error opening '" << in_file << "'\n";
cout << "Enter the input in_file name: ";
cin >> in_file;
in_stream.clear();
in_stream.open(in_file, ios::_Nocreate);
}
while (in_stream.good) {
in_stream.ignore(256, '\n');
in_stream.ignore(256, '\n');
in_stream >> x[i] >> separator >>y[i] >> separator >> z[i];
i++;
count = count + 1;
}
cout << x[1] << y[1] << z[1];
in_stream.close();
out_stream.close();
return 0;
}
Within your reading of the file, you are using in_stream.ignore(256, '\n'); correctly, but you want to use it outside the while loop. When you have it inside the while loop, every time it runs, you will ignore the first two lines, then read the third. Your output would actually read in only a third of what you expect. To fix this, just move those 2 lines outside the while loop.
in_stream.ignore(256, '\n');
in_stream.ignore(256, '\n');
while (in_stream.good)
{
in_stream >> x[i] >> separator >>y[i] >> separator >> z[i];
i++;
count = count + 1;
}
This should fix your problem, but you should generally use a vector instead of an array. Vectors automatically manage memory and check for bounds instead of you having to do that.
Also, good practice is to read values out of the stream as the while condition instead of in_stream.good:
while(stream >> var)
{
//Your code here
}
Here is a good resource on why that is.

What is a good way to read in and separate information from this text file?

Let's say I have a text file:
83 71 69 97Joines, William B.
100 85 88 85Henry, Jackson Q.
And I want to store each number in an array of ints, and each full-name into an array of strings (a full name would be Joines, William B for example).
What would be the best way, because I debated whether using while (inputFile >> line) or while (getline(inputFile, line)) would be better. I don't know if it would be easier to read them one word at a time or read them one line at a time. My main problem will be splitting the 97Joines, William B. to 97 and Joines, William B. which I don't understand how to do in C++.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main() {
int counter = 0;
int scores[40];
string names[10];
string filename, line;
ifstream inputFile;
cout << "Please enter the location of the file:\n";
cin >> filename;
inputFile.open(filename);
while (inputFile >> line) {
// if line is numeric values only, do scores[counter] = line;
// if it is alphabet characters only, do names[counter] = line;
//if it is both, find a way to split it // <----- need help figuring out how to do this!
}
inputFile.close();
}
You need to #include <cstdlib> for strtol I am sure there are better ways to do this but this is the only way I know and this is only for 97joines, and 85Henry,
string word; // to get joines,
string str; // for 97
string numword;
inputFile >> numword;
for(int k = 0; k < numword.length(); k++)
{
if(isdigit(numword[k]))
{
str = str + numword[k];
}
else
{
word = word + numword[k];
}
}
int num = strtol(str.c_str(), NULL, 0);
You can, given the file structure you have shown, read it like this:
int a, b, c, d;
std::string name;
for (int i = 0; i < 2; ++i)
{
// read the numbers
inputFile >> a >> b >> c >> d;
// read the name
std::getline(inputFile, name);
// do stuff with the data... we just print it now
std::cout << a << " " << b << " " << c << " " << d << " " << name << std::endl;
}
Since the numbers are space separated it is easy to just use the stream operator. Furthermore, since the name is the last part we can just use std::getline which will read the rest of the line and store it in the variable name.
You can try it here, using std::cin.

Counting word occurence in textfile

Here's the code that I based here http://www.thecrazyprogrammer.com/2015/02/c-program-count-occurrence-word-text-file.html. (new in c++)
#include <iostream>
#include <fstream>
#include<cstring>
using namespace std;
int main()
{
// std::cout << "Hello World!" << std::endl;
// return 0;
ifstream fin("my_data.txt"); //opening text file
int count=0;
char ch[20],c[20];
cout<<"Enter a word to count:";
gets(c);
while(fin)
{
fin>>ch;
if(strcmp(ch,c)==0)
count++;
}
cout<<"Occurrence="<<count<<"n";
fin.close(); //closing file
return 0;
}
Error in Patter Counting
my_data.txt has only 3 "world" in it, but as I run the program, it results to
here's the textfile's content
What could go wrong?
A solution using std::string
int count = 0;
std::string word_to_find, word_inside_file;
std::ifstream fin("my_data.txt");
std::cout << "Enter a word to count:";
std::cin >> word_to_find;
while (fin >> word_inside_file) {
if (word_to_find == word_inside_file )
count++;
}
std::cout << "Occurrence=" << count << "";
If you want to find all occurrences inside other strings as well, as mentioned in the comments, you can do something like this:
...
while (fin >> word_inside_file) {
count += findAllOccurrences(word_to_find, word_inside_file);
}
...
Inside findAllOccurrences(std::string, std::string) you will implement a "find all string occurrences inside another string" algorithm.
If you are new to c++ you shouldn't really use gets. Read about "buffer overflow vulnerability". gets() is more like c-style. You should consider using std::cin.

How to read in user entered comma separated integers?

I'm writing a program that prompts the user for:
Size of array
Values to be put into the array
First part is fine, I create a dynamically allocated array (required) and make it the size the user wants.
I'm stuck on the next part. The user is expected to enter in a series of ints separated by commas such as: 1,2,3,4,5
How do I take in those ints and put them into my dynamically allocated array? I read that by default cin takes in integers separated by whitespace, can I change this to commas?
Please explain in the simplest manner possible, I am a beginner to programming (sorry!)
EDIT: TY so much for all the answers. Problem is we haven't covered vectors...is there a method only using the dynamically allocated array I have?
so far my function looks like this. I made a default array in main. I plan to pass it to this function, make the new array, fill it, and update the pointer to point to the new array.
int *fill (int *&array, int *limit) {
cout << "What is the desired array size?: ";
while ( !(cin >> *limit) || *limit < 0 ) {
cout << " Invalid entry. Please enter a positive integer: ";
cin.clear();
cin.ignore (1000, 10);
}
int *newarr;
newarr = new int[*limit]
//I'm stuck here
}
All of the existing answers are excellent, but all are specific to your particular task. Ergo, I wrote a general touch of code that allows input of comma separated values in a standard way:
template<class T, char sep=','>
struct comma_sep { //type used for temporary input
T t; //where data is temporarily read to
operator const T&() const {return t;} //acts like an int in most cases
};
template<class T, char sep>
std::istream& operator>>(std::istream& in, comma_sep<T,sep>& t)
{
if (!(in >> t.t)) //if we failed to read the int
return in; //return failure state
if (in.peek()==sep) //if next character is a comma
in.ignore(); //extract it from the stream and we're done
else //if the next character is anything else
in.clear(); //clear the EOF state, read was successful
return in; //return
}
Sample usage http://coliru.stacked-crooked.com/a/a345232cd5381bd2:
typedef std::istream_iterator<comma_sep<int>> istrit; //iterators from the stream
std::vector<int> vec{istrit(in), istrit()}; //construct the vector from two iterators
Since you're a beginner, this code might be too much for you now, but I figured I'd post this for completeness.
A priori, you should want to check that the comma is there, and
declare an error if it's not. For this reason, I'd handle the
first number separately:
std::vector<int> dest;
int value;
std::cin >> value;
if ( std::cin ) {
dest.push_back( value );
char separator;
while ( std::cin >> separator >> value && separator == ',' ) {
dest.push_back( value );
}
}
if ( !std::cin.eof() ) {
std::cerr << "format error in input" << std::endl;
}
Note that you don't have to ask for the size first. The array
(std::vector) will automatically extend itself as much as
needed, provided the memory is available.
Finally: in a real life example, you'd probably want to read
line by line, in order to output a line number in case of
a format error, and to recover from such an error and continue.
This is a bit more complicated, especially if you want to be
able to accept the separator before or after the newline
character.
You can use getline() method as below:
#include <vector>
#include <string>
#include <sstream>
int main()
{
std::string input_str;
std::vector<int> vect;
std::getline( std::cin, input_str );
std::stringstream ss(str);
int i;
while (ss >> i)
{
vect.push_back(i);
if (ss.peek() == ',')
ss.ignore();
}
}
The code is taken and processed from this answer.
Victor's answer works but does more than is necessary. You can just directly call ignore() on cin to skip the commas in the input stream.
What this code does is read in an integer for the size of the input array, reserve space in a vector of ints for that number of elements, then loop up to the number of elements specified alternately reading an integer from standard input and skipping separating commas (the call to cin.ignore()). Once it has read the requested number of elements, it prints them out and exits.
#include <iostream>
#include <iterator>
#include <limits>
#include <vector>
using namespace std;
int main() {
vector<int> vals;
int i;
cin >> i;
vals.reserve(i);
for (size_t j = 0; j != vals.capacity(); ++j) {
cin >> i;
vals.push_back(i);
cin.ignore(numeric_limits<streamsize>::max(), ',');
}
copy(begin(vals), end(vals), ostream_iterator<int>(cout, ", "));
cout << endl;
}
#include <iostream>
using namespace std;
int main() {
int x,i=0;
char y; //to store commas
int arr[50];
while(!cin.eof()){
cin>>x>>y;
arr[i]=x;
i++;
}
for(int j=0;j<i;j++)
cout<<arr[j]; //array contains only the integer part
return 0;
}
The code can be simplified a bit with new std::stoi function in C+11. It takes care of spaces in the input when converting and throws an exception only when a particular token has started with non-numeric character. This code will thus accept input
" 12de, 32, 34 45, 45 , 23xp,"
easily but reject
" de12, 32, 34 45, 45 , 23xp,"
One problem is still there as you can see that in first case it will display " 12, 32, 34, 45, 23, " at the end where it has truncated "34 45" to 34. A special case may be added to handle this as error or ignore white space in the middle of token.
wchar_t in;
std::wstring seq;
std::vector<int> input;
std::wcout << L"Enter values : ";
while (std::wcin >> std::noskipws >> in)
{
if (L'\n' == in || (L',' == in))
{
if (!seq.empty()){
try{
input.push_back(std::stoi(seq));
}catch (std::exception e){
std::wcout << L"Bad input" << std::endl;
}
seq.clear();
}
if (L'\n' == in) break;
else continue;
}
seq.push_back(in);
}
std::wcout << L"Values entered : ";
std::copy(begin(input), end(input), std::ostream_iterator<int, wchar_t>(std::wcout, L", "));
std::cout << std::endl;
#include<bits/stdc++.h>
using namespace std;
int a[1000];
int main(){
string s;
cin>>s;
int i=0;
istringstream d(s);
string b;
while(getline(d,b,',')){
a[i]= stoi(b);
i++;
}
for(int j=0;j<i;j++){
cout<<a[j]<<" ";
}
}
This code works nicely for C++ 11 onwards, its simple and i have used stringstreams and the getline and stoi functions
You can use scanf instead of cin and put comma beside data type symbol
#include<bits/stdc++.h>
using namespace std;
int main()
{
int a[10],sum=0;
cout<<"enter five numbers";
for(int i=0;i<3;i++){
scanf("%d,",&a[i]);
sum=sum+a[i];
}
cout<<sum;
}
First, take the input as a string, then parse the string and store it in a vector, you will get your integers.
vector<int> v;
string str;
cin >> str;
stringstream ss(str);
for(int i;ss>>i;){
v.push_back(i);
if(ss.peek() == ','){
ss.ignore();
}
}
for(auto &i:v){
cout << i << " ";
}

Detect newline byte from filestream

I'm trying to collect information from a textfile which contains names of organisations (without spaces) and floating integers. I want to store this information in an array structure.
The problem I'm having so far is collecting the information. Here is a sample of the textfile:
CBA 12.3 4.5 7.5 2.9 4.1
TLS 3.9 1 8.6 12.8 4.9
I can have up to 128 different numbers for each organisation, and up to 200 organisations in the textfile.
This is what my structure looks like so far:
struct callCentre
{
char name[256];
float data[20];
};
My main:
int main()
{
callCentre aCentre[10];
getdata(aCentre);
calcdata(aCentre);
printdata(aCentre);
return 0;
}
And the getdata function:
void getdata(callCentre aCentre[])
{
ifstream ins;
char dataset[20];
cout << "Enter the name of the data file: ";
cin >> dataset;
ins.open(dataset);
if(ins.good())
{
while(ins.good())
{
ins >> aCentre[c].name;
for(int i = 0; i < MAX; i++)
{
ins >> aCentre[c].data[i];
if(ins == '\n')
break;
}
c++;
}
}
else
{
cout << "Data files couldnt be found." << endl;
}
ins.close();
}
What I'm trying to achieve in my getdata function is this: store the organisation name first into the structure, then read each float into the data array until the program detects a newline byte. However, so far my check for the newline byte isn't working.
Assume that variables c and MAX are already defined.
How should I go about this properly?
The >> operator treats whitespace as a delimiter, and that includes newlines, so it just eats those and you never see them.
You need to read lines and then chop the lines up. The following bit of hackery illustrates the basic idea:
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
int main() {
string line;
while( getline( cin, line ) ) {
istringstream is( line );
string cs;
is >> cs;
double vals[10];
int i = 0;
while( is >> vals[i] ) {
i++;
}
cout << "CS: " << cs;
for ( int j = 0; j < i; j++ ) {
cout << " " << vals[j];
}
cout << endl;
}
}
char byte = ins.peek();
Or
if(ins.peek() == '\n') break;
(Edit): You'll want to also check for an eof after your peek(), because some files may not have a ending newline.
I'd like to point out that you might want to consider using a vector<callCentre> instead of a static array. If your input file length exceeds the capacity of the array, you'll walk all over the stack.
I would read the file, one line after another and parse each line individually for the values:
std::string line;
while (std::getline(ins, line)) {
std::istringstream sline(line);
sline >> aCentre[c].name;
int i = 0;
while (sline >> aCentre[c].data[i])
i++;
c++;
}