How to number input in ascending order? - c++

I have input coming in form a file input.txt as two columns of strings such as:
string1 string2
string3 string4
etc.
I am trying to number the strings in ascending order starting form 0 but in such a way that repeating strings don't get assigned new values but keep the once already assigned to them.
I decided to use a set::find operation to do this, but I am having a hard time making it work. Here's what I have so far:
int main(int argc, char* argv[]) {
std::ifstream myfile ("input.txt");
std::string line;
int num = 0; // num is the total number of input strings
if (myfile.is_open()) {
while(std::getline(myfile, line)) {
++num;
}
}
std::string str1, str1; // strings form input
int str1Num, str2Num; // numbers assigned to strings
int i = 0; // used to assign values to strings
StringInt si;
std::vector<StringInt> saveStringInts(num);
std::set<std::string> alreadyCounted(num, 0);
std::set<std::string>::iterator sit;
std::ifstream myfile2 ("input.txt");
if (myfile2.is_open()) {
while(myfile2.good()) {
// read in input, put it in vars below
myfile2 >> str1 >> str2;
// if strings are not already assigned numbers, assign them
if ((*(sit = alreadyCounted.find(str1)).compare(str1) != 0) { // doesn't work
str1Num = i++;
alreadyCounted.insert(str1);
saveStringInts.push_back(StringInt(str1Num));
}
else {
str1Num = si->getNum(str1);
}
if ((*(sit = alreadyCounted.find(str2)).compare(str2) != 0) {
str2Num = i++;
alreadyCounted.insert(str2);
saveStringInts.push_back(StringInt(str2Num));
}
else {
str2Num = si->getNum(str2);
}
// use str1 and str2 in the functions below before the next iteration
}
}
Unfortunately, I tried other approaches and now completely stuck. If you know how to fix my code or can suggest a better way to accomplish my task, I would greatly appreciate your help.

You need to compare std::set<int>::iterator against the end() iterator of your set, rather than dereferencing the iterator and comparing its value against something! Actually, derferencing the end() iterator is undefined behavior:
if ((*(sit = alreadyCounted.find(str1)).compare(str1) != 0) // WRONG: don't do that!
should really be
if (alreadyCounted.find(str1) != alreadyCounted.end())
... and likewise for the other string. Personally, I would use a different technique, though: when insert()ing into a std::set<T>, you get back a pair of an iterator and an indicator whether the object was inserted. The latter together with the current set's size give the next value, e.g.:
bool result = alreadyCounted.insert(str1).second;
strNum1 = result? alreadyCounted.size() - 1: si->getNum(str1);

Related

std::string returning inappropriate value

I wrote a program which perform string compression using counts of repeated characters. The program in C++ is :
#include<iostream>
#include<cstring>
std::string compressBad(std::string str)
{
std::string mystr = "";
int count = 1;
char last = str[0];
for (int i = 0; i < str.length();++i)
{
if(str[i] == last)
count++;
else
{
std::string lastS = last+"";
std::string countS = std::to_string(count);
mystr.append(lastS);
mystr.append(countS);
//mystr = mystr + last + count;
count = 1;
last = str[i];
}
}
std::string lastS = last+"";
std::string countS = std::to_string(count);
mystr.append(lastS);
mystr.append(countS);
return mystr;
//return mystr+last+count;
}
int main()
{
std::string str;
std::getline(std::cin, str);
std::string str2 = compressBad(str);
std::cout<<str2;
/*if (str.length() < str2.length())
std::cout<<str;
else
std::cout<<str2;*/
std::cout<<std::endl;
return 0;
}
Few example on running this are :
Input : sssaaddddd
Output : ùÿÿ*425
Output it should print : s3a2d5
Second example:
Input : sssaaddd
Output: ùÿÿ*423
Output it should print : s3a2d3
I also implemented the same concept in Java and there it is working fine. The java implementation is here
Why is this problem happening with above code.
There may be other issues in your code, but I think that this line might be to blame:
std::string lastS = last+"";
Here, you're trying to convert the character last to a string by concatenating the empty string to the end. Unfortunately, in C++ this is interpreted to mean "take the numeric value of the character last, then add that to a pointer that points to the empty string, producing a new pointer to a character." This pointer points into random memory, hence the garbage you're seeing. (Notice that this is quite different from how Java works!)
Try changing this line to read
std::string lastS(1, last);
This will initialize lastS to be a string consisting of just the character stored in last.
Another option would be to use an ostringstream:
std::ostringstream myStr;
myStr << last << count;
// ...
return myStr.str();
This eliminates all the calls to .append() and std::to_string and is probably a lot easier to read.
last + "" doesn't do what you think.
just do
mystr.append(1, last);

How to initializing std::set<std::string> correctly?

Please help me, I have been trying to do this for the past two-three hours, all with no luck. I have a number of strings comming in form input.txt in the format
string1 string2
string3 string4
etc.
that I want to put into a std::set which is initially empty. I want to number the strings as they come in and put them into the set to keep track of the duplicates so I don't number them again. I am trying to initialize std::set<std::string> inGraph but can't make it work. I tried to initialize std::set<std::string> inGraph(0, tot_lines); where 0 to tot_lines is the range of the number of total strings I expect to get form the input. The I tried to initialize all with empty stirng like: std::set<std::string> inGraph(tot_lines, ""); and that failed. Here's what I have now:
struct StringInt {
std::string name; // associate name and number for each input string
int number;
};
int main(int argc, char* argv[]) {
int tot_lines = 100;
int icv1, icv2;
std::string vert1, vert2;
std::set<std::string> inGraph(); // this is the set I want to initialize
std::set<std::string>::iterator sit;
std::vector<StringInt> stringInts(tot_lines*2);
StringInt* si;
std::ifstream myfile2 ("input.txt");
if (myfile2.is_open()) {
while(myfile2 >> vert1 >> vert2) {
// read in input, put it in vars below
myfile2 >> vert1 >> vert2;
if (inGraph.find(vert1) != inGraph.end()) {
icv1 = i++;
si->name = vert1;
si->number = icv1;
inGraph.insert(vert1);
stringInts.push_back(*si);
}
else {
icv1 = si->number;
}
if (inGraph.find(vert2) != inGraph.end()) {
icv2 = i++;
si->name = vert1;
si->number = icv2;
inGraph.insert(vert2);
stringInts.push_back(*si);
}
else {
icv2 = si->number;
}
}
The error I get is: left of '.find' must have class/struct/union Can you please help me figure out how to initialize the std::set<std::string> inGraph so I can number the strings?
The error message is because you are a victim of Most Vexing Parse.
std::set<std::string> inGraph();
It is a function declaration whose return type is std::set<std::string>. Just remove the () after inGraph to make it a object declaration.

Incrementing pointers for *char in a while loop

Here is what I have:
char* input = new char [input_max]
char* inputPtr = iput;
I want to use the inputPtr to traverse the input array. However I am not sure what will correctly check whether or not I have reached the end of the string:
while (*inputPtr++)
{
// Some code
}
or
while (*inputPtr != '\0')
{
inputPtr++;
// Some code
}
or a more elegant option?
Assuming input string is null-terminated:
for(char *inputPtr = input; *inputPtr; ++inputPtr)
{
// some code
}
Keep in mind that the example you posted may not give the results you want. In your while loop condition, you're always performing a post-increment. When you're inside the loop, you've already passed the first character. Take this example:
#include <iostream>
using namespace std;
int main()
{
const char *str = "apple\0";
const char *it = str;
while(*it++)
{
cout << *it << '_';
}
}
This outputs:
p_p_l_e__
Notice the missing first character and the extra _ underscore at the end. Check out this related question if you're confused about pre-increment and post-increment operators.
I would do:
inputPtr = input; // init inputPtr always at the last moment.
while (*inputPtr != '\0') { // Assume the string last with \0
// some code
inputPtr++; // After "some code" (instead of what you wrote).
}
Which is equivalent to the for-loop suggested by greatwolf. It's a personal choice.
Be careful, with both of your examples, you are testing the current position and then you increment. Therefore, you are using the next character!
Assuming input isn't null terminated:
char* input = new char [input_max];
for (char* inputPtr = input; inputPtr < input + input_max;
inputPtr++) {
inputPtr[0]++;
}
for the null terminated case:
for (char* inputPtr = input; inputPtr[0]; inputPtr++) {
inputPtr[0]++;
}
but generally this is as good as you can get. Using std::vector, or std::string may enable cleaner and more elegant options though.

C++ Counting words in a file between two words

I am currently trying to count the number of words in a file. After this, I plan to make it count the words between two words in the file. For example. My file may contain. "Hello my name is James". I want to count the words, so 5. And then I would like to count the number of words between "Hello" and "James", so the answer would be 3. I am having trouble with accomplishing both tasks.
Mainly due to not being exactly sure how to structure my code.
Any help on here would be greatly appreciated. The code I am currently using is using spaces to count the words.
Here is my code:
readwords.cpp
string ReadWords::getNextWord()
{
bool pWord = false;
char c;
while((c = wordfile.get()) !=EOF)
{
if (!(isspace(c)))
{
nextword.append(1, c);
}
return nextword;
}
}
bool ReadWords::isNextWord()
{
if(!wordfile.eof())
{
return true;
}
else
{
return false;
}
}
main.cpp
main()
{
int count = 0;
ReadWords rw("hamlet.txt");
while(rw.isNextWord()){
rw.getNextWord();
count++;
}
cout << count;
rw.close();
}
What it does at the moment is counts the number of characters. I'm sure its just a simple fix and something silly that I'm missing. But I've been trying for long enough to go searching for some help.
Any help is greatly appreciated. :)
Rather than parse the file character-by-character, you can simply use istream::operator<<() to read whitespace-separated words. << returns the stream, which evaluates to true as a bool when the stream can still be read from.
vector<string> words;
string word;
while (wordfile >> word)
words.push_back(word);
There is a common formulation of this using the <iterator> and <algorithm> utilities, which is more verbose, but can be composed with other iterator algorithms:
istream_iterator<string> input(wordfile), end;
copy(input, end, back_inserter(words));
Then you have the number of words and can do with them whatever you like:
words.size()
If you want to find "Hello" and "James", use find() from the <algorithm> header to get iterators to their positions:
// Find "Hello" anywhere in 'words'.
const auto hello = find(words.begin(), words.end(), "Hello");
// Find "James" anywhere after 'hello' in 'words'.
const auto james = find(hello, words.end(), "James");
If they’re not in the vector, find() will return words.end(); ignoring error checking for the purpose of illustration, you can count the number of words between them by taking their difference, adjusting for the inclusion of "Hello" in the range:
const auto count = james - (hello + 1);
You can use operator-() here because std::vector::iterator is a “random-access iterator”. More generally, you could use std::distance() from <iterator>:
const auto count = distance(hello, james) - 1;
Which has the advantage of being more descriptive of what you’re actually doing. Also, for future reference, this kind of code:
bool f() {
if (x) {
return true;
} else {
return false;
}
}
Can be simplified to just:
bool f() {
return x;
}
Since x is already being converted to bool for the if.
To count:
std::ifstream infile("hamlet.txt");
std::size_t count = 0;
for (std::string word; infile >> word; ++count) { }
To count only between start and stop:
std::ifstream infile("hamlet.txt");
std::size_t count = 0;
bool active = false;
for (std::string word; infile >> word; )
{
if (!active && word == "Hello") { active = true; }
if (!active) continue;
if (word == "James") break;
++count;
}
I think "return nextword;" should instead be "else return nextword;" or else you are returning from the function getNextWord every time, no matter what the char is.
string ReadWords::getNextWord()
{
bool pWord = false;
char c;
while((c = wordfile.get()) !=EOF)
{
if (!(isspace(c)))
{
nextword.append(1, c);
}
else return nextword;//only returns on a space
}
}
To count all words:
std::ifstream f("hamlet.txt");
std::cout << std::distance (std::istream_iterator<std::string>(f),
std::istream_iterator<std::string>()) << '\n';
To count between two words:
std::ifstream f("hamlet.txt");
std::istream_iterator<std::string> it(f), end;
int count = 0;
while (std::find(it, end, "Hello") != end)
while (++it != end && *it != "James")
++count;
std::cout << count;
Try this:
below the line
nextword.append(1, c);
add
continue;

creating a string split function in C++ [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
Im trying to create a function that mimics the behavior of the getline() function, with the option to use a delimiter to split the string into tokens.
The function accepts 2 strings (the second is being passed by reference) and a char type for the delimiter. It loops through each character of the first string, copying it to the second string and stops looping when it reaches the delimiter. It returns true if the first string have more characters after the delimiter and false otherwise. The position of the last character is being saved in a static variable.
for some reason the the program is going into an infinite loop and is not executing anything:
const int LINE_SIZE = 160;
bool strSplit(string sFirst, string & sLast, char cDelim) {
static int iCount = 0;
for(int i = iCount; i < LINE_SIZE; i++) {
if(sFirst[i] != cDelim)
sLast[i-iCount] = sFirst[i];
else {
iCount = i+1;
return true;
}
}
return false;
}
The function is used in the following way:
while(strSplit(sLine, sToken, '|')) {
cout << sToken << endl;
}
Why is it going into an infinite loop, and why is it not working?
I should add that i'm interested in a solution without using istringstream, if that's possible.
It is not exactly what you asked for, but have you considered std::istringstream and std::getline?
// UNTESTED
std::istringstream iss(sLine);
while(std::getline(iss, sToken, '|')) {
std::cout << sToken << "\n";
}
EDIT:
Why is it going into an infinite loop, and why is it not working?
We can't know, you didn't provide enough information. Try to create an SSCCE and post that.
I can tell you that the following line is very suspicious:
sLast[i-iCount] = sFirst[i];
This line will result in undefined behavior (including, perhaps, what you have seen) in any of the following conditions:
i >= sFirst.size()
i-iCount >= sLast.size()
i-iCount < 0
It appears to me likely that all of those conditions are true. If the passed-in string is, for example, shorter than 160 lines, or if iCount ever grows to be bigger than the offset of the first delimiter, then you'll get undefined behavior.
LINE_SIZE is probably larger than the number of characters in the string object, so the code runs off the end of the string's storage, and pretty much anything can happen.
Instead of rolling your own, string::find does what you need.
std::string::size_type pos = 0;
std::string::size_type new_pos = sFirst.find('|', pos);
The call to find finds the first occurrence of '|' that's at or after the position 'pos'. If it succeeds, it returns the index of the '|' that it found. If it fails, it returns std::string::npos. Use it in a loop, and after each match, copy the text from [pos, new_pos) into the target string, and update pos to new_pos + 1.
are you sure it's the strSplit() function that doesn't return or is it your caller while loop that's infinite?
Shouldn't your caller loop be something like:
while(strSplit(sLine, sToken, '|')) {
cout << sToken << endl;
cin >> sLine >> endl;
}
-- edit --
if value of sLine is such that it makes strSplit() to return true then the while loop becomes infinite.. so do something to change the value of sLine for each iteration of the loop.. e.g. put in a cin..
Check this out
std::vector<std::string> spliString(const std::string &str,
const std::string &separator)
{
vector<string> ret;
string::size_type strLen = str.length();
char *buff;
char *pch;
buff = new char[strLen + 1];
buff[strLen] = '\0';
std::copy(str.begin(), str.end(), buff);
pch = strtok(buff, separator.c_str());
while(pch != NULL)
{
ret.push_back(string(pch));
pch = strtok(NULL, separator.c_str());
}
delete[] buff;
return ret;
}