c++ finding a string using part of the string - c++

lets say that we have:
string list[]= {"12.34.56.78","55.34.5","23.44.5"}
I want the user to enter part of the string which is also a string:
for example string 55 and it will loop through the string a and look for the whole string and print "55.34.5"
What I was doing is:
str is a string input and list is a whole list of the strings
for (int i=0; i<n; i++){
for (int j=0; j<(list[i].length()); j++){
for (int k=0; k<(str.length()); k++){
if (list[i][j] == str[k])
cout<<list[i]<<endl;
else
break;
however, there is a problem with this, and it doesn't work properly.
Update:
so I have updated my code to:
for (int i=0; i<n; i++)
if (strncmp(list[i].c_str(), str.c_str(), str.length()) == 0)){
cout<<list[i]<<endl;
}
however, this doesn't output any of the strings.

For any function fanatics (see it work):
std::string findInList(const std::vector<std::string> &searchFrom, const std::string &lookFor) {
for (const std::string &s : searchFrom) {
if (s.find(lookFor) != std::string::npos)
return s;
}
return "";
}
I used a vector instead of an array because vectors are better and don't require extra work to get the array size from. If C++11 isn't being used, a normal for loop works perfectly fine.
This also assumes you want the first match to be returned. A probably better option is to return a vector of strings, empty if none are found, which makes it explicit that none were found, or as many as are found otherwise. Instead of returning the found string, just add it to the vector and continue on, returning the vector when you're done.
If you want to model the standard algorithms, you can also have it take a beginning iterator and an ending iterator instead of the actual container. This will allow you to call it on any type of container, including arrays, with any range in that container to look through.
Taking both points into consideration, you can evolve it into this (see it work):
template <typename Iterator>
std::vector<std::string> findInList(Iterator start, const Iterator end, const std::string &lookFor) {
std::vector<std::string> ret;
for (; start != end; ++start)
if (start->find(lookFor) != std::string::npos)
ret.emplace_back(*start);
return ret;
}
Again, if not using C++11, emplace_back can be swapped out for push_back.

That just compares the first character in list[i] with the first char in your string. If the corresponding first chars match, it prints the entire ith string and then advances k, the offset into your str, without changing the offset into the string against which you're comparing. I think you can dispense with the inner two loops, and use a fixed length string comparison, i.e.,
for (int i=0; i < n; i++) {
if (strncmp(list[i].c_str(), str.c_str(), str.length()) == 0) {
// match
}
}

Here's an answer that combines both of the previous answers. It uses the find member function of the std::string class
for (int i=0; i < n; i++) {
if (list[i].find(str) != std::string::npos) {
std::cout << list[i] << std::endl;
}
}

Related

C++ string parser issues

Ok, so I'm working on a homework project in C++ and am running into an issue, and can't seem to find a way around it. The function is supposed to break an input string at user-defined delimiters and store the substrings in a vector to be accessed later. I think I got the basic parser figured out, but it doesn't want to split the last part of the input.
int main() {
string input = "comma-delim-delim&delim-delim";
vector<string> result;
vector<char> delims;
delims.push_back('-');
delims.push_back('&');
int begin = 0;
for (int i = begin; i < input.length(); i++ ){
for(int j = 0; j < delims.size(); j++){
if(input.at(i) == delims.at(j)){
//Compares chars in delim vector to current char in string, and
//creates a substring from the beginning to the current position
//minus 1, to account for the current char being a delimiter.
string subString = input.substr(begin, (i - begin));
result.push_back(subString);
begin = i + 1;
}
The above code works fine for splitting the input code up until the last dash. Anything after that, because it doesn't run into another delimiter, it won't save as a substring and push into the result vector. So in an attempt to rectify the matter, I put together the following:
else if(input.at(i) == input.at(input.length())){
string subString = input.substr(begin, (input.length() - begin));
result.push_back(subString);
}
However, I keep getting out of bounds errors with the above portion. It seems to be having an issue with the boundaries for splitting the substring, and I can't figure out how to get around it. Any help?
In your code you have to remember that .size() is going to be 1 more than your last index because it starts at 0. so an array of size 1 is indexed at [0]. so if you do input.at(input.length()) will always overflow by 1 place. input.at(input.length()-1) is the last element. here is an example that is working for me. After your loops just grab the last piece of the string.
if(begin != input.length()){
string subString = input.substr(begin,(input.length()-begin));
result.push_back(subString);
}
Working from the code in the question I've substituted iterators so that we can check for the end() of the input:
int main() {
string input = "comma-delim-delim&delim-delim";
vector<string> result;
vector<char> delims;
delims.push_back('-');
delims.push_back('&');
auto begin = input.begin(); // use iterator
for(auto ii = input.begin(); ii <= input.end(); ii++){
for(auto j : delims) {
if(ii == input.end() || *ii == j){
string subString(begin,ii); // can construct string from iterators, of if ii is at end
result.push_back(subString);
if(ii != input.end())
begin = ii + 1;
else
goto done;
}
}
}
done:
return 0;
}
This program uses std::find_first_of to parse the multiple delimiters:
int main() {
string input = "comma-delim-delim&delim-delim";
vector<string> result;
vector<char> delims;
delims.push_back('-');
delims.push_back('&');
auto begin = input.begin(); // use iterator
for(;;) {
auto next = find_first_of(begin, input.end(), delims.begin(), delims.end());
string subString(begin, next); // can construct string from iterators
result.push_back(subString);
if(next == input.end())
break;
begin = next + 1;
}
}

read string into array

I want to read a string with integers and whitespaces into an array. For example I have a string looks like 1 2 3 4 5, and I want to convert it into an integer array arr[5]={1, 2, 3, 4, 5}. How should I do that?
I tried to delete the whitespaces, but that just assign the whole 12345 into every array element. If I don't everything element will all assigned 1.
for (int i = 0; i < str.length(); i++){
if (str[i] == ' ')
str.erase(i, 1);
}
for (int j = 0; j < size; j++){ // size is given
arr[j] = atoi(str.c_str());
}
A couple of notes:
Use a std::vector. You will most likely never know the size of an input at compile time. If you do, use a std::array.
If you have C++11 available to you, maybe think about stoi or stol, as they will throw upon failed conversion
You could accomplish your task with a std::stringstream which will allow you to treat a std::string as a std::istream like std::cin. I recommend this way
alternatively, you could go the hard route and attempt to tokenize your std::string based on ' ' as a delimiter, which is what it appears you are trying to do.
Finally, why reinvent the wheel if you go the tokenization route? Use Boost's split function.
Stringstream approach
std::vector<int> ReadInputFromStream(const std::string& _input, int _num_vals)
{
std::vector<int> toReturn;
toReturn.reserve(_num_vals);
std::istringstream fin(_input);
for(int i=0, nextInt=0; i < _num_vals && fin >> nextInt; ++i)
{
toReturn.emplace_back(nextInt);
}
// assert (toReturn.size() == _num_vals, "Error, stream did not contain enough input")
return toReturn;
}
Tokenization approach
std::vector<int> ReadInputFromTokenizedString(const std::string& _input, int _num_vals)
{
std::vector<int> toReturn;
toReturn.reserve(_num_vals);
char tok = ' '; // whitespace delimiter
size_t beg = 0;
size_t end = 0;
for(beg = _input.find_first_not_of(tok, end); toReturn.size() < static_cast<size_t>(_num_vals) &&
beg != std::string::npos; beg = _input.find_first_not_of(tok, end))
{
end = beg+1;
while(_input[end] == tok && end < _input.size())
++end;
toReturn.push_back(std::stoi(_input.substr(beg, end-beg)));
}
// assert (toReturn.size() == _num_vals, "Error, string did not contain enough input")
return toReturn;
}
Live Demo
Your code arr[j] = atoi(str.c_str()); is fault. The str is a string, not a char. When you used atoi(const char *), you should give the &char param. So the correct code is arr[j] = atoi(&str[j]). By the way, if you want to change the string to int, you could use the function arr[j] = std::stoul(str). I hope this can help you.
You have modified/parsing the string in one loop, but copying to integer array in another loop. without setting any marks, where all the embedded integers in strings start/end. So we have to do both the actions in single loop.
This code is not perfect, but to give you some idea; followed the same process you followed, but used vectors.
string str = "12 13 14";
vector<int> integers;
int start=0,i = 0;
for (; i < str.length(); i++){
if (str[i] == ' ')
{
integers.push_back(atoi(str.substr(start,i).c_str()));
start = i;
}
}
integers.push_back(atoi(str.substr(start,i).c_str()));

c++ vector erase function not working for specific words?

I am using a very simple function in c++, vector.erase(), here's what I have (I'm trying to erase all instances of these three keywords from a .txt file):
First I use it in two separate for loops to erase all instances of <event> and </event>, this works perfectly and outputs the edited text file with no more instances of those words.
for (int j = 0; j< N-counter; j++) {
if(myvec[j] == "<event>") {
myvec.erase(myvec.begin()+j);
}
}
for (int j = 0; j< N-counter; j++) {
if(myvec[j] == "</event>") {
myvec.erase(myvec.begin()+j);
}
}
However, when I add a third for loop to do the EXACT same thing, literally just copy and paste with a new keyword as follows:
for (int j = 0; j< N-counter; j++) {
if(myvec[j] == "</LesHouchesEvents>") {
myvec.erase(myvec.begin()+j);
}
}
It compiles and executes, however it completely destroys the .txt file, making it completely un-openable, and when i cat it, I just get a bunch of crazy symbols.
I have tried switching the order of these for loops, even getting rid of the first two for loops entirely, everything I can think of, alas it just will not work for the keyword </LesHouchesEvents> for some strange reason.
Your loops are not taking into account that when you erase() an element from a vector, the indexes of the remaining elements will decrement accordingly. So your loops will eventually exceed the bounds of the vector once you have erased at least 1 element. You need to take that into account:
std:string word = ...;
size_t count = N-counter;
for (int j = 0; j < count;) {
if(myvec[j] == word) {
myvec.erase(myvec.begin()+j);
--count;
}
else {
++j;
}
}
With that said, it would be safer to use iterators instead of indexes. erase() returns an iterator to the element that immediately follows the removed element. You can use std::find() for the actual searching:
#include <algorithm>
std::vector<std::string>::iterator iter = std::find(myvec.begin(), myvec.end(), word);
while (iter != myvec.end())
{
iter = myvec.erase(iter);
iter = std::find(iter, myvec.end(), word);
}
Or, you could just use std::remove() instead:
#include <algorithm>
myvec.erase(std::remove(myvec.begin(), myvec.end(), word), myvec.end());
I don't know if this is your specific problem or not, but this loop is almost surely not what you want.
Note the documentation for erase - it "shifts" left the remaining elements. Unfortunately, your code still increments j, meaning you're skipping the next element:
for (int j = 0; j< N-counter; j++) { // <- Don't increment j here
...
myvec.erase(myvec.begin()+j); // <- increment it only if this didn't happen.
}
You'll also need to adjust your loop's halting condition.
Even assuming you got it working, this is nearly the worst possible way to remove items from a vector.
You almost certainly want the remove/erase idiom here, and you probably want to do all the comparisons in a single pass, so it's something like this:
std::vector<std::string> bad = {
"<event>",
"</event>",
"</LesHouchesEvents>"
};
myvec.erase(std::remove_if(my_vec.begin(), my_vec.end(),
[&](std::string const &s) {
return std::find(bad.begin(), bad.end(), s) != bad.end();
}),
my_vec.end());

C++ Dynamic Array Inputs

I am using two dynamic arrays to read from a file. They are to keep track of each word and the amount of times it appears. If it has already appeared, I must keep track in one array and not add it into the other array since it already exists. However, I am getting blank spaces in my array when I meet a duplicate. I think its because my pointer continues to advance, but really it shouldn't. I do not know how to combat this. The only way I have was to use a continue; when I print out the results if the array content = ""; if (*(words + i) == "") continue;. This basically ignores those blanks in the array. But I think that is messy. I just want to figure out how to move the pointer back in this method. words and frequency are my dynamic arrays.
I would like guidance in what my problem is, rather than solutions.
I have now changed my outer loop to be a while loop, and only increment when I have found the word. Thank you WhozCraig and poljpocket.
Now this occurs.
Instead of incrementing your loop variable [i] every loop, you need to only increment it when a NEW word is found [i.e. not one already in the words array].
Also, you're wasting time in your inner loop by looping through your entire words array, since words will only exist up to index i.
int idx = 0;
while (file >> hold && idx < count) {
if (!valid_word(hold)) {
continue;
}
// You don't need to check past idx because you
// only have <idx> words so far.
for (int i = 0; i < idx; i++) {
if (toLower(words[i]) == toLower(hold)) {
frequency[i]++;
isFound = true;
break;
}
}
if (!isFound) {
words[idx] = hold;
frequency[idx] = 1;
idx++;
}
isFound = false;
}
First, to address your code, this is what it should probably look like. Note how we only increment i as we add words, and we only ever scan the words we've already added for duplicates. Note also how the first pass will skip the j-loop entirely and simply insert the first word with a frequency of 1.
void addWords(const std::string& fname, int count, string *words, int *frequency)
{
std::ifstream file(fname);
std::string hold;
int i = 0;
while (i < count && (file >> hold))
{
int j = 0;
for (; j<i; ++j)
{
if (toLower(words[j]) == toLower(hold))
{
// found a duplicate at j
++frequency[j];
break;
}
}
if (j == i)
{
// didn't find a duplicate
words[i] = hold;
frequency[i] = 1;
++i;
}
}
}
Second, to really address your code, this is what it should actually look like:
#include <iostream>
#include <fstream>
#include <map>
#include <string>
//
// Your implementation of toLower() goes here.
//
typedef std::map<std::string, unsigned int> WordMap;
WordMap addWords(const std::string& fname)
{
WordMap words;
std::ifstream inf(fname);
std::string word;
while (inf >> word)
++words[toLower(word)];
return words;
}
If it isn't obvious by now how a std::map<> makes this task easier, it never will be.
check out SEEK_CUR(). If you want to set the cursor back
The problem is a logical one, consider several situations:
Your algorithm does not find the current word. It is inserted at position i of your arrays.
Your algorithm does find the word. The frequency of the word is incremented along with i, which leaves you with blank entries in your arrays whenever there's a word which is already present.
To conclude, 1 works as expected but 2 doesn't.
My advice is that you don't rely on for loops to traverse the string but use a "get-next-until-end" approach which uses a while loop. With this, you can track your next insertion point and thus get rid of the blank entries.
int currentCount = 0;
while (file)
{
// your inner for loop
if (!found)
{
*(words + currentCount) = hold;
*(frequency + currentCount) = 1;
currentCount++;
}
}
Why not use a std::map?
void collect( std::string name, std::map<std::string,int> & freq ){
std::ifstream file;
file.open(name.c_str(), std::ifstream::in );
std::string word;
while( true ){
file >> word; // add toLower
if( file.eof() ) break;
freq[word]++;
}
file.close();
}
The problem with your solution is the use of count in the inner loop where you look for duplicates. You'll need another variable, say nocc, initially 0, used as limit in the inner loop and incremented whenever you add another word that hasn't been seen yet.

Creating own input masks

Basically I want to validate a string against a mask which is in the DB however to validate against it I need to assign a rule to that mask i.e [D] = 0<=10. So what I have done is retrieved that mask and extracted the [] from the letters and stored them in two different vectors, so my question is, that can you assign a rule to various cells with the vector
i.e
a[0] = 0<=10
a[1] = "H"
something along the lines of that below is my code bear in mind that the string in the top is not from the DB it is just a string i created assuming it is from the DB because the process will be the same
string s("[sh][a][mar][i]");
vector< vector<char> > Vect;
vector<char> vect;
int i = 0;
while(i < s.size()) {
if(s[i]=='[') {
i++;
vect.push_back(s[i]);
i++;
}
else if(s[i] == ']') {
i++;
Vect.push_back(vect);
vect.clear();
}
else {
vect.push_back(s[i]);
i++;
}
}
vector< vector<char> >::iterator it;
vector<char>::iterator itera;
vector<std::string> vectString;
for (it = Vect.begin() ; it != Vect.end() ; ++it ) {
string a;
for (itera = it->begin() ; itera != it->end() ; ++itera) {
cout << *itera;
a += *itera;
}
vectString.push_back(a);
}
I don't really understand your question, but here is what I can suggest:
Instead of using a huge if-else-if, try use std::string::find and std::string::substr to extract elements.
I don't really see the reason you transform std::string to char and then reverse it. Use std::string::find and std::string::substr to get vectString in one step may be a better idea.
std::string offers powerful functions, if you are not familiar with it, you may want to take a look at : http://www.cplusplus.com/reference/string/string/
Do you want to verify if a given string can be accepted by some regular expressions?
If this is the case, why don't you just use some regular expression objects which representing your rules and then check whether its matching result equals to your original string?