Comparing lines inside a vector<string> - c++

How do you read lines from a vector and compare their lengths? I have pushed strings inside a vector and now would like to find the longest line and use that as my output. I have this as my code all the way up to comparing the strings:
ifstream code_File ("example.txt");
size_t find_Stop1, find_Stop2, find_Stop3, find_Start;
string line;
vector<string> code_Assign, code_Stop;
if (code_File.is_open()) {
while ( getline(code_File,line)) {
find_Start = line.find("AUG"); // Finding all posssible start codes
if (find_Start != string::npos) {
line = line.substr(find_Start);
code_Assign.push_back(line); //adding line to Code_Assign
find_Stop2 = line.find("UGA"); // Try and find stop code.
if (find_Stop2 != string::npos) {
line = line.substr(line.find("AUG"), find_Stop2);
code_Stop.push_back(line); // Adding it to code_Stop vector
}
find_Stop1 = line.find("UAA"); // finding all possible stop codes.
if (find_Stop1 != string::npos) {
line = line.substr(line.find("AUG"), find_Stop1); // Assign string code_1 from start code to UGA
code_Stop.push_back(line); //Adding to code_Stop vector
}
find_Stop3 = line.find("UAG"); // finding all possible stop codes.
if (find_Stop3 != string::npos) {
line = line.substr(line.find("AUG"), find_Stop3);
code_Stop.push_back(line); //Adding to code_Stop vector
}
}
}
cout << '\n' << "Codes to use: " << endl;
for (size_t i = 0; i < code_Assign.size(); i++)
cout << code_Assign[i] << endl;
cout << '\n' << "Possible Reading Frames: " << endl;
for (size_t i = 0; i < code_Stop.size(); i++)
cout << code_Stop[i] << endl;
cout << endl;
std::vector<std::string>::iterator longest = std::max_element(code_Stop.begin(), code_Stop.end, compare_length);
std::string longest_line = *longest; // retrieve return value
code_File.close();
}
else cout << "Cannot open File.";
to try and clarify my current output is this all in the code_Stop vector:
Possible Reading Frames:
AUG GGC CUC GAG ACC CGG GUU UAA AGU AGG
AUG GGC CUC GAG ACC CGG GUU
AUG AAA UUU GGG CCC AGA GCU CCG GGU AGC GCG UUA CAU
and I would just like to get the longest line.
Note, I am just learning about vectors, so please be kind...I have been getting a lot of help from this board and really much appreciated.
Ed. I changed the code to show where I have put it and it is giving me "Program received signal: 'EXC_BAD_ACCESS'". What have I done?

This should work:
#include <string>
#include <vector>
#include <algorithm>
bool compare_length(std::string const& lhs, std::string const& rhs) {
return lhs.size() < rhs.size();
}
int main() {
std::vector<std::string> lines; // fill with data
std::vector<std::string>::iterator longest = std::max_element(
lines.begin(), lines.end(),
compare_length);
std::string longest_line = *longest; // retrieve return value
}
compare_length is a function that compares the length of two given strings. It returns true if the first string is shorter than the second one, and false otherwise.
std::max_element is a standard-algorithm that find the largest element in a sequence using the specified comparison-function. lines.begin() and lines.end() return iterators to the beginning and the end of the sequence lines, thus specifying the range the algorithm should scan.

Related

Find an hidden permutation of a string C++

I have two strings, and wanted to check if the second is a permutation of the first (and viceversa of course).
So I found out on cplusplus reference that the is_permutation function of the library algorithm could help me. Indeed, I have the following code:
int main () {
string s1 = "bear";
string s2 = "reab";
if ( is_permutation (s1.begin(), s1.end(), s2.begin()) )
cout << "Found permutation.\n";
else cout << "No permutations found.\n";
return 0;
}
And this works. But now, for example, let's say I still have the string "bear", and a second random string that inside has a permutation of bear, so something like this:
s1 = "bear";
s2 = "AsdVYTcKIyqbNQreabJUoBn";
As you can see there's still the permutation "reab". How can I actually check if there's an hidden permutation? And eventually, save it on a "s3" different string?
Hope you can help me.
You can use a combination of std::string::substr and is_permutation to achieve this.
// Example program
#include <iostream>
#include <string>
#include <algorithm>
using std::string;
using std::cout;
int main () {
string s1 = "bear";
string s2 = "AsdVYTcKIyqbNQJUoBnreab";
size_t i;
for( i = 0; i <= s2.size() - s1.size(); i++)
{
string s3 = s2.substr(i, s1.size());
if ( is_permutation (s1.begin(), s1.end(), s3.begin()))
{
cout << "Found permutation.\n";
break;
}
else
{
continue;
}
}
if(i > s2.size() - s1.size())
cout << "No permutations found.\n";
return 0;
}
See live demo here.
as kingW3 already pointed out in the comments on how one might do it.
string s1 = "bear";
string s2 = "AsdVYTcKIyqbNQreabJUoBn";
string key = "";
for (int i = 0; i < s2.length()+1 - s1.length(); i++)
{
key = s2.substr(i, s1.length());
if (is_permutation(s1.begin(), s1.end(), key.begin()))
cout << "Found permutation.\n";
else cout << "No permutations found.\n";
}
return 0;
Edit: Please note that the for loop condition has to be writtenm with either +1 or -1 in order to get the last character from your second string.
i < s2.length()+1 - s1.length()
or
i < s2.length() - (s1.length()-1)
hope it helps.

Writing sorted numbers from function out to text file?

I have two text files, each with an unknown number of integers sorted from lowest to highest... for example:
input file 1: 1 3 5 7 9 11...
input file 2: 2 4 6 8 10 ....
I want to take these numbers from both files, sort from low to high, and then output the full list of sorted numbers from both input files to a single output file. What I have so far...
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include "iosort.h"
int main()
{
const char* filename1 = "numberlist1.txt";
const char* filename2 = "numberlist2.txt";
std::ofstream ofs("output.txt");
std::ifstream ifs1, ifs2;
std::string input1, input2;
ifs1.open(filename1);
std::getline(ifs1, input1);
std::cout << "Contents of file 1: " << input1 << std::endl;
ifs2.open(filename2);
std::getline(ifs2, input2);
std::cout << "Contents of file 2: " << input2 << std::endl;
ioSort(ifs1, ifs2, ofs);
return 0;
}
and my function...
#include <fstream>
#include <sstream>
#include <vector>
#include "iosort.h"
void ioSort(std::ifstream& in1, std::ifstream& in2, std::ofstream& out)
{
int a, b;
std::vector<int> f1, f2, f3; //create one vector for each input stream
while (in1 >> a)
{
f1.push_back(a);
}
while (in2 >> b)
{
f2.push_back(b);
}
//now f1 and f2 are vectors that have the numbers from the input files
//we know that in these input files numbers are sorted from low to high
if (f1.size() > f2.size()) //input stream 1 was larger
{
for (int i = 0; i < f2.size(); i++)
{
if (f1[i] > f2[i]) //number at input vector 2 less that respective pos
{ //in input vector 1
f3.push_back(f2[i]);
}
else if(f1[i] == f2[i]) //numbers are equal
{
f3.push_back(f1[i]);
f3.push_back(f2[i]);
}
else //number in 1 is less than that in vector 2
{
f3.push_back(f1[i]);
}
}
for (int i = f2.size(); i < f1.size(); i++)
{
f3.push_back(f1[i]); //push remaining numbers from stream 1 into vector
}
}
else //input stream 2 was larger
{
for (int i = 0; i < f1.size(); i++)
{
if (f1[i] > f2[i]) //number at input vector 2 less that respective pos
{ //in input vector 1
f3.push_back(f2[i]);
}
else if(f1[i] == f2[i]) //numbers are equal
{
f3.push_back(f1[i]);
f3.push_back(f2[i]);
}
else //number in 1 is less than that in vector 2
{
f3.push_back(f1[i]);
}
}
for (int i = f1.size(); i < f2.size(); i++)
{
f3.push_back(f1[i]); //push remaining numbers from stream 2 into vector
}
}
//send vector contents to output file
for (int i = 0; i < f3.size(); i++)
{
out << f3[i] << " ";
}
}
Everytime I compile and run, the file output.txt is being created, but it is empty. Can anybody point me to what I am doing wrong. If, in main, I do something like:
out << 8 << " " << 9 << std::endl;
then it will show up in the output file.
AHA! Found your error. You're opening the file, then reading it directly to stdout (where you list the contents of your file), and then passing the same stream into your function. You cannot do this. Whenever you read from a file, the stream moves further through the file. By the time you're in your sorting function, you're at the end of the file, and so no numbers are read!
You need to remove the lines
std::getline(ifs1, input1);
std::cout << "Contents of file 1: " << input1 << std::endl;
and
std::getline(ifs2, input2);
std::cout << "Contents of file 2: " << input2 << std::endl;
Instead, print them out after you've stored them in the vector.
I'll leave the rest of my reply down below, since you, or posterity, might need it.
I'm not sure what's going on with your output file problem. Go through the whole chain and see where it's failing:
After you've read your file in, print out f1 and f2 with cout. Are they there and what you expect? If they are, we can move on.
After your algorithm has run, is your f3 there, and what you expect? If so, keep going!
This lets you diagnose the exact line where your code is failing (i.e. not doing it what you expect it do), and you know you can rule everything you've checked out.
Of course, instead of using cout you can launch this under a debugging environment and see what happens step by step, but if you don't know how, it'll take longer to do that to diagnose your problem the first time.
You do have other problems though, your merge function has errors. You end up skipping certain elements because you're only using one index for both arrays. Think about it: you only push one number into your output array in the f1[i] > f2[i] or the f1[i] < f2[i], but you discard both by incrementing i.
You can take your merge loop and simplify it by a lot, while also fixing your mistake :).
auto it = f1.cbegin();
auto jt = f2.cbegin();
while (it != f1.cend() && jt != f2.cend()) {
if (*it < *jt) f3.push_back(*jt++); //f2 was bigger, push f2>f3 and increment f2 index
else if (*it > *jt) f3.push_back(*it++); //f1 was bigger, push f1>f3 and increment f1 index
else { //implicit equals, only option left
f3.push_back(*jt++);
f3.push_back(*it++);
}
}
while (it != f1.cend()) f3.push_back(*it++);
while (jt != f2.cend()) f3.push_back(*jt++);
So now f3 contains your sorted array, sorted in O(m+n) time. If you're doing this for the sake of learning, I'd try to remedy your error using your way first before switching over to this.
If you want to write less code and speed isn't a problem, you can use <algorithm> to do this, too, but it's a terrible O((n+m)lg(n+m)).
auto it = f1.cbegin();
auto jt = f2.cbegin();
while (it != f1.cend()) f3.push_back(*it++);
while (jt != f2.cend()) f3.push_back(*jt++);
std::sort(f3.begin(), f3.end());
Since you're reading the file with std::getline() before calling ioSort(), there's nothing for the sorting function to read.
You can rewind back to the beginning of the file with seekg().
ifs1.clear();
ifs1.seekg(0, ifs1.beg);
ifs2.clear();
ifs2.seekg(0, ifs1.beg);
ioSort(ifs1, ifs2, ofs);
See How to read same file twice in a row
In order to be short:
#include <fstream>
#include <algorithm>
#include <iterator>
int main()
{
std::ifstream infile1("infile1.txt");
std::ifstream infile2("infile2.txt");
std::ofstream outfile("outfile.txt");
std::merge(
std::istream_iterator<int>{infile1}, std::istream_iterator<int>{},
std::istream_iterator<int>{infile2}, std::istream_iterator<int>{},
std::ostream_iterator<int>{outfile, " "}
);
}
std::merge is an STL algorithm that merge two sorted ranges into one sorted range. And the ranges are the files for this case. The files are viewed as ranges using std::istream_iterator<int>. The output file is accessed as a range using std::ostream_iterator<int>.

How to compare two arrays and return non matching values in C++

I would like to parse through two vectors of strings and find the strings that match each other and the ones that do not.
Example of what I want get:
input vector 1 would look like: [string1, string2, string3]
input vector 2 would look like: [string2, string3, string4]
Ideal output:
string1: No Match
string2: Match
string3: Match
string4: No Match
At the moment I use this code:
vector<string> function(vector<string> sequences, vector<string> second_sequences){
for(vector<string>::size_type i = 0; i != sequences.size(); i++) {
for(vector<string>::size_type j = 0; j != second_sequences.size(); j++){
if (sequences[i] == second_sequences[j]){
cout << "Match: " << sequences[i];
}else{
cout << "No Match: " << sequences[i];
cout << "No Match: " << second_sequences[j];
}
}
}
}
It works great for the ones that match, but iterates over everything so many times,
and the ones that do not match get printed a large number of times.
How can I improve this?
The best code is the code that you did not have to write.
If you take a (STL) map container it will take care for you of sorting and memorizing the different strings you encounter.
So let the container works for us.
I propose a small code quickly written. You need for this syntax to enable at least the C++ 2011 option of your compiler ( -std=c++11 on gcc for example ). The syntax that should be used before C++11 is much more verbose (but should be known from a scholar point of view ).
You have only a single loop.
This is only a hint for you ( my code does not take into account that in the second vector string4 could be present more than once, but I let you arrange it to your exact needs)
#include <iostream>
#include <vector>
#include <string>
#include <map>
using namespace std;
vector<string> v1 { "string1","string2","string3"};
vector<string> v2 { "string2","string3","string4"};
//ordered map will take care of "alphabetical" ordering
//The key are the strings
//the value is a counter ( or could be any object of your own
//containing more information )
map<string,int> my_map;
int main()
{
cout << "Hello world!" << endl;
//The first vector feeds the map before comparison with
//The second vector
for ( const auto & cstr_ref:v1)
my_map[cstr_ref] = 0;
//We will look into the second vector ( it could also be the third,
//the fourth... )
for ( const auto & cstr_ref:v2)
{
auto iterpair = my_map.equal_range(cstr_ref);
if ( my_map.end() != iterpair.first )
{
//if the element already exist we increment the counter
iterpair.first->second += 1;
}
else
{
//otherwise we put the string inside the map
my_map[cstr_ref] = 0;
}
}
for ( const auto & map_iter: my_map)
{
if ( 0 < map_iter.second )
{
cout << "Match :";
}
else
{
cout << "No Match :" ;
}
cout << map_iter.first << endl;
}
return 0;
}
Output:
No Match :string1
Match :string2
Match :string3
No Match :string4
std::sort(std::begin(v1), std::end(v1));
std::sort(std::begin(v2), std::end(v2));
std::vector<std::string> common_elements;
std::set_intersection(std::begin(v1), std::end(v1)
, std::begin(v2), std::end(v2)
, std::back_inserter(common_elements));
for(auto const& s : common_elements)
{
std::cout<<s<<std::endl;
}

How to make String::Find(is) omit this

If I have a list, which contains the 4 nodes ("this"; "test example"; "is something of"; "a small") and I want to find every string that has "is" (only 1 positive with this list). This topic has been posted a large number of times, which I have used to help get me this far. However, I can't see anywhere how I omit "this" from a positive result. I could probably use string::c_str, then find it myself, after I've reduced my much larger list. Or is there a way I could use string::find_first_of? It would seem there's a better way. Thanks.
EDIT: I know that I can omit a particular string, but I'm looking for bigger picture b/c my list is quite large (ex: poem).
for(it = phrases.begin(); it != phrases.end(); ++it)
{
found = it->find(look);
if(found != string::npos)
cout << i++ << ". " << *it << endl;
else
{
i++;
insert++;
}
}
Just to clarify: what are you struggling with?
What you want to do is check if what you have found is the start of a word (or the phrase) and is also the end of a word (or the phrase)
ie. check if:
found is equal to phrases.begin OR the element preceding found is a space
AND two elements after found is a space OR phrases.end
EDIT: You can access the character that was found by using found (replace X with the length of the string you're finding (look.length)
found = it->find(look);
if(found!=string::npos)
{
if((found==0 || it->at(found-1)==' ')
&& (found==it->length-X || it->at(found+X)==' '))
{
// Actually found it
}
} else {
// Do whatever
}
We can use boost regex for searching regular expressions. Below is an example code. Using regular expression complex seacrh patterns can be created.
#include <boost/regex.hpp>
#include <string>
#include <iostream>
#include <boost/tokenizer.hpp>
using namespace boost;
using namespace std;
int main()
{
std::string list[4] = {"this","hi how r u ","is this fun is","no"};
regex ex("^is");
for(int x =0;x<4;++x)
{
string::const_iterator start, end;
boost::char_separator<char> sep(" ");
boost::tokenizer<boost::char_separator<char> > token(list[x],sep);
cout << "Search string: " << list[x] <<"\n"<< endl;
int x = 0;
for(boost::tokenizer<boost::char_separator<char> >::iterator itr = token.begin();
itr!=token.end();++itr)
{
start = (*itr).begin();
end = (*itr).end();
boost::match_results<std::string::const_iterator> what;
boost::match_flag_type flags = boost::match_default;
if(boost::regex_search(start, end, what, ex, flags))
{
++x;
cout << "Found--> " << what.str() << endl;
}
}
cout<<"found pattern "<<x <<" times."<<endl<<endl;
}
return 0;
}
Output:
Search string: this
found pattern 0 times.
Search string: hi how r u
found pattern 0 times.
Search string: is this fun is
Found--> is Found--> is found pattern 2 times.
Search string: no
found pattern 0 times.
I didn't realize you only wanted to match "is". You can do this by using an std::istringstream to tokenize it for you:
std::string term("is");
for(std::list<std::string>::const_iterator it = phrases.begin();
it != phrases.end(); ++it)
{
std::istringstream ss(*it);
std::string token;
while(ss >> token)
{
if(token == term)
std::cout << "Found " << token << "\n";
}
}

How do I find all the positions of a substring in a string?

I want to search a large string for all the locations of a string.
The two other answers are correct but they are very slow and have O(N^2) complexity. But there is the Knuth-Morris-Pratt algorithm, which finds all substrings in O(N) complexity.
Edit:
Also, there is another algorithm: the so-called "Z-function" with O(N) complexity, but I couldn't find an English source for this algorithm (maybe because there is also another more famous one with same name - the Z-function of Rieman), so I will just put its code here and explain what it does.
void calc_z (string &s, vector<int> & z)
{
int len = s.size();
z.resize (len);
int l = 0, r = 0;
for (int i=1; i<len; ++i)
if (z[i-l]+i <= r)
z[i] = z[i-l];
else
{
l = i;
if (i > r) r = i;
for (z[i] = r-i; r<len; ++r, ++z[i])
if (s[r] != s[z[i]])
break;
--r;
}
}
int main()
{
string main_string = "some string where we want to find substring or sub of string or just sub";
string substring = "sub";
string working_string = substring + main_string;
vector<int> z;
calc_z(working_string, z);
//after this z[i] is maximal length of prefix of working_string
//which is equal to string which starting from i-th position of
//working_string. So the positions where z[i] >= substring.size()
//are positions of substrings.
for(int i = substring.size(); i < working_string.size(); ++i)
if(z[i] >=substring.size())
cout << i - substring.size() << endl; //to get position in main_string
}
Using std::string::find. You can do something like:
std::string::size_type start_pos = 0;
while( std::string::npos !=
( start_pos = mystring.find( my_sub_string, start_pos ) ) )
{
// do something with start_pos or store it in a container
++start_pos;
}
EDIT: Doh! Thanks for the remark, Nawaz! Better?
I'll add for completeness, there is another approach that is possible with std::search, works like std::string::find, difference is that you work with iterators, something like:
std::string::iterator it(str.begin()), end(str.end());
std::string::iterator s_it(search_str.begin()), s_end(search_str.end());
it = std::search(it, end, s_it, s_end);
while(it != end)
{
// do something with this position..
// a tiny optimisation could be to buffer the result of the std::distance - heyho..
it = std::search(std::advance(it, std::distance(s_it, s_end)), end, s_it, s_end);
}
I find that this sometimes outperforms std::string::find, esp. if you represent your string as a vector<char>.
Simply use std::string::find() which returns the position at which the substring was found, or std::string::npos if none was found.
Here is the documentation.
An here is the example taken from this documentation:
// string::find
#include <iostream>
#include <string>
using namespace std;
int main ()
{
string str ("There are two needles in this haystack with needles.");
string str2 ("needle");
size_t found;
// different member versions of find in the same order as above:
found=str.find(str2);
if (found!=string::npos)
cout << "first 'needle' found at: " << int(found) << endl;
found=str.find("needles are small",found+1,6);
if (found!=string::npos)
cout << "second 'needle' found at: " << int(found) << endl;
found=str.find("haystack");
if (found!=string::npos)
cout << "'haystack' also found at: " << int(found) << endl;
found=str.find('.');
if (found!=string::npos)
cout << "Period found at: " << int(found) << endl;
// let's replace the first needle:
str.replace(str.find(str2),str2.length(),"preposition");
cout << str << endl;
return 0;
}