Encoding two vectors in a function - c++

Hey I'm writing a function that takes two std::vector<std::string> and returns a third std::vector<std::string>.
The function is going to encode the two vectors together and create the 3rd vector.
I'm currently debugging this to find out why it's not working, and I keep getting: vector subscript out of range. As far as I can tell it's crashing at this line:
if (file2[i].size() < file1[i].size())
Can I use size() to get the size of the element at i?
std::vector<std::string> Encode(std::vector<std::string> &file1,
std::vector<std::string> &file2)
{
std::vector<std::string> file3;
std::string temp;
for (unsigned int i = 0; i < file1.size(); i++) {
for (unsigned int x = 0; x < file1[i].size(); x++) {
if (file2[i].size() < file1[i].size()) {
for (unsigned int t = 0; t < file2[i].size(); t++) {
file3[i][x] = (int)file1[i][x] + (int)file2[i][t];
}
} else if (file2[i].size() > file1[i].size()) {
file3[i][x] = (int)file1[i][x] + (int)file2[i][x];
}
if (file3[i][x] > 126) {
file3[i][x] = file3[i][x] % 127;
} else {
file3[i][x] = file3[i][x] + 32;
}
}
}
return file3;
}
Any idea what's going on here?

I'd be very much inclined to simplify by factoring. At the lowest layer is a combine function to combine two chars into one:
char combine(char a, char b)
{
char result = a+b;
if (result > 126)
return result % 127;
return result+32;
}
The next level up would be to iterate through each of the letters in two strings of possibly different sizes. The algorithm works for differing length strings by "recycling" through the shorter string.
std::string mix(const std::string &first, const std::string &second)
{
unsigned len1 = first.length();
unsigned len2 = second.length();
if (len1 < len2)
return mix(second, first);
std::string result;
// if the strings are of different lengths, first is now the longer
unsigned j=0;
for (unsigned i=0; i < len1; ++i, ++j) {
if (j >= len2)
j = 0;
result.push_back(combine(first[i], second[j]));
}
return result;
}
Finally, the combination of the vector of string is much simpler:
std::vector<std::string> Encode(const std::vector<std::string> &file1,
const std::vector<std::string> &file2)
{
std::vector<std::string> file3;
assert(file1.size() == file2.size());
for (unsigned int i = 0; i < file1.size(); i++) {
file3.push_back(mix(file1[i], file2[i]));
}
return file3;
}
Note that the code currently uses an assert to assure that the two vectors are the same length, but this is probably an artificial constraint. Real code should either assure that they are the same length or do something else to handle that case. Since it's not clear what your function is intended to do, I've left it to you to decide how to handle it, but with the assert as a placeholder to remind you that it does need to be addressed.
Finally, some driver code using C++11:
int main()
{
std::vector<std::string> english{"one", "two", "three", "four"};
std::vector<std::string> spanish{"uno", "dos", "tres", "cuatro"};
auto result = Encode(english, spanish);
std::copy(result.begin(), result.end(),
std::ostream_iterator<std::string>(std::cout, " "));
}
Note, too that I've used push_back to append to the end of the strings and const declarations for the passed strings.

Try these three sets of inputs:
1. file1 is bigger than file2
2. file2 is bigger than file1
3. file1 is equal to file2 in size.
Let us know the cases when the error was reproduced and when it was not.
I think by this stage you will solve the problem by yourself.
If not,
write contents of the (smallest possible) file1 and file2 that reproduced the error.

Some problems enumerated:
-You are assuming that file1 and file2 have the same size or at least that file1 has size <= file2 (in the other case would cause invalid memory access in line if (file2[i].size() < file1[i].size()) {) and are not checking in the function for that. At least add an assert statement or a checking.
-You are initializing file3 empty, and are indexing later in the function.
-The other problem is what happen when file1[i] and file2[i] have the same length, this option is not cover in the if-else, the cover options are < and > but not ==.
-You are accessing invalid memory with statements like file3[i][x], because the strings in file3[i] are initialized empty, for that, don't contain any character.
This is the more close that i can get without known the exact steps of the encode algorithm
#include <iostream>
#include <vector>
#include <boost/lexical_cast.hpp>
using namespace std;
std::vector<std::string> Encode(std::vector<std::string> &file1,
std::vector<std::string> &file2) {
assert(file1.size() <= file2.size());
std::vector<std::string> file3(file1.size());
std::string temp;
for (unsigned int i = 0; i < file1.size(); i++) {
for (unsigned int x = 0; x < file1[i].size(); x++) {
int enc = 0;
if (file2[i].size() <= file1[i].size()) {
for (unsigned int t = 0; t < file2[i].size(); t++) {
enc = (int)file1[i][x] + (int)file2[i][t];
}
}
else if (file2[i].size() > file1[i].size()) {
enc = (int)file1[i][x] + (int)file2[i][x];
}
if (enc > 126) {
file3[i] += (enc % 127);
}
else {
file3[i] += (enc + 32);
}
}
}
return file3;
}
int main(int argc, char *argv[]) {
std::vector<std::string> a{ "1", "2", "3" };
std::vector<std::string> b{ "6", "7", "8" };
for (const auto& s : a)
std::cout << s << std::endl;
for (const auto& s : b)
std::cout << s << std::endl;
auto c = Encode(a, b);
for (const auto& s : c)
std::cout << s << std::endl;
return 0;
}

Related

Storing indices in a vector<set<int>> vs a vector<vector<int>>

I'm doing a problem on Leetcode called "Number of Matching Subsequences". You are given a string S and a vector of smaller strings, and you have to find out how many of the smaller strings are substrings of S. (Not necessarily continguous substrings.)
I wrote my code in a certain way, and while it works fine, it was such that the compiler on Leetcode timed out. Someone else wrote their code almost the same as mine, but it didn't time out. I'm wondering what makes his faster. Here's mine:
class Solution {
public:
int numMatchingSubseq(string S, vector<string>& words) {
int count = 0;
vector<set<int>> Svec (26); // keep track of the indices where characters were seen in S
for (int i = 0; i < S.length(); ++i) Svec[S[i] - 'a'].insert(i);
for (auto & w : words) { // loop over words and characters within words, finding the soonest the next character appears in S
bool succeeded = true;
int current_index = -1;
for (auto & c : w) {
set<int> & c_set = Svec[c - 'a'];
auto it = upper_bound(begin(c_set), end(c_set), current_index);
if (it == end(c_set)) {
succeeded = false;
break;
}
current_index = *it;
} // loop over chars
if (succeeded) count++;
} //loop over words
return count;
}
};
int main() {
string S = "cbaebabacd";
vector<string> words {"abc", "abbd", "bbbbd"};
Solution sol;
cout << sol.numMatchingSubseq(S, words) << endl;
return 0;
}
Outputs
2
Program ended with exit code: 0
His solution stores the indices not in a vector<set<int>>, but in a vector<vector<int>>. I don't see why that would be a big difference.
int numMatchingSubseq (string S, vector<string>& words) {
vector<vector<int>> alpha (26);
for (int i = 0; i < S.size (); ++i) alpha[S[i] - 'a'].push_back (i);
int res = 0;
for (const auto& word : words) {
int x = -1;
bool found = true;
for (char c : word) {
auto it = upper_bound (alpha[c - 'a'].begin (), alpha[c - 'a'].end (), x);
if (it == alpha[c - 'a'].end ()) found = false;
else x = *it;
}
if (found) res++;
}
return res;
}
This is inefficient:
upper_bound(begin(c_set), end(c_set), current_index)
See this note in these std::upper_bound() docs:
for non-LegacyRandomAccessIterators, the number of iterator increments is linear.
You should instead use:
c_set.upper_bound(current_index)

How to display duplicate characters in a string in C++?

I am working on some code for a class that requires me to output duplicates in a string. This string can have any ascii character but the output needs to show only the repeated character and the total number of times it repeats.
Here are some sample inputs and outputs
mom, m:2
taco, No duplicates
good job, o:3
tacocat, t:2 c:2 a:2
My code works for all but the last test case, the t:2 and a:2 appears twice, Now I have come to the conclusion that I need to store duplicated characters somewhere and run a check on that list to see if that duplicate has already been printed so I tried using a vector.
My method is to push the character into the vector as the duplicates are printed and if a character is already in the vector then it is skipped in the printing. But I have not been able to find a way to this. I tried to use the find() from #include<algorithm> but got a syntax error that I am unable to fix. Is there a function that I can apply for this? Or am I going about this in a bad way?
I found the implementation of find() here & I looked here but they don't match and it breaks my code completely when I try to apply it.
#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
using namespace std;
vector <char> alreadyprintedcharacters;
void findrepeats(string const&);
int main()
{
string input;
cout << "Enter the input : ";
getline(cin, input);
findrepeats(input);
return 0;
}
void findrepeats(string const &in)
{
int trackerOfDuplicates = 0;
int asciiArray[256];
char ch;
int charconv;
for (int i = 0; i < 256; i++) // creates my refference array for the comparison and sets all the values equal to zero
asciiArray[i] = 0;
for (unsigned int i = 0; i < in.length(); i++)
{
ch = in[i];
charconv = static_cast<int>(ch);
if (asciiArray[charconv] == 0)
{
asciiArray[charconv] = 1;
}
else if (asciiArray[charconv] > 0)
{
asciiArray[charconv] = asciiArray[charconv]++;
}
}
bool trip = false;
for (unsigned int i = 0; i < in.length(); i++)
{
char static alreadyprinted;
char ch = in[i];
if ((asciiArray[ch] > 1) && (ch != alreadyprinted) && (find(alreadyprintedcharacters.begin(), alreadyprintedcharacters.end(), ch)!= alreadyprintedcharacters.end()))// change reflected HERE
{
cout << in[i] << " : " << asciiArray[ch] << endl;//???? maybe a nested loop
trip = true;
alreadyprinted = ch;
alreadyprintedcharacters.push_back(alreadyprinted);
}
}
if (trip == false)
cout << "No repeated characters were found.\n";
}
Your code works fine for me (gives the correct output for tacocat) if you fix the error related to std::find:
std::find doesn't return a bool, it returns an iterator (in your case, a std::vector<char>::iterator). If you want to check if std::find found something, you should compare it to alreadyprintedcharacters.end(), because that's what std::find returns if it didn't find something.
You can create an integer array of 256 and initialize it to 0 at first. Then loop over characters in the string and increment each index that corresponds to that letter. In the end, you can print out letters that have values greater than 1. Just change your findrepeats function to the following:
void findrepeats(string const &in)
{
int asciiArray[256];
char ch;
int charconv;
bool foundAny = false;
for (int i = 0; i < 256; i++) asciiArray[i] = 0;
for (unsigned int i = 0; i < in.length(); i++)
{
ch = in[i];
charconv = static_cast<int>(ch);
asciiArray[charconv]++;
}
for (unsigned int i = 0; i < 256; i++)
{
char static alreadyprinted;
if (asciiArray[i] > 1)
{
foundAny = true;
cout << static_cast<char>(i) << " : " << asciiArray[i] << endl;
}
}
if (!foundAny)
cout << "No repeated characters were found.\n";
}
You have to make following changes in your code
change the loop body where you are updating the reference array for the comparison and sets all the values like this:
//your code
else if (asciiArray[charconv] > 0)
{
asciiArray[charconv] = asciiArray[charconv]++;
}
in the above code the value of asciiArray[charconv] doesn't change because it is a post increment asciiArray[charconv]++; , either change it to a pre increment ++asciiArray[charconv]; or write asciiArray[charconv] = asciiArray[charconv]+1;
Here is a link to this why it doesn't increment.
Also you can change the loop like this,more simplified:
for (unsigned int i = 0; i < in.length(); i++)
{
ch = in[i];
charconv = static_cast<int>(ch);
asciiArray[charconv]++;
}
change the type of found to std::vector<char>::iterator coz find returns an iterator to the first element in the range that compares equal to val & if no elements match, the function returns last.
std::vector<char>::iterator found = find(alreadyprintedcharacters.begin(), alreadyprintedcharacters.end(), ch);
Then your condition should be like
if((asciiArray[ch] > 1) && (ch!=alreadyprinted) && (found == alreadyprintedcharacters.end()))
I don't quite get why you need all of that code (given you stated you can't use std::map).
You declared an array of 256 and set each item to 0, which is OK:
for (int i = 0; i < 256; i++)
asciiArray[i] = 0;
Now the next step should be simple -- just go through the string, one character at a time, and increment the associated value in your array. You seem to start out this way, then go off on a tangent doing other things:
for (unsigned int i = 0; i < in.length(); i++)
{
ch = in[i]; // ok
asciiArray[ch]++;
We can set a boolean to true if we discover that the character count we just incremented is > 1:
bool dup = false;
for (unsigned int i = 0; i < in.length(); i++)
{
ch = in[i]; // ok
asciiArray[ch]++;
if ( asciiArray[ch] > 1 )
dup = true;
}
That is the entire loop to preprocess the string. Then you need a loop after this to print out the results.
As to printing, just go through your array only if there are duplicates, and you know this by just inspecting the dup value. If the array's value at character i is > 1, you print the information for that character, if not, skip to the next one.
I won't show the code for the last step, since this is homework.
Just met similar question last week, here is what I did, maybe not a best solution, but it did work well.
string str("aer08%&#&%$$gfdslh6FAKSFH");
vector<char> check;
vector<int> counter;
//subscript is the bridge between charcheck and count. counter[sbuscript] store the times that check[subscript] appeared
int subscript = 0;
bool charisincheck = false;
for (const auto cstr : str) //read every char in string
{
subscript = 0;
charisincheck = false;
for (const auto ccheck : check) // read every element in charcheck
{
if (cstr == ccheck)//check if the char get from the string had already existed in charcheck
{
charisincheck = true; //if exist, break the for loop
break;
}
subscript++;
}
if (charisincheck == true) //if the char in string, then the count +1
{
counter[subscript] += 1;
}
else //if not, add the new char to check, and also add a counter for this new char
{
check.push_back(cstr);
counter.push_back(1);
}
}
for (decltype(counter.size()) i = 0; i != counter.size(); i++)
{
cout << check[i] << ":" << counter[i] << endl;
}met
import java.util.*;
class dublicate{
public static void main(String arg[]){
Scanner sc =new Scanner(System.in);
String str=sc.nextLine();
int d[]=new int[256];
int count=0;
for(int i=0;i<256;i++){
d[i]=0;
}
for(int i=0;i<str.length();i++){
if(d[str.charAt(i)]==0)
for(int j=i+1;j<str.length();j++){
if(str.charAt(i)==str.charAt(j)){
d[str.charAt(i)]++;
}
}
}
for(char i=0;i<256;i++){
if(d[i]>0)
System.out.println(i+" :="+(d[i]+1));
}
}
}
//here simple code for duplicate characters in a string in C++
#include<iostream.h>
#include<conio.h>
#include<string.h>
void main(){
clrscr();
char str[100];
cin>>str;
int d[256];
int count=0;
for(int k=0;k<256;k++){
d[k]=0;
}
for(int i=0;i<strlen(str);i++){
if(d[str[i]]==0)
for(int j=i+1;j<strlen(str);j++){
if(str[i]==str[j]){
d[str[i]]++;
}
}
}
for(int c=0;c<256;c++){
if(d[c]>0)
cout<<(char)c<<" :="<<(d[c]+1)<<"\n";
}
getch();
}

segmentation fault while using custom sorting function to sort strings in a vector<string>

I am trying to sort a vector that contains strings. I have the following code for the main function
int size;
cin >> size;
vector<string> s;
cout << "READ" << endl;
for (int i = 0; i<size; i++)
{
string str;
cin >> str;
s.push_back(str);
}
cout << "READ" << endl;
cout << "Passed" << endl;
sort(s.begin(), s.end(), comp);
for (int i = 0; i<size; i++)
{
string st = s.at(I);
cout << st << endl;
}
return 0;
And the comp function code is
bool comp(string s1, string s2)
{
if (s1.empty() or s2.empty())
return false;
int l = s1.length();
for (int i = 0; i<l; i++)
{
if (s1[i] > s2[i])
return true;
}
return false;
}
But for this input I am getting a segmentation fault
46
lnpxeemwlqlzpxrmrmwbseqfnpkzaafdnukixaopcfvhqw
dhfhhoyhhzleldljmirjbqagcleivzomlpanqzsmqnrzij
zcsrvgqlmrgknqhwtcqzyldjanlczysnspvusziqtazjlu
idiknfqdygrwhvdzperlvgueqhuezsrwzztlodqgipnqzb
zjfyxbghvdecpzhvoxzojcpciaspyoeaetimmoccjqxtmv
mxwnhdyjutecwbrxdjmrbdjvbzprgnekvnvhxnuvekoflo
jjbjxzuaafatzdwlnzcorkiagrwzvrmjqqbdlmgyewzsea
bmvyqojhnbfrypiiwvtgifmqqdcuilohbfvkqjhlcwsfyo
zrbjhsrxnllmsdfqurkjfomwsvgfepwttohojxmrhexpmy
hcdxtucpeptgqhckpdxdcgpvhkiuucvwbuhtmbskqdlasw
rtocxkyrsrbluwvpfkekqkdwncvozfgmcrswpksiqmfnnl
xawlpinqjstxvrqvsugbvszhibbcmbdwktgwjlezakyqrr
cfghwolkahdafrcuufklziipmtkhuxdrxqlavcrxavxuas
plcsutiemkgfunhpyeiuvxwjppzsopglcyhgidsyhjnutp
vyyrbmfyfwpcowlpytmkvsyrzgiausrulsxtwysjgpgtqi
bsoknggdytplubxzjczatotnpovriwibeamjfnyxibvama
imkshtavbjpnkafuxwbzpiqlnnotrxmjepzeuwtuewtqab
ttjzqrcdcofkljaevmauexsxlkrxuanxgrsmsrxckixpoz
aocndkatjggduuyiksgmovthyoomrfsaxlnjouszxxoqtc
ahmkgizkvsbrqyricbtnpvpnibvgvnnrnqphkstvcjsbli
biasqbcofwdgabnipodjkriiyqlhaddpegkmydutcyoksk
avyaodwtgbdsnhheoearlinfcadeteiiudobbvqdqizcry
mhdekyvubghealrenyshjcjuhxxzimsgvukcdfdbjramzq
ayrzjanrebdgowsngullkgyvlgqzjexebleigxvgwjnbyf
vcpnclkhoawabjlhfnrncxfswjjmpxqcwoeqpyaitwdrjf
ghngenuvshwuaubahlzazwmgnsmtzyqfvvoxnhiufhxpac
ljrwslmgjilvdommuvpebcznjalxuazyujtzpewdbxjnwj
jqirnjnheowbioheyleyhkrcyfxuweyipumfojetmvomuz
vnnlsozyplofqkxfwcmlyntfrhspvbscocodlejqrymdeu
lgjcimksyragrhhagkmlnaysfxzswxfkhqzrhjlgkemmhp
weoxhopddcyiiikwblqvvcxcuxkebhywdacpmjrlkosxmw
bwcxxsytqdpybjxyqgmggitkgpkiytnwprsnxrygryxigo
qtwyleqxqflmaudekmdmgscfvjfwkchacxmokxrcfgwnhl
dgcmvhgnzigmrxougsbhwdhugyvloaqlliybbzkttmolln
jqmrfoyhwxbiyvzntvxozfswwjbeybahggfjrrzzhbapyi
oxbjadgrttqnfbevqolflhdpmgwgudhwfeebauqhhygvnt
kwmqirrljycddqcvjanibiarpcjjqiuvkdbdyzogbcixah
yyykebcfsnixcjdbkxtqvqynafmtuvoepeayiaqinvmjen
lsyxwgpfxlfkxckzsjzonxkhullkatmnwwfuicgjzbnvzf
vihglfapunknuitwtcxzdwjyfwqurvsydacylgcyohrbou
olmojrovoqseuqausssdupqzhbmyblomlbbqzwgbtgyiwq
tcshhbdgxsrtxywgqahqfimbnckwdhtbzlpwevuqjyqrbd
vjmcknagopzpwrmrianbgyhyginqduwdfjgmdqttcqroof
srmfsjigydlqlgsmvgqddpqmqkjzptzwdfpjmpnvgaezlx
yphbhtrmqcnrfklqmkblvginnhxxtlnnwcfuwujdqwkvaq
jahvrihhicrqvttmdzwbemjjqnstvtudvifdvrbjxalirj
For a lot other inputs it gives correct result but for the above result I get this result
READ
READ
Passed
I have searched a lot but cannot find the error in my comp function.
As the others have noted, your comparison function is/was broken.
Here's an example that uses std::greater<std::string>() as a comparison function:
#include <algorithm>
#include <functional>
#include <iostream>
#include <string>
#include <vector>
int main() {
int size;
if (!(std::cin >> size)) return -1;
std::vector<std::string> v;
for (int i = 0; i < size; i++) {
std::string str;
if (!(std::cin >> str)) return -1;
v.push_back(str);
}
std::sort(v.begin(), v.end(), std::greater<std::string>());
for (int i = 0; i < v.size(); i++) {
std::cout << v[i] << std::endl;
}
return 0;
}
The output is sorted in descending order (test.txt containing above input):
$ g++ test.cc && ./a.out < test.txt
zrbjhsrxnllmsdfqurkjfomwsvgfepwttohojxmrhexpmy
zjfyxbghvdecpzhvoxzojcpciaspyoeaetimmoccjqxtmv
zcsrvgqlmrgknqhwtcqzyldjanlczysnspvusziqtazjlu
yyykebcfsnixcjdbkxtqvqynafmtuvoepeayiaqinvmjen
yphbhtrmqcnrfklqmkblvginnhxxtlnnwcfuwujdqwkvaq
xawlpinqjstxvrqvsugbvszhibbcmbdwktgwjlezakyqrr
weoxhopddcyiiikwblqvvcxcuxkebhywdacpmjrlkosxmw
vyyrbmfyfwpcowlpytmkvsyrzgiausrulsxtwysjgpgtqi
vnnlsozyplofqkxfwcmlyntfrhspvbscocodlejqrymdeu
vjmcknagopzpwrmrianbgyhyginqduwdfjgmdqttcqroof
vihglfapunknuitwtcxzdwjyfwqurvsydacylgcyohrbou
vcpnclkhoawabjlhfnrncxfswjjmpxqcwoeqpyaitwdrjf
ttjzqrcdcofkljaevmauexsxlkrxuanxgrsmsrxckixpoz
tcshhbdgxsrtxywgqahqfimbnckwdhtbzlpwevuqjyqrbd
srmfsjigydlqlgsmvgqddpqmqkjzptzwdfpjmpnvgaezlx
rtocxkyrsrbluwvpfkekqkdwncvozfgmcrswpksiqmfnnl
qtwyleqxqflmaudekmdmgscfvjfwkchacxmokxrcfgwnhl
plcsutiemkgfunhpyeiuvxwjppzsopglcyhgidsyhjnutp
oxbjadgrttqnfbevqolflhdpmgwgudhwfeebauqhhygvnt
olmojrovoqseuqausssdupqzhbmyblomlbbqzwgbtgyiwq
mxwnhdyjutecwbrxdjmrbdjvbzprgnekvnvhxnuvekoflo
mhdekyvubghealrenyshjcjuhxxzimsgvukcdfdbjramzq
lsyxwgpfxlfkxckzsjzonxkhullkatmnwwfuicgjzbnvzf
lnpxeemwlqlzpxrmrmwbseqfnpkzaafdnukixaopcfvhqw
ljrwslmgjilvdommuvpebcznjalxuazyujtzpewdbxjnwj
lgjcimksyragrhhagkmlnaysfxzswxfkhqzrhjlgkemmhp
kwmqirrljycddqcvjanibiarpcjjqiuvkdbdyzogbcixah
jqmrfoyhwxbiyvzntvxozfswwjbeybahggfjrrzzhbapyi
jqirnjnheowbioheyleyhkrcyfxuweyipumfojetmvomuz
jjbjxzuaafatzdwlnzcorkiagrwzvrmjqqbdlmgyewzsea
jahvrihhicrqvttmdzwbemjjqnstvtudvifdvrbjxalirj
imkshtavbjpnkafuxwbzpiqlnnotrxmjepzeuwtuewtqab
idiknfqdygrwhvdzperlvgueqhuezsrwzztlodqgipnqzb
hcdxtucpeptgqhckpdxdcgpvhkiuucvwbuhtmbskqdlasw
ghngenuvshwuaubahlzazwmgnsmtzyqfvvoxnhiufhxpac
dhfhhoyhhzleldljmirjbqagcleivzomlpanqzsmqnrzij
dgcmvhgnzigmrxougsbhwdhugyvloaqlliybbzkttmolln
cfghwolkahdafrcuufklziipmtkhuxdrxqlavcrxavxuas
bwcxxsytqdpybjxyqgmggitkgpkiytnwprsnxrygryxigo
bsoknggdytplubxzjczatotnpovriwibeamjfnyxibvama
bmvyqojhnbfrypiiwvtgifmqqdcuilohbfvkqjhlcwsfyo
biasqbcofwdgabnipodjkriiyqlhaddpegkmydutcyoksk
ayrzjanrebdgowsngullkgyvlgqzjexebleigxvgwjnbyf
avyaodwtgbdsnhheoearlinfcadeteiiudobbvqdqizcry
aocndkatjggduuyiksgmovthyoomrfsaxlnjouszxxoqtc
ahmkgizkvsbrqyricbtnpvpnibvgvnnrnqphkstvcjsbli
For reference:
http://en.cppreference.com/w/cpp/utility/functional/greater
http://en.cppreference.com/w/cpp/utility/functional/less
Your comp method is broken.
For those two strings "ac" and "ca", comp("ac", "ca") is true as is comp("ac", "ca"). Because you try next position when s1[i] < s2[i] when you should return immediately false.
And as you were said in comments, you should cope with strings of different length by use the length of the shorter to avoid errors for accessing past string length.
So you comp method should be:
bool comp(string s1,string s2)
{
// empty string comes last
if(s1.empty()) return false;
if(s2.empty()) return true;
// limit to shorter string length
unsigned int l = s1.length();
if (s2.length() < l) l = s2.length();
for(unsigned int i=0;i<l;i++)
{
// if chars are different at position i return immediately
if(s1[i] > s2[i])
return true;
if(s1[i] < s2[i])
return false;
}
// shorter string comes last
return(s1.length() >= s2.length());
}
This condition:
if (s1.empty() or s2.empty())
return false;
does not satisfy strict weak ordering requirement to comp function (from https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings)
If x < y, then for all z, either x < z or z < y or both.
In your case if z is empty string and x and y are not, it would not satisfy either x < z nor z < y
Possible implementation is:
if( s2.empty() ) return !s1.empty();
if( s1.empty() ) return false;
But I do not see why you cannot use either std::greater std::string::compare or std::string::operator>() instead of manual implementation

issue to populate an array of strings or array of char*

very basic C++ question. Looks I m really rusted there...
All I want to do is to read an array of X strings from a file and create an array of X vertical strings of the horizontal strings.
IE :
file contains:
azert
qsdfg
wxcvb
poiuy
mlkjh
I want to create a string array containing:
aqwpm
zsxol
edcol
rfvuj
tgbyh
Here is what I tried so far:
[bad code]
const int SIZE = 37;
std::string table_h[SIZE];
std::string table_v[SIZE];
int i = 0;
while (source >> table_h[i]) //,sizeof table_h[i]
{
for (int j = 0; j< SIZE; j++)
{
table_v[j][i] = table_h[i][j];
}
i++;
}
-> works fine for the first line, breaks when i=1. I don't understand what.
I noticed that although table_v[0][0] = 'f'; works fine.
Both table_v[0][36] = 'f'; and table_h[0].at(36); break.
With char * (which was my first idea),
char * table_h[SIZE];
char * table_v[SIZE];
something like
table_v[0][0] = 'f';
immediately breaks.
I suppose I need to allocate memory or initialize something first??
Thx in advance.
You should set the size of strings before using operator [] to access them. Resize of table_h is optional, but you definitely have to resize table_v.
const int SIZE = 37;
std::string table_h[SIZE];
std::string table_v[SIZE];
for (size_t i = 0; i < SIZE; ++i)
{
table_h[i].resize(SIZE);
table_v[i].resize(SIZE);
}
int i = 0;
while (source >> table_h[i])
{
for (int j = 0; j < SIZE; j++)
{
table_v[j][i] = table_h[i][j];
}
i++;
}
See the working example.
In my opinion, if you know the size of a strings, resizing is better than appending. It can save some memory re-allocations, and IMHO it is simply nicer solution.
Indeed the table_v[j] is an empty string.
The string needs to allocate space for the characters. This is not done by the index operators, i.e.
table_v[j][9] = 'a';
assumes enough space is allocated for table_v[j].
You can do append to your string to add to the initially empty string. Append does not take chars though, so instead of using index of table_h[i][j] you can use substr.
std::string to_append = table_j[i].substr(j, 1)
table[j].append(to_append);
This also relieves you of the i counter.
Here is a demonstrative program that shows how it can be done
#include <iostream>
#include <vector>
#include <string>
#include <numeric>
int main()
{
std::vector<std::string> v1 =
{
"azert", "qsdfg", "wxcvb", "poiuy", "mlkjh"
};
for ( const std::string &s : v1 ) std::cout << s << ' ';
std::cout << std::endl;
auto max_size = std::accumulate( v1.begin(), v1.end(),
size_t( 0 ),
[]( size_t acc, const std::string &s )
{
return acc < s.size() ? s.size() : acc;
} );
std::vector<std::string> v2( max_size );
for ( const std::string &s : v1 )
{
for ( std::string::size_type i = 0; i < s.size(); i++ )
{
v2[i].push_back( s[i] );
}
}
for ( const std::string &s : v2 ) std::cout << s << ' ';
std::cout << std::endl;
return 0;
}
The program output is
azert qsdfg wxcvb poiuy mlkjh
aqwpm zsxol edcik rfvuj tgbyh
As for your code than these statements
std::string table_h[SIZE];
std::string table_v[SIZE];
defined two arrays of empty strings. So you may not apply the subscript opertaor to an empty string. You could use for example member function push_back
for (int j = 0; j< SIZE; j++)
{
table_v[j].push_back( table_h[i][j] );
}

Number of characters matching between two strings in C++

I am building a small project for spelling correction, this is not homework.
Given two strings str1 and str2. One has to find out the number of characters matching between two strings.
For example if str1 = "assign" and str2 = "assingn", then the output should be 6.
In str2, characters, "a", "s", "s", "i", "g", "n" are there in str1, "assign". Thus output should be 6.
If str1 = "sisdirturn" and str2 = "disturb", then output should be 6.
In the str2, characters, "d", "i", "s", "t", "u", "r" are there in string str1, "sisdirturn". Thus output should be 6.
I've tried many attempts, however I am unable to get the answer. Kindly help to sort this out and if there is any idea to improve upon this, do tell.
Here is my attempt so far:
int char_match (string str1, string str2)
{
//Take two strings, split them into vector of characters and sort them.
int i, j, value = 0;
vector <char> size1, size2;
char* cstr1 = new char[str1.length() + 1];
strcpy(cstr1, str1.c_str());
char* cstr2 = new char[str2.length() + 1];
strcpy(cstr2, str2.c_str());
for(i = 0, j = 0 ; i < strlen(cstr1), j < strlen(cstr2); i++, j++)
{
size1.push_back( cstr1[i] );
size2.push_back( cstr2[j] );
}
sort (size1.begin(), size1.end() );
sort (size2.begin(), size2.end() );
//Start from beginning of two vectors. If characters are matched, pop them and reset the counters.
i = 0;
j = 0;
while ( !size1.empty() )
{
out :
while ( !size2.empty() )
{
if (size1[i] == size2[j])
{
value++;
pop_front(size1);
pop_front(size2);
i = 0;
j = 0;
goto out;
}
j++;
}
i++;
}
return value;
}
#include <iostream>
#include <algorithm> // sort, set_intersection
std::string::size_type matching_characters(std::string s1, std::string s2) {
sort(begin(s1), end(s1));
sort(begin(s2), end(s2));
std::string intersection;
std::set_intersection(begin(s1), end(s1), begin(s2), end(s2),
back_inserter(intersection));
return intersection.size();
}
int main() {
std::cout << matching_characters("assign", "assingn") << '\n'; // 6
std::cout << matching_characters("sisdirturn", "disturb") << '\n'; // 6
}
The above uses sort and so it has O(N*log N) performance, if that matters. If all your inputs are small then this may be faster than the second solution:
Sora's solution has better complexity, and can also be implemented concisely using standard <algorithm>s:
#include <iostream>
#include <algorithm> // for_each
#include <numeric> // inner_product
int matching_characters(std::string const &s1, std::string const &s2) {
int s1_char_frequencies[256] = {};
int s2_char_frequencies[256] = {};
for_each(begin(s1), end(s1),
[&](unsigned char c) { ++s1_char_frequencies[c]; });
for_each(begin(s2), end(s2),
[&](unsigned char c) { ++s2_char_frequencies[c]; });
return std::inner_product(std::begin(s1_char_frequencies),
std::end(s1_char_frequencies),
std::begin(s2_char_frequencies), 0, std::plus<>(),
[](auto l, auto r) { return std::min(l, r); });
}
int main() {
std::cout << matching_characters("assign", "assingn") << '\n'; // 6
std::cout << matching_characters("sisdirturn", "disturb") << '\n'; // 6
}
I'm using C++14 features, such as generic lambdas, for convenience. You may have to make some modifications if your compiler doesn't support C++14.
For me the solution using sort and set_intersection takes about 1/4th the time as the other solution for these inputs. That's because sorting and iterating over arrays of 6 or 7 elements can be faster than having to walk over arrays of 256 elements.
sort/set_intersection (3667ns) vs. for_each/inner_product (16,363ns)
Once the input is large enough the speed advantage will tip the other way. Furthermore, at the point where the input is too large to take advantage of the small-string optimization then the sort/set_intersection method will start doing expensive memory allocations.
Of course this performance result is highly implementation dependent, so if the performance of this routine matters you'll have to test it yourself on your target implementation with real input. If it doesn't matter then the O(N) solution is the better choice.
I am not 100% on what it is you are actually trying to achieve, but in the case of trying to see how many characters that match in the words, it would be a simple case of just running a loop through them and adding 1 every time you found a match, like this
int char_match (string str1, string str2)
{
//Take two strings, split them into vector of characters and sort them.
unsigned int matches = 0;
unsigned int stringLength = (str1.length > str2.length) ? str2.length : str1.length;
for(unsigned int i = 0; i < stringLength; ++i)
{
if(str1[i] == str2[i])
{
++matches;
}
}
return matches;
}
but from your code it looks like you want to find out exactly how many of the same characters they have that is to say ignoring the actual position of each character then it would be a rather different process. Something along the lines of this
int char_match (string str1, string str2)
{
unsigned int str1CharCount[256] = {0};
unsigned int str2CharCount[256] = {0};
unsigned int matches = 0;
for(unsigned int i = 0; i < str1.length; ++i)
{
++str1CharCount[static_cast<unsigned short>(str1[i])];
}
for(unsigned int i = 0; i < str2.length; ++i)
{
++str2CharCount[static_cast<unsigned short>(str1[i])];
}
for(unsigned int i = 0; i < 256; ++i)
{
matches += (str1CharCount[i] > str1CharCount[i]) ? str1CharCount[i] - (str1CharCount[i] - str2CharCount[i]) : str2CharCount[i] - (str2CharCount[i] - str1CharCount[i]);
}
return matches;
}
please note that for this second function there are probably a lot more efficient ways of doing it, but it should work all the same
EDIT:
This code should do what you wanted, main difference being it checks the ascii value to make sure it is a valid character
int char_match (string str1, string str2)
{
unsigned int str1CharCount[256] = {0};
unsigned int str2CharCount[256] = {0};
unsigned int matches = 0;
for(unsigned int i = 0; i < str1.length; ++i)
{
unsigned short aValue = static_cast<unsigned short>(str1[i]);
if(aValue >= static_cast<unsigned short>('a') && aValue <= static_cast<unsigned short>('z'))
{
++str1CharCount[static_cast<unsigned short>(str1[i]) - 32];
}
else if(aValue >= static_cast<unsigned short>('A') && aValue <= static_cast<unsigned short>('Z'))
{
++str1CharCount[static_cast<unsigned short>(str1[i])];
}
}
for(unsigned int i = 0; i < str2.length; ++i)
{
++str2CharCount[static_cast<unsigned short>(str1[i])];
}
for(unsigned int i = static_cast<unsigned short>('a'); i <= static_cast<unsigned short>('Z'); ++i)
{
matches += (str1CharCount[i] > str1CharCount[i]) ? str1CharCount[i] - (str1CharCount[i] - str2CharCount[i]) : str2CharCount[i] - (str2CharCount[i] - str1CharCount[i]);
}
return matches;
}