Binary search on C++ string does not work

Binary search on C++ string does not work - c++

What is wrong with the following code? How come it does not found the letter using my implementation of a binary search?
#include <iostream>
#include <string>
#include <algorithm>
#include <cctype>
#include <cwctype>
using namespace std;
bool contains(string s, char a){
int m = 0;
int n = s.length()-1;
while (m != n) {
int k = (m + n) / 2;
if (s[k] == a)
return true;
if (s[k] < a) {
n = k - 1;
} else {
m=k + 1;
}
}
return false;
}
int main() {
string s = "miyvarxarmaiko";
char a = 'm';
if (contains(s,a) == true) {
cout << "s contains character a" << endl;
} else {
cout << "does not contain" << endl;
}
return 0;
}

A prerequisite for binary search is that the array should be sorted.
To sort the string s you can do sort(s.begin(),s.end());
There are a few more bugs in your implementation:
if (s[k] < a) {
n = k - 1; // Incorrect..should be m = k + 1
} else {
m= k + 1; // Incorrect..should be n = k - 1
}
Why?
When the key is greater than the middle element you need to narrow the search to the right half of the middle element and you do that changing the low(in your case m) to mid+1 (in your case k+1). Similarly the other case needs to be changed as well.
and
while (m!=n)
should be
while (m<=n)
Why?
Consider the case of searching char 'a' in the string "a". Both your m and n will be 0 and so will be k. In your case the while loop is not entered at all. So in general when the search narrows down to one element and that happens to be your key, your existing code will fail.
Tip:
Your variable name selection is not good. It better to use names such as low, high and mid.

Related

Displaying sets and subsets of a string

I am trying to find the subsets and output them with binary representation.
EXAMPLE:
000:EMPTY
001:C
010:B
011:B C
100:A
101:A C
110:A B
111:A B C
I have the following code that finds all subsets but not sure about the binary?
#include <iostream>
using namespace std;
void recsub(string sofar, string rest){
if(rest=="") cout<<sofar<<endl;
else{
recsub(sofar+rest[0], rest.substr(1)); //including first letter
recsub(sofar, rest.substr(1)); //recursion without including first letter.
}
}
void listsub(string str){
recsub("",str);
}
int main(){
listsub("abc");
return 0;
}

You can do it other way around, generate mask first and then generate matching string.
#include <iostream>
using namespace std;
void listsub(const string &str) {
auto n = str.size();
// iterate over every binary representation
for (size_t k = 0; k < (1u << n); ++k) {
string binary;
string subset;
// iterate over every bit from right to left
for (size_t bit = n ; bit > 0; --bit) {
bool has_bit = (k & (1 << (bit - 1)));
binary.push_back(has_bit ? '1' : '0');
if (has_bit) {
subset.push_back(str[n - bit]);
}
}
cout << binary << ':' << subset << endl;
}
}
int main() {
listsub("abc");
return 0;
}

Combination of unique elements without repetition

Elements : a b c
all combinations in such a way:abcabacbcabc
Formula to get total number of combinations of unique elements without repetition = 2^n - 1 (where n is the number of unique elements)
In our case top: 2^3 - 1 = 7
Another formula to get the combinations with specific length = n!/(r! * (n - r)!) (where n= nb of unique items and r=length)
Example for our the above case with r=2 : 3!/(2! * 1!) = 3 which is ab ac bc
Is there any algorithm or function that gets all of the 7 combinations?
I searched a lot but all i can find is one that gets the combinations with specific length only.
UPDATE:
This is what I have so far but it only gets combination with specific length:
void recur(string arr[], string out, int i, int n, int k, bool &flag)
{
flag = 1;
// invalid input
if (k > n)
return;
// base case: combination size is k
if (k == 0) {
flag = 0;
cout << out << endl;
return;
}
// start from next index till last index
for (int j = i; j < n; j++)
{
recur(arr, out + " " + arr[j], j + 1, n, k - 1,flag);
}
}

The best algorithm I've ever find to resolve this problem is to use bitwise operator. You simply need to start counting in binary. 1's in binary number means that you have to show number.
e.g.
in case of string "abc"
number , binary , string
1 , 001 , c
2 , 010 , b
3 , 011 , bc
4 , 100 , a
5 , 101 , ac
6 , 110 , ab
7 , 111 , abc
This is the best solution I've ever find. you can do it simply with loop. there will not be any memory issue.
here is the code
#include <iostream>
#include <string>
#include <math.h>
#include<stdio.h>
#include <cmath>
using namespace std;
int main()
{
string s("abcd");
int condition = pow(2, s.size());
for( int i = 1 ; i < condition ; i++){
int temp = i;
for(int j = 0 ; j < s.size() ; j++){
if (temp & 1){ // this condition will always give you the most right bit of temp.
cout << s[j];
}
temp = temp >> 1; //this statement shifts temp to the right by 1 bit.
}
cout<<endl;
}
return 0;
}

Do a simple exhaustive search.
#include <iostream>
#include <string>
using namespace std;
void exhaustiveSearch(const string& s, int i, string t = "")
{
if (i == s.size())
cout << t << endl;
else
{
exhaustiveSearch(s, i + 1, t);
exhaustiveSearch(s, i + 1, t + s[i]);
}
}
int main()
{
string s("abc");
exhaustiveSearch(s, 0);
}
Complexity: O(2^n)

Here's an answer using recursion, which will take any number of elements as strings:
#include <vector>
#include <string>
#include <iostream>
void make_combos(const std::string& start,
const std::vector<std::string>& input,
std::vector<std::string>& output)
{
for(size_t i = 0; i < input.size(); ++i)
{
auto new_string = start + input[i];
output.push_back(new_string);
if (i + 1 == input.size()) break;
std::vector<std::string> new_input(input.begin() + 1 + i, input.end());
make_combos(new_string, new_input, output);
}
}
Now you can do:
int main()
{
std::string s {};
std::vector<std::string> output {};
std::vector<std::string> input {"a", "b", "c"};
make_combos(s, input, output);
for(auto i : output) std::cout << i << std::endl;
std::cout << "There are " << output.size()
<< " unique combinations for this input." << std::endl;
return 0;
}
This outputs:
a
ab
abc
ac
b
bc
c
There are 7 unique combinations for this input.

What's wrong with my dynamic programming algorithm with memoization?

*Sorry about my poor English. If there is anything that you don't understand, please tell me so that I can give you more information that 'make sence'.
**This is first time asking question in Stackoverflow. I've searched some rules for asking questions correctly here, but there should be something I missed. I welcome all feedback.
I'm currently solving algorithm problems to improve my skill, and I'm struggling with one question for three days. This question is from https://algospot.com/judge/problem/read/RESTORE , but since this page is in KOREAN, I tried to translate it in English.
Question
If there are 'k' pieces of partial strings given, calculate shortest string that includes all partial strings.
All strings consist only lowercase alphabets.
If there are more than 1 result strings that satisfy all conditions with same length, choose any string.
Input
In the first line of input, number of test case 'C'(C<=50) is given.
For each test case, number of partial string 'k'(1<=k<=15) is given in the first line, and in next k lines partial strings are given.
Length of partial string is between 1 to 40.
Output
For each testcase, print shortest string that includes all partial strings.
Sample Input
3
3
geo
oji
jing
2
world
hello
3
abrac
cadabra
dabr
Sample Output
geojing
helloworld
cadabrac
And here is my code. My code seems to work perfect with Sample Inputs, and when I made test inputs for my own and tested, everything worked fine. But when I submit this code, they say my code is 'wrong'.
Please tell me what is wrong with my code. You don't need to tell me whole fixed code, I just need sample inputs that causes error with my code. Added code description to make my code easier to understand.
Code Description
Saved all input partial strings in vector 'stringParts'.
Saved current shortest string result in global variable 'answer'.
Used 'cache' array for memoization - to skip repeated function call.
Algorithm I designed to solve this problem is divided into two function -
restore() & eraseOverlapped().
restore() function calculates shortest string that includes all partial strings in 'stringParts'.
Result of resotre() is saved in 'answer'.
For restore(), there are three parameters - 'curString', 'selected' and 'last'.
'curString' stands for currently selected and overlapped string result.
'selected' stands for currently selected elements of 'stringParts'. Used bitmask to make my algorithm concise.
'last' stands for last selected element of 'stringParts' for making 'curString'.
eraseOverlapped() function does preprocessing - it deletes elements of 'stringParts' that can be completly included to other elements before executing restore().
#include <algorithm>
#include <iostream>
#include <vector>
#include <cstring>
#include <string>
#define MAX 15
using namespace std;
int k;
string answer; // save shortest result string
vector<string> stringParts;
bool cache[MAX + 1][(1 << MAX) + 1]; //[last selected string][set of selected strings in Bitmask]
void restore(string curString, int selected=0, int last=0) {
//base case 1
if (selected == (1 << k) - 1) {
if (answer.empty() || curString.length() < answer.length())
answer = curString;
return;
}
//base case 2 - memoization
bool& ret = cache[last][selected];
if (ret != false) return;
for (int next = 0; next < k; next++) {
string checkStr = stringParts[next];
if (selected & (1 << next)) continue;
if (curString.empty())
restore(checkStr, selected + (1 << next), next + 1);
else {
int check = false;
//count max overlapping area of two strings and overlap two strings.
for (int i = (checkStr.length() > curString.length() ? curString.length() : checkStr.length())
; i > 0; i--) {
if (curString.substr(curString.size()-i, i) == checkStr.substr(0, i)) {
restore(curString + checkStr.substr(i, checkStr.length()-i), selected + (1 << next), next + 1);
check = true;
break;
}
}
if (!check) { // if there aren't any overlapping area
restore(curString + checkStr, selected + (1 << next), next + 1);
}
}
}
ret = true;
}
//check if there are strings that can be completely included by other strings, and delete that string.
void eraseOverlapped() {
//arranging string vector in ascending order of string length
int vectorLen = stringParts.size();
for (int i = 0; i < vectorLen - 1; i++) {
for (int j = i + 1; j < vectorLen; j++) {
if (stringParts[i].length() < stringParts[j].length()) {
string temp = stringParts[i];
stringParts[i] = stringParts[j];
stringParts[j] = temp;
}
}
}
//deleting included strings
vector<string>::iterator iter;
for (int i = 0; i < vectorLen-1; i++) {
for (int j = i + 1; j < vectorLen; j++) {
if (stringParts[i].find(stringParts[j]) != string::npos) {
iter = stringParts.begin() + j;
stringParts.erase(iter);
j--;
vectorLen--;
}
}
}
}
int main(void) {
int C;
cin >> C; // testcase
for (int testCase = 0; testCase < C; testCase++) {
cin >> k; // number of partial strings
memset(cache, false, sizeof(cache)); // initializing cache to false
string inputStr;
for (int i = 0; i < k; i++) {
cin >> inputStr;
stringParts.push_back(inputStr);
}
eraseOverlapped();
k = stringParts.size();
restore("");
cout << answer << endl;
answer.clear();
stringParts.clear();
}
}

After determining which string-parts can be removed from the list since they are contained in other string-parts, one way to model this problem might be as the "taxicab ripoff problem" problem (or Max TSP), where each potential length reduction by overlap is given a positive weight. Considering that the input size in the question is very small, it seems likely that they expect a near brute-force solution, with possibly some heuristic and backtracking or other form of memoization.

Thanks Everyone who tried to help me solve this problem. I actually solved this problem with few changes on my previous algorithm. These are main changes.
In my previous algorithm I saved result of restore() in global variable 'answer' since restore() didn't return anything, but in new algorithm since restore() returns mid-process answer string I no longer need to use 'answer'.
Used string type cache instead of bool type cache. I found out using bool cache for memoization in this algorithm was useless.
Deleted 'curString' parameter from restore(). Since what we only need during recursive call is one previously selected partial string, 'last' can replace role of 'curString'.
CODE
#include <algorithm>
#include <iostream>
#include <vector>
#include <cstring>
#include <string>
#define MAX 15
using namespace std;
int k;
vector<string> stringParts;
string cache[MAX + 1][(1 << MAX) + 1];
string restore(int selected = 0, int last = -1) {
if (selected == (1 << k) - 1) {
return stringParts[last];
}
if (last == -1) {
string ret = "";
for (int next = 0; next < k; next++) {
string resultStr = restore(selected + (1 << next), next);
if (ret.empty() || ret.length() > resultStr.length())
ret = resultStr;
}
return ret;
}
string& ret = cache[last][selected];
if (!ret.empty()) {
cout << "cache used in [" << last << "][" << selected << "]" << endl;
return ret;
}
string curString = stringParts[last];
for (int next = 0; next < k; next++) {
if (selected & (1 << next)) continue;
string checkStr = restore(selected + (1 << next), next);
int check = false;
string resultStr;
for (int i = (checkStr.length() > curString.length() ? curString.length() : checkStr.length())
; i > 0; i--) {
if (curString.substr(curString.size() - i, i) == checkStr.substr(0, i)) {
resultStr = curString + checkStr.substr(i, checkStr.length() - i);
check = true;
break;
}
}
if (!check)
resultStr = curString + checkStr;
if (ret.empty() || ret.length() > resultStr.length())
ret = resultStr;
}
return ret;
}
void EraseOverlapped() {
int vectorLen = stringParts.size();
for (int i = 0; i < vectorLen - 1; i++) {
for (int j = i + 1; j < vectorLen; j++) {
if (stringParts[i].length() < stringParts[j].length()) {
string temp = stringParts[i];
stringParts[i] = stringParts[j];
stringParts[j] = temp;
}
}
}
vector<string>::iterator iter;
for (int i = 0; i < vectorLen - 1; i++) {
for (int j = i + 1; j < vectorLen; j++) {
if (stringParts[i].find(stringParts[j]) != string::npos) {
iter = stringParts.begin() + j;
stringParts.erase(iter);
j--;
vectorLen--;
}
}
}
}
int main(void) {
int C;
cin >> C;
for (int testCase = 0; testCase < C; testCase++) {
cin >> k;
for (int i = 0; i < MAX + 1; i++) {
for (int j = 0; j < (1 << MAX) + 1; j++)
cache[i][j] = "";
}
string inputStr;
for (int i = 0; i < k; i++) {
cin >> inputStr;
stringParts.push_back(inputStr);
}
EraseOverlapped();
k = stringParts.size();
string resultStr = restore();
cout << resultStr << endl;
stringParts.clear();
}
}
This algorithm is much slower than the 'ideal' algorithm that the book I'm studying suggests, but it was fast enough to pass this question's time limit.

Using vectors to solve the anagrams in C++

I have a function that takes in two vectors of strings and compares each element to see if they are anagrams of one another.
Vector #1: "bat", "add", "zyz", "aaa"
Vector #2: "tab", "dad", "xyx", "bbb"
Restrictions and other things to clarify: The function is supposed to loop through both vectors and compare the strings. I am only supposed to compare based on the index of each vector; meaning I only compare the strings which are in the first index, then the strings which are in the second index, and so on. It's safe to assume that the vectors passed in as parameters will always be the same size.
If the compared strings are anagrams, "Match" is printed on the screen. If they aren't, "No Match" is printed.
Output: Match Match No Match No Match
I'm getting ridiculously stuck on this problem, I know how to reverse strings but when it gets to this I'm getting a bit clueless.
I understand that I would need to iterate through each vector, and then compare. But how would I be able to compare each letter within the string? Also, I'm not allowed to include anything else like algorithm, sort, or set. I've tried digging through a lot of questions but most answers utilized this.
If there are any tips on how to solve this, that would be great. I'll be posting what I find shortly.
Here's what I got so far:
#include <iostream>
#include <vector>
#include <string>
using namespace std;
void anagrams(const vector<string>& vOne, const vector<string>& vTwo){
for(int i=0; i< vOne.size(); i++){
for(int j=0; j< vTwo.size(); j++){
if(vOne[i].size() != vTwo[j].size()){
cout << 0 << endl;
}
else {
cout << 1 << endl;
}
}
}
}
void quicksort(vector<int>& a, int low, int high){
if(low < high)
{
int mid = (low + high)/2;
int pivot = a[mid];
swap(a[high], a[mid]);
int i, j;
for(i=low, j=high-1; ;){
while(a[i]<pivot) ++i;
while(j>i && pivot < a[j]) --j;
if (i < j)
swap(a[i++], a[j--]);
else
break;
}
swap(a[i], a[high]);
}
quicksort(a, low, i - 1);
quicksort(a, i + 1, high);
}
Thanks in advance!

Though you are not able to use sort, you should still sort the the words you are checking against, to see if they are anagrams. You will just have to sort the char[] manually, which is unfortunate, yet a good exercise. I would make a predicate, a function that compares the 2 strings and return true or false, and use that to check if they are anagrams. Also, it seems as though you don't need to print out both words that actually match, if that is true, then you can sort the words in the vectors when you first read them in, then just run them through your predicate function.
// Predicate
bool isMatch(const string &lhs, const string &rhs)
{
...sort and return lhs == rhs;
}
If you write the function, as I have above, you are passing in the parameters by const reference, which then you can copy (not using strcpy() due to vulnerabilities) the parameters into char[] and sort the words. I would recommend writing your sort as its own function.
Another hint, remember that things are much faster, and stl uses smart ptrs to do sorting. Anyway, I hope this helps even a little bit, I didn't want to give you the answer.

A solution that is fairly quick as long as the strings only contain characters between a-z and A-Z would be
bool is_anagram( const string& s1, const string& s2 ) {
if( s1.size() != s2.size() ) {
return false;
}
size_t count[ 26 * 2 ] = { 0 };
for( size_t i = 0; i < s1.size(); i++ ) {
char c1 = s1[ i ];
char c2 = s2[ i ];
if( c1 >= 'a' ) {
count[ c1 - 'a' ]++;
}
else {
count[ c1 - 'A' + 26 ]++;
}
if( c2 >= 'a' ) {
count[ c2 - 'a' ]--;
}
else {
count[ c2 - 'A' + 26 ]--;
}
}
for( size_t i = 0; i < 26 * 2; i++ ) {
if( count[ i ] != 0 ) {
return false;
}
}
return true;
}

If you're willing to use C++11, here is some rather inefficient code for seeing if two strings are anagrams. I'll leave it up to you to loop through the list of words.
#include <iostream>
#include <vector>
using namespace std;
int count_occurrences(string& word, char search) {
int count = 0;
for (char s : word) {
if (s == search) {
count++;
}
}
return count;
}
bool compare_strings(string word1, string v2) {
if (word1.size() != v2.size())
{
return false;
}
for (char s: word1) //In case v1 contains letters that are not in v2
{
if (count_occurrences(word1, s) != count_occurrences(v2, s))
{
return false;
}
}
return true;
}
int main() {
string s1 = "bat";
string s2 = "atb";
bool result = compare_strings(s1, s2);
if (result)
{
cout << "Match" << endl;
}
else
{
cout << "No match" << endl;
}
}
This works by simply counting the number of times a given letter occurs in a string. A better way to do this would be to sort the characters in the string alphabetically, and then compare the sorted strings to see if they are equal. I'll leave it up to you to improve this.
Best wishes.

Another solution, since I'm sufficiently bored:
#include <iostream>
#include <vector>
#include <string>
int equiv_class(char c) {
if ((c>='A')&&(c<='Z')) return c-'A';
if ((c>='a')&&(c<='z')) return c-'a';
return 27;
}
bool is_anagram(const std::string& a, const std::string& b)
{
if (a.size()!=b.size()) return false;
int hist[26]={};
int nz=0; // Non-zero histogram sum tally
for (int i=0, e=a.size() ; i!=e ; ++i)
{
int aclass = equiv_class(a[i]);
int bclass = equiv_class(b[i]);
if (aclass<27) {
switch (++hist[aclass]) {
case 1: ++nz; break; // We were 0, now we're not--add
case 0: --nz; break; // We were't, now we are--subtract
// otherwise no change in nonzero count
}
}
if (bclass<27) {
switch (--hist[bclass]) {
case -1: ++nz; break; // We were 0, now we're not--add
case 0: --nz; break; // We weren't, now we are--subtract
// otherwise no change in nonzero count
}
}
}
return 0==nz;
}
int main()
{
std::vector<std::string> v1{"elvis","coagulate","intoxicate","a frontal lobotomy"};
std::vector<std::string> v2{"lives","catalogue","excitation","bottlein frontofme"};
for (int i=0, e=(v1.size()==v2.size()?v1.size():0); i!=e; ++i) {
if (is_anagram(v1[i],v2[i])) {
std::cout << " Match";
} else {
std::cout << " No Match";
}
}
}

Check whether two strings are anagrams using C++

The program below I came up with for checking whether two strings are anagrams. Its working fine for small string but for larger strings ( i tried : listened , enlisted ) Its giving me a 'no !'
Help !
#include<iostream.h>
#include<string.h>
#include<stdio.h>
int main()
{
char str1[100], str2[100];
gets(str1);
gets(str2);
int i,j;
int n1=strlen(str1);
int n2=strlen(str2);
int c=0;
if(n1!=n2)
{
cout<<"\nThey are not anagrams ! ";
return 0;
}
else
{
for(i=0;i<n1;i++)
for(j=0;j<n2;j++)
if(str1[i]==str2[j])
++c;
}
if(c==n1)
cout<<"yes ! anagram !! ";
else
cout<<"no ! ";
system("pause");
return 0;
}

I am lazy, so I would use standard library functionality to sort both strings and then compare them:
#include <string>
#include <algorithm>
bool is_anagram(std::string s1, std::string s2)
{
std::sort(s1.begin(), s1.end());
std::sort(s2.begin(), s2.end());
return s1 == s2;
}
A small optimization could be to check that the sizes of the strings are the same before sorting.
But if this algorithm proved to be a bottle-neck, I would temporarily shed some of my laziness and compare it against a simple counting solution:
Compare string lengths
Instantiate a count map, std::unordered_map<char, unsigned int> m
Loop over s1, incrementing the count for each char.
Loop over s2, decrementing the count for each char, then check that the count is 0

The algorithm also fails when asked to find if aa and aa are anagrams. Try tracing the steps of the algorithm mentally or in a debugger to find why; you'll learn more that way.
By the way.. The usual method for finding anagrams is counting how many times each letter appears in the strings. The counts should be equal for each letter. This approach has O(n) time complexity as opposed to O(n²).

bool areAnagram(char *str1, char *str2)
{
// Create two count arrays and initialize all values as 0
int count1[NO_OF_CHARS] = {0};
int count2[NO_OF_CHARS] = {0};
int i;
// For each character in input strings, increment count in
// the corresponding count array
for (i = 0; str1[i] && str2[i]; i++)
{
count1[str1[i]]++;
count2[str2[i]]++;
}
// If both strings are of different length. Removing this condition
// will make the program fail for strings like "aaca" and "aca"
if (str1[i] || str2[i])
return false;
// Compare count arrays
for (i = 0; i < NO_OF_CHARS; i++)
if (count1[i] != count2[i])
return false;
return true;
}

I see 2 main approaches below:
Sort then compare
Count the occurrences of each letter
It's interesting to see that Suraj's nice solution got one point (by me, at the time of writing) but a sort one got 22. The explanation is that performance wasn't in people's mind - and that's fine for short strings.
The sort implementation is only 3 lines long, but the counting one beats it square for long strings. It is much faster (O(N) versus O(NlogN)).
Got the following results with 500 MBytes long strings.
Sort - 162.8 secs
Count - 2.864 secs
Multi threaded Count - 3.321 secs
The multi threaded attempt was a naive one that tried to double the speed by counting in separate threads, one for each string. Memory access is the bottleneck and this is an example where multi threading makes things a bit worse.
I would be happy to see some idea that would speed up the count solution (think by someone good with memory latency issues, caches).

#include<stdio.h>
#include<string.h>
int is_anagram(char* str1, char* str2){
if(strlen(str1)==strspn(str1,str2) && strlen(str1)==strspn(str2,str1) &&
strlen(str1)==strlen(str2))
return 1;
return 0;
}
int main(){
char* str1 = "stream";
char* str2 = "master";
if(is_anagram(str1,str2))
printf("%s and %s are anagram to each other",str1,str2);
else
printf("%s and %s are not anagram to each other",str1,str2);
return 0;
}

#include<iostream>
#include<unordered_map>
using namespace std;
int checkAnagram (string &str1, string &str2)
{
unordered_map<char,int> count1, count2;
unordered_map<char,int>::iterator it1, it2;
int isAnagram = 0;
if (str1.size() != str2.size()) {
return -1;
}
for (unsigned int i = 0; i < str1.size(); i++) {
if (count1.find(str1[i]) != count1.end()){
count1[str1[i]]++;
} else {
count1.insert(pair<char,int>(str1[i], 1));
}
}
for (unsigned int i = 0; i < str2.size(); i++) {
if (count2.find(str2[i]) != count2.end()) {
count2[str2[i]]++;
} else {
count2.insert(pair<char,int>(str2[i], 1));
}
}
for (unordered_map<char, int>::iterator itUm1 = count1.begin(); itUm1 != count1.end(); itUm1++) {
unordered_map<char, int>::iterator itUm2 = count2.find(itUm1->first);
if (itUm2 != count2.end()) {
if (itUm1->second != itUm2->second){
isAnagram = -1;
break;
}
}
}
return isAnagram;
}
int main(void)
{
string str1("WillIamShakespeare");
string str2("IamaWeakishSpeller");
cout << "checkAnagram() for " << str1 << "," << str2 << " : " << checkAnagram(str1, str2) << endl;
return 0;
}

It's funny how sometimes the best questions are the simplest.
The problem here is how to deduce whether two words are anagrams - a word being essentially an unsorted multiset of chars.
We know we have to sort, but ideally we'd want to avoid the time-complexity of sort.
It turns out that in many cases we can eliminate many words that are dissimilar in linear time by running through them both and XOR-ing the character values into an accumulator. The total XOR of all characters in both strings must be zero if both strings are anagrams, regardless of ordering. This is because anything xored with itself becomes zero.
Of course the inverse is not true. Just because the accumulator is zero does not mean we have an anagram match.
Using this information, we can eliminate many non-anagrams without a sort, short-circuiting at least the non-anagram case.
#include <iostream>
#include <string>
#include <algorithm>
//
// return a sorted copy of a string
//
std::string sorted(std::string in)
{
std::sort(in.begin(), in.end());
return in;
}
//
// check whether xor-ing the values in two ranges results in zero.
// #pre first2 addresses a range that is at least as big as (last1-first1)
//
bool xor_is_zero(std::string::const_iterator first1,
std::string::const_iterator last1,
std::string::const_iterator first2)
{
char x = 0;
while (first1 != last1) {
x ^= *first1++;
x ^= *first2++;
}
return x == 0;
}
//
// deduce whether two strings are the same length
//
bool same_size(const std::string& l, const std::string& r)
{
return l.size() == r.size();
}
//
// deduce whether two words are anagrams of each other
// I have passed by const ref because we may not need a copy
//
bool is_anagram(const std::string& l, const std::string& r)
{
return same_size(l, r)
&& xor_is_zero(l.begin(), l.end(), r.begin())
&& sorted(l) == sorted(r);
}
// test
int main() {
using namespace std;
auto s1 = "apple"s;
auto s2 = "eppla"s;
cout << is_anagram(s1, s2) << '\n';
s2 = "pppla"s;
cout << is_anagram(s1, s2) << '\n';
return 0;
}
expected:
1
0

Try this:
// Anagram. Two words are said to be anagrams of each other if the letters from one word can be rearranged to form the other word.
// From the above definition it is clear that two strings are anagrams if all characters in both strings occur same number of times.
// For example "xyz" and "zxy" are anagram strings, here every character 'x', 'y' and 'z' occur only one time in both strings.
#include <map>
#include <string>
#include <cctype>
#include <iostream>
#include <algorithm>
#include <unordered_map>
using namespace std;
bool IsAnagram_1( string w1, string w2 )
{
// Compare string lengths
if ( w1.length() != w2.length() )
return false;
sort( w1.begin(), w1.end() );
sort( w2.begin(), w2.end() );
return w1 == w2;
}
map<char, size_t> key_word( const string & w )
{
// Declare a map which is an associative container that will store a key value and a mapped value pairs
// The key value is a letter in a word and the maped value is the number of times this letter appears in the word
map<char, size_t> m;
// Step over the characters of string w and use each character as a key value in the map
for ( auto & c : w )
{
// Access the mapped value directly by its corresponding key using the bracket operator
++m[toupper( c )];
}
return ( m );
}
bool IsAnagram_2( const string & w1, const string & w2 )
{
// Compare string lengths
if ( w1.length() != w2.length() )
return false;
return ( key_word( w1 ) == key_word( w2 ) );
}
bool IsAnagram_3( const string & w1, const string & w2 )
{
// Compare string lengths
if ( w1.length() != w2.length() )
return false;
// Instantiate a count map, std::unordered_map<char, unsigned int> m
unordered_map<char, size_t> m;
// Loop over the characters of string w1 incrementing the count for each character
for ( auto & c : w1 )
{
// Access the mapped value directly by its corresponding key using the bracket operator
++m[toupper(c)];
}
// Loop over the characters of string w2 decrementing the count for each character
for ( auto & c : w2 )
{
// Access the mapped value directly by its corresponding key using the bracket operator
--m[toupper(c)];
}
// Check to see if the mapped values are all zeros
for ( auto & c : w2 )
{
if ( m[toupper(c)] != 0 )
return false;
}
return true;
}
int main( )
{
string word1, word2;
cout << "Enter first word: ";
cin >> word1;
cout << "Enter second word: ";
cin >> word2;
if ( IsAnagram_1( word1, word2 ) )
cout << "\nAnagram" << endl;
else
cout << "\nNot Anagram" << endl;
if ( IsAnagram_2( word1, word2 ) )
cout << "\nAnagram" << endl;
else
cout << "\nNot Anagram" << endl;
if ( IsAnagram_3( word1, word2 ) )
cout << "\nAnagram" << endl;
else
cout << "\nNot Anagram" << endl;
system("pause");
return 0;
}

In this approach I took care of empty strings and repeated characters as well. Enjoy it and comment any limitation.
#include <iostream>
#include <map>
#include <string>
using namespace std;
bool is_anagram( const string a, const string b ){
std::map<char, int> m;
int count = 0;
for (int i = 0; i < a.length(); i++) {
map<char, int>::iterator it = m.find(a[i]);
if (it == m.end()) {
m.insert(m.begin(), pair<char, int>(a[i], 1));
} else {
m[a[i]]++;
}
}
for (int i = 0; i < b.length(); i++) {
map<char, int>::iterator it = m.find(b[i]);
if (it == m.end()) {
m.insert(m.begin(), pair<char, int>(b[i], 1));
} else {
m[b[i]]--;
}
}
if (a.length() <= b.length()) {
for (int i = 0; i < a.length(); i++) {
if (m[a[i]] >= 0) {
count++;
} else
return false;
}
if (count == a.length() && a.length() > 0)
return true;
else
return false;
} else {
for (int i = 0; i < b.length(); i++) {
if (m[b[i]] >= 0) {
count++;
} else {
return false;
}
}
if (count == b.length() && b.length() > 0)
return true;
else
return false;
}
return true;
}

Check if the two strings have identical counts for each unique char.
bool is_Anagram_String(char* str1,char* str2){
int first_len=(int)strlen(str1);
int sec_len=(int)strlen(str2);
if (first_len!=sec_len)
return false;
int letters[256] = {0};
int num_unique_chars = 0;
int num_completed_t = 0;
for(int i=0;i<first_len;++i){
int char_letter=(int)str1[i];
if(letters[char_letter]==0)
++num_unique_chars;
++letters[char_letter];
}
for (int i = 0; i < sec_len; ++i) {
int c = (int) str2[i];
if (letters[c] == 0) { // Found more of char c in t than in s.
return false;
}
--letters[c];
if (letters[c] == 0) {
++num_completed_t;
if (num_completed_t == num_unique_chars) {
// it’s a match if t has been processed completely
return i == sec_len - 1;
}
}
}
return false;}

#include <iostream>
#include <string.h>
using namespace std;
const int MAX = 100;
char cadA[MAX];
char cadB[MAX];
bool chrLocate;
int i,m,n,j, contaChr;
void buscaChr(char [], char []);
int main() {
cout << "Ingresa CadA: ";
cin.getline(cadA, sizeof(cadA));
cout << "Ingresa CadB: ";
cin.getline(cadB, sizeof(cadA));
if ( strlen(cadA) == strlen(cadB) ) {
buscaChr(cadA,cadB);
} else {
cout << "No son Anagramas..." << endl;
}
return 0;
}
void buscaChr(char a[], char b[]) {
j = 0;
contaChr = 0;
for ( i = 0; ( (i < strlen(a)) && contaChr < 2 ); i++ ) {
for ( m = 0; m < strlen(b); m++ ) {
if ( a[i] == b[m]) {
j++;
contaChr++;
a[i] = '-';
b[m] = '+';
} else { contaChr = 0; }
}
}
if ( j == strlen(a)) {
cout << "SI son Anagramas..." << endl;
} else {
cout << "No son Anagramas..." << endl;
}
}

Your algorithm is incorrect. You're checking each character in the first word to see how many times that character appears in the second word. If the two words were 'aaaa', and 'aaaa', then that would give you a count of 16. A small alteration to your code would allow it to work, but give a complexity of N^2 as you have a double loop.
for(i=0;i<n1;i++)
for(j=0;j<n2;j++)
if(str1[i]==str2[j])
++c, str2[j] = 0; // 'cross off' letters as they are found.
I done some tests with anagram comparisons. Comparing two strings of 72 characters each (the strings are always true anagrams to get maximum number of comparisons), performing 256 same-tests with a few different STL containers...
template<typename STORAGE>
bool isAnagram(const string& s1, const string& s2, STORAGE& asciiCount)
{
for(auto& v : s1)
{
asciiCount[v]++;
}
for(auto& v : s2)
{
if(--asciiCount[static_cast<unsigned char>(v)] == -1)
{
return false;
}
}
return true;
}
Where STORAGE asciiCount =
map<char, int> storage; // 738us
unordered_map<char, int> storage; // 260us
vector<int> storage(256); // 43us
// g++ -std=c++17 -O3 -Wall -pedantic
This is the fastest I can get.
These are crude tests using coliru online compiler + and std::chrono::steady_clock::time_point for measurements, however they give a general idea of performance gains.
vector has the same performance, uses only 256 bytes, although strings are limited to 255 characters in length (also change to: --asciiCount[static_cast(v)] == 255 for unsigned char counting).
Assuming vector is the fastest. An improvement would be to just allocate a C style array unsigned char asciiCount[256]; on the stack (since STL containers allocate their memory dynamically on the heap)
You could probably reduce this storage to 128 bytes, 64 or even 32 bytes (ascii chars are typically in range 0..127, while A-Z+a-z 64.127, and just upper or lower case 64..95 or 96...127) although not sure what gains would be found from fitting this inside a cache line or half.
Any better ways to do this? For Speed, Memory, Code Elegance?

1. Simple and fast way with deleting matched characters
bool checkAnagram(string s1, string s2) {
for (char i : s1) {
unsigned int pos = s2.find(i,0);
if (pos != string::npos) {
s2.erase(pos,1);
} else {
return false;
}
}
return s2.empty();
}
2. Conversion to prime numbers. Beautiful but very expensive, requires special Big Integer type for long strings.
// https://en.wikipedia.org/wiki/List_of_prime_numbers
int primes[255] = {2, 3, 5, 7, 11, 13, 17, 19, ... , 1613};
bool checkAnagramPrimes(string s1, string s2) {
long c1 = 1;
for (char i : s1) {
c1 = c1 * primes[i];
}
long c2 = 1;
for (char i : s2) {
c2 = c2 * primes[i];
if (c2 > c1) {
return false;
}
}
return c1 == c2;
}

string key="listen";
string key1="silent";
string temp=key1;
int len=0;
//assuming both strings are of equal length
for (int i=0;i<key.length();i++){
for (int j=0;j<key.length();j++){
if(key[i]==temp[j]){
len++;
temp[j] = ' ';//to deal with the duplicates
break;
}
}
}
cout << (len==key.length()); //if true: means the words are anagrams

Instead of using dot h header which is deprecated in modern c++.
Try this solution.
#include <iostream>
#include <string>
#include <map>
int main(){
std::string word_1 {};
std::cout << "Enter first word: ";
std::cin >> word_1;
std::string word_2 {};
std::cout << "Enter second word: ";
std::cin >> word_2;
if(word_1.length() == word_2.length()){
std::map<char, int> word_1_map{};
std::map<char, int> word_2_map{};
for(auto& c: word_1)
word_1_map[std::tolower(c)]++;
for(auto& c: word_2)
word_2_map[std::tolower(c)]++;
if(word_1_map == word_2_map){
std::cout << "Anagrams" << std::endl;
}
else{
std::cout << "Not Anagrams" << std::endl;
}
}else{
std::cout << "Length Mismatch" << std::endl;
}
}

#include <bits/stdc++.h>
using namespace std;
#define NO_OF_CHARS 256
int main()
{ bool ans = true;
string word1 = "rest";
string word2 = "tesr";
unordered_map<char,int>maps;
for(int i = 0 ; i <5 ; i++)
{
maps[word1[i]] +=1;
}
for(int i = 0 ; i <5 ; i++)
{
maps[word2[i]]-=1 ;
}
for(auto i : maps)
{
if(i.second!=0)
{
ans = false;
}
}
cout<<ans;
}

Well if you don't want to sort than this code will give you perfect output.
#include <iostream>
using namespace std;
int main(){
string a="gf da";
string b="da gf";
int al,bl;
int counter =0;
al =a.length();
bl =b.length();
for(int i=0 ;i<al;i++){
for(int j=0;j<bl;j++){
if(a[i]==b[j]){
if(j!=bl){
b[j]=b[b.length()-counter-1];
bl--;
counter++;
break;
}else{
bl--;
counter++;
}
}
}
}
if(counter==al){
cout<<"true";
}
else{
cout<<"false";
}
return 0;
}

Here is the simplest and fastest way to check for anagrams
bool anagram(string a, string b) {
int a_sum = 0, b_sum = 0, i = 0;
while (a[i] != '\0') {
a_sum += (int)a[i]; // (int) cast not necessary
b_sum += (int)b[i];
i++;
}
return a_sum == b_sum;
}
Simply adds the ASCII values and checks if the sums are equal.
For example:
string a = "nap" and string b = "pan"
a_sum = 110 + 97 + 112 = 319
b_sum = 112 + 97 + 110 = 319

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Binary search on C++ string does not work - c++

Related

Displaying sets and subsets of a string

Combination of unique elements without repetition

What's wrong with my dynamic programming algorithm with memoization?

Using vectors to solve the anagrams in C++

Check whether two strings are anagrams using C++

Categories

Resources