finding the maximum subsequence of string? - c++

I am new to programming. I have been struggling with this programme for a long time.
Question:
what logic error am I having with my code?
Problem:
need to find the maximum subsequence of a string. The length of the required subsequence, and the content of the string are from input.
for example, maximum subsequence with length 3 of string "abcde" is "cde".
the subsequence retrieved should have the same order as the original string
EDIT: A subsequence is a subset of the input string "I" arranged in the original order.
The maximum subsequence in this question is the largest one (in alphabetical order) of those subsequences with length K.
For instance, why in the case AbCd687fs 4 the subsequence is not"bdfs" but "d8fs"? The reason is that "d8fs" is larger than "bdfs" in alphabetical order
for 1265432 2. You can get some subsequence with length 2, e.g., 12, 16, 15, 14, 13, 26, 25, 24, 23, 22, 65, 64, 63,62,54,... . And in alphabetical order, the subsequence "65" is the maximum.
For AbCd687fs 4. You can get some subsequence with length 4, e.g., AbCd, AbC6, bCd6, bC8s,d687, d87f, d8fs, d7fs,... . And in alphabetical order, the subsequence " d8fs" is the maximum.
My approach:
initialize string buffer with same length of I, filled with '*' : string buffer(I.length(),'*')
inner for loop to find the largest character in string I
replace the character into the string buffer with the same index position of the character in string I
remove the current largest character in string I. Then go through the for loop again to find the next largest character in string I.
while loop with number of iterations same as value of K to run the for loop K times
When the while loop ends, remove all the '*' from string buffer. The remaining content should be the maximum subsequence only.
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
int main()
{
string I;
cout << "Please input a string:" << endl;
cin >> I;
int K;
cout << "Please input the length of subsequence:" << endl;
cin >> K;
string buffer(I.length(), '*');
int pos;
while (K>0)
{
char vMax = I[0];
for (int i = 0; i < I.length(); i++)
{
if (I[i] > vMax)
{
vMax = I[i];
}
}
pos = I.find(vMax);
//buffer.insert(pos, I, pos, 1);
buffer.replace(pos, 1, I, pos, 1);
I.erase(remove(I.begin(), I.end(), vMax), I.end());
K--;
}
buffer.erase(remove(buffer.begin(), buffer.end(), '*'), buffer.end());
cout << "The maximum subsequence is: ";
for (int i = 0; i < buffer.length(); i++)
{
cout << buffer[i];
}
}

Your current code has a big problem since you erase characters from I and therefore can't calculate the pos in the original string properly.
I suggest that you use std::max_element to get an iterator to the char with the largest value instead of doing the search manually.
Here's how to fix it using AbCd687fs 4 as an example:
Search for the char with the largest value in the range AbCd68 (7fs not included because we must be able to find 4 characters in total). d is found.
Search the range 687 (AbCd and fs are not included), 8 is found.
Search the range 7f (AbCd68 and s are not included). f is found.
Search the range s (AbCd687f are not included). s is found.
Implemented:
#include <algorithm> // std::max_element
#include <iterator> // std::prev
std::string get_substr(const std::string& I, size_t K) {
std::string buffer;
buffer.reserve(K);
for(auto maxit = I.begin(); K --> 0; ++maxit) {
maxit = std::max_element(maxit, std::prev(I.end(), K));
buffer += *maxit;
}
return buffer;
}
Demo

I made modifications to your code and this should most probably help you:
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
int main()
{
string I;
cout << "Please input a string:" << endl;
cin >> I;
int K;
cout << "Please input the length of subsequence:" << endl;
cin >> K;
string buffer(I.length(), '*');
int pos;
while (K>0)
{
char vMax = I[0];
for (int i = 0; i < I.length(); i++)
{
if (I[i] > vMax)
{
vMax = I[i];
}
}
pos = I.find(vMax);
//buffer.insert(pos, I, pos, 1);
buffer.at(pos) = vMax;
I.at(pos) = ' '; //Space has the least ascii value in printable characters
//buffer.replace(pos, 1, I, pos, 1);
//I.erase(remove(I.begin(), I.end(), vMax), I.end());
K--;
}
buffer.erase(remove(buffer.begin(), buffer.end(), '*'), buffer.end());
cout << "The maximum subsequence is: ";
for (int i = 0; i < buffer.length(); i++)
{
cout << buffer[i];
}
}

Reading code which is in one single function is hard. So my advice for the future, split code into smaller pieces.
Instead reading your code I've wrote my solution:
using Histogram = std::array<size_t, 128>;
auto makeFullHistogram(const std::string& s)
{
Histogram r{};
for (auto ch : s) ++r[ch];
return r;
}
void clipLeadingValuesOfHistogram(Histogram& hist, size_t len)
{
for (auto it = hist.rbegin(); it != hist.rend(); ++it) {
if (len) {
if (len < *it) {
*it = len;
}
len -= *it;
} else {
*it = 0;
}
}
}
auto findLeadingValuesHistogram(const std::string& s, size_t len)
{
auto hist = makeFullHistogram(s);
clipLeadingValuesOfHistogram(hist, len);
return hist;
}
std::string bestSubstring(const std::string& s, size_t len)
{
std::string r;
r.reserve(len);
auto hist = findLeadingValuesHistogram(s, len);
for (auto ch : s) {
if (hist[ch]) {
--hist[ch];
r += ch;
if (r.size() == len) break;
}
}
return r;
}
Learning writing tests is also helpful:
https://godbolt.org/z/93aY4aafs

Related

How to decrease memory usage and make code executed faster than 2s?

My task is:
Implement a binary search on an array of numbers sorted in non-decreasing order.
It is forbidden to use ready-made binary search functions from standard libraries.
The first line contains an integer n — the number of numbers in the array 1 <= n <= 3*10^5. The second line contains n numbers of the array separated by a space. All numbers are integers and belong to the interval from -2^31 to 2^31 inclusive. The numbers in the array are sorted in non-decreasing order. The third line contains an integer k — the number of requests 1 <= k <= 3*10^5. The fourth line contains k space-separated integers-requests from -2^31 to 2^31 - 1 inclusive.
For each query number x on a separate line print numbers b, l and r separated by a space, where:
b is equal to 1 if x is present in the array, or 0 otherwise;
l is the index of the first element greater than or equal to x;
r is the index of the first element greater than x.
Array elements are numbered with indices from 0 to n-1. If there are no suitable elements in the array, we will agree that the returned value will be equal to n.
Input example:
1
1
3
0 1 2
Output for the input above must be:
0 0 0
1 0 1
0 1 1
Here is my code for the task above:
#include <algorithm>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
using string = std::string;
using stringstream = std::stringstream;
template <typename T> using vector = std::vector<T>;
int string_to_int(stringstream& stream, const string &value) {
int result;
stream << value;
stream >> result;
stream.clear();
return result;
}
template <typename T> int binary_search(const vector<T> &source, int item) {
int low = 0;
int high = source.size() - 1;
while (low <= high) {
int middle_index = low + (high - low) / 2;
int middle = source[middle_index];
if (middle == item)
return middle_index;
if (item < middle)
high = middle_index - 1;
else
low = middle_index + 1;
}
return source.size();
}
void split_string_to_words(const string &source, vector<string> &words) {
string temp;
words.reserve(100);
for (int i = 0; i < source.length(); ++i) {
if (source[i] == ' ') {
words.push_back(std::move(temp));
temp.clear();
} else
temp.push_back(source[i]);
}
words.push_back(temp);
}
int main() {
stringstream stream;
string line;
getline(std::cin, line);
int item_count = string_to_int(stream, line);
getline(std::cin, line);
vector<string> string_items;
split_string_to_words(line, string_items);
vector<int> items(item_count);
std::transform(string_items.begin(), string_items.end(), items.begin(),
[&](string number) { return string_to_int(stream, number); });
getline(std::cin, line);
int search_item_count = string_to_int(stream, line);
getline(std::cin, line);
vector<string> search_string_items;
split_string_to_words(line, search_string_items);
vector<int> search_items(search_item_count);
std::transform(search_string_items.begin(), search_string_items.end(),
search_items.begin(),
[&](string number) { return string_to_int(stream, number); });
for (auto item : search_items) {
int index = binary_search(items, item);
std::cout << (1 - (index == items.size())) << " ";
int l = 0;
while (l < items.size() && items[l] < item)
l++;
int r = l;
while (r < items.size() && items[r] <= item)
r++;
std::cout << l << " " << r << std::endl;
}
}
I don't know how to speed up my code. It exceeds 2s on some test cases (but the input data is not shown in the iRunner2).
Note that stoi doesn't work in iRunner2.

find the maximum number of words in a sentence from a paragraph with C++

I am trying to find out the maximum number of words in a sentence (Separated by a dot) from a paragraph. and I am completely stuck into how to sort and output to stdout.
Eg:
Given a string S: {"Program to split strings. By using custom split function. In C++"};
The expected output should be : 5
#define max 8 // define the max string
string strings[max]; // define max string
string words[max];
int count = 0;
void split (string str, char seperator) // custom split() function
{
int currIndex = 0, i = 0;
int startIndex = 0, endIndex = 0;
while (i <= str.size())
{
if (str[i] == seperator || i == str.size())
{
endIndex = i;
string subStr = "";
subStr.append(str, startIndex, endIndex - startIndex);
strings[currIndex] = subStr;
currIndex += 1;
startIndex = endIndex + 1;
}
i++;
}
}
void countWords(string str) // Count The words
{
int count = 0, i;
for (i = 0; str[i] != '\0';i++)
{
if (str[i] == ' ')
count++;
}
cout << "\n- Number of words in the string are: " << count +1 <<" -";
}
//Sort the array in descending order by the number of words
void sortByWordNumber(int num[30])
{
/* CODE str::sort? std::*/
}
int main()
{
string str = "Program to split strings. By using custom split function. In C++";
char seperator = '.'; // dot
int numberOfWords;
split(str, seperator);
cout <<" The split string is: ";
for (int i = 0; i < max; i++)
{
cout << "\n initial array index: " << i << " " << strings[i];
countWords(strings[i]);
}
return 0;
}
Count + 1 in countWords() is giving the numbers correctly only on the first result then it adds the " " whitespace to the word count.
Please take into consideration answering with the easiest solution to understand first. (std::sort, making a new function, lambda)
Your code does not make a sense. For example the meaning of this declaration
string strings[max];
is unclear.
And to find the maximum number of words in sentences of a paragraph there is no need to sort the sentences themselves by the number of words.
If I have understood correctly what you need is something like the following.
#include <iostream>
#include <sstream>
#include <iterator>
int main()
{
std::string s;
std::cout << "Enter a paragraph of sentences: ";
std::getline( std::cin, s );
size_t max_words = 0;
std::istringstream is( s );
std::string sentence;
while ( std::getline( is, sentence, '.' ) )
{
std::istringstream iss( sentence );
auto n = std::distance( std::istream_iterator<std::string>( iss ),
std::istream_iterator<std::string>() );
if ( max_words < n ) max_words = n;
}
std::cout << "The maximum number of words in sentences is "
<< max_words << '\n';
return 0;
}
If to enter the paragraph
Here is a paragraph. It contains several sentences. For example, how to use string streams.
then the output will be
The maximum number of words in sentences is 7
If you are not yet familiar with string streams then you could use member functions find, find_first_of, find_first_not_of with objects of the type std::string to split a string into sentences and to count words in a sentence.
Your use case sounds like a reduction. Essentially you can have a state machine (parser) that goes through the string and updates some state (e.g. counters) when it encounters the word and sentence delimiters. Special care should be given for corner cases, e.g. when having continuous multiple white-spaces or >1 continous full stops (.). A reduction handling these cases is shown below:
int max_words_in(std::string const& str)
{
// p is the current and max word count.
auto parser = [in_space = false] (std::pair<int, int> p, char c) mutable {
switch (c) {
case '.': // Sentence ends.
if (!in_space && p.second <= p.first) p.second = p.first + 1;
p.first = 0;
in_space = true;
break;
case ' ': // Word ends.
if (!in_space) ++p.first;
in_space = true;
break;
default: // Other character encountered.
in_space = false;
}
return p; // Return the updated accumulation value.
};
return std::accumulate(
str.begin(), str.end(), std::make_pair(0, 0), parser).second;
}
Demo
The tricky part is deciding how to handle degenerate cases, e.g. what should the output be for "This is a , ,tricky .. .. string to count" where different types of delimiters alternate in arbitrary ways. Having a state machine implementation of the parsing logic allows you to easily adjust your solution (e.g. you can pass an "ignore list" to the parser and update the default case to not reset the in_space variable when c belongs to that list).
vector<string> split(string str, char seperator) // custom split() function
{
size_t i = 0;
size_t seperator_pos = 0;
vector<string> sentences;
int word_count = 0;
for (; i < str.size(); i++)
{
if (str[i] == seperator)
{
i++;
sentences.push_back(str.substr(seperator_pos, i - seperator_pos));
seperator_pos = i;
}
}
if (str[str.size() - 1] != seperator)
{
sentences.push_back(str.substr(seperator_pos + 1, str.size() - seperator_pos));
}
return sentences;
}

Checking whether a String is a Lapindrome or not [duplicate]

This question already has answers here:
How to find whether the string is a Lapindrome? [closed]
(2 answers)
Closed 2 years ago.
The question is to check whether a given string is a lapindrome or not(CodeChef). According to the question, Lapindrome is defined as a string which when split in the middle, gives two halves having the same characters and same frequency of each character.
I have tried solving the problem using C++ with the code below
#include <iostream>
#include<cstring>
using namespace std;
bool lapindrome(char s[],int len){
int firstHalf=0,secondHalf=0;
char c;
for(int i=0,j=len-1;i<j;i++,j--){
firstHalf += int(s[i]);
secondHalf += int(s[j]);
}
if(firstHalf == secondHalf){
return true;
}
else
return false;
}
int main() {
// your code goes here
int t,len;
bool result;
char s[1000];
cin>>t;
while(t){
cin>>s;
len = strlen(s);
result = lapindrome(s,len);
if(result == true)
cout<<"YES"<<endl;
else
cout<<"NO"<<endl;
--t;
}
return 0;
}
I have taken two count variables which will store the sum of ascii code of characters from first half and second half. Then those two variables are compared to check whether both the halves are equal or not.
I have tried the code on a couple of custom inputs and it works fine. But after I submit the code, the solution seems to be wrong.
Replace the lapindrome function to this one:
bool isLapindrome(std::string str)
{
int val1[MAX] = {0};
int val2[MAX] = {0};
int n = str.length();
if (n == 1)
return true;
for (int i = 0, j = n - 1; i < j; i++, j--)
{
val1[str[i] - 'a']++;
val2[str[j] - 'a']++;
}
for (int i = 0; i < MAX; i++)
if (val1[i] != val2[i])
return false;
return true;
}
Example Output
Input a string here: asdfsasd
The string is NOT a lapindrome.
---
Input a string here: asdfsdaf
The string is a lapindrome.
Enjoy!
You're not counting frequencies of the characters, only their sum. You could simply split the string into halves, create two maps for character frequencies of both sides e.g. std::map containing the count for each character. Then You can compare both maps with something like std::equal to check the complete equality of the maps (to see whether the halves are the same in terms of character frequency).
Instead of counting the frequency of characters (in the two halfs of input string) in two arrays or maps, it's actually sufficient to count them in one as well.
For this, negative counts have to be allowed.
Sample code:
#include <iostream>
#include <string>
#include <unordered_map>
bool isLapindrome(const std::string &text)
{
std::unordered_map<unsigned char, int> freq;
// iterate until index (growing from begin) and
// 2nd index (shrinking from end) cross over
for (size_t i = 0, j = text.size(); i < j--; ++i) {
++freq[(unsigned char)text[i]]; // count characters of 1st half positive
--freq[(unsigned char)text[j]]; // count characters of 2nd half negative
}
// check whether positive and negative counts didn't result in 0
// for at least one counted char
for (const std::pair<unsigned char, int> &entry : freq) {
if (entry.second != 0) return false;
}
// Otherwise, the frequencies were balanced.
return true;
}
int main()
{
auto check = [](const std::string &text) {
std::cout << '\'' << text << "': "
<< (isLapindrome(text) ? "yes" : "no")
<< '\n';
};
check("");
check("abaaab");
check("gaga");
check("abccab");
check("rotor");
check("xyzxy");
check("abbaab");
}
Output:
'': yes
'abaaab': yes
'gaga': yes
'abccab': yes
'rotor': yes
'xyzxy': yes
'abbaab': no
Live Demo on coliru
Note:
About the empty input string, I was a bit uncertain. If it's required to not to count as Lapindrome then an additional check is needed in isLapindrome(). This could be achieved with changing the final
return true;
to
return !text.empty(); // Empty input is considered as false.
The problem with your code was, that you only compare the sum of the characters. What's meant by frequency is that you have to count the occurrence of each character. Instead of counting frequencies in maps like in the other solutions here, you can simply sort and compare the two strings.
#include <iostream>
#include <string>
#include <algorithm>
bool lapindrome(const std::string& s) {
// true if size = 1, false if size = 0
if(s.size() <= 1) return (s.size());
std::string first_half = s.substr(0, s.size() / 2);
std::sort(first_half.begin(), first_half.end());
std::string second_half = s.substr(s.size() / 2 + s.size() % 2);
std::sort(second_half.begin(), second_half.end());
return first_half == second_half;
}
// here's a shorter hacky alternative:
bool lapindrome_short(std::string s) {
if (s.size() <= 1) return (s.size());
int half = s.size() / 2;
std::sort(s.begin(), s.begin() + half);
std::sort(s.rbegin(), s.rbegin() + half); // reverse half
return std::equal(s.begin(), s.begin() + half, s.rbegin());
}
int main() {
int count;
std::string input;
std::cin >> count;
while(count--) {
std::cin >> input;
std::cout << input << ": "
<< (lapindrome(input) ? "YES" : "NO") << std::endl;
}
return 0;
}
Live Demo

Compute the length of a longest common substring of two given 0 - 1 strings

Full title:
"Compute the length of a longest common sub-string of two given 0 - 1 strings. Input format has at least two test cases, each consisting of two non-empty 0-1 strings of lengths at most 100.The input terminates on EOF"
Here is one of my Homework, I did found out the way to compute the length of a longest common sub-string of two given 0-1 strings but I don't know how to input many test cases at ones.
Please help me if you guys have any solution for this problem.
This is my code :
#include <string>
using namespace std;
string A,B;
int lcs(int i, int j, int count)
{
if (i == 0 || j == 0)
return count;
if (A[i-1] == B[j-1])
{
count = lcs(i - 1, j - 1, count + 1);
}
count = max(count, max(lcs( i, j - 1, 0), lcs( i - 1, j, 0)));
return count;
}
int main()
{
int n,m;
cout << "Input String A and B \n";
cin >> A; cin >> B;
n=A.size();
m=B.size();
cout<< "Longest common substring "<< lcs(n,m,0) << endl;
return 0;
}
int solveYourProblemFunc(string str1, string str2) {
/* your code */
}
int main() {
int testCount;
cin >> testCount;
vector<int> results;
while(testCount--) {
string str1, str2;
getline(cin, str1);
getline(cin, str2);
int res = solveYourProblemFunc();
results.push_back(res);
}
/* output results */
}

Check whether two strings are anagrams using C++

The program below I came up with for checking whether two strings are anagrams. Its working fine for small string but for larger strings ( i tried : listened , enlisted ) Its giving me a 'no !'
Help !
#include<iostream.h>
#include<string.h>
#include<stdio.h>
int main()
{
char str1[100], str2[100];
gets(str1);
gets(str2);
int i,j;
int n1=strlen(str1);
int n2=strlen(str2);
int c=0;
if(n1!=n2)
{
cout<<"\nThey are not anagrams ! ";
return 0;
}
else
{
for(i=0;i<n1;i++)
for(j=0;j<n2;j++)
if(str1[i]==str2[j])
++c;
}
if(c==n1)
cout<<"yes ! anagram !! ";
else
cout<<"no ! ";
system("pause");
return 0;
}
I am lazy, so I would use standard library functionality to sort both strings and then compare them:
#include <string>
#include <algorithm>
bool is_anagram(std::string s1, std::string s2)
{
std::sort(s1.begin(), s1.end());
std::sort(s2.begin(), s2.end());
return s1 == s2;
}
A small optimization could be to check that the sizes of the strings are the same before sorting.
But if this algorithm proved to be a bottle-neck, I would temporarily shed some of my laziness and compare it against a simple counting solution:
Compare string lengths
Instantiate a count map, std::unordered_map<char, unsigned int> m
Loop over s1, incrementing the count for each char.
Loop over s2, decrementing the count for each char, then check that the count is 0
The algorithm also fails when asked to find if aa and aa are anagrams. Try tracing the steps of the algorithm mentally or in a debugger to find why; you'll learn more that way.
By the way.. The usual method for finding anagrams is counting how many times each letter appears in the strings. The counts should be equal for each letter. This approach has O(n) time complexity as opposed to O(n²).
bool areAnagram(char *str1, char *str2)
{
// Create two count arrays and initialize all values as 0
int count1[NO_OF_CHARS] = {0};
int count2[NO_OF_CHARS] = {0};
int i;
// For each character in input strings, increment count in
// the corresponding count array
for (i = 0; str1[i] && str2[i]; i++)
{
count1[str1[i]]++;
count2[str2[i]]++;
}
// If both strings are of different length. Removing this condition
// will make the program fail for strings like "aaca" and "aca"
if (str1[i] || str2[i])
return false;
// Compare count arrays
for (i = 0; i < NO_OF_CHARS; i++)
if (count1[i] != count2[i])
return false;
return true;
}
I see 2 main approaches below:
Sort then compare
Count the occurrences of each letter
It's interesting to see that Suraj's nice solution got one point (by me, at the time of writing) but a sort one got 22. The explanation is that performance wasn't in people's mind - and that's fine for short strings.
The sort implementation is only 3 lines long, but the counting one beats it square for long strings. It is much faster (O(N) versus O(NlogN)).
Got the following results with 500 MBytes long strings.
Sort - 162.8 secs
Count - 2.864 secs
Multi threaded Count - 3.321 secs
The multi threaded attempt was a naive one that tried to double the speed by counting in separate threads, one for each string. Memory access is the bottleneck and this is an example where multi threading makes things a bit worse.
I would be happy to see some idea that would speed up the count solution (think by someone good with memory latency issues, caches).
#include<stdio.h>
#include<string.h>
int is_anagram(char* str1, char* str2){
if(strlen(str1)==strspn(str1,str2) && strlen(str1)==strspn(str2,str1) &&
strlen(str1)==strlen(str2))
return 1;
return 0;
}
int main(){
char* str1 = "stream";
char* str2 = "master";
if(is_anagram(str1,str2))
printf("%s and %s are anagram to each other",str1,str2);
else
printf("%s and %s are not anagram to each other",str1,str2);
return 0;
}
#include<iostream>
#include<unordered_map>
using namespace std;
int checkAnagram (string &str1, string &str2)
{
unordered_map<char,int> count1, count2;
unordered_map<char,int>::iterator it1, it2;
int isAnagram = 0;
if (str1.size() != str2.size()) {
return -1;
}
for (unsigned int i = 0; i < str1.size(); i++) {
if (count1.find(str1[i]) != count1.end()){
count1[str1[i]]++;
} else {
count1.insert(pair<char,int>(str1[i], 1));
}
}
for (unsigned int i = 0; i < str2.size(); i++) {
if (count2.find(str2[i]) != count2.end()) {
count2[str2[i]]++;
} else {
count2.insert(pair<char,int>(str2[i], 1));
}
}
for (unordered_map<char, int>::iterator itUm1 = count1.begin(); itUm1 != count1.end(); itUm1++) {
unordered_map<char, int>::iterator itUm2 = count2.find(itUm1->first);
if (itUm2 != count2.end()) {
if (itUm1->second != itUm2->second){
isAnagram = -1;
break;
}
}
}
return isAnagram;
}
int main(void)
{
string str1("WillIamShakespeare");
string str2("IamaWeakishSpeller");
cout << "checkAnagram() for " << str1 << "," << str2 << " : " << checkAnagram(str1, str2) << endl;
return 0;
}
It's funny how sometimes the best questions are the simplest.
The problem here is how to deduce whether two words are anagrams - a word being essentially an unsorted multiset of chars.
We know we have to sort, but ideally we'd want to avoid the time-complexity of sort.
It turns out that in many cases we can eliminate many words that are dissimilar in linear time by running through them both and XOR-ing the character values into an accumulator. The total XOR of all characters in both strings must be zero if both strings are anagrams, regardless of ordering. This is because anything xored with itself becomes zero.
Of course the inverse is not true. Just because the accumulator is zero does not mean we have an anagram match.
Using this information, we can eliminate many non-anagrams without a sort, short-circuiting at least the non-anagram case.
#include <iostream>
#include <string>
#include <algorithm>
//
// return a sorted copy of a string
//
std::string sorted(std::string in)
{
std::sort(in.begin(), in.end());
return in;
}
//
// check whether xor-ing the values in two ranges results in zero.
// #pre first2 addresses a range that is at least as big as (last1-first1)
//
bool xor_is_zero(std::string::const_iterator first1,
std::string::const_iterator last1,
std::string::const_iterator first2)
{
char x = 0;
while (first1 != last1) {
x ^= *first1++;
x ^= *first2++;
}
return x == 0;
}
//
// deduce whether two strings are the same length
//
bool same_size(const std::string& l, const std::string& r)
{
return l.size() == r.size();
}
//
// deduce whether two words are anagrams of each other
// I have passed by const ref because we may not need a copy
//
bool is_anagram(const std::string& l, const std::string& r)
{
return same_size(l, r)
&& xor_is_zero(l.begin(), l.end(), r.begin())
&& sorted(l) == sorted(r);
}
// test
int main() {
using namespace std;
auto s1 = "apple"s;
auto s2 = "eppla"s;
cout << is_anagram(s1, s2) << '\n';
s2 = "pppla"s;
cout << is_anagram(s1, s2) << '\n';
return 0;
}
expected:
1
0
Try this:
// Anagram. Two words are said to be anagrams of each other if the letters from one word can be rearranged to form the other word.
// From the above definition it is clear that two strings are anagrams if all characters in both strings occur same number of times.
// For example "xyz" and "zxy" are anagram strings, here every character 'x', 'y' and 'z' occur only one time in both strings.
#include <map>
#include <string>
#include <cctype>
#include <iostream>
#include <algorithm>
#include <unordered_map>
using namespace std;
bool IsAnagram_1( string w1, string w2 )
{
// Compare string lengths
if ( w1.length() != w2.length() )
return false;
sort( w1.begin(), w1.end() );
sort( w2.begin(), w2.end() );
return w1 == w2;
}
map<char, size_t> key_word( const string & w )
{
// Declare a map which is an associative container that will store a key value and a mapped value pairs
// The key value is a letter in a word and the maped value is the number of times this letter appears in the word
map<char, size_t> m;
// Step over the characters of string w and use each character as a key value in the map
for ( auto & c : w )
{
// Access the mapped value directly by its corresponding key using the bracket operator
++m[toupper( c )];
}
return ( m );
}
bool IsAnagram_2( const string & w1, const string & w2 )
{
// Compare string lengths
if ( w1.length() != w2.length() )
return false;
return ( key_word( w1 ) == key_word( w2 ) );
}
bool IsAnagram_3( const string & w1, const string & w2 )
{
// Compare string lengths
if ( w1.length() != w2.length() )
return false;
// Instantiate a count map, std::unordered_map<char, unsigned int> m
unordered_map<char, size_t> m;
// Loop over the characters of string w1 incrementing the count for each character
for ( auto & c : w1 )
{
// Access the mapped value directly by its corresponding key using the bracket operator
++m[toupper(c)];
}
// Loop over the characters of string w2 decrementing the count for each character
for ( auto & c : w2 )
{
// Access the mapped value directly by its corresponding key using the bracket operator
--m[toupper(c)];
}
// Check to see if the mapped values are all zeros
for ( auto & c : w2 )
{
if ( m[toupper(c)] != 0 )
return false;
}
return true;
}
int main( )
{
string word1, word2;
cout << "Enter first word: ";
cin >> word1;
cout << "Enter second word: ";
cin >> word2;
if ( IsAnagram_1( word1, word2 ) )
cout << "\nAnagram" << endl;
else
cout << "\nNot Anagram" << endl;
if ( IsAnagram_2( word1, word2 ) )
cout << "\nAnagram" << endl;
else
cout << "\nNot Anagram" << endl;
if ( IsAnagram_3( word1, word2 ) )
cout << "\nAnagram" << endl;
else
cout << "\nNot Anagram" << endl;
system("pause");
return 0;
}
In this approach I took care of empty strings and repeated characters as well. Enjoy it and comment any limitation.
#include <iostream>
#include <map>
#include <string>
using namespace std;
bool is_anagram( const string a, const string b ){
std::map<char, int> m;
int count = 0;
for (int i = 0; i < a.length(); i++) {
map<char, int>::iterator it = m.find(a[i]);
if (it == m.end()) {
m.insert(m.begin(), pair<char, int>(a[i], 1));
} else {
m[a[i]]++;
}
}
for (int i = 0; i < b.length(); i++) {
map<char, int>::iterator it = m.find(b[i]);
if (it == m.end()) {
m.insert(m.begin(), pair<char, int>(b[i], 1));
} else {
m[b[i]]--;
}
}
if (a.length() <= b.length()) {
for (int i = 0; i < a.length(); i++) {
if (m[a[i]] >= 0) {
count++;
} else
return false;
}
if (count == a.length() && a.length() > 0)
return true;
else
return false;
} else {
for (int i = 0; i < b.length(); i++) {
if (m[b[i]] >= 0) {
count++;
} else {
return false;
}
}
if (count == b.length() && b.length() > 0)
return true;
else
return false;
}
return true;
}
Check if the two strings have identical counts for each unique char.
bool is_Anagram_String(char* str1,char* str2){
int first_len=(int)strlen(str1);
int sec_len=(int)strlen(str2);
if (first_len!=sec_len)
return false;
int letters[256] = {0};
int num_unique_chars = 0;
int num_completed_t = 0;
for(int i=0;i<first_len;++i){
int char_letter=(int)str1[i];
if(letters[char_letter]==0)
++num_unique_chars;
++letters[char_letter];
}
for (int i = 0; i < sec_len; ++i) {
int c = (int) str2[i];
if (letters[c] == 0) { // Found more of char c in t than in s.
return false;
}
--letters[c];
if (letters[c] == 0) {
++num_completed_t;
if (num_completed_t == num_unique_chars) {
// it’s a match if t has been processed completely
return i == sec_len - 1;
}
}
}
return false;}
#include <iostream>
#include <string.h>
using namespace std;
const int MAX = 100;
char cadA[MAX];
char cadB[MAX];
bool chrLocate;
int i,m,n,j, contaChr;
void buscaChr(char [], char []);
int main() {
cout << "Ingresa CadA: ";
cin.getline(cadA, sizeof(cadA));
cout << "Ingresa CadB: ";
cin.getline(cadB, sizeof(cadA));
if ( strlen(cadA) == strlen(cadB) ) {
buscaChr(cadA,cadB);
} else {
cout << "No son Anagramas..." << endl;
}
return 0;
}
void buscaChr(char a[], char b[]) {
j = 0;
contaChr = 0;
for ( i = 0; ( (i < strlen(a)) && contaChr < 2 ); i++ ) {
for ( m = 0; m < strlen(b); m++ ) {
if ( a[i] == b[m]) {
j++;
contaChr++;
a[i] = '-';
b[m] = '+';
} else { contaChr = 0; }
}
}
if ( j == strlen(a)) {
cout << "SI son Anagramas..." << endl;
} else {
cout << "No son Anagramas..." << endl;
}
}
Your algorithm is incorrect. You're checking each character in the first word to see how many times that character appears in the second word. If the two words were 'aaaa', and 'aaaa', then that would give you a count of 16. A small alteration to your code would allow it to work, but give a complexity of N^2 as you have a double loop.
for(i=0;i<n1;i++)
for(j=0;j<n2;j++)
if(str1[i]==str2[j])
++c, str2[j] = 0; // 'cross off' letters as they are found.
I done some tests with anagram comparisons. Comparing two strings of 72 characters each (the strings are always true anagrams to get maximum number of comparisons), performing 256 same-tests with a few different STL containers...
template<typename STORAGE>
bool isAnagram(const string& s1, const string& s2, STORAGE& asciiCount)
{
for(auto& v : s1)
{
asciiCount[v]++;
}
for(auto& v : s2)
{
if(--asciiCount[static_cast<unsigned char>(v)] == -1)
{
return false;
}
}
return true;
}
Where STORAGE asciiCount =
map<char, int> storage; // 738us
unordered_map<char, int> storage; // 260us
vector<int> storage(256); // 43us
// g++ -std=c++17 -O3 -Wall -pedantic
This is the fastest I can get.
These are crude tests using coliru online compiler + and std::chrono::steady_clock::time_point for measurements, however they give a general idea of performance gains.
vector has the same performance, uses only 256 bytes, although strings are limited to 255 characters in length (also change to: --asciiCount[static_cast(v)] == 255 for unsigned char counting).
Assuming vector is the fastest. An improvement would be to just allocate a C style array unsigned char asciiCount[256]; on the stack (since STL containers allocate their memory dynamically on the heap)
You could probably reduce this storage to 128 bytes, 64 or even 32 bytes (ascii chars are typically in range 0..127, while A-Z+a-z 64.127, and just upper or lower case 64..95 or 96...127) although not sure what gains would be found from fitting this inside a cache line or half.
Any better ways to do this? For Speed, Memory, Code Elegance?
1. Simple and fast way with deleting matched characters
bool checkAnagram(string s1, string s2) {
for (char i : s1) {
unsigned int pos = s2.find(i,0);
if (pos != string::npos) {
s2.erase(pos,1);
} else {
return false;
}
}
return s2.empty();
}
2. Conversion to prime numbers. Beautiful but very expensive, requires special Big Integer type for long strings.
// https://en.wikipedia.org/wiki/List_of_prime_numbers
int primes[255] = {2, 3, 5, 7, 11, 13, 17, 19, ... , 1613};
bool checkAnagramPrimes(string s1, string s2) {
long c1 = 1;
for (char i : s1) {
c1 = c1 * primes[i];
}
long c2 = 1;
for (char i : s2) {
c2 = c2 * primes[i];
if (c2 > c1) {
return false;
}
}
return c1 == c2;
}
string key="listen";
string key1="silent";
string temp=key1;
int len=0;
//assuming both strings are of equal length
for (int i=0;i<key.length();i++){
for (int j=0;j<key.length();j++){
if(key[i]==temp[j]){
len++;
temp[j] = ' ';//to deal with the duplicates
break;
}
}
}
cout << (len==key.length()); //if true: means the words are anagrams
Instead of using dot h header which is deprecated in modern c++.
Try this solution.
#include <iostream>
#include <string>
#include <map>
int main(){
std::string word_1 {};
std::cout << "Enter first word: ";
std::cin >> word_1;
std::string word_2 {};
std::cout << "Enter second word: ";
std::cin >> word_2;
if(word_1.length() == word_2.length()){
std::map<char, int> word_1_map{};
std::map<char, int> word_2_map{};
for(auto& c: word_1)
word_1_map[std::tolower(c)]++;
for(auto& c: word_2)
word_2_map[std::tolower(c)]++;
if(word_1_map == word_2_map){
std::cout << "Anagrams" << std::endl;
}
else{
std::cout << "Not Anagrams" << std::endl;
}
}else{
std::cout << "Length Mismatch" << std::endl;
}
}
#include <bits/stdc++.h>
using namespace std;
#define NO_OF_CHARS 256
int main()
{ bool ans = true;
string word1 = "rest";
string word2 = "tesr";
unordered_map<char,int>maps;
for(int i = 0 ; i <5 ; i++)
{
maps[word1[i]] +=1;
}
for(int i = 0 ; i <5 ; i++)
{
maps[word2[i]]-=1 ;
}
for(auto i : maps)
{
if(i.second!=0)
{
ans = false;
}
}
cout<<ans;
}
Well if you don't want to sort than this code will give you perfect output.
#include <iostream>
using namespace std;
int main(){
string a="gf da";
string b="da gf";
int al,bl;
int counter =0;
al =a.length();
bl =b.length();
for(int i=0 ;i<al;i++){
for(int j=0;j<bl;j++){
if(a[i]==b[j]){
if(j!=bl){
b[j]=b[b.length()-counter-1];
bl--;
counter++;
break;
}else{
bl--;
counter++;
}
}
}
}
if(counter==al){
cout<<"true";
}
else{
cout<<"false";
}
return 0;
}
Here is the simplest and fastest way to check for anagrams
bool anagram(string a, string b) {
int a_sum = 0, b_sum = 0, i = 0;
while (a[i] != '\0') {
a_sum += (int)a[i]; // (int) cast not necessary
b_sum += (int)b[i];
i++;
}
return a_sum == b_sum;
}
Simply adds the ASCII values and checks if the sums are equal.
For example:
string a = "nap" and string b = "pan"
a_sum = 110 + 97 + 112 = 319
b_sum = 112 + 97 + 110 = 319