Find All Palindrome Substrings in a String by Rearranging Characters - c++

For fun and practice, I have tried to solve the following problem (using C++): Given a string, return all the palindromes that can be obtained by rearranging its characters.
I've come up with an algorithm that doesn't work completely. Sometimes, it finds all the palindromes, but other times it finds some but not all.
It works by swapping each adjacent pair of characters N times, where N is the length of the input string. Here is the code:
std::vector<std::string> palindromeGen(std::string charactersSet) {
std::vector<std::string> pals;
for (const auto &c : charactersSet) {
for (auto i = 0, j = 1; i < charactersSet.length() - 1; ++i, ++j) {
std::swap(charactersSet[i], charactersSet[j]);
if (isPalandrome(charactersSet)) {
if (std::find(pals.begin(), pals.end(), charactersSet) == pals.end()) {
// if palindrome is unique
pals.push_back(charactersSet);
}
}
}
}
return pals;
}
What's the fault in this algorithm? I'm mostly concerned about the functionality of the algorithm, rather than the efficiency. Although I'll appreciate tips about efficiency as well. Thanks.

This probably fits a bit better in Code Review but here goes:
Logic Error
You change charactersSet while iterating over it, meaning that your iterator breaks. You need to make a copy of characterSet, and iterate over that.
Things to Change
Since pals holds only unique values, it should be a std::set instead of a std::vector. This will simplify some things. Also, your isPalandrome method spells palindrome wrong!
Alternative Approach
Since palindromes can only take a certain form, consider sorting the input string first, so that you can have a list of characters with an even number of occurrences, and a list of characters with an odd number. You can only have one character with an odd number of occurrences (and this only works for an odd length input). This should let you discard a bunch of possibilities. Then you can work through the different possible combinations of one half of the palindrome (since you can build one half from the other).

Here is another implementation that leverages std::next_permutation:
#include <string>
#include <algorithm>
#include <set>
std::set<std::string> palindromeGen(std::string charactersSet)
{
std::set<std::string> pals;
std::sort(charactersSet.begin(), charactersSet.end());
do
{
// check if the string is the same backwards as forwards
if ( isPalindrome(charactersSet))
pals.insert(charactersSet);
} while (std::next_permutation(charactersSet.begin(), charactersSet.end()));
return pals;
}
We first sort the original string. This is required for std::next_permutation to work correctly. A call to the isPalindrome function with a permutation of the string is done in a loop. Then if the string is a palindrome, it's stored in the set. The subsequent call to std::next_permutation just rearranges the string.
Here is a Live Example. Granted, it uses a reversed copy of the string as the "isPalindrome" function (probably not efficient), but you should get the idea.

Related

What is an efficient way to many substrings of a string against a case in C++?

I am working on some problems on Hackerrank, where I need to to determine if a string matches a pattern after 1 or 0 elements are removed. If after removing 1 or 0 elements, every character in the string has the same frequency then I want to print "YES". Otherwise I print "NO". The input is a string between 1 and 10^5 characters. What I have works for trivial cases, but times-out on some of the test cases. I should reach some return value, but I think the test cases are just really large input that my code is too inefficient for. Particularly where I use erase, I am copying the string n^2 times where n is the length of the string. Can I work on the string in-place, simply skipping over each element?
#include <iostream>
#include <string>
using namespace std;
long countFreq(char someChar,string someStr){
long count = 0;
for(int i=0;i<someStr.size();i++){
if(someStr[i] == someChar)count += 1;
}
return count;}
bool allSameFreq(string someStr){
long freq = countFreq(someStr[0],someStr);
for(char someChr:someStr){
if(countFreq(someChr,someStr)!=freq)return false;
}
return true;}
int main(){
string pattern;
cin>>pattern;
if(allSameFreq(pattern)==true){cout<<"YES";return 0;}
else {
for(int i=0;i<pattern.size();i++){
string copy = pattern;
copy.erase(i,1);
if(allSameFreq(copy)==true){cout<<"YES";return 0;}
}
cout<<"NO";
}
return 0;
}
Edit: It has been pointed out that this can be solved in another way that is less memory intensive. Still, I am curious about the original question: What is an efficient way to iterate over a string (or whatever) and test the "string minus each value" against a condition WITHOUT making a copy of the string each time?
I found a method that efficiently operates on a sublist without making a copy. You have to be able to access pointers to elements of the string, treating the string as a linked list. So depending on how you want to alter the string, you move a pointer or few, saving the values of the pointers you modified.

efficient way to remove a list of string from a big vector

I am using visual studio 2012 (windows) and I am trying to write an efficient c++ function to remove some words from a big vector of strings.
I am using stl algorithms. I am a c++ beginner so I am not sure that it is the best way to proceed. This is what I have did :
#include <algorithm>
#include <unordered_set>
using std::vector;
vector<std::string> stripWords(vector<std::string>& input,
std::tr1::unordered_set<std::string>& toRemove){
input.erase(
remove_if(input.begin(), input.end(),
[&toRemove](std::string x) -> bool {
return toRemove.find(x) != toRemove.end();
}));
return input;
}
But this don't work, It doesn't loop over all input vector.
This how I test my code:
vector<std::string> in_tokens;
in_tokens.push_back("removeme");
in_tokens.push_back("keep");
in_tokens.push_back("removeme1");
in_tokens.push_back("removeme1");
std::tr1::unordered_set<std::string> words;
words.insert("removeme");
words.insert("removeme1");
stripWords(in_tokens,words);
You need the two-argument form of erase. Don't outsmart yourself and write it on separate lines:
auto it = std::remove_if(input.begin(), input.end(),
[&toRemove](std::string x) -> bool
{ return toRemove.find(x) != toRemove.end(); });
input.erase(it, input.end()); // erases an entire range
Your approach using std::remove_if() is nearly the correct approach but it erases just one element. You need to use the two argument version of erase():
input.erase(
remove_if(input.begin(), input.end(),
[&toRemove](std::string x) -> bool {
return toRemove.find(x) != toRemove.end();
}), input.end());
std::remove_if() reorders the elements such that the kept elements are in the front of the sequence. It returns an iterator it to the first position which is to be considered the new end of the sequence, i.e., you need to erase the range [it, input.end()).
You've already gotten a couple of answers about how to this correctly.
Now, the question is whether you can make it substantially more efficient. The answer to that will depend on another question: do you care about the order of the strings in the vector?
If you can rearrange the strings in the vector without causing a problem, then you can make the removal substantially more efficient.
Instead of removing strings from the middle of the vector (which requires moving all the other strings over to fill in the hole) you can swap all the unwanted strings to the end of the vector, then remove them.
Especially if you're only removing a few strings from near the beginning of a large vector, this can improve efficiency a lot. Just for example, let's assume a string you want to remove is followed by 1000 other strings. With this, you end up swapping only two strings, then erasing the last one (which is fast). With your current method, you end up moving 1000 strings just to remove one.
Better still, even with fairly old compilers, you can expect swapping strings to be quite fast as a rule--typically faster than moving them would be (unless your compiler is new enough to support move assignment).

Replace swap implementation for built-in types in C++

I want to change the behavior of std::swap for char type. According to what I have learned, the only a way to do this is to add a template specialization for std::swap, isn't it?
Since char is a built-in type, we have no chance to use ADL.
Please give your advice for such cases.
Edit: Here is the original problem I needed to solve. Random shuffle a string except that non-alpha characters should keep their positions unchanged.
The first thing I want to do is to leverage the std::random_shuffle.
First: Don't do that. You may inadvertently break a different part of code that was previously working.
What you could try is to create your own class, make it hold only a single char element in it and then add any fancy functionality to it that you like. This way you would have your own swap behavior without breaking somebody elses code.
However, if you still want to do that, try the following (running) example:
#include <algorithm>
#include <iostream>
namespace std {
template <>
void swap<char>(char& a, char& b) {
std::cerr << "Swapped " << a << " with " << b << "\n";
char t=a;
a=b;
b=t;
}
}
int main() {
char arr[] = {'a', 'z', 'b', 'y'};
std::reverse(arr, arr+4);
return 0;
}
Do note that some stl algorithms may be specialized for basic types and not use std::swap at all.
Ad. Edited question:
Fair shuffling algorithm is fairly simple:
for (i = 0 .. n-2) {
j = random (i .. n-1); //and NOT random (0 .. n-1)
swap(array[i], array[j]);
}
however, if you modify swap to prevent the operation when either of the arguments is not alphanumeric (I presume that's what you wanted to change swap into?), the remaining permutation is not going to be fair. With the increasing number of non-alhanumeric characters, the chance that given character won't move - increases. In worst-case scenario, imagine a long string with only two alphanumeric characters - the chance of them getting swapped will be near 0.
If you want to have fair permutation on only non-alpha characters you can do:
a) Pretty straightforward way - extract the alphanumeric characters to separate array, shuffle, and then put them back.
Simple, no performance hit, but needs more memory.
b) If the number of nonalphanumeric characters is relatively low, you can repeat the dice roll:
for (i = 0 .. n-2) {
if (!alphanumeric(array[i]) continue;
do {
j = random (i .. n-1);
while (!alphanumeric(array[j]));
swap(array[i], array[j]);
}
This shuffling will be still fair, but will take a lot of time when you have a lot of nonalphanumeric characters.

Word Frequency Statistics

In an pre-interview, I am faced with a question like this:
Given a string consists of words separated by a single white space, print out the words in descending order sorted by the number of times they appear in the string.
For example an input string of “a b b” would generate the following output:
b : 2
a : 1
Firstly, I'd say it is not so clear that whether the input string is made up of single-letter words or multiple-letter words. If the former is the case, it could be simple.
Here is my thought:
int c[26] = {0};
char *pIn = strIn;
while (*pIn != 0 && *pIn != ' ')
{
++c[*pIn];
++pIn;
}
/* how to sort the array c[26] and remember the original index? */
I can get the statistics of the frequecy of every single-letter word in the input string, and I can get it sorted (using QuickSort or whatever). But after the count array is sorted, how to get the single-letter word associated with the count so that I can print them out in pair later?
If the input string is made of of multiple-letter word, I plan to use a map<const char *, int> to track the frequency. But again, how to sort the map's key-value pair?
The question is in C or C++, and any suggestion is welcome.
Thanks!
I would use a std::map<std::string, int> to store the words and their counts. Then I would use something this to get the words:
while(std::cin >> word) {
// increment map's count for that word
}
finally, you just need to figure out how to print them in order of frequency, I'll leave that as an exercise for you.
You're definitely wrong in assuming that you need only 26 options, 'cause your employer will want to allow multiple-character words as well (and maybe even numbers?).
This means you're going to need an array with a variable length. I strongly recommend using a vector or, even better, a map.
To find the character sequences in the string, find your current position (start at 0) and the position of the next space. Then that's the word. Set the current position to the space and do it again. Keep repeating this until you're at the end.
By using the map you'll already have the word/count available.
If the job you're applying for requires university skills, I strongly recommend optimizing the map by adding some kind of hashing function. However, judging by the difficulty of the question I assume that that is not the case.
Taking the C-language case:
I like brute-force, straightforward algos so I would do it in this way:
Tokenize the input string to give an unsorted array of words. I'll have to actually, physically move each word (because each is of variable length); and I think I'll need an array of char*, which I'll use as the arg to qsort( ).
qsort( ) (descending) that array of words. (In the COMPAR function of qsort(), pretend that bigger words are smaller words so that the array acquires descending sort order.)
3.a. Go through the now-sorted array, looking for subarrays of identical words. The end of a subarray, and the beginning of the next, is signalled by the first non-identical word I see.
3.b. When I get to the end of a subarray (or to the end of the sorted array), I know (1) the word and (2) the number of identical words in the subarray.
EDIT new step 4: Save, in another array (call it array2), a char* to a word in the subarry and the count of identical words in the subarray.
When no more words in sorted array, I'm done. it's time to print.
qsort( ) array2 by word frequency.
go through array2, printing each word and its frequency.
I'M DONE! Let's go to lunch.
All the answers prior to mine did not give really an answer.
Let us think on a potential solution.
There is a more or less standard approach for counting something in a container.
We can use an associative container like a std::map or a std::unordered_map. And here we associate a "key", in this case the word, to a count, with a value, in this case the count of the specific word.
And luckily the maps have a very nice index operator[]. This will look for the given key and, if found, return a reference to the value. If not found, then it will create a new entry with the key and return a reference to the new entry. So, in both cases, we will get a reference to the value used for counting. And then we can simply write:
std::unordered_map<char,int> counter{};
counter[word]++;
And that looks really intuitive.
After this operation, you have already the frequency table. Either sorted by the key (the word), by using a std::map or unsorted, but faster accessible with a std::unordered_map.
Now you want to sort according to the frequency/count. Unfortunately this is not possible with maps.
Therefore we need to use a second container, like a ```std::vector`````which we then can sort unsing std::sort for any given predicate, or, we can copy the values into a container, like a std::multiset that implicitely orders its elements.
For getting out the words of a std::string we simply use a std::istringstream and the standard extraction operator >>. No big deal at all.
And because writing all this long names for the std containers, we create alias names, with the using keyword.
After all this, we now write ultra compact code and fulfill the task with just a few lines of code:
#include <iostream>
#include <string>
#include <sstream>
#include <utility>
#include <set>
#include <unordered_map>
#include <type_traits>
#include <iomanip>
// ------------------------------------------------------------
// Create aliases. Save typing work and make code more readable
using Pair = std::pair<std::string, unsigned int>;
// Standard approach for counter
using Counter = std::unordered_map<Pair::first_type, Pair::second_type>;
// Sorted values will be stored in a multiset
struct Comp { bool operator ()(const Pair& p1, const Pair& p2) const { return (p1.second == p2.second) ? p1.first<p2.first : p1.second>p2.second; } };
using Rank = std::multiset<Pair, Comp>;
// ------------------------------------------------------------
std::istringstream text{ " 4444 55555 1 22 4444 333 55555 333 333 4444 4444 55555 55555 55555 22 "};
int main() {
Counter counter;
// Count
for (std::string word{}; text >> word; counter[word]++);
// Sort
Rank rank(counter.begin(), counter.end());
// Output
for (const auto& [word, count] : rank) std::cout << std::setw(15) << word << " : " << count << '\n';
}

C++ string sort like a human being?

I would like to sort alphanumeric strings the way a human being would sort them. I.e., "A2" comes before "A10", and "a" certainly comes before "Z"! Is there any way to do with without writing a mini-parser? Ideally it would also put "A1B1" before "A1B10". I see the question "Natural (human alpha-numeric) sort in Microsoft SQL 2005" with a possible answer, but it uses various library functions, as does "Sorting Strings for Humans with IComparer".
Below is a test case that currently fails:
#include <set>
#include <iterator>
#include <iostream>
#include <vector>
#include <cassert>
template <typename T>
struct LexicographicSort {
inline bool operator() (const T& lhs, const T& rhs) const{
std::ostringstream s1,s2;
s1 << toLower(lhs); s2 << toLower(rhs);
bool less = s1.str() < s2.str();
//Answer: bool less = doj::alphanum_less<std::string>()(s1.str(), s2.str());
std::cout<<s1.str()<<" "<<s2.str()<<" "<<less<<"\n";
return less;
}
inline std::string toLower(const std::string& str) const {
std::string newString("");
for (std::string::const_iterator charIt = str.begin();
charIt!=str.end();++charIt) {
newString.push_back(std::tolower(*charIt));
}
return newString;
}
};
int main(void) {
const std::string reference[5] = {"ab","B","c1","c2","c10"};
std::vector<std::string> referenceStrings(&(reference[0]), &(reference[5]));
//Insert in reverse order so we know they get sorted
std::set<std::string,LexicographicSort<std::string> > strings(referenceStrings.rbegin(), referenceStrings.rend());
std::cout<<"Items:\n";
std::copy(strings.begin(), strings.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
std::vector<std::string> sortedStrings(strings.begin(), strings.end());
assert(sortedStrings == referenceStrings);
}
Is there any way to do with without writing a mini-parser?
Let someone else do that?
I'm using this implementation: http://www.davekoelle.com/alphanum.html, I've modified it to support wchar_t, too.
It really depends what you mean by "parser." If you want to avoid writing a parser, I would think you should avail yourself of library functions.
Treat the string as a sequence of subsequences which are uniformly alphabetic, numeric, or "other."
Get the next alphanumeric sequence of each string using isalnum and backtrack-checking for + or - if it is a number. Use strtold in-place to find the end of a numeric subsequence.
If one is numeric and one is alphabetic, the string with the numeric subsequence comes first.
If one string has run out of characters, it comes first.
Use strcoll to compare alphabetic subsequences within the current locale.
Use strtold to compare numeric subsequences within the current locale.
Repeat until finished with one or both strings.
Break ties with strcmp.
This algorithm has something of a weakness in comparing numeric strings which exceed the precision of long double.
Is there any way to do it without writing a mini parser? I would think the answer is no. But writing a parser isn't that tough. I had to do this a while ago to sort our company's stock numbers. Basically just scan the number and turn it into an array. Check the "type" of every character: alpha, number, maybe you have others you need to deal with special. Like I had to treat hyphens special because we wanted A-B-C to sort before AB-A. Then start peeling off characters. As long as they are the same type as the first character, they go into the same bucket. Once the type changes, you start putting them in a different bucket. Then you also need a compare function that compares bucket-by-bucket. When both buckets are alpha, you just do a normal alpha compare. When both are digits, convert both to integer and do an integer compare, or pad the shorter to the length of the longer or something equivalent. When they're different types, you'll need a rule for how those compare, like does A-A come before or after A-1 ?
It's not a trivial job and you have to come up with rules for all the odd cases that may arise, but I would think you could get it together in a few hours of work.
Without any parsing, there's no way to compare human written numbers (high values first with leading zeroes stripped) and normal characters as part of the same string.
The parsing doesn't need to be terribly complex though. A simple hash table to deal with things like case sensitivity and stripping special characters ('A'='a'=1,'B'='b'='2,... or 'A'=1,'a'=2,'B'=3,..., '-'=0(strip)), remap your string to an array of the hashed values, then truncate number cases (if a number is encountered and the last character was a number, multiply the last number by ten and add the current value to it).
From there, sort as normal.