I have two words, and I want obtain all permutation of the combination of these words. The relative order of character from each string has to be preserved
look at this exapmle:
Input= "abc", "mn"
Output= "abcmn", "abmnc", "amnbc", "mnabc", "mabcn", "manbc", "mabnc", "ambnc", "ambcn", "abmcn"
I search stackoverflow.com, and achieve the following code, but it doesn't work!
void print_towstring(const std::vector<int>& v, const std::string& s1, const std::string& s2)
{
std::size_t i1 = 0;
std::size_t i2 = 0;
for (int i : v) {
std::cout << ((i == 0) ? s1[i1++] : s2[i2++]);
}
std::cout << std::endl;
}
void towstring(const std::string& s1, const std::string& s2)
{
std::vector<int> v(s1.size(), 0);
v.insert(v.end(), s2.size(), 1);
do
{
print_towstring(v, s1, s2);
} while (std::next_permutation(v.begin(), v.end()));
}
int main(int argc, char *argv[])
{
towstring("abc", "mn");
return 0;
}
how can I write algorithm of permutation combination in c++ ?
I think you can do it recursively. Basically at each step you create two branches: one where you add to your string a letter from the right hand side string, and another branch where you add the first letter from the left hand side string:
void AddNext(
std::string const& left,
std::string const& right,
std::string const& current,
std::vector< std::string >& results)
{
if (left.empty())
{
current.append(right);
results.push_back(current);
return;
}
else if (right.empty())
{
current.append(left);
results.push_back(current)
return;
}
else
{
AddNext(left, right.substr(1, right.size() -1), current + std::string(1, right[0]), results);
AddNext(left.substr(1, left.size() -1), right, current + std::string(1, left[0]), results);
}
}
It seems that you can represent a "permutation combination" by a sequence of 0's and 1's, where each number tells from which string to take next character, like this:
00101 - means abmcn
So, you now have to produce all such strings that have a given number of 0's and a given number of 1's (3 and 2 in your example). To do it, I guess, the easiest would be to iterate over all combinations of 0's and 1's, and throw away those that don't have the needed number of 1's.
I like to represent a string of bits by a number, starting from the least significant bit, e.g. 00101 corresponds to
0 * 2^0 +
0 * 2^1 +
1 * 2^2 +
0 * 2^3 +
1 * 2^4 = 20
(warning: this will only work for a limited string size - up to 32 - or whatever number of bits int has. For longer strings, the implementation could be adapted to 64 bits, but it's not worth it, because it would be too slow anyway)
To get a given bit from such a number:
int GetBit(int n, int b)
{
return (n >> b) & 1;
}
To convert such a number to a vector:
void ConvertNumberToVector(int n, std::vector<int>& v)
{
for (int b = 0; b < v.size(); ++b)
v[b] = GetBit(n, b);
}
Then you can use this vector with your print_towstring function.
The(My) code works: http://ideone.com/IYYVZY
It just use C++11.
for C++03 see http://ideone.com/ZHXSkt
You have to change the for range loop
for (int e : v) -> for (std::size_t i = 0, size = v.size(); i != size; ++i) { int e = v[i]; ..
I am going to build upon the approach from anatolyg to provide a solution for n input strings, lets say for the sake of simplicity n <=10 (see where I am going?). Notice how the relative ordering for one string never change? This is the base of the algorithm
step1 :
You take your string in a array or vector or whatever. for each string you assign a symbol of size one, like character 0 for the first one, 1 for the second, till 9.
steps2 : you have a function which convert the inputs to a single string (or better a vector), where each character is from the original string. In your case, the function is :
f("abc", "mn") => "00011"
steps3 : you enumerate the permutations over the resulting string, in this case "00011". You were already in the right track with std::next_permutation()
steps4 : you iterate on each of the resulting string and use its symbol as a mask. something like
void mergewithmask(std::string& target, std::string& input, char mask )
{
int i = 0;//target index
int j = 0;//input index
for(i = 0; i < target.size(); i++)
{
if(target[i] == mask){
target[i] = input[j];
j++;
}
}
}
so
mergewithmask("01001","abc", `0`) => "a1bc1"
mergewithmask("a1bc1","mn", `1`) => "ambcn"
In order of this approach to work you need to use symbols which don't collide with your initial inputs. Using a vector of negative numbers for instance will guarantee not colliding with a char array and a unlimited amount of input strings...
Related
This question already has answers here:
What is the best way to recursively generate all binary strings of length n?
(4 answers)
Closed 9 months ago.
The community reviewed whether to reopen this question 9 months ago and left it closed:
Original close reason(s) were not resolved
I have to get every possible combination of given string.
The string I get can be of various sizes but always contains only 1 and 0.
For example, Combinations I want to get with "101" as the input :
"000" "001" "010" "100" "110" "101" "011" "111".
I tried using std::next_permutation (c++20), I'm getting close but this not exactly not what I want.
The final goal is to store every combination inside a string vector.
Below is what I tried with next_permutation
// I'm not using *using namespace std* no need to mention it
std::vector<std::string> generate_all_combinations(std::string base)
{
std::vector<std::string> combinations;
do {
combinations.push_back(base);
} while (std::next_permutation(base.begin(), base.end()));
return combinations;
}
When I print the vector's content I have :
"011" "101" "110".
The base strings "000" and "111" are not a problem I can generate those pretty easily. But I'm still lacking other combinations like "001" "010" "100".
Maybe this is not the answer you expect, but if you want every binary combination for a specified count of bits, this is one possible solution:
void getCombinations(std::vector<std::string>& str_list, int len)
{
uint64_t comb_count = 1ull << len;
std::string str;
for(uint64_t i = 0; i < comb_count; ++i) {
str.clear();
for(int j = 0; j < len; ++j)
str += ((i >> j) & 0x1) ? "1" : "0";
str_list.push_back(str);
}
}
std::next_permutation won't work here, you will have to craft your own function here.
For example, for a symbol list "ab" where the permutations have a length of 3, you can get the 4th permutation by converting the number 4 to base 2 (length of symbol list) and use that as an index table for your permutation.
so 4 becomes 100 in base 2, so the indices are {1, 0, 0} and for the symbol list that is 'b', 'a', 'a' therefore the string "baa".
Here is a possible implementation of this, and it takes a symbol list, permutation length and current number of permutation. You can manually convert to any base by diving by base^position and taking modulus of base. The base is simply the length of the symbol list:
template<std::integral T>
constexpr T int_pow(T b, T e) {
return (e == 0) ? T{ 1 } : b * int_pow(b, e - 1);
}
std::string get_permutation(const std::string& symbols, std::size_t permutation_size, std::size_t position) {
std::string permutation;
permutation.resize(permutation_size);
auto base = symbols.length();
for (std::size_t i = 0u; i < permutation_size; ++i) {
auto index = (position / int_pow(base, i)) % base;
permutation[permutation_size - i - 1] = symbols[index];
}
return permutation;
}
so calling this:
std::cout << get_permutation("ab", 3u, 4u) << '\n';
prints out
baa
with this function you can make a list_permutations function, that adds all permutations to a vector:
std::vector<std::string> list_permutations(const std::string& symbols, std::size_t permutation_size) {
auto result_size = std::size_t(std::pow(symbols.length(), permutation_size));
std::vector<std::string> result(result_size);
for (std::size_t i = 0u; i < result.size(); ++i) {
result[i] = get_permutation(symbols, permutation_size, i);
}
return result;
}
int main() {
auto list = list_permutations("01", 3u);
for (auto& i : list) {
std::cout << i << '\n';
}
}
output:
000
001
010
011
100
101
110
111
The problem says that you have a string consisting of digits & special chars. All you have to do is to remove all the chars and divide the digits into blocks separated by '-'. The block holds 3 digits or 2 digits but cannot be 1 alone. For example:
input: "aasnd1df2d3dfg4gfd56f7gaad8ew9ds2sa1"
After the removal of the chars, it should be like "12345678921" and then it should be divided into blocks, so the final output would be like "123-456-789-21".
I made the char removal part, but I can't make the blocks division. Any ideas?
string removeNumbers(string str)
{
int current = 0;
string dig;
int len=0;
int ctr=0;
for(int i = 0; i < str.length(); i++){
if (isdigit(str[i])){
str[current] = str[i];
current++;
}
}
dig= str.substr(0,current);
return dig;
}
You first function works. However, it is a little bit too complicated and should be refactored. Also the name is wrong. It removes the NON-digits.
I will show a similar, but easier solution to you later in the code example.
Now the tricky part. You want to distribute the digits equally in a group with a given fixed size. I will explain the general approach with arbitrary group sizes to you.
Let us assume that you want to split an array of a given size into groups of a fixed size, here the array of digits (your string) into groups of size gs, in your example 3.
Then we can calculate the number of groups by a simple integer division. So, basically the number of elements in the array (number of digits in your string) devided by the group size. Since an integer division of a number by a group size will result in 0 for the first group-size elements, we will add an offset of one and get the simple formular:
const size_t numberOfResultingGroups = ((numberOfDigits-1) / groupSize) + 1 ;
This should be understandable. And how many elements will there be in one group? We can calculate this again with an integer division. So, number of elements (digits) divided by the group size. Also very straigtforward.
const size_t basicGoupSize = numberOfDigits / numberOfResultingGroups;
Also very easy to understand. But there may be a remainder as a result of the integer division. These remaining elements need also to be distributed.
We could calculate the remainder with a modulo division, but also in a classical way:
int digitsToDistribute = numberOfDigits - (numberOfResultingGroups * basicGoupSize);
Also this is simple to understand.
And note, the remainder (the digitsToDistribute) can of course never be bigger than the number of groups.
We can then in a loop use the "basicGoupSize" and add a 1 to it, as long as there are remaining "digitsToDistribute". So, in each loop run, we will check, if there are still remaining bytes to distribute, and, in case of yes, add 1 to the "basicGoupSize" and the decrement the "digitsToDistribute".
And so all bytes will be evenly distributed.
Next is the handling of a dash. Many people think of it like print something, then dash, then something, then dash, then something, then dash aso.
But the handling of the last dash may be difficult. And so, we turn the sequence around and print a dash, followed by something, followed by a dash, followed by something aso. And we simply suppress the first dash.
Everything together will lead us to the solution (This is one of many possible solutions):
#include <iostream>
#include <string>
// Remove all non-digits from a string
std::string removeNonDigits(const std::string &str)
{
std::string dig;
for (unsigned int i = 0; i < str.length(); i++) {
if (::isdigit(str[i])) {
dig += str[i];
}
}
return dig;
}
int main() {
// Ourtest string
std::string test{ "aasnd1df2d3dfg4gfd56f7gaad8ew9ds2sa1sdf34sdff56sdff78sdf" };
// Remove all non digits
std::string digits{ removeNonDigits(test) };
// Our group length will be 3 (can be anything else)
constexpr size_t groupSize{ 3u };
// How many digitsdowe haveinour test string
const size_t numberOfDigits = digits.length();
// Trivial case: Number of digits is smaller than the desired group size. Simply print it
if (numberOfDigits < groupSize) {
std::cout << digits << '\n';
}
else {
// Ok, now we want to build groups and distribute remainders
// How many goups will we have? Integer division by group size (corrected for a 1 offset)
const size_t numberOfResultingGroups{ ((numberOfDigits-1) / groupSize) + 1 };
// A basic group will have this size. Integer division of number Of digits by number of groups
const size_t basicGoupSize{ numberOfDigits / numberOfResultingGroups };
// Maybe not all digits are used up. This is the remainder.
// We will add 1 to the basic group size as long as there are remaining digits
int digitsToDistribute = numberOfDigits - (numberOfResultingGroups * basicGoupSize);
// Handling of dashes. We will notprint a dash in front of the FIRST group
bool printDashInFrontOfGroup = false;
// Start position of substring
size_t stringStartPos{};
// No print all groups
for (size_t i{}; i < numberOfResultingGroups; ++i) {
// Length of group. So base length + 1, as long as there are remaining digits to distribute
const size_t length = basicGoupSize + (((digitsToDistribute--) > 0) ? 1 : 0);
// Print dash and sub string
std::cout << (std::exchange(printDashInFrontOfGroup, true)?"-":"") << digits.substr(stringStartPos, length);
// And,next start position
stringStartPos += length;
}
}
return 0;
}
Edit
Some additional note:
All the above algorithm is deterministic. It will give the mathematical correct solution. Predictable. For different group sizes.
As a basic rule. Make your design, the mathematics and the algorithm before start coding. Then implement everything.
Example: Removing non digits from a string in C++ is a really very short one-liner:
// Remove all-non-digits
test = std::regex_replace(test, std::regex(R"([^0-9])"), "");
That is basically very simple and actually a no brainer. Anyway. People may do it differently, or more complicated. For whatever reason.
And as said, if you do your design before coding then you can come up with really compact solutions. One of many possible examples:
#include <iostream>
#include <string>
#include <regex>
#include <algorithm>
#include <vector>
#include <experimental/iterator>
auto splitToEqualParts(const std::string s, const size_t groupSize) {
std::vector<std::string> result{}; // Here we will store all the split string parts
int toDistribute = (int)s.size() - (((((int)s.size() - 1) / (int)groupSize) + 1) * (int)(s.size() / ((((int)s.size() - 1) / (int)groupSize) + 1)));
// Add all sub-string parts to result vector
for (size_t i{}, pos{}, length{}; i < (size_t)((((int)s.size() - 1) / (int)groupSize) + 1); ++i, pos += length) {
length = (s.size() / ((((int)s.size() - 1) / (int)groupSize) + 1)) + (toDistribute-- > 0) * 1;
result.push_back(s.substr(pos, length));
}
return result;
}
int main() {
// Some test string
std::string test{ "aasnd1df2d3dfg4gfd56f7gaad8ew9ds2sa1sdf34sdff56sdff78sdf" };
// Remove all-non-digits
test = std::regex_replace(test, std::regex(R"([^0-9])"), "");
// Split into equal parts
std::vector part{ splitToEqualParts(test, 3) };
std::copy(part.begin(), part.end(), std::experimental::make_ostream_joiner(std::cout, "-"));
return 0;
}
Compiled and tested with GCC and Clang. Using C++17.
Of course the common sub expressions maybe put in const variables. But basically no need. An optimizing compiler will do it for you.
The above will work for all group sizes (>0) and any number of digits in the string . . .
I suggest
#include <string>
#include <iostream>
#include <algorithm>
using namespace std;
string remove_chars(string & s)
{
string inter_result;
for_each(s.begin(), s.end(), [&inter_result](char c) { if (isdigit(c)) inter_result += c; });
size_t length = inter_result.length();
if (length <= 3)
{
return inter_result;
}
string result;
size_t pos = 0, step = 3;
while (pos < length)
{
if (length - pos - 3 == 1) step = 2;
if (result.length() > 0)
{
result += "-";
}
result += inter_result.substr(pos, step);
pos += step;
}
return result;
}
int main()
{
string s = "aasnd1df2d3dfg4gfd56f7gaad8ew9ds2sa1";
cout << remove_chars(s) << endl;
return 0;
}
I have my code which return the smallest integer deletions required to make anagram :
#include <bits/stdc++.h>
using namespace std;
int makeAnagram(string a, string b) {
int count = 0;
for(auto it=a.begin(); it!=a.end(); it++){
if(find(b.begin(), b.end(), *it) == b.end()){
a.erase(it);
count++;
}
}
for(auto it = b.begin(); it != b.end(); it++){
if(find(a.begin(), a.end(), *it) == a.end()){
b.erase(it);
count++;
}
}
return count;
}
And it doesn't work at all, I don't understand why, the main test is :
int main()
{
string a={'a','b','c'};
string b={'c','d','e'};
int res = makeAnagram(a, b);
cout << res << "\n";
return 0;
}
The console is supposed to return 4, but it return 2 instead, and the string a and b have 2 elements at the end of the program, when they should are 1-sized
Problem with your approach is you are deleting the element during the iteration but your not considering the change in the iterator i,e you should first increment iterator then delete the previous element here is simple approach
int makeAnagram(string a, string b) {
int A = a.size();
int B = b.size();
int count = 0;
if (A > B)
{
for (auto i = b.begin(); i != b.end(); i++)
{
size_t t = a.find(*i);
if (t == std::string::npos)
{
count++;
}
else
{
a.erase(a.begin() + t);
}
}
count = count + A - (b.size() - count);
}
else
{for (auto i = a.begin(); i != a.end(); i++)
{
size_t t = b.find(*i);
if (t == std::string::npos)
{
count++;
}
else
{
b.erase(b.begin() + t);
}
}
count = count + B - (a.size() - count);
}
return count;
}
Hm, I thought that I answered this question already somewhere else. But anyway. Lets try again. Important is the algorithm. And I nearly doubt that there is a faster answer than mine below. But, we never know . . .
And, as always, the most important thing is to find a good algorithm. And then, we maybe can do some good coding to get a fast solution. But most important is the algorithm.
Let's start to think about it. Let's start with 2 simple strings
abbccc
abbccc
They are identical, so nothing to erase. Result is 0. But how can we come to this conclusion? We could think of sorting, searching, comparing character by character, but the correct approach is counting the occurence of characters. That is nealy everytime done when talking about Anagrams. So, here we have for each string 1 a, 2 b, 3c.
And if we compare the counts for each character in the strings, then they are the same.
If we remember our -long time ago- school days or C-code, or even better Micro Controller assembler codes, then we know that comparing can be done by subtracting. Example. Let us look at some examples: 6-4=2 or 3-4= -1 or 7-7=0. So, that approach can be used.
Next example for 2 strings:
bbcccddd
abbccc
We already see by looking at it that we need to delete 3*"d" from the first string and one "a" from the second string. Overall 4 deletions. Let's look at the counts:
String a: b->2, c->3 d->3, String b: a->1, b->2, c->3
And, now let's compare, so subtract: a->0-1= -1, b->2-2=0, c->3-3=0, d->3-0=3.
And if we add up the absolute values of the deltas, then we have the result. 3+abs(-1)=4
OK, now, we can start to code this algorithm.
Read 2 source strings a and b from std::cin. For this we will use std::getline
Next we define a "counter" as an array. We assume that a char is 8bit wide and with that the maximum number of characters is 256
We positively count all character occurences of the first string
Now we do the comparison and counting in one step, by decrementing the counter for each occurence of a character in the 2nd string
Then we accumulate all counters (for all occurences of characters). We use the absolute value, because numbers could be negative.
Then we have the result.
Please note, you would need an array size of 26 counters only, because the requirements state an input range for 'a'-'z' for the charachters of the strings. But then we would need to map the charachter values for 'a'-'z' to indices 0-25, by subtracting always 'a' from a character. But with a little bit waste of space (230bytes), we can omit the subtraction.
Please see:
#include <iostream>
#include <string>
int main() {
// Here we will store the input, 2 strings to check
std::string a{}, b{};
// Read the strings from std::cin
std::getline(std::cin, a);
std::getline(std::cin, b);
// Here we will count the occurence of characters.
//We assume a char type with a width of 8 bit
int counter[256]{};
// Count occurence of characters in string a
// And Count occurence of characters in string b negatively
for (const char c : a) ++counter[c];
for (const char c : b) --counter[c];
// Calculate result
int charactersToDeleteForAnagram{};
for (int c : counter) charactersToDeleteForAnagram += std::abs(c);
std::cout << charactersToDeleteForAnagram << '\n';
return 0;
}
We can also convert to C++, where we use input checking, a std::unordered_map for counting and std::accumulate for summing up. Also the internal representation of a char-type doesn'matter. And the principle is the same.
I do not know, if this is that much slower . . .
Please see:
#include <iostream>
#include <string>
#include <unordered_map>
#include <numeric>
int main() {
// Here we will store the input, 2 strings to check
std::string aString{}, bString{};
// Read the strings from std::cin
if (std::getline(std::cin, aString) && std::getline(std::cin, bString)) {
// Here we will count the occurence of characters.
//We assume a char type with a width of 8 bit
std::unordered_map<char, int> counter{};
// Count occurence of characters in string a
// And Count occurence of characters in string b negatively
for (const char character : aString) counter[character]++;
for (const char character : bString) counter[character]--;
// Calculate result and show to user
std::cout << std::accumulate(counter.begin(), counter.end(), 0U,
[](size_t sum, const auto& counter) { return sum + std::abs(counter.second); }) << '\n';
}
else std::cerr << "\nError: Problem with input\n";
return 0;
}
If you should have any question, then please ask.
Language: C++ 17
Compiled and tested with MS Visual Studio 2019 Community Edition
Suppose I have a string "abcdpqrs",
now "dcb" can be counted as a substring of above string as the characters are together.
Also "pdq" is a part of above string. But "bcpq" is not. I hope you got what I want.
Is there any efficient way to do this.
All I can think is taking help of hash to do this. But it is taking long time even in O(n) program as backtracking is required in many cases. Any help will be appreciated.
Here is an O(n * alphabet size) solution:
Let's maintain an array count[a] = how many times the character a was in the current window [pos; pos + lenght of substring - 1]. It can be recomputed in O(1) time when the window is moved by 1 to the right(count[s[pos]]--, count[s[pos + substring lenght]]++, pos++). Now all we need is to check for each pos that count array is the same as count array for the substring(it can be computed only once).
It can actually be improved to O(n + alphabet size):
Instead of comparing count arrays in a naive way, we can maintain the number diff = number of characters that do not have the same count value as in a substring for the current window. The key observation is that diff changes in obvious way we apply count[c]-- or count[c]++ (it either gets incremented, decremented or stays the same depending on only count[c] value). Two count arrays are the same if and only if diff is zero for current pos.
Lets say you have the string "axcdlef" and wants to search "opde":
bool compare (string s1, string s2)
{
// sort both here
// return if they are equal when sorted;
}
you would need to call this function for this example with the following substrings of size 4(same as length as "opde"):
"axcd"
"xcdl"
"cdle"
"dlef"
bool exist = false;
for (/*every split that has the same size as the search */)
exist = exist || compare(currentsplit, search);
You can use a regex (i.e boost or Qt) for this. Alternately you an use this simple approach. You know the length k of the string s to be searched in string str. So take each k consecutive characters from str and check if any of these characters is present in s.
Starting point ( a naive implementation to make further optimizations):
#include <iostream>
/* pos position where to extract probable string from str
* s string set with possible repetitions being searched in str
* str original string
*/
bool find_in_string( int pos, std::string s, std::string str)
{
std::string str_s = str.substr( pos, s.length());
int s_pos = 0;
while( !s.empty())
{
std::size_t found = str_s.find( s[0]);
if ( found!=std::string::npos)
{
s.erase( 0, 1);
str_s.erase( found, 1);
} else return 0;
}
return 1;
}
bool find_in_string( std::string s, std::string str)
{
bool found = false;
int pos = 0;
while( !found && pos < str.length() - s.length() + 1)
{
found = find_in_string( pos++, s, str);
}
return found;
}
Usage:
int main() {
std::string s1 = "abcdpqrs";
std::string s2 = "adcbpqrs";
std::string searched = "dcb";
std::string searched2 = "pdq";
std::string searched3 = "bcpq";
std::cout << find_in_string( searched, s1);
std::cout << find_in_string( searched, s2);
std::cout << find_in_string( searched2, s1);
std::cout << find_in_string( searched3, s1);
return 0;
}
prints: 1110
http://ideone.com/WrSMeV
To use an array for this you are going to need some extra code to map where each character goes in there... Unless you know you are only using 'a' - 'z' or something similar that you can simply subtract from 'a' to get the position.
bool compare(string s1, string s2)
{
int v1[SIZE_OF_ALFABECT];
int v2[SIZE_OF_ALFABECT];
int count = 0;
map<char, int> mymap;
// here is just pseudocode
foreach letter in s1:
if map doesnt contain this letter already:
mymap[letter] = count++;
// repeat the same foreach in s2
/* You can break and return false here if you try to add new char into map,
that means that the second string has a different character already... */
// count will now have the number of distinct chars that you have in both strs
// you will need to check only 'count' positions in the vectors
for(int i = 0; i < count; i++)
v1[i] = v2[i] = 0;
//another pseudocode
foreach letter in s1:
v1[mymap[leter]]++;
foreach letter in s1:
v2[mymap[leter]]++;
for(int i = 0; i < count; i++)
if(v1[i] != v2[i])
return false;
return true;
}
Here is a O(m) best case, O(m!) worst case solution - m being the length of your search string:
Use a suffix-trie, e.g. a Ukkonnen Trie (there are some floating around, but I have no link at hand at the moment), and search for any permutation of the substring. Note that any lookup needs just O(1) for each chararacter of the string to search, regardless of the size of n.
However, while the size of n does not matter, this becomes inpractical for large m.
If however n is small enough anf one is willing to sacrifice lookup performance for index size, the suffix trie can store a string that contains all permutations of the original string.
Then the lookup will always be O(m).
I'd suggest to go with the accepted answer for the general case. However, here you have a suggestion that can perform (much) better for small substrings and large string.
I'm posting this on behalf of a friend since I believe this is pretty interesting:
Take the string "abb". By leaving out
any number of letters less than the
length of the string we end up with 7
strings.
a b b ab ab bb abb
Out of these 4 are palindromes.
Similarly for the string
"hihellolookhavealookatthispalindromexxqwertyuiopasdfghjklzxcvbnmmnbvcxzlkjhgfdsapoiuytrewqxxsoundsfamiliardoesit"
(a length 112 string) 2^112 - 1
strings can be formed.
Out of these how many are
palindromes??
Below there is his implementation (in C++, C is fine too though). It's pretty slow with very long words; he wants to know what's the fastest algorithm possible for this (and I'm curious too :D).
#include <iostream>
#include <cstring>
using namespace std;
void find_palindrome(const char* str, const char* max, long& count)
{
for(const char* begin = str; begin < max; begin++) {
count++;
const char* end = strchr(begin + 1, *begin);
while(end != NULL) {
count++;
find_palindrome(begin + 1, end, count);
end = strchr(end + 1, *begin);
}
}
}
int main(int argc, char *argv[])
{
const char* s = "hihellolookhavealookatthis";
long count = 0;
find_palindrome(s, strlen(s) + s, count);
cout << count << endl;
}
First of all, your friend's solution seems to have a bug since strchr can search past max. Even if you fix this, the solution is exponential in time.
For a faster solution, you can use dynamic programming to solve this in O(n^3) time. This will require O(n^2) additional memory. Note that for long strings, even 64-bit ints as I have used here will not be enough to hold the solution.
#define MAX_SIZE 1000
long long numFound[MAX_SIZE][MAX_SIZE]; //intermediate results, indexed by [startPosition][endPosition]
long long countPalindromes(const char *str) {
int len = strlen(str);
for (int startPos=0; startPos<=len; startPos++)
for (int endPos=0; endPos<=len; endPos++)
numFound[startPos][endPos] = 0;
for (int spanSize=1; spanSize<=len; spanSize++) {
for (int startPos=0; startPos<=len-spanSize; startPos++) {
int endPos = startPos + spanSize;
long long count = numFound[startPos+1][endPos]; //if str[startPos] is not in the palindrome, this will be the count
char ch = str[startPos];
//if str[startPos] is in the palindrome, choose a matching character for the palindrome end
for (int searchPos=startPos; searchPos<endPos; searchPos++) {
if (str[searchPos] == ch)
count += 1 + numFound[startPos+1][searchPos];
}
numFound[startPos][endPos] = count;
}
}
return numFound[0][len];
}
Explanation:
The array numFound[startPos][endPos] will hold the number of palindromes contained in the substring with indexes startPos to endPos.
We go over all pairs of indexes (startPos, endPos), starting from short spans and moving to longer ones. For each such pair, there are two options:
The character at str[startPos] is not in the palindrome. In that case, there are numFound[startPos+1][endPos] possible palindromes - a number that we have calculated already.
character at str[startPos] is in the palindrome (at its beginning). We scan through the string to find a matching character to put at the end of the palindrome. For each such character, we use the already-calculated results in numFound to find number of possibilities for the inner palindrome.
EDIT:
Clarification: when I say "number of palindromes contained in a string", this includes non-contiguous substrings. For example, the palindrome "aba" is contained in "abca".
It's possible to reduce memory usage to O(n) by taking advantage of the fact that calculation of numFound[startPos][x] only requires knowledge of numFound[startPos+1][y] for all y. I won't do this here since it complicates the code a bit.
Pregenerating lists of indices containing each letter can make the inner loop faster, but it will still be O(n^3) overall.
I have a way can do it in O(N^2) time and O(1) space, however I think there must be other better ways.
the basic idea was the long palindrome must contain small palindromes, so we only search for the minimal match, which means two kinds of situation: "aa", "aba". If we found either , then expand to see if it's a part of a long palindrome.
int count_palindromic_slices(const string &S) {
int count = 0;
for (int position=0; position<S.length(); position++) {
int offset = 0;
// Check the "aa" situation
while((position-offset>=0) && (position+offset+1)<S.length() && (S.at(position-offset))==(S.at(position+offset+1))) {
count ++;
offset ++;
}
offset = 1; // reset it for the odd length checking
// Check the string for "aba" situation
while((position-offset>=0) && position+offset<S.length() && (S.at(position-offset))==(S.at(position+offset))) {
count ++;
offset ++;
}
}
return count;
}
June 14th, 2012
After some investigation, I believe this is the best way to do it.
faster than the accepted answer.
Is there any mileage in making an initial traversal and building an index of all occurances of each character.
h = { 0, 2, 27}
i = { 1, 30 }
etc.
Now working from the left, h, only possible palidromes are at 3 and 17, does char[0 + 1] == char [3 -1] etc. got a palindrome. does char [0+1] == char [27 -1] no, No further analysis of char[0] needed.
Move on to char[1], only need to example char[30 -1] and inwards.
Then can probably get smart, when you've identified a palindrome running from position x->y, all inner subsets are known palindromes, hence we've dealt with some items, can eliminate those cases from later examination.
My solution using O(n) memory and O(n^2) time, where n is the string length:
palindrome.c:
#include <stdio.h>
#include <string.h>
typedef unsigned long long ull;
ull countPalindromesHelper (const char* str, const size_t len, const size_t begin, const size_t end, const ull count) {
if (begin <= 0 || end >= len) {
return count;
}
const char pred = str [begin - 1];
const char succ = str [end];
if (pred == succ) {
const ull newCount = count == 0 ? 1 : count * 2;
return countPalindromesHelper (str, len, begin - 1, end + 1, newCount);
}
return count;
}
ull countPalindromes (const char* str) {
ull count = 0;
size_t len = strlen (str);
size_t i;
for (i = 0; i < len; ++i) {
count += countPalindromesHelper (str, len, i, i, 0); // even length palindromes
count += countPalindromesHelper (str, len, i, i + 1, 1); // odd length palindromes
}
return count;
}
int main (int argc, char* argv[]) {
if (argc < 2) {
return 0;
}
const char* str = argv [1];
ull count = countPalindromes (str);
printf ("%llu\n", count);
return 0;
}
Usage:
$ gcc palindrome.c -o palindrome
$ ./palindrome myteststring
EDIT: I misread the problem as the contiguous substring version of the problem. Now given that one wants to find the palindrome count for the non-contiguous version, I strongly suspect that one could just use a math equation to solve it given the number of distinct characters and their respective character counts.
Hmmmmm, I think I would count up like this:
Each character is a palindrome on it's own (minus repeated characters).
Each pair of the same character.
Each pair of the same character, with all palindromes sandwiched in the middle that can be made from the string between repeats.
Apply recursively.
Which seems to be what you're doing, although I'm not sure you don't double-count the edge cases with repeated characters.
So, basically, I can't think of a better way.
EDIT:
Thinking some more,
It can be improved with caching, because you sometimes count the palindromes in the same sub-string more than once. So, I suppose this demonstrates that there is definitely a better way.
Here is a program for finding all the possible palindromes in a string written in both Java and C++.
int main()
{
string palindrome;
cout << "Enter a String to check if it is a Palindrome";
cin >> palindrome;
int length = palindrome.length();
cout << "the length of the string is " << length << endl;
int end = length - 1;
int start = 0;
int check=1;
while (end >= start) {
if (palindrome[start] != palindrome[end]) {
cout << "The string is not a palindrome";
check=0;
break;
}
else
{
start++;
end--;
}
}
if(check)
cout << "The string is a Palindrome" << endl;
}
public String[] findPalindromes(String source) {
Set<String> palindromes = new HashSet<String>();
int count = 0;
for(int i=0; i<source.length()-1; i++) {
for(int j= i+1; j<source.length(); j++) {
String palindromeCandidate = new String(source.substring(i, j+1));
if(isPalindrome(palindromeCandidate)) {
palindromes.add(palindromeCandidate);
}
}
}
return palindromes.toArray(new String[palindromes.size()]);
}
private boolean isPalindrome(String source) {
int i =0;
int k = source.length()-1;
for(i=0; i<source.length()/2; i++) {
if(source.charAt(i) != source.charAt(k)) {
return false;
}
k--;
}
return true;
}
I am not sure but you might try whit fourier. This problem remined me on this: O(nlogn) Algorithm - Find three evenly spaced ones within binary string
Just my 2cents