Finding smallest substring not present in string - c++

I have a string consisting only of digits 0-9. The string can be between 1 and 1,000,000 characters in length. I need to find the smallest number that isn't present in the string, in linear time. Here are some examples:
1023456789 //Smallest number not in string is 11
1023479 //Smallest number not in string is 5
112131405678910 //Smallest number not in string is 15
With size of 1,000,000, I figured that the smallest number not present in the string has to be at most 6 digits.
My approach was to generate all numbers 0 through 999,999 and insert them all in a vector (in order). Then make a map that marks what strings have already been seen. Then I iterate through the string, and for each position I get all substring starting from it, size 1 to 6, and I mark all those substrings as true in the map. At the end, I just check all keys one by one, and when I find the first one that has a false value in the map, I print it.
Here are some code snippets:
string tmp="0";
string numbers[999999];
void increase(int pos)
{
if(pos==-1)tmp.insert(0,"1");
else if(tmp.at(pos)!='9')tmp.at(pos)++;
else
{
tmp.at(pos)='0';
increase(pos-1);
}
}
//And later inside main
for(int j=0;j<999999;j++)
{
numbers[j]=tmp;
increase(tmp.size()-1);
}
for(int j=0;j<input.size();j++)
{
for(int k=0;k<6;k++)
{
string temp="";
if(j+k<input.size())
{
temp+=input.at(j+k);
appeared[temp]=true;
}
}
}
int counter=0;
while(appeared[numbers[counter]])counter++;
cout<<numbers[counter]<<endl;
A note about the first part of the algorithm. I generate the vector once, then I use it for 100 strings. I need to parse all 100 strings in less than 4 seconds.
This algorithm is too slow for me as it is currently. Could I optimize some of the code, or should I consider a different approach?

Idea is to build a tree of numbers that were met:
class Node {
public:
Node() : count( 0 ) {}
// create a tree from substring [from, to[ interval
void build( const std::string &str, size_t from, size_t to )
{
Node *node = this;
while( from != to )
node = node->insert( str[from++] );
}
std::string smallestNumber( bool root = true, int limit = 0 ) const;
private:
Node *insert( char c )
{
int idx = c - '0';
if( !children[idx] ) {
++count;
children[idx].reset( new Node );
}
return children[idx].get();
}
int count;
std::unique_ptr<Node> children[10];
};
std::string Node::smallestNumber( bool root, int limit ) const
{
std::string rez;
if( count < 10 ) { // for this node string is one symbol length
for( int i = 0; i < 10; ++i )
if( !children[i] ) return std::string( 1, '0' + i );
throw std::sruntime_error( "should not happen!" );
}
if( limit ) {
if( --limit == 1 ) return rez; // we cannot make string length 1
}
char digit = '0';
for( int i = 0; i < 10; ++i ) {
if( root && i == 0 ) continue;
std::string tmp = children[i]->smallestNumber( false, limit );
if( !tmp.empty() ) {
rez = tmp;
digit = '0' + i;
limit = rez.length();
if( limit == 1 ) break;
}
}
return digit + rez;
}
void calculate( const std::string &str )
{
Node root;
for( size_t i = 0; i < str.length(); ++i ) {
root.build( str, i, i + std::min( 6UL, str.length() - i ) );
}
std::cout << "smallest number is:" << root.smallestNumber() << std::endl;
}
int main()
{
calculate( "1023456789" );
calculate( "1023479" );
calculate( "112131405678910" );
return 0;
}
EDIT: after some thought I realized that inner loop is completely unnecessary. 1 loop is enough. String length is limited to 6, I rely on OPs estimation of biggest number possible.
Output:
smallest number is:11
smallest number is:5
smallest number is:15

Here's how I would approach the problem. The idea is to generate sets of unique substrings of particular length, starting from the shortest and then testing those before generating longer substrings. This allows the code to not make assumptions about the upper bound of the result and also should be much faster for long input strings that have small results. Still, it's not necessarily better in the worst case of big results.
int find_shortest_subnumber(std::string str) {
static int starts[10] = {
0, 10, 100, 1000, 10000,
100000, 1000000, 10000000, 100000000, 1000000000
};
// can't find substrings longer than 9 (won't fit in int)
int limit = std::min((int)str.size(), 9);
for(int length = 1; length <= limit; length++) {
std::set<std::string> uniques; // unique substrings of current length
for(int i = 0; i <= (int)str.size() - length; i++) {
auto start = str.begin() + i;
uniques.emplace(start, start + length);
}
for(int i = starts[length - 1]; i < starts[length]; i++) {
if(uniques.find(std::to_string(i)) == uniques.end())
return i;
}
}
return -1; // not found (empty string or too big result)
}
I haven't done proper complexity analysis. I crudely tested the function with a particular test string that was 1 028 880 characters long and had the result of 190 000. It took about 2s to execute on my machine (which includes generation of the test string which should be negligible).

You can construct a suffix tree for the string in linear time (and space). Once you have the suffix tree, you simply need to breadth-first walk it scanning the children of each node in lexicographical order, and checking all 10 digits at each node. The first missing one is the last digit in the smallest missing number.
Since a 1,000,000 digit sequence only has 999,995 six-digit subsequences, there must be at least five six-digit subsequences not present, so the breadth-first search must terminate no later than the sixth level; consequently, it is also linear time.

Since you only need to know whether a number has been seen yet or not, it's probably easiest to use a std::vector<bool> to store that indication. As you walk through the input number, you mark numbers as true in the array. When you're done, you walk through the array, and print out the index of the first item that's still false.

Related

Inconsistent Quick Sort Function

So I have written this quick sort function, and it works for SOME input.
For example it works for the following inputs : "5 4 3 2 1", "3 4 5 6 7", etc.
However when I input something like : "0 3 5 4 -5 100 7777 2014" it will always mix up the multi digit numbers.
I was hoping someone could help point me to where my code is failing at this test case.
Sort.cpp
std::vector<int> QuickSort::sortFunc(std::vector<int> vec, int left, int right) {
int i = left, j = right;
int tmp;
int pivot = vec.at( (left + right) / 2 );
/* partition */
while (i <= j) {
while (vec.at(i) < pivot)
i++;
while (vec.at(j) > pivot)
j--;
if (i <= j) {
tmp = vec.at(i);
vec.at(i) = vec.at(j);
vec.at(j) = tmp;
i++;
j--;
}
}
/* recursion */
if (left < j)
return sortFunc( vec, left, j );
if (i < right)
return sortFunc( vec, i, right );
else
{
return vec;
}
}
main.cpp
int main()
{
// The user inputs a string of numbers (e.g. "6 4 -2 88 ..etc") and those integers are then put into a vector named 'vec'.
std::vector<int> vec;
// Converts string from input into integer values, and then pushes said values into vector.
std::string line;
if ( getline(std::cin, line) )
{
std::istringstream str(line);
int value;
str >> value;
vec.push_back( value );
while ( str >> value )
{
vec.push_back( value );
}
}
// Creating QuickSort object.
QuickSort qSort;
QuickSort *ptrQSort = &qSort;
// Creating new vector that has been 'Quick Sorted'.
int vecSize = vec.size();
std::vector<int> qSortedVec;
qSortedVec = ptrQSort->sortFunc( vec, 0, vecSize-1 );
// Middle, start, and end positions on the vector.
int mid = ( 0 + (vec.size()-1) ) / 2;
int start = 0, end = vec.size() - 1;
// Creating RecursiveBinarySearch object.
RecursiveBinarySearch bSearch;
RecursiveBinarySearch *ptrBSearch = &bSearch;
//bool bS = ptrBSearch->binarySearch( qSortedVec, mid, start, end );
bool bS = ptrBSearch->binarySearch( bSortedVec, mid, start, end );
/*--------------------------------------OUTPUT-----------------------------------------------------------------------*/
// Print out inputted integers and the binary search result.
// Depending on the binary search, print either 'true' or 'false'.
if ( bS == 1 )
{
std::cout << "true ";
}
if ( bS == 0 )
{
std::cout << "false ";
}
// Prints the result of the 'quick sorted' array.
int sortedSize = qSortedVec.size();
for ( int i = 0; i < sortedSize; i++ )
{
std::cout << qSortedVec[i] << " ";
}
std::cout << "\n";
return 0;
}
Thanks for any and all help you can give me guys.
I'm not sure if this solves it completely, but after sorting the left part, you still need to sort the right part, but you already return instead.
Also, passing the vector by value and returning it is overhead and not needed, because in the end there should only be one version of the vector, so passing by reference is preferred. Passing by value and returning is sometimes needed when doing recursion, especially when backtracking (looking for different paths), but not in this case where left and right provide the needed state.

Find the minimum number of moves to get a "Good" string

A string is called to be good if and only if "All the distinct characters in String are repeated the same number of times".
Now, Given a string of length n, what is the minimum number of changes we have to make in this string so that string becomes good.
Note : We are only allowed to use lowercase English letters, and we can change any letter to any other letter.
Example : Let String is yyxzzxxx
Then here answer is 2.
Explanation : One possible solution yyxyyxxx. We have changed 2 'z' to 2 'y'. Now both 'x' and 'y' are repeated 4 times.
My Approach :
Make a hash of occurrence of all 26 lowercase letters.
Also find number of distinct alphabets in string.
Sort this hash array and start checking if length of string is divisible by number of distinct characters.If yes then we got the answer.
Else reduce distinct characters by 1.
But its giving wrong answers for some results as their may be cases when removing some character that has not occur minimum times provide a good string in less moves.
So how to do this question.Please help.
Constraints : Length of string is up to 2000.
My Approach :
string s;
cin>>s;
int hash[26]={0};
int total=s.length();
for(int i=0;i<26;i++){
hash[s[i]-'a']++;
}
sort(hash,hash+total);
int ans=0;
for(int i=26;i>=1;i--){
int moves=0;
if(total%i==0){
int eachshouldhave=total/i;
int position=26;
for(int j=1;j<26;j++){
if(hash[j]>eachshouldhave && hash[j-1]<eachshouldhave){
position=j;
break;
}
}
int extrasymbols=0;
//THE ONES THAT ARE BELOW OBVIOUSLY NEED TO BE CHANGED TO SOME OTHER SYMBOL
for(int j=position;j<26;j++){
extrasymbols+=hash[j]-eachshouldhave;
}
//THE ONES ABOVE THIS POSITION NEED TO GET SOME SYMBOLS FROM OTHERS
for(int j=0;j<position;j++){
moves+=(eachshouldhave-hash[j]);
}
if(moves<ans)
ans=moves;
}
else
continue;
}
Following should fix your implementation:
std::size_t compute_change_needed(const std::string& s)
{
int count[26] = { 0 };
for(char c : s) {
// Assuming only valid char : a-z
count[c - 'a']++;
}
std::sort(std::begin(count), std::end(count), std::greater<int>{});
std::size_t ans = s.length();
for(std::size_t i = 1; i != 27; ++i) {
if(s.length() % i != 0) {
continue;
}
const int expected_count = s.length() / i;
std::size_t moves = 0;
for(std::size_t j = 0; j != i; j++) {
moves += std::abs(count[j] - expected_count);
}
ans = std::min(ans, moves);
}
return ans;
}

Optimizing "It's the Great Pumpkin Patch." ACM 1999 Practice

So, this is a weekly project for school, and I have got it completely working, but feel as if the way I did it, probably isn't one of of the best ways to do it ( or even a good way ). I was hoping you guys could help optimize it / give a better solution. I have already submitted this version, but would like to know a more optimal solution to the problem.
So to start, here's the problem...
It's almost Halloween and Linus is setting out to the garden to wait for the Great Pumpkin. Unfortunately, due to diversification, there are lots of other gourds in the garden this year, so he needs you to write a program to tell him how many patches of pumpkins there are and how big they are.
The input to this program will be a number of different gardens. The first line of the input for each garden will be the dimensions of the garden, r, the number of rows in the garden, and c, the number of columns, where 0 ≤ r ≤ 40 and 0 ≤ c ≤ 40. Following the dimensions will be r lines with c characters on each line. Each of these characters will be a lower case letter representing the type of gourd grown in the square. A lower case 'p' will represent pumpkins. A garden with 0 for the number of rows and/or columns indicates the end of input and should not be processed.
Example input:
10 10
pzzzzzzzzp
pyypzzzzzy
ppppssssyy
ssspssssyy
ssssppssyy
ssssppsspy
zzzzzzsspy
zzzzzzsspy
yyyypzsspy
yyyypppppy
3 4
pppp
pppp
pppp
1 7
zzzzzzz
0 0
For each garden, output the number of the garden (with the first input set being garden 1), the number of pumpkin patches in the garden, and the size of the pumpkin patches in order from smallest to largest. If there is more than one patch of a given size, print the size as many times as it occurs. Use the following format:
Example Output:
Garden # 1: 4 patches, sizes: 1 4 8 10
Garden # 2: 1 patches, sizes: 12
Garden # 3: 0 patches, sizes:
Note: Even though the problem says to input from a file, our professor told us to input through the keyboard.
My approach to this was to put the garden into a 2d array with a border of x's around it. I would then use a function to find a pumpkin patch ( and return its coordinates ). I would then use another function that recursively found if a pumpkin was attached to that one through above, below, left, and right, and returned the size of the pumpkin patch. This function also 'deletes' each pumpkin when it finds it by replacing it by an 'x'. This allowed me to not have to worry about finding pumpkins multiple times.
So here's my code, pretty well commented so that you guys could hopefully understand what I was trying to do.
#include <iostream>
#include <fstream>
using namespace std;
const int MAX_ROW = 41;
const int MAX_COL = 41;
char input ();
int checkForPatchSize ( char arr[][MAX_COL], int numOne, int numTwo );
bool findPatch ( char arr[][MAX_COL], int &row, int&col );
void sort ( int arr[], int size);
int main ()
{
int inputNumOne = -1; // Defaulted to -1, First number for Patch Size
int inputNumTwo = -1; // Defaulted to -1, Second number for Patch Size
int i, j; // i, j are indexes for loops, number
int numArr[MAX_ROW][MAX_COL]; // Array for storing Data
int indexGarden = 0;
int index = 1;
while ( inputNumOne != 0 )
{
cin >> inputNumOne; // User inputs Dimension
cin >> inputNumTwo; // Of Garden...
if ( inputNumOne != 0 and inputNumTwo != 0 ) // End case is that both are 0.
{
char pumpkinPatch[MAX_ROW][MAX_COL]; // Array for storing pumpkin patch info.
for ( i = 0; i < inputNumOne+2; i++ )
{
for ( j = 0; j < inputNumTwo+2; j++ )
{
// This if statement surrounds the garden in X's so that I have a border (allows me to not have to worry about test cases.
if ( i == 0 or j == 0 or i == inputNumOne + 1 or j == inputNumTwo + 1 )
{
pumpkinPatch[i][j] = 'x';
}
else // This is the input of the garden into a 2d array.
{
pumpkinPatch[i][j] = input();
}
}
}
int row, col, size, numberOfPatches = 0;
bool foundPatch = true;
index = 1;
while ( foundPatch == true )
{
row = inputNumOne+2; // Because I added a border to the garden
col = inputNumTwo+2; // the size is +2 of what the user input.
foundPatch = findPatch ( pumpkinPatch, row, col ); // Finds the start of a pumpkin patch, and returns the coordinates ( as row and col ).
if ( foundPatch == true ) // If a patch is found....
{
numberOfPatches++; // Increase number of patches
size = checkForPatchSize ( pumpkinPatch, row, col); // find size of particular patch
numArr[indexGarden][index] = size; // put size into data arr (to be printed to screen later).
index++;
}
}
numArr[indexGarden][0] = numberOfPatches; // Put number of patches as first item in each column of data arr.
indexGarden++;
}
}
for ( index = 0; index < indexGarden; index++ ) // Print out Garden Info
{
cout << "Garden # " << index + 1 <<": " << numArr[index][0] << " patches, sizes: ";
int temp = numArr[index][0]; // temp is the number of patches in particular garden.
int tempArr[temp]; // temp array to be used for sorting
int indexTwo;
for ( indexTwo = 0; indexTwo < temp; indexTwo++ )
{
tempArr[indexTwo] = numArr[index][indexTwo+1]; // Transfer sizes into a temp array so that they can be sorted.
}
sort (tempArr, temp); // Sort ( Sorts arr from smalles to larges )
for ( indexTwo = 0; indexTwo < temp; indexTwo++ ) // Output sorted array to screen.
{
cout << tempArr[indexTwo] << " ";
}
cout << endl;
}
}
char input()
{
char letter;
cin >> letter;
return letter;
}
/////////////// findPatch /////////////////////////////////////////////////
// Requirements: a 2D array of garden, and the size of it (row and col). //
// Returns a bool, true if patch is found, false if no patches found. //
// row and col are returned by reference to be the coordinates of one //
// of the pumpkins in the patch. //
///////////////////////////////////////////////////////////////////////////
bool findPatch ( char arr[][MAX_COL], int &row, int&col )
{
int rowIndex = 0;
int colIndex = 0;
while ( arr[rowIndex][colIndex] != 'p' and rowIndex < row )
{
colIndex = 0;
while ( arr[rowIndex][colIndex] != 'p' and colIndex < col )
{
colIndex++;
}
if ( arr[rowIndex][colIndex] != 'p' )
{
rowIndex++;
}
}
if ( arr[rowIndex][colIndex] != 'p' )
{
return false;
}
row = rowIndex;
col = colIndex;
return true;
}
/////////////// checkForPatchSize /////////////////////////////////////////
// Requirements: a 2D array of garden, and the coordinates of the start //
// of a patch. (Gotten from findPatch) //
// Returns an int, which is the size of the patch found //
// All p's or pumpkins are changed to x's so that they are not used //
// multiple times. //
///////////////////////////////////////////////////////////////////////////
int checkForPatchSize ( char arr[][MAX_COL], int numOne, int numTwo )
{
int index = 0;
if ( arr[numOne][numTwo] == 'p' )
{
index++;
arr[numOne][numTwo] = '0';
// Check Above
index += checkForPatchSize ( arr, numOne - 1, numTwo );
// Check to Left
index += checkForPatchSize ( arr, numOne, numTwo - 1 );
// Check Below
index += checkForPatchSize ( arr, numOne + 1, numTwo );
// Check to Right
index += checkForPatchSize ( arr, numOne, numTwo + 1 );
return index;
}
return 0;
}
/////////////// sort //////////////////////////////////////////////////////
// Requirements: an integer array, and the size of it (amount of //
// numbers in it). //
// //
// Sorts an array from smalles to largest numbers //
///////////////////////////////////////////////////////////////////////////
void sort ( int arr[], int size )
{
int index = 0;
bool changeMade = true;
while ( changeMade == true )
{
changeMade = false;
for ( index = 0; index < size - 1; index++ )
{
if ( arr[index] > arr[index+1] )
{
int temp = arr[index];
arr[index] = arr[index+1];
arr[index+1] = temp;
changeMade = true;
}
}
}
}
Alright, after reading through your code, I see your approach. Generally, I would approach this from a visual perspective. As it is, your code should work just fine, and is quite an elegant solution. The single weakness of your algorithm is the fact that it iterates over the same patch each time it moves. For example, when it moves upwards, it checks downwards. Avoiding redundancy is the surest sign of an optimal algorithm, but in terms of the small-scale at which you are deploying the algorithm, it need not be optimal.
In a way, the recursive nature of your code makes it quite beautiful because it traverses the pumpkin patch like little sparks that die out, which I really do like. Recursion is something which I don't often bother myself with, mostly because I don't think recursively, but after spending a moment wrapping my head around your algorithm, I really do see its value in cases like this. I would love to see the algorithm at work with dynamic visuals.
As for the accuracy of your algorithm, I can not imagine that it would fail in any manner to count the pumpkins correctly because it functions by making a small wave around the picked pumpkin in which the algorithm repeats itself, effectively propagating through the patch until it is all counted. As I said, the only shortcoming of your algorithm is that it would fall into an infinite loop if you did not somehow mark pumpkins as found (it checks the position it was called from). Beyond that, I can only say that you've proposed an excellent solution and that your doubts are almost entirely misplaced. Using a recursive algorithm was an excellent choice in this regard because it doesn't require a long list of cases in order to 'count'; it simply jumps into adjacent positions, returning to itself with the full count.

To find the longest substring with equal sum in left and right in C++

I was solving a question, with which I am having some problems:
Complete the function getEqualSumSubstring, which takes a single argument. The single argument is a string s, which contains only non-zero digits.
This function should print the length of longest contiguous substring of s, such that the length of the substring is 2*N digits and the sum of the leftmost N digits is equal to the sum of the rightmost N digits. If there is no such string, your function should print 0.
int getEqualSumSubstring(string s) {
int i=0,j=i,foundLength=0;
for(i=0;i<s.length();i++)
{
for(j=i;j<s.length();j++)
{
int temp = j-i;
if(temp%2==0)
{
int leftSum=0,rightSum=0;
string tempString=s.substr(i,temp);
for(int k=0;k<temp/2;k++)
{
leftSum=leftSum+tempString[k]-'0';
rightSum=rightSum+tempString[k+(temp/2)]-'0';
}
if((leftSum==rightSum)&&(leftSum!=0))
if(s.length()>foundLength)
foundLength=s.length();
}
}
}
return(foundLength);
}
The problem is that this code is working for some samples and not for the others. Since this is an exam type question I don't have the test cases either.
This code works
int getEqualSumSubstring(string s) {
int i=0,j=i,foundLength=0;
for(i=0;i<s.length();i++)
{
for(j=i;j<s.length();j++)
{
int temp = j-i+1;
if(temp%2==0)
{
int leftSum=0,rightSum=0;
string tempString=s.substr(i,temp);
// printf("%d ",tempString.length());
for(int k=0;k<temp/2;k++)
{
leftSum=leftSum+tempString[k]-48;
rightSum=rightSum+tempString[k+(temp/2)]-48;
}
if((leftSum==rightSum)&&(leftSum!=0))
if(tempString.length()>foundLength)
foundLength=tempString.length();
}
}
}
return(foundLength);
}
The temp variable must be j-i+1. Otherwise the case where the whole string is the answer will not be covered. Also, we need to make the change suggested by Scott.
Here's my solution that I can confirm works. The ones above didn't really work for me - they gave me compile errors somehow. I got the same question on InterviewStreet, came up with a bad, incomplete solution that worked for 9/15 of the test cases, so I had to spend some more time coding afterwards.
The idea is that instead of caring about getting the left and right sums (which is what I initially did as well), I will get all the possible substrings out of each half (left and right half) of the given input, sort and append them to two separate lists, and then see if there are any matches.
Why?
Say the strings "423" and "234" have the same sum; if I sorted them, they would both be "234" and thus match. Since these numbers have to be consecutive and equal length, I no longer need to worry about having to add them up as numbers and check.
So, for example, if I'm given 12345678, then on the left side, the for-loop will give me:
[1,12,123,1234,2,23,234,3,34]
And on the right:
[5,56,567,5678,...]
And so forth.
However, I'm only taking substrings of a length of at least 2 into account.
I append each of these substrings, sorted by converting into a character array then converting back into a string, into ArrayLists.
So now that all this is done, the next step is to see if there are identical strings of the same numbers in these two ArrayLists. I simply check each of temp_b's strings against temp_a's first string, then against temp_a's second string, and so forth.
If I get a match (say, "234" and "234"), I'll set the length of those matching substrings as my tempCount (tempCount = 3). I also have another variable called 'count' to keep track of the greatest length of these matching substrings (if this was the first occurrence of a match, then count = 0 is overwritten by tempCount = 3, so count = 3).
As for the odd/even string length with the variable int end, the reason for this is because in the line of code s.length()/2+j, is the length of the input happened to be 11, then:
s.length() = 11
s.length()/2 = 11/5 = 5.5 = 5
So in the for-loop, s.length()/2 + j, where j maxes out at s.length()/2, would become:
5 + 5 = 10
Which falls short of the s.length() that I need to reach for to get the string's last index.
This is because the substring function requires an end index of one greater than what you'd put for something like charAt(i).
Just to demonstrate, an input of "47582139875" will generate the following output:
[47, 457, 4578, 24578, 57, 578, 2578, 58, 258, 28] <-- substrings from left half
[139, 1389, 13789, 135789, 389, 3789, 35789, 789, 5789, 578] <-- substrings from right half
578 <-- the longest one that matched
6 <-- the length of '578' x 2
public static int getEqualSumSubtring(String s){
// run through all possible length combinations of the number string on left and right half
// append sorted versions of these into new ArrayList
ArrayList<String> temp_a = new ArrayList<String>();
ArrayList<String> temp_b = new ArrayList<String>();
int end; // s.length()/2 is an integer that rounds down if length is odd, account for this later
for( int i=0; i<=s.length()/2; i++ ){
for( int j=i; j<=s.length()/2; j++ ){
// only account for substrings with a length of 2 or greater
if( j-i > 1 ){
char[] tempArr1 = s.substring(i,j).toCharArray();
Arrays.sort(tempArr1);
String sorted1 = new String(tempArr1);
temp_a.add(sorted1);
//System.out.println(sorted1);
if( s.length() % 2 == 0 )
end = s.length()/2+j;
else // odd length so we need the extra +1 at the end
end = s.length()/2+j+1;
char[] tempArr2 = s.substring(i+s.length()/2, end).toCharArray();
Arrays.sort(tempArr2);
String sorted2 = new String(tempArr2);
temp_b.add(sorted2);
//System.out.println(sorted2);
}
}
}
// For reference
System.out.println(temp_a);
System.out.println(temp_b);
// If the substrings match, it means they have the same sum
// Keep track of longest substring
int tempCount = 0 ;
int count = 0;
String longestSubstring = "";
for( int i=0; i<temp_a.size(); i++){
for( int j=0; j<temp_b.size(); j++ ){
if( temp_a.get(i).equals(temp_b.get(j)) ){
tempCount = temp_a.get(i).length();
if( tempCount > count ){
count = tempCount;
longestSubstring = temp_a.get(i);
}
}
}
}
System.out.println(longestSubstring);
return count*2;
}
Heres my solution to this question including tests. I've added an extra function just because I feel it makes the solution way easier to read than the solutions above.
#include <string>
#include <iostream>
using namespace std;
int getMaxLenSumSubstring( string s )
{
int N = 0; // The optimal so far...
int leftSum = 0, rightSum=0, strLen=s.size();
int left, right;
for(int i=0;i<strLen/2+1;i++) {
left=(s[i]-int('0')); right=(s[strLen-i-1]-int('0'));
leftSum+=left; rightSum+=right;
if(leftSum==rightSum) N=i+1;
}
return N*2;
}
int getEqualSumSubstring( string s ) {
int maxLen = 0, substrLen, j=1;
for( int i=0;i<s.length();i++ ) {
for( int j=1; j<s.length()-i; j++ ) {
//cout<<"Substring = "<<s.substr(i,j);
substrLen = getMaxLenSumSubstring(s.substr(i,j));
//cout<<", Len ="<<substrLen;
if(substrLen>maxLen) maxLen=substrLen;
}
}
return maxLen;
}
Here are a few tests I ran. Based upon the examples above they seem right.
int main() {
cout<<endl<<"Test 1 :"<<getEqualSumSubstring(string("123231"))<<endl;
cout<<endl<<"Test 2 :"<<getEqualSumSubstring(string("986561517416921217551395112859219257312"))<<endl;
cout<<endl<<"Test 3:"<<getEqualSumSubstring(string("47582139875"))<<endl;
}
Shouldn't the following code use tempString.length() instead of s.length()
if((leftSum==rightSum)&&(leftSum!=0))
if(s.length()>foundLength)
foundLength=s.length();
Below is my code for the question... Thanks !!
public class IntCompl {
public String getEqualSumSubstring_com(String s)
{
int j;
int num=0;
int sum = 0;
int m=s.length();
//calculate String array Length
for (int i=m;i>1;i--)
{
sum = sum + m;
m=m-1;
}
String [] d = new String[sum];
int k=0;
String ans = "NULL";
//Extract strings
for (int i=0;i<s.length()-1;i++)
{
for (j=s.length();j>=i+1;k++,j--)
{
num = k;
d[k] = s.substring(i,j);
}
k=num+1;
}
//Sort strings in such a way that the longest strings precede...
for (int i=0; i<d.length-1; i++)
{
for (int h=1;h<d.length;h++)
{
if (d[i].length() > d[h].length())
{
String temp;
temp=d[i];
d[i]=d[h];
d[h]=temp;
}
}
}
// Look for the Strings with array size 2*N (length in even number) and such that the
//the sum of left N numbers is = to the sum of right N numbers.
//As the strings are already in decending order, longest string is searched first and break the for loop once the string is found.
for (int x=0;x<d.length;x++)
{
int sum1=0,sum2=0;
if (d[x].length()%2==0 && d[x].length()<49)
{
int n;
n = d[x].length()/2;
for (int y=0;y<n;y++)
{
sum1 = sum1 + d[x].charAt(y)-'0';
}
for (int y=n;y<d[x].length();y++)
{
sum2 = sum2 + d[x].charAt(y)-'0';
}
if (sum1==sum2)
{
ans = d[x];
break;
}
}
}
return ans;
}
}
Here is the complete Java Program for this question.
Complexity is O(n^3)
This can however be solved in O(n^2).For O(n^2) complexity solution refer to this link
import java.util.Scanner;
import static java.lang.System.out;
public class SubStringProblem{
public static void main(String args[]){
Scanner sc = new Scanner(System.in);
out.println("Enter the Digit String:");
String s = sc.nextLine();
int n = (new SubStringProblem()).getEqualSumSubString(s);
out.println("The longest Sum SubString is "+n);
}
public int getEqualSumSubString(String s){
int N;
if(s.length()%2==0)
{
//String is even
N = s.length();
}
else{
//String is odd
N=s.length()-1;
}
boolean flag =false;
int sum1,sum2;
do{
for(int k=0;k<=s.length()-N;k++){
sum1=0;
sum2=0;
for(int i =k,j=k+N-1;i<j;i++,j--)
{
sum1=sum1 + Integer.parseInt(s.substring(i,i+1));
sum2+=Integer.parseInt(s.substring(j,j+1));
}
if(sum1==sum2){
return N;
}
}
N-=2;
flag =true;
}while(N>1);
return -1;
}
}
What is your rationale for the number 48 on these two lines?
for(int k=0;k<temp/2;k++)
{
leftSum=leftSum+tempString[k]-48;
rightSum=rightSum+tempString[k+(temp/2)]-48;
}
I am just overly curious and would like to hear the reasoning behind it, because I have a similar solution, but without the 48 and it still works. However, I added the 48 an still got the correct answer.
Simple solution. O(n*n). s - input string.
var longest = 0;
for (var i = 0; i < s.length-1; i++) {
var leftSum = rightSum = 0;
for (var j = i, k = i+1, l = 2; j >=0 && k < s.length; j--, k++, l+=2) {
leftSum += parseInt(s[j]);
rightSum += parseInt(s[k]);
if (leftSum == rightSum && l > longest) {
longest = l;
}
}
}
console.log(longest);

How many palindromes can be formed by selections of characters from a string?

I'm posting this on behalf of a friend since I believe this is pretty interesting:
Take the string "abb". By leaving out
any number of letters less than the
length of the string we end up with 7
strings.
a b b ab ab bb abb
Out of these 4 are palindromes.
Similarly for the string
"hihellolookhavealookatthispalindromexxqwertyuiopasdfghjklzxcvbnmmnbvcxzlkjhgfdsapoiuytrewqxxsoundsfamiliardoesit"
(a length 112 string) 2^112 - 1
strings can be formed.
Out of these how many are
palindromes??
Below there is his implementation (in C++, C is fine too though). It's pretty slow with very long words; he wants to know what's the fastest algorithm possible for this (and I'm curious too :D).
#include <iostream>
#include <cstring>
using namespace std;
void find_palindrome(const char* str, const char* max, long& count)
{
for(const char* begin = str; begin < max; begin++) {
count++;
const char* end = strchr(begin + 1, *begin);
while(end != NULL) {
count++;
find_palindrome(begin + 1, end, count);
end = strchr(end + 1, *begin);
}
}
}
int main(int argc, char *argv[])
{
const char* s = "hihellolookhavealookatthis";
long count = 0;
find_palindrome(s, strlen(s) + s, count);
cout << count << endl;
}
First of all, your friend's solution seems to have a bug since strchr can search past max. Even if you fix this, the solution is exponential in time.
For a faster solution, you can use dynamic programming to solve this in O(n^3) time. This will require O(n^2) additional memory. Note that for long strings, even 64-bit ints as I have used here will not be enough to hold the solution.
#define MAX_SIZE 1000
long long numFound[MAX_SIZE][MAX_SIZE]; //intermediate results, indexed by [startPosition][endPosition]
long long countPalindromes(const char *str) {
int len = strlen(str);
for (int startPos=0; startPos<=len; startPos++)
for (int endPos=0; endPos<=len; endPos++)
numFound[startPos][endPos] = 0;
for (int spanSize=1; spanSize<=len; spanSize++) {
for (int startPos=0; startPos<=len-spanSize; startPos++) {
int endPos = startPos + spanSize;
long long count = numFound[startPos+1][endPos]; //if str[startPos] is not in the palindrome, this will be the count
char ch = str[startPos];
//if str[startPos] is in the palindrome, choose a matching character for the palindrome end
for (int searchPos=startPos; searchPos<endPos; searchPos++) {
if (str[searchPos] == ch)
count += 1 + numFound[startPos+1][searchPos];
}
numFound[startPos][endPos] = count;
}
}
return numFound[0][len];
}
Explanation:
The array numFound[startPos][endPos] will hold the number of palindromes contained in the substring with indexes startPos to endPos.
We go over all pairs of indexes (startPos, endPos), starting from short spans and moving to longer ones. For each such pair, there are two options:
The character at str[startPos] is not in the palindrome. In that case, there are numFound[startPos+1][endPos] possible palindromes - a number that we have calculated already.
character at str[startPos] is in the palindrome (at its beginning). We scan through the string to find a matching character to put at the end of the palindrome. For each such character, we use the already-calculated results in numFound to find number of possibilities for the inner palindrome.
EDIT:
Clarification: when I say "number of palindromes contained in a string", this includes non-contiguous substrings. For example, the palindrome "aba" is contained in "abca".
It's possible to reduce memory usage to O(n) by taking advantage of the fact that calculation of numFound[startPos][x] only requires knowledge of numFound[startPos+1][y] for all y. I won't do this here since it complicates the code a bit.
Pregenerating lists of indices containing each letter can make the inner loop faster, but it will still be O(n^3) overall.
I have a way can do it in O(N^2) time and O(1) space, however I think there must be other better ways.
the basic idea was the long palindrome must contain small palindromes, so we only search for the minimal match, which means two kinds of situation: "aa", "aba". If we found either , then expand to see if it's a part of a long palindrome.
int count_palindromic_slices(const string &S) {
int count = 0;
for (int position=0; position<S.length(); position++) {
int offset = 0;
// Check the "aa" situation
while((position-offset>=0) && (position+offset+1)<S.length() && (S.at(position-offset))==(S.at(position+offset+1))) {
count ++;
offset ++;
}
offset = 1; // reset it for the odd length checking
// Check the string for "aba" situation
while((position-offset>=0) && position+offset<S.length() && (S.at(position-offset))==(S.at(position+offset))) {
count ++;
offset ++;
}
}
return count;
}
June 14th, 2012
After some investigation, I believe this is the best way to do it.
faster than the accepted answer.
Is there any mileage in making an initial traversal and building an index of all occurances of each character.
h = { 0, 2, 27}
i = { 1, 30 }
etc.
Now working from the left, h, only possible palidromes are at 3 and 17, does char[0 + 1] == char [3 -1] etc. got a palindrome. does char [0+1] == char [27 -1] no, No further analysis of char[0] needed.
Move on to char[1], only need to example char[30 -1] and inwards.
Then can probably get smart, when you've identified a palindrome running from position x->y, all inner subsets are known palindromes, hence we've dealt with some items, can eliminate those cases from later examination.
My solution using O(n) memory and O(n^2) time, where n is the string length:
palindrome.c:
#include <stdio.h>
#include <string.h>
typedef unsigned long long ull;
ull countPalindromesHelper (const char* str, const size_t len, const size_t begin, const size_t end, const ull count) {
if (begin <= 0 || end >= len) {
return count;
}
const char pred = str [begin - 1];
const char succ = str [end];
if (pred == succ) {
const ull newCount = count == 0 ? 1 : count * 2;
return countPalindromesHelper (str, len, begin - 1, end + 1, newCount);
}
return count;
}
ull countPalindromes (const char* str) {
ull count = 0;
size_t len = strlen (str);
size_t i;
for (i = 0; i < len; ++i) {
count += countPalindromesHelper (str, len, i, i, 0); // even length palindromes
count += countPalindromesHelper (str, len, i, i + 1, 1); // odd length palindromes
}
return count;
}
int main (int argc, char* argv[]) {
if (argc < 2) {
return 0;
}
const char* str = argv [1];
ull count = countPalindromes (str);
printf ("%llu\n", count);
return 0;
}
Usage:
$ gcc palindrome.c -o palindrome
$ ./palindrome myteststring
EDIT: I misread the problem as the contiguous substring version of the problem. Now given that one wants to find the palindrome count for the non-contiguous version, I strongly suspect that one could just use a math equation to solve it given the number of distinct characters and their respective character counts.
Hmmmmm, I think I would count up like this:
Each character is a palindrome on it's own (minus repeated characters).
Each pair of the same character.
Each pair of the same character, with all palindromes sandwiched in the middle that can be made from the string between repeats.
Apply recursively.
Which seems to be what you're doing, although I'm not sure you don't double-count the edge cases with repeated characters.
So, basically, I can't think of a better way.
EDIT:
Thinking some more,
It can be improved with caching, because you sometimes count the palindromes in the same sub-string more than once. So, I suppose this demonstrates that there is definitely a better way.
Here is a program for finding all the possible palindromes in a string written in both Java and C++.
int main()
{
string palindrome;
cout << "Enter a String to check if it is a Palindrome";
cin >> palindrome;
int length = palindrome.length();
cout << "the length of the string is " << length << endl;
int end = length - 1;
int start = 0;
int check=1;
while (end >= start) {
if (palindrome[start] != palindrome[end]) {
cout << "The string is not a palindrome";
check=0;
break;
}
else
{
start++;
end--;
}
}
if(check)
cout << "The string is a Palindrome" << endl;
}
public String[] findPalindromes(String source) {
Set<String> palindromes = new HashSet<String>();
int count = 0;
for(int i=0; i<source.length()-1; i++) {
for(int j= i+1; j<source.length(); j++) {
String palindromeCandidate = new String(source.substring(i, j+1));
if(isPalindrome(palindromeCandidate)) {
palindromes.add(palindromeCandidate);
}
}
}
return palindromes.toArray(new String[palindromes.size()]);
}
private boolean isPalindrome(String source) {
int i =0;
int k = source.length()-1;
for(i=0; i<source.length()/2; i++) {
if(source.charAt(i) != source.charAt(k)) {
return false;
}
k--;
}
return true;
}
I am not sure but you might try whit fourier. This problem remined me on this: O(nlogn) Algorithm - Find three evenly spaced ones within binary string
Just my 2cents