Permutation on a strange string - c++

There is a strange string of 10 characters ether '0' or '1'. I have n filter strings each having 10 characters ether '0' or '1'. A '1' at the i-th position in a filter string means that if I applies this filter to the i-th character of the strange string, i-th character of strange string will be inverted: it becomes '1' if it was '0', and vice versa, whereas a '0' at the i-th position in a filter string will not do anything to the strange string. I can apply any number of filters. I can pick any number of filters and can apply to strange string. Now i want to find how many different subsets of all the filters can I apply to transform this strange string so that strange string will contain only 1's? I am not able to generalise the problem for any number of strings. Can anybody help.
Let us have some test cases
Enter strange string :1111111111
Total filter strings : 2
Enter filter strings :
0000000000
0000000000
Output is: 4
Explanation : Strange string is already having all characters 1's, and I have two different identity filters. I can either apply the empty subset of filters, the first filter only, the second filter only, or both.
Enter strange string :0101010101
Total filter strings : 3
Enter filter strings :
1010101010
1010000000
0000101010
Output is: 2
Explanation : I can either apply the first filter (and invert all 0's) or apply the second and third filters in any order.

Brute force algorithm:
std::uint16_t apply_filters(std::uint16_t init,
const std::vector<std::uint16_t>& filters,
const std::vector<bool>& mask)
{
auto res = init;
for (std::size_t i = 0; i != filters.size(); ++i) {
if (mask[i]) {
res ^= filters[i];
}
}
return res;
}
bool increase(std::vector<bool>& bs)
{
for (std::size_t i = 0; i != bs.size(); ++i) {
bs[i] = !bs[i];
if (bs[i] == true) {
return true;
}
}
return false; // overflow
}
std::size_t count_filters_combination(std::uint16_t init,
const std::vector<std::uint16_t>& filters)
{
std::vector<bool> bs(filters.size());
std::size_t count = 0;
const std::uint16_t expected = 0b1111111111;
do
{
if (expected == apply_filters(init, filters, bs)) {
++count;
}
} while (increase(bs));
return count;
}
Live Demo

I wrote my solution in python. It should be pretty easy to understand. I'll leave it to you to translate it to C++.
def filterCombinations(original, filters):
combinations = 1 if original == 0b1111111111 else 0
for i in xrange(len(filters)):
newvalue = original ^ filters[i]
newfilters = filters[i+1:]
combinations += filterCombinations(newvalue, newfilters)
return combinations
Using 3 filters, the first level of recursion looks like this:
filterCombinations(S, [F1, F2, F3])
--> X +
filterCombinations(S^F1, [F2, F3]) +
filterCombinations(S^F2, [F3]) +
filterCombinations(S^F3, [])
Where X is 1 if S == 1111111111 and 0 otherwise.

Related

How to sort non-numeric strings by converting them to integers? Is there a way to convert strings to unique integers while being ordered?

I am trying to convert strings to integers and sort them based on the integer value. These values should be unique to the string, no other string should be able to produce the same value. And if a string1 is bigger than string2, its integer value should be greater. Ex: since "orange" > "apple", "orange" should have a greater integer value. How can I do this?
I know there are an infinite number of possibilities between just 'a' and 'b' but I am not trying to fit every single possibility into a number. I am just trying to possibly sort, let say 1 million values, not an infinite amount.
I was able to get the values to be unique using the following:
long int order = 0;
for (auto letter : word)
order = order * 26 + letter - 'a' + 1;
return order;
but this obviously does not work since the value for "apple" will be greater than the value for "z".
This is not a homework assignment or a puzzle, this is something I thought of myself. Your help is appreciated, thank you!
You are almost there ... just a minor tweaks are needed:
you are multiplying by 26
however you have letters (a..z) and empty space so you should multiply by 27 instead !!!
Add zeropading
in order to make starting letter the most significant digit you should zeropad/align the strings to common length... if you are using 32bit integers then max size of string is:
floor(log27(2^32)) = 6
floor(32/log2(27)) = 6
Here small example:
int lexhash(char *s)
{
int i,h;
for (h=0,i=0;i<6;i++) // process string
{
if (s[i]==0) break;
h*=27;
h+=s[i]-'a'+1;
}
for (;i<6;i++) h*=27; // zeropad missing letters
return h;
}
returning these:
14348907 a
28697814 b
43046721 c
373071582 z
15470838 abc
358171551 xyz
23175774 apple
224829626 orange
ordered by hash:
14348907 a
15470838 abc
23175774 apple
28697814 b
43046721 c
224829626 orange
358171551 xyz
373071582 z
This will handle all lowercase a..z strings up to 6 characters length which is:
26^6 + 26^5 +26^4 + 26^3 + 26^2 + 26^1 = 321272406 possibilities
For more just use bigger bitwidth for the hash. Do not forget to use unsigned type if you use the highest bit of it too (not the case for 32bit)
You can use position of char:
std::string s("apple");
int result = 0;
for (size_t i = 0; i < s.size(); ++i)
result += (s[i] - 'a') * static_cast<int>(i + 1);
return result;
By the way, you are trying to get something very similar to hash function.

Separating every second digit in an integer C++

I am currently finishing up an assignment I have to complete for my OOP class and I am struggling with 1 part in particular. Keep in mind I am still a beginner. The question is as followed:
If the string contains 13 characters, all of characters are digits and the check digit is modulo 10, this function returns true; false otherwise.
This is in regards to a EAN. I basically have to separate every second digit from the rest digits. for example 9780003194876 I need to do calculations with 7,0,0,1,4,7. I have no clue about doing this.
Any help would be greatly appreciated!
bool isValid(const char* str){
if (atoi(str) == 13){
}
return false;
}
You can start with a for loop which increments itself by 2 for each execution:
for (int i = 1, len = strlen(str); i < len; i += 2)
{
int digit = str[i] - '0';
// do something with digit
}
The above is just an example though...
Since the question was tagged as C++ (Not C, so I suggest other answerers to not solve this using C libraries, please. Let us getting OP's C++ knoweledge in the right way since the beggining), and is an OOP class I'm going to solve this with the C++ way: Use the std::string class:
bool is_valid( const std::string& str )
{
if( str.size() == 13 )
{
for( std::size_t i = 0 ; i < 13 ; i += 2 )
{
int digit = str[i] - '0';
//Do what you wan't with the digit
}
}
else
return false;
}
First, if it's EAN, you have to process every digit, not just
every other one. In fact, all you need to do is a weighted sum
of the digits; for EAN-13, the weigths alternate between 1 and
3, starting with three. The simplest solution is probably to
put them in a table (i.e. int weigt[] = { 1, 3, 1, 3... };,
and iterate over the string (in this case, using an index rather
than iterators, since you want to be able to index into
weight as well), converting each digit into a numerical value
(str[i] - '0', if isdigit(static_cast<unsigned char>(str[i])
is true; if it's false, you haven't got a digi.), then
multiplying it by the running total. When you're finished, if
the total, modulo 10, is 0, it's correct. Otherwise, it isn't.
You certainly don't want to use atoi, since you don't want the
numerical value of the string; you want to treat each digit
separately.
Just for the record, professionally, I'd write something like:
bool
isValidEAN13( std::string const& value )
{
return value.size() == 13
&& std::find_if(
value.begin(),
value.end(),
[]( unsigned char ch ){ return !isdigit( ch ); } )
== value.end()
&& calculateEAN13( value ) == value.back() - '0';
}
where calculateEAN13 does the actual calculations (and can be
used for both generation and checking). I suspect that this
goes beyond the goal of the assignment, however, and that all
your teacher is looking for is the calculateEAN13 function,
with the last check (which is why I'm not giving it in full).

Given a word and a text, return the count of the occurrences of anagrams of the word in the text [duplicate]

This question already has answers here:
Given a word and a text, we need to return the occurrences of anagrams
(6 answers)
Closed 9 years ago.
For eg. word is for and the text is forxxorfxdofr, anagrams of for will be ofr, orf, fro, etc. So the answer would be 3 for this particular example.
Here is what I came up with.
#include<iostream>
#include<cstring>
using namespace std;
int countAnagram (char *pattern, char *text)
{
int patternLength = strlen(pattern);
int textLength = strlen(text);
int dp1[256] = {0}, dp2[256] = {0}, i, j;
for (i = 0; i < patternLength; i++)
{
dp1[pattern[i]]++;
dp2[text[i]]++;
}
int found = 0, temp = 0;
for (i = 0; i < 256; i++)
{
if (dp1[i]!=dp2[i])
{
temp = 1;
break;
}
}
if (temp == 0)
found++;
for (i = 0; i < textLength - patternLength; i++)
{
temp = 0;
dp2[text[i]]--;
dp2[text[i+patternLength]]++;
for (j = 0; j < 256; j++)
{
if (dp1[j]!=dp2[j])
{
temp = 1;
break;
}
}
if (temp == 0)
found++;
}
return found;
}
int main()
{
char pattern[] = "for";
char text[] = "ofrghofrof";
cout << countAnagram(pattern, text);
}
Does there exist a faster algorithm for the said problem?
Most of the time will be spent searching, so to make the algorithm more time efficient, the objective is to reduce the quantities of searches or optimize the search.
Method 1: A table of search starting positions.
Create a vector of lists, one vector slot for each letter of the alphabet. This can be space-optimized later.
Each slot will contain a list of indices into the text.
Example text: forxxorfxdofr
Slot List
'f' 0 --> 7 --> 11
'o' 1 --> 5 --> 10
'r' 2 --> 6 --> 12
For each word, look up the letter in the vector to get a list of indexes into the text. For each index in the list, compare the text string position from the list item to the word.
So with the above table and the word "ofr", the first compare occurs at index 1, second compare at index 5 and last compare at index 10.
You could eliminate near-end of text indices where (index + word length > text length).
You can use the commutativity of multiplication, along with uniqueness of primal decomposition. This relies on my previous answer here
Create a mapping from each character into a list of prime numbers (as small as possible). For e.g. a-->2, b-->3, c-->5, etc.. This can be kept in a simple array.
Now, convert the given word into the multiplication of the primes matching each of its characters. This results will be equal to a similar multiplication of any anagram of that word.
Now sweep over the array, and at any given step, maintain the multiplication of the primes matching the last L characters (where L is the length of your word). So every time you advance you do
mul = mul * char2prime(text[i]) / char2prime(text[i-L])
Whenever this multiplication equals that of your word - increment the overall counter, and you're done
Note that this method would work well on short words, but the primes multiplication can overflow a 64b var pretty fast (by ~9-10 letters), so you'll have to use a large number math library to support longer words.
This algorithm is reasonably efficient if the pattern to be anagrammed is so short that the best way to search it is to simply scan it. To allow longer patterns, the scans represented here by the 'for jj' and 'for mm' loops could be replaced by more sophisticated search techniques.
// sLine -- string to be searched
// sWord -- pattern to be anagrammed
// (in this pseudo-language, the index of the first character in a string is 0)
// iAnagrams -- count of anagrams found
iLineLim = length(sLine)-1
iWordLim = length(sWord)-1
// we need a 'deleted' marker char that will never appear in the input strings
chNil = chr(0)
iAnagrams = 0 // well we haven't found any yet have we
// examine every posn in sLine where an anagram could possibly start
for ii from 0 to iLineLim-iWordLim do {
chK = sLine[ii]
// does the char at this position in sLine also appear in sWord
for jj from 0 to iWordLim do {
if sWord[jj]=chK then {
// yes -- we have a candidate starting posn in sLine
// is there an anagram of sWord at this position in sLine
sCopy = sWord // make a temp copy that we will delete one char at a time
sCopy[jj] = chNil // delete the char we already found in sLine
// the rest of the anagram would have to be in the next iWordLim positions
for kk from ii+1 to ii+iWordLim do {
chK = sLine[kk]
cc = false
for mm from 0 to iWordLim do { // look for anagram char
if sCopy[mm]=chK then { // found one
cc = true
sCopy[mm] = chNil // delete it from copy
break // out of 'for mm'
}
}
if not cc then break // out of 'for kk' -- no anagram char here
}
if cc then { iAnagrams = iAnagrams+1 }
break // out of 'for jj'
}
}
}
-Al.

Compare part of the string

Okay so here is what I'm trying to accomplish.
First of all below table is just an example of what I created, in my assignment I'm not suppose to know any of these. Which means I don't know what they will pass and what is the length of each string.
I'm trying to accomplish one task is to get to be able to compare part of the string
//In Array `phrase` // in array `word`
"Backdoor", 0 "mark" 3 (matches "Market")
"DVD", 1 "of" 2 (matches "Get off")
"Get off", 2 "" -1 (no match)
"Market", 3 "VD" 1 (matches "DVD")
So as you can see from the above codes from the left hand side is the set of array which I store them in my class and they have upto 10 words
Here is the class definition.
class data
{
char phrase[10][40];
public:
int match(const char word[ ]);
};
so I'm using member function to access this private data.
int data::match(const char word[ ])
{
int n,
const int wordLength = strlen(word);
for (n=0 ; n <= 10; n++)
{
if (strncmp (phrase[n],word,wordLength) == 0)
{
return n;
}
}
return -1;
}
The above code that I'm trying to make it work is that it should match and and return if it found the match by returning the index n if not found should always return -1.
What happen now is always return 10.
You're almost there but your code is incomplete so I''m shootin in the dark on a few things.
You may have one too many variables representing an index. Unless n and i are different you should only use one. Also try to use more descriptive names, pos seems to represent the length of the text you are searching.
for (n=0 ; n <= searchLength ; n++)
Since the length of word never changes you don't need to call strlen every time. Create a variable to store the length in before the for loop.
const int wordLength = strlen(word);
I'm assuming the text you are searching is stored in a char array. This means you'll need to pass a pointer to the first element stored at n.
if (strncmp (&phrase[n],word,wordLength) == 0)
In the end you have something that looks like the following:
char word[256] = "there";
char phrase[256] = "hello there hippie!";
const int wordLength = strlen(word);
const int searchLength = strlen(phrase);
for (int n = 0; n <= searchLength; n++)
{
// or phrase + n
if (strncmp(&phrase[n], word, wordLength) == 0)
{
return n;
}
}
return -1;
Note: The final example is now complete to the point of returning a match.
I'm puzzled about your problem. There are some cases unclear. For eaxmple abcdefg --- abcde Match "abcde"? how many words match? any other examples, abcdefg --- dcb Match "c"?and abcdefg --- aoodeoofoo Match "a" or "adef"? if you want to find the first matched word, it's OK and very simple. But if you are to find the longest and discontinuous string, it is a big question. I think you should have a research about LCS problem (Longest Common Subsequence)

comparing bits (one position at a time)

Initially I have user input decimal numbers (0 - 15), and I will turn that into binary numbers.
Say these numbers are written into a text file, as shown in the picture. These numbers are arranged by the numbers of 1's. The dash - is used to separate different groups of 1.
I have to read this file, and compare strings of one group with the all the strings in the group below, i.e., Group 1 with all the strings in group 2, and group 2 - group 3.
The deal is that, only one column of 0 / 1 difference is allowed, and that column is replaced by letter t. If more than one column of difference is encountered, write none.
So say group 2, 0001 with group 3, 0011, only the second column is different. however, 0010 and 0101 are two columns of difference.
The result will be written into another file.....
At the moment, when I am reading these strings, I am using vector string. I came across bitset. What is important is that I have to access the character one at a time, meaning I have break the vector string into vector char. But it seems like there could be easier way to do it.
I even thought about a hash table - linked-list. Having group 1 assigned to H[0]. Each comparison is done as H[current-group] with H[current_group+1]. But beyond the first comparison (comparing 1's and 0's), the comparison beyond that will not work under this hash-linked way. So I gave up on that.
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <algorithm>
#include <iterator>
using namespace std;
int main() {
ifstream inFile("a.txt");
vector<string> svec;
copy(istream_iterator<string>(inFile), istream_iterator<string>(), back_inserter(svec));
copy(svec.begin(), svec.end(), ostream_iterator<string>(cout,"\n"));
for(int i = 0; i < svec.size(); i++)
{
cout << svec[i] << " ";
}
inFile.close();
return 0;
}
This is the sample code of writing it into a file....but like I said, the whole deal of vector seems impractical in my case....
Any help is appreciated. thanks
I don't understand your code snippet -- it looks like all it does is read in the input file into a vector of strings, which will then contain each whitespace-delimited word in a separate string, then write it back out in 2 different ways (once with words separated by \n, once with them separated by spaces).
It seems the main problem you're having is with reading and interpreting the file itself, as opposed to doing the necessary calculations -- right? That's what I hope this answer will help you with.
I think the line structure of the file is important -- right? In that case you would be better off using the global getline() function in the <string> header, which reads an entire line (rather than a whitespace-delimited word) into a string. (Admittedly that function is pretty well-hidden!) Also you don't actually need to read all the lines into a vector, and then process them -- it's more efficient and actually easier to distill them down to numbers or bitsets as you go:
vector<unsigned> last, curr; // An unsigned can comfortably hold 0-15
ifstream inf("a.txt");
while (true) {
string line;
getline(inf, line); // This is the group header: ignore it
while (getline(inf, line)) {
if (line == "-") {
break;
}
// This line contains a binary string: turn it into a number
// We ignore all characters that are not binary digits
unsigned val = 0;
for (int i = 0; i < line.size(); ++i) {
if (line[i] == '0' || line[i] == '1') {
val = (val << 1) + line[i] - '0';
}
}
curr.push_back(val);
}
// Either we reached EOF, or we saw a "-". Either way, compare
// the last 2 groups.
compare_them_somehow(curr, last); // Not doing everything for you ;)
last = curr; // Using swap() would be more efficient, but who cares
curr.clear();
if (inf) {
break; // Either the disk exploded, or we reached EOF, so we're done.
}
}
Perhaps I've misunderstood your goal, but strings are amenable to array member comparison:
string first = "001111";
string next = "110111";
int sizeFromTesting = 5;
int columnsOfDifference = 0;
for ( int UU = sizeFromTesting; UU >=0; UU-- )
{
if ( first[ UU ] != next[ UU ] )
columnsOfDifference++;
}
cout << columnsOfDifference;
cin.ignore( 99, '\n' );
return 0;
Substitute file streams and bound protection where appropriate.
Not applicable, but to literally bitwise compare variables, & both using a mask for each digit (000010 for second digit).
If or = 0, they match: both are 0. If they or = 1 and & = 1, that digit is 1 for both. Otherwise they differ. Repeat for all the bits and all the numbers in the group.
in vb.net
'group_0 with group_1
If (group_0_count > 0 AndAlso group_1_count > 0) Then
Dim result = ""
Dim index As Integer = 0
Dim g As Integer = 0
Dim h As Integer = 0
Dim i As Integer = 0
For g = 0 To group_0_count - 1
For h = 0 To group_1_count - 1
result = ""
index = 0
For i = 0 To 3
If group_1_0.Items(g).ToString.Chars(i) <> group_1_1.Items(h).ToString.Chars(i) Then
result &= "-"
index = index + 1
Else
result &= group_1_0.Items(g).ToString.Chars(i)
End If
Next
Next
Next
End If
Read it in as an integer, then all you should need is comparisons with bitshifts and bit masks.