comparing bits (one position at a time) - c++

Initially I have user input decimal numbers (0 - 15), and I will turn that into binary numbers.
Say these numbers are written into a text file, as shown in the picture. These numbers are arranged by the numbers of 1's. The dash - is used to separate different groups of 1.
I have to read this file, and compare strings of one group with the all the strings in the group below, i.e., Group 1 with all the strings in group 2, and group 2 - group 3.
The deal is that, only one column of 0 / 1 difference is allowed, and that column is replaced by letter t. If more than one column of difference is encountered, write none.
So say group 2, 0001 with group 3, 0011, only the second column is different. however, 0010 and 0101 are two columns of difference.
The result will be written into another file.....
At the moment, when I am reading these strings, I am using vector string. I came across bitset. What is important is that I have to access the character one at a time, meaning I have break the vector string into vector char. But it seems like there could be easier way to do it.
I even thought about a hash table - linked-list. Having group 1 assigned to H[0]. Each comparison is done as H[current-group] with H[current_group+1]. But beyond the first comparison (comparing 1's and 0's), the comparison beyond that will not work under this hash-linked way. So I gave up on that.
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <algorithm>
#include <iterator>
using namespace std;
int main() {
ifstream inFile("a.txt");
vector<string> svec;
copy(istream_iterator<string>(inFile), istream_iterator<string>(), back_inserter(svec));
copy(svec.begin(), svec.end(), ostream_iterator<string>(cout,"\n"));
for(int i = 0; i < svec.size(); i++)
{
cout << svec[i] << " ";
}
inFile.close();
return 0;
}
This is the sample code of writing it into a file....but like I said, the whole deal of vector seems impractical in my case....
Any help is appreciated. thanks

I don't understand your code snippet -- it looks like all it does is read in the input file into a vector of strings, which will then contain each whitespace-delimited word in a separate string, then write it back out in 2 different ways (once with words separated by \n, once with them separated by spaces).
It seems the main problem you're having is with reading and interpreting the file itself, as opposed to doing the necessary calculations -- right? That's what I hope this answer will help you with.
I think the line structure of the file is important -- right? In that case you would be better off using the global getline() function in the <string> header, which reads an entire line (rather than a whitespace-delimited word) into a string. (Admittedly that function is pretty well-hidden!) Also you don't actually need to read all the lines into a vector, and then process them -- it's more efficient and actually easier to distill them down to numbers or bitsets as you go:
vector<unsigned> last, curr; // An unsigned can comfortably hold 0-15
ifstream inf("a.txt");
while (true) {
string line;
getline(inf, line); // This is the group header: ignore it
while (getline(inf, line)) {
if (line == "-") {
break;
}
// This line contains a binary string: turn it into a number
// We ignore all characters that are not binary digits
unsigned val = 0;
for (int i = 0; i < line.size(); ++i) {
if (line[i] == '0' || line[i] == '1') {
val = (val << 1) + line[i] - '0';
}
}
curr.push_back(val);
}
// Either we reached EOF, or we saw a "-". Either way, compare
// the last 2 groups.
compare_them_somehow(curr, last); // Not doing everything for you ;)
last = curr; // Using swap() would be more efficient, but who cares
curr.clear();
if (inf) {
break; // Either the disk exploded, or we reached EOF, so we're done.
}
}

Perhaps I've misunderstood your goal, but strings are amenable to array member comparison:
string first = "001111";
string next = "110111";
int sizeFromTesting = 5;
int columnsOfDifference = 0;
for ( int UU = sizeFromTesting; UU >=0; UU-- )
{
if ( first[ UU ] != next[ UU ] )
columnsOfDifference++;
}
cout << columnsOfDifference;
cin.ignore( 99, '\n' );
return 0;
Substitute file streams and bound protection where appropriate.
Not applicable, but to literally bitwise compare variables, & both using a mask for each digit (000010 for second digit).
If or = 0, they match: both are 0. If they or = 1 and & = 1, that digit is 1 for both. Otherwise they differ. Repeat for all the bits and all the numbers in the group.

in vb.net
'group_0 with group_1
If (group_0_count > 0 AndAlso group_1_count > 0) Then
Dim result = ""
Dim index As Integer = 0
Dim g As Integer = 0
Dim h As Integer = 0
Dim i As Integer = 0
For g = 0 To group_0_count - 1
For h = 0 To group_1_count - 1
result = ""
index = 0
For i = 0 To 3
If group_1_0.Items(g).ToString.Chars(i) <> group_1_1.Items(h).ToString.Chars(i) Then
result &= "-"
index = index + 1
Else
result &= group_1_0.Items(g).ToString.Chars(i)
End If
Next
Next
Next
End If

Read it in as an integer, then all you should need is comparisons with bitshifts and bit masks.

Related

How to sort non-numeric strings by converting them to integers? Is there a way to convert strings to unique integers while being ordered?

I am trying to convert strings to integers and sort them based on the integer value. These values should be unique to the string, no other string should be able to produce the same value. And if a string1 is bigger than string2, its integer value should be greater. Ex: since "orange" > "apple", "orange" should have a greater integer value. How can I do this?
I know there are an infinite number of possibilities between just 'a' and 'b' but I am not trying to fit every single possibility into a number. I am just trying to possibly sort, let say 1 million values, not an infinite amount.
I was able to get the values to be unique using the following:
long int order = 0;
for (auto letter : word)
order = order * 26 + letter - 'a' + 1;
return order;
but this obviously does not work since the value for "apple" will be greater than the value for "z".
This is not a homework assignment or a puzzle, this is something I thought of myself. Your help is appreciated, thank you!
You are almost there ... just a minor tweaks are needed:
you are multiplying by 26
however you have letters (a..z) and empty space so you should multiply by 27 instead !!!
Add zeropading
in order to make starting letter the most significant digit you should zeropad/align the strings to common length... if you are using 32bit integers then max size of string is:
floor(log27(2^32)) = 6
floor(32/log2(27)) = 6
Here small example:
int lexhash(char *s)
{
int i,h;
for (h=0,i=0;i<6;i++) // process string
{
if (s[i]==0) break;
h*=27;
h+=s[i]-'a'+1;
}
for (;i<6;i++) h*=27; // zeropad missing letters
return h;
}
returning these:
14348907 a
28697814 b
43046721 c
373071582 z
15470838 abc
358171551 xyz
23175774 apple
224829626 orange
ordered by hash:
14348907 a
15470838 abc
23175774 apple
28697814 b
43046721 c
224829626 orange
358171551 xyz
373071582 z
This will handle all lowercase a..z strings up to 6 characters length which is:
26^6 + 26^5 +26^4 + 26^3 + 26^2 + 26^1 = 321272406 possibilities
For more just use bigger bitwidth for the hash. Do not forget to use unsigned type if you use the highest bit of it too (not the case for 32bit)
You can use position of char:
std::string s("apple");
int result = 0;
for (size_t i = 0; i < s.size(); ++i)
result += (s[i] - 'a') * static_cast<int>(i + 1);
return result;
By the way, you are trying to get something very similar to hash function.

my run-length encoding doesn't work with big numbers

I have a assingment were I need to code and decode txt files, for example: hello how are you? has to be coded as hel2o how are you? and aaaaaaaaaajkle as a10jkle.
while ( ! invoer.eof ( ) ) {
if (kar >= '0' && kar <= '9') {
counter = kar-48;
while (counter > 1){
uitvoer.put(vorigeKar);
counter--;
}
}else if (kar == '/'){
kar = invoer.get();
uitvoer.put(kar);
}else{
uitvoer.put(kar);
}
vorigeKar = kar;
kar = invoer.get ( );
}
but the problem I have is if need to decode a12bhr, the answer is aaaaaaaaaaaabhr but I can't seem to get the 12 as number without problems, I also can't use any strings or array's.
c++
I believe that you are making following mistake: imagine you give a32, then you read the character a and save it as vorigeKar (previous character, I am , Flemish so I understand Dutch :-) ).
Then you read 3, you understand that it is a number and you repeat vorigeKar three times, which leads to aaa. Then you read 2 and repeat vorigeKar two times, leading to aaaaa (five times, five equals 3 + 2).
You need to learn how to keep on reading numeric characters, and translate them into complete numbers (like 32, or 12 in your case).
Like #Dominique said in his answers, You're doing it wrong.
Let me tell you my logic, you can try it.
Pesudo Code + Logic:
Store word as a char array or string, so that it'll be easy to print at last
Loop{
Read - a //check if it's number by subtracting from '0'
Read - 1 //check if number = true. Store it in int res[] = res*10 + 1
//Also store the previous index in an index array(ie) index of char 'a' if you encounter a number first time.
Read - 2 //check if number = true. Store it in res = res*10 + 2
Read - b , h and so on till "space" character
If you encounter another number, then store it's previous character's index in index array and then store the number in a res[] array.
Now using index array you can get the index of your repeating character to be printed and print it for it's corresponding times which we have stored in the result array.
This goes for the second, third...etc:- numbers in your word till the end of the word
}
First, even though you say you can't use strings, you still need to know the basic principle behind how to turn a stream of digit characters into an integer.
Assuming the number is positive, here is a simple function that turns a series of digits into a number:
#include <iostream>
#include <cctype>
int runningTotal(char ch, int lastNum)
{
return lastNum * 10 + (ch -'0');
}
int main()
{
// As a test
char s[] = "a123b23cd1/";
int totalNumber = 0;
for (size_t i = 0; s[i] != '/'; ++i)
{
char digit = s[i]; // This is the character "read from the file"
if ( isdigit( digit) )
totalNumber = runningTotal(digit, totalNumber);
else
{
if ( totalNumber > 0 )
std::cout << totalNumber << "\n";
totalNumber = 0;
}
}
std::cout << totalNumber;
}
Output:
123
23
1
So what was done? The character array is the "file". I then loop for each character, building up the number. The runningTotal is a function that builds the integer from each digit character encountered. When a non-digit is found, we output that number and start the total from 0 again.
The code does not save the letter to "multiply" -- I leave that to you as homework. But the code above illustrates how to take digits and create the number from them. For using a file, you would simply replace the for loop with the reading of each character from the file.

Print out each character randomly

I am creating a small game where the user will have hints(Characters of a string) to guess the word of a string. I have the code to see each individual character of the string, but is it possible that I can see those characters printed out randomly?
string str("TEST");
for (int i = 0; i < str.size(); i++){
cout <<" "<< str[i];
output:T E S T
desired sample output: E T S T
Use random_shuffle on the string:
random_shuffle(str.begin(), str.end());
Edits:
C++11 onwards use:
auto engine = std::default_random_engine{};
shuffle ( begin(str), end(str), engine );
Use the following code to generate the letters randomly.
const int stl = str.size();
int stl2 = stl;
while (stl2 >= 0)
{
int r = rand() % stl;
if (str[r] != '0')
{
cout<<" "<<str[r];
str[r] = '0';
stl2--;
}
}
This code basically generates the random number based on the size of the String and then prints the character placed at that particular position of the string.
To avoid the reprinting of already printed character, I have converted the character printed to "0", so next time same position number is generated, it will check if the character is "0" or not.
If you need to preserve the original string, then you may copy the string to another variable and use it in the code.
Note: It is assumed that string will contain only alphabetic characters and so to prevent repetition, "0" is used. If your string may contain numbers, you may use a different character for comparison purpose

String not modified by loop

I'm solving the following problem:
The assignment is to create and return a string object that consists of digits in an int that is sent in through the function's parameter; so the expected output of the function call string pattern(int n) would be "1\n22\n..n\n".
In case you're interested, here is the URL (You need to be signed in to view) to the full assignment, a CodeWars Kata
This is one of the tests (with my return included):
Test-case input: pattern(2)
Expected:
1
22
Actual: "OUTPUT"
//string header file and namespace are already included for you
string pattern(int n){
string out = "OUTPUT";
for (int i = 1; i <= n; ++i){
string temp = "";
temp.insert(0, i, i);
out += temp;
}
return out;
}
The code is self-explanatory and I'm sure there are multiple ways of making it run quicker and more efficiently.
My question is two-fold. Why doesn't my loop start (even though my expression should hold true (1 <= 2) for above case)?
And how does my code hold in the grand scheme of things? Am I breaking any best-practices?
The overload of std::string::insert() that you are using takes three arguments:
index
count
character
You are using i as both count and character. However, the function expects the character to be of char type. In your case, your i is interpreted as a character with the code of 1 and 2, which are basically spaces (well, not really, but whatever). So your output really looks like OUTPUT___ where ___ are three spaces.
If you look at the ascii table, you will notice that digits 0123...9 have indexes from 48 to 57, so to get an index of a particular number, you can do i + 48, or i + '0' (where '0' is the index of 0, which is 48). Finally, you can do it all in the constructor:
string temp(i, i + '0');
The loop works - but does nothing visible. You insert the character-code 1 - not the character '1'; use:
temp.insert(0, i, '0'+i);
the insert method is not called right:
temp.insert(0, i, i); --->
temp.insert(0, i, i+'0');

Given a word and a text, return the count of the occurrences of anagrams of the word in the text [duplicate]

This question already has answers here:
Given a word and a text, we need to return the occurrences of anagrams
(6 answers)
Closed 9 years ago.
For eg. word is for and the text is forxxorfxdofr, anagrams of for will be ofr, orf, fro, etc. So the answer would be 3 for this particular example.
Here is what I came up with.
#include<iostream>
#include<cstring>
using namespace std;
int countAnagram (char *pattern, char *text)
{
int patternLength = strlen(pattern);
int textLength = strlen(text);
int dp1[256] = {0}, dp2[256] = {0}, i, j;
for (i = 0; i < patternLength; i++)
{
dp1[pattern[i]]++;
dp2[text[i]]++;
}
int found = 0, temp = 0;
for (i = 0; i < 256; i++)
{
if (dp1[i]!=dp2[i])
{
temp = 1;
break;
}
}
if (temp == 0)
found++;
for (i = 0; i < textLength - patternLength; i++)
{
temp = 0;
dp2[text[i]]--;
dp2[text[i+patternLength]]++;
for (j = 0; j < 256; j++)
{
if (dp1[j]!=dp2[j])
{
temp = 1;
break;
}
}
if (temp == 0)
found++;
}
return found;
}
int main()
{
char pattern[] = "for";
char text[] = "ofrghofrof";
cout << countAnagram(pattern, text);
}
Does there exist a faster algorithm for the said problem?
Most of the time will be spent searching, so to make the algorithm more time efficient, the objective is to reduce the quantities of searches or optimize the search.
Method 1: A table of search starting positions.
Create a vector of lists, one vector slot for each letter of the alphabet. This can be space-optimized later.
Each slot will contain a list of indices into the text.
Example text: forxxorfxdofr
Slot List
'f' 0 --> 7 --> 11
'o' 1 --> 5 --> 10
'r' 2 --> 6 --> 12
For each word, look up the letter in the vector to get a list of indexes into the text. For each index in the list, compare the text string position from the list item to the word.
So with the above table and the word "ofr", the first compare occurs at index 1, second compare at index 5 and last compare at index 10.
You could eliminate near-end of text indices where (index + word length > text length).
You can use the commutativity of multiplication, along with uniqueness of primal decomposition. This relies on my previous answer here
Create a mapping from each character into a list of prime numbers (as small as possible). For e.g. a-->2, b-->3, c-->5, etc.. This can be kept in a simple array.
Now, convert the given word into the multiplication of the primes matching each of its characters. This results will be equal to a similar multiplication of any anagram of that word.
Now sweep over the array, and at any given step, maintain the multiplication of the primes matching the last L characters (where L is the length of your word). So every time you advance you do
mul = mul * char2prime(text[i]) / char2prime(text[i-L])
Whenever this multiplication equals that of your word - increment the overall counter, and you're done
Note that this method would work well on short words, but the primes multiplication can overflow a 64b var pretty fast (by ~9-10 letters), so you'll have to use a large number math library to support longer words.
This algorithm is reasonably efficient if the pattern to be anagrammed is so short that the best way to search it is to simply scan it. To allow longer patterns, the scans represented here by the 'for jj' and 'for mm' loops could be replaced by more sophisticated search techniques.
// sLine -- string to be searched
// sWord -- pattern to be anagrammed
// (in this pseudo-language, the index of the first character in a string is 0)
// iAnagrams -- count of anagrams found
iLineLim = length(sLine)-1
iWordLim = length(sWord)-1
// we need a 'deleted' marker char that will never appear in the input strings
chNil = chr(0)
iAnagrams = 0 // well we haven't found any yet have we
// examine every posn in sLine where an anagram could possibly start
for ii from 0 to iLineLim-iWordLim do {
chK = sLine[ii]
// does the char at this position in sLine also appear in sWord
for jj from 0 to iWordLim do {
if sWord[jj]=chK then {
// yes -- we have a candidate starting posn in sLine
// is there an anagram of sWord at this position in sLine
sCopy = sWord // make a temp copy that we will delete one char at a time
sCopy[jj] = chNil // delete the char we already found in sLine
// the rest of the anagram would have to be in the next iWordLim positions
for kk from ii+1 to ii+iWordLim do {
chK = sLine[kk]
cc = false
for mm from 0 to iWordLim do { // look for anagram char
if sCopy[mm]=chK then { // found one
cc = true
sCopy[mm] = chNil // delete it from copy
break // out of 'for mm'
}
}
if not cc then break // out of 'for kk' -- no anagram char here
}
if cc then { iAnagrams = iAnagrams+1 }
break // out of 'for jj'
}
}
}
-Al.