As the post title suggests, I'm working to strengthen my grasp on C++ and character manipulation, this time through creating a Vigenere Cipher. For those unfamiliar with it, it's a fairly simple way to encrypt a text file.
The basic way it works is that there exists a string "key", and each character (in my case at least) is a lowercase alphabetical character. These are stored into an array and are used to "shift" the value of the file being encoded. A character of 'a' will shift the target by 0, while 'z' will shift it by 25. The "shift" is cyclical, meaning that if 'z' is shifted by 'b' (1) it should result in an 'a'.
My current method is found below:
//Assume cipher[] contains "[a][b][c][x ][y ][z ]" Cipher is a <string> object
//Assume ptr[] contains "[0][1][2][23][24][25]
#A whole bunch of includes
char c;
ifstream is;
ofstream os;
is.open(argv[3]) //"myinput.txt"
os.open(argv[4]) //"myoutput.txt"
int i = 0;
while( is.good() ) {
c = is.get();
if( is.good() ) { //did we just hit the EoF?
c = tolower( c - 0 ); //just make sure it's lowercase
c = c + ptr[ i % cipher.size() ] % 26;
if( c> 122 )
c = ( c % 123 ) + 97;
i++;
os.put( c );
}
}
My problem lies in my modulo operations, I believe. Maybe it's because I've spent so much time hashing this out, but I spent hours last night writing this, and then another hour lying in bed trying to wrap my mind around how to effectively create what I want:
grab char.
check char. //let char = 'z'
check the cipher. //let the cipher = 'y'
eval cipher shift //'y' shift value = 24
shift z 24 places (cyclically) //'z'==25, 25+24=49, 49%26=23. 23='x'
HERE IS THE ISSUE: How to do this with ACSII? ('a'=97, z='121')
Imagine that you want to "shuffle" the "ones" digits 0-9 between 20 and 29 by two steps, such that 20 becomes 22, and 29 becomes 21,. How would you do that?
Well, I would subtract 20 [our base number], and then shuffle the remaining digit, and then add 20 back in again.
newnum = num - 20;
newnum %= 10;
newnum += 20;
The same principle would apply for ascii - just that of course the base isn't 20.
Related
I am trying to convert strings to integers and sort them based on the integer value. These values should be unique to the string, no other string should be able to produce the same value. And if a string1 is bigger than string2, its integer value should be greater. Ex: since "orange" > "apple", "orange" should have a greater integer value. How can I do this?
I know there are an infinite number of possibilities between just 'a' and 'b' but I am not trying to fit every single possibility into a number. I am just trying to possibly sort, let say 1 million values, not an infinite amount.
I was able to get the values to be unique using the following:
long int order = 0;
for (auto letter : word)
order = order * 26 + letter - 'a' + 1;
return order;
but this obviously does not work since the value for "apple" will be greater than the value for "z".
This is not a homework assignment or a puzzle, this is something I thought of myself. Your help is appreciated, thank you!
You are almost there ... just a minor tweaks are needed:
you are multiplying by 26
however you have letters (a..z) and empty space so you should multiply by 27 instead !!!
Add zeropading
in order to make starting letter the most significant digit you should zeropad/align the strings to common length... if you are using 32bit integers then max size of string is:
floor(log27(2^32)) = 6
floor(32/log2(27)) = 6
Here small example:
int lexhash(char *s)
{
int i,h;
for (h=0,i=0;i<6;i++) // process string
{
if (s[i]==0) break;
h*=27;
h+=s[i]-'a'+1;
}
for (;i<6;i++) h*=27; // zeropad missing letters
return h;
}
returning these:
14348907 a
28697814 b
43046721 c
373071582 z
15470838 abc
358171551 xyz
23175774 apple
224829626 orange
ordered by hash:
14348907 a
15470838 abc
23175774 apple
28697814 b
43046721 c
224829626 orange
358171551 xyz
373071582 z
This will handle all lowercase a..z strings up to 6 characters length which is:
26^6 + 26^5 +26^4 + 26^3 + 26^2 + 26^1 = 321272406 possibilities
For more just use bigger bitwidth for the hash. Do not forget to use unsigned type if you use the highest bit of it too (not the case for 32bit)
You can use position of char:
std::string s("apple");
int result = 0;
for (size_t i = 0; i < s.size(); ++i)
result += (s[i] - 'a') * static_cast<int>(i + 1);
return result;
By the way, you are trying to get something very similar to hash function.
I have a assingment were I need to code and decode txt files, for example: hello how are you? has to be coded as hel2o how are you? and aaaaaaaaaajkle as a10jkle.
while ( ! invoer.eof ( ) ) {
if (kar >= '0' && kar <= '9') {
counter = kar-48;
while (counter > 1){
uitvoer.put(vorigeKar);
counter--;
}
}else if (kar == '/'){
kar = invoer.get();
uitvoer.put(kar);
}else{
uitvoer.put(kar);
}
vorigeKar = kar;
kar = invoer.get ( );
}
but the problem I have is if need to decode a12bhr, the answer is aaaaaaaaaaaabhr but I can't seem to get the 12 as number without problems, I also can't use any strings or array's.
c++
I believe that you are making following mistake: imagine you give a32, then you read the character a and save it as vorigeKar (previous character, I am , Flemish so I understand Dutch :-) ).
Then you read 3, you understand that it is a number and you repeat vorigeKar three times, which leads to aaa. Then you read 2 and repeat vorigeKar two times, leading to aaaaa (five times, five equals 3 + 2).
You need to learn how to keep on reading numeric characters, and translate them into complete numbers (like 32, or 12 in your case).
Like #Dominique said in his answers, You're doing it wrong.
Let me tell you my logic, you can try it.
Pesudo Code + Logic:
Store word as a char array or string, so that it'll be easy to print at last
Loop{
Read - a //check if it's number by subtracting from '0'
Read - 1 //check if number = true. Store it in int res[] = res*10 + 1
//Also store the previous index in an index array(ie) index of char 'a' if you encounter a number first time.
Read - 2 //check if number = true. Store it in res = res*10 + 2
Read - b , h and so on till "space" character
If you encounter another number, then store it's previous character's index in index array and then store the number in a res[] array.
Now using index array you can get the index of your repeating character to be printed and print it for it's corresponding times which we have stored in the result array.
This goes for the second, third...etc:- numbers in your word till the end of the word
}
First, even though you say you can't use strings, you still need to know the basic principle behind how to turn a stream of digit characters into an integer.
Assuming the number is positive, here is a simple function that turns a series of digits into a number:
#include <iostream>
#include <cctype>
int runningTotal(char ch, int lastNum)
{
return lastNum * 10 + (ch -'0');
}
int main()
{
// As a test
char s[] = "a123b23cd1/";
int totalNumber = 0;
for (size_t i = 0; s[i] != '/'; ++i)
{
char digit = s[i]; // This is the character "read from the file"
if ( isdigit( digit) )
totalNumber = runningTotal(digit, totalNumber);
else
{
if ( totalNumber > 0 )
std::cout << totalNumber << "\n";
totalNumber = 0;
}
}
std::cout << totalNumber;
}
Output:
123
23
1
So what was done? The character array is the "file". I then loop for each character, building up the number. The runningTotal is a function that builds the integer from each digit character encountered. When a non-digit is found, we output that number and start the total from 0 again.
The code does not save the letter to "multiply" -- I leave that to you as homework. But the code above illustrates how to take digits and create the number from them. For using a file, you would simply replace the for loop with the reading of each character from the file.
I recently learnt that using getchar_unlocked() is a faster way of reading input.
I searched on the internet and found the code snippet below:
But I am unable to understand it.
void fast_scanf(int &number)
{
register int ch = getchar_unlocked();
number= 0;
while (ch > 47 && ch < 58) {
number = number * 10 + ch - 48;
ch = getchar_unlocked();
}
}
int main(void)
{
int test_cases;fast_scanf(test_cases);
while (test_cases--)
{
int n;
fast_scanf(n);
int array[n];
for (int i = 0; i < n; i++)
fast_scanf(array[i]);
}
return 0;
}
So, this code takes input for an integer array of size n for a given number of test_cases . I didn't understand anything in the function fast_scanf, like why this line:
while (ch > 47 && ch < 58)
{ number = number * 10 + ch - 48;
why the register is used while declaring ch?
why the getchar_unlocked() is used twice in the function? and so on..
It would be great help if someone elaborates this for me. Thanks in advance!
Okay, since what you are asking needs to be explained clearly, I am writing it here... so I don't jumble it all up inside the comments...
The function: (Edited it a bit to make it look like more C++-standard)
void fast_scanf(int &number)
{
auto ch = getchar_unlocked();
number= 0;
while (ch >= '0' && ch <= '9')
{
number = number * 10 + ch - '0';
ch = getchar_unlocked();
}
}
Here, take up consideration by looking at the ASCII Table first, since you won't understand how the results are coming up if you don't...
1) Here, you have a character ch takes up the input character from the user using getchar_unlocked() (The auto keyword does that automatically for you and is only usable in C++, not C)...
2) You assign the variable number to zero so that the variable can be re-used, note that the variable is a reference so it changes inside your program as well...
3) while (ch >= '0' && ch <= '9')... As pointed out, checks whether the characters is within the numerical ASCII limit, similar to saying that the character has to be greater than or equal to 48 but less than or equal to 57...
4) Here, things are a little bit tricky, Variable number is multiplied with the product of itself and 10 and the real integer value of the character you stored)...
5) In the next line, ch is reassigned so that you don't have to stay in the loop forever, since ch will remain that number forever if the user doesn't type anything... remember a loop goes back to where it was declared after reaching the end, checks if the condition is true, continues if it is true else breaks)...
For example: 456764
Here, ch will first take 4 then the others follow so we go with 4 first...
1) Number will be assigned to zero. While loop checks if the given character is a number or not, if it is continues the loop else breaks it...
2) Multiplication of 0 with 10 will be zero... and adding it with difference 52 (that is '4') with 48 (that is '0') gives you 4 (the real numerical value, not the char '4')...
So the variable number now is 4...
And the same continues with the others as well... See...
number = number * 10 + '5' - '0'
number = 4 * 10 + 53 - 48
number = 40 + 5
number = 45... etc, etc. for other numbers...
Well currently I am re creating my own version of enigma as a little project but if you understand how the enigma machine works it has rotors which connect a character to a completely different character for example A might be connected to F or U may be connected to C and this is done three times. Currently I am getting the char for the rotor by using this function:
char getRotorOne(char i) {
if(i == 'a') {
return 'g';
}if(i == 'b') {
return 'A';
}if(i == 'c') {
return 'o';
}
The main problem with this is it takes a long time to write and it seems inefficient and I think there must be a better way. The other problem with this is on the original enigma machine there were only the 26 letters of the alphabet on this there are the 94 tapeable characters of ascii (32-126) is there any other simpler way that this can be done? If you think this question is unclear or you don't understand please tell me instead of just flagging my post, then you can help me improve my question.
Use tables! Conveniently, C string literals are arrays of characters. So you can do this:
// abc
const char* lower_mapping = "gAo";
// ABC
const char* upper_mapping = "xyz";
char getRotorOne(char i) {
if (i >= 'a' && i <= 'z') return lower_mapping[i - 'a'];
if (i >= 'A' && i <= 'Z') return upper_mapping[i - 'A'];
assert(false && "Unknown character cannot be mapped!");
}
Since chars are really just small integers, and ASCII guarantees contiguous ranges for a-z and A-Z (and 0-9) you can subtract from a given character the first one in its range (so, 'a' or 'A') to get an index into that range. That index can then be used to look up the corresponding character via a table, which in this case is just a simple hardcoded string literal.
This is an improvement on Cameron's answer. You should use a simple char array for each rotor, but as you said you want to process ASCII characters in the range 32-126, you should build each mapping as an array of 95 characters:
char rotor1[95] ="aXc;-0..."; // the 95 non control ascii characters in arbitrary order
Then you write your rotor function that way:
char getRotorOne(char i) {
if ((i < 32) || (i > 126)) return i; // do not change non processed characters
return rotor1[i - 32]; // i - 32 is in range 0 - 94: rotor1[i - 32] is defined
}
To start off, I'm four weeks into a C++ course and I don't even know loops yet, so please speak baby talk?
Okay, so I'm supposed to read a twelve character string (plus NULL makes thirteen) from a file, and then shift the letters backwards three, and then print my results to screen and file. I'm okay with everything except the shifting letters. I don't want to write miles of code to take each character individually, subtract three, and re-assemble the string, but I'm not sure how to work with the whole string at once. Can someone recommend a really simple method of doing this?
If you are dealing with simple letters (A to Z or a to z), then you can assume that the internals codes are linear.
Letters are coded as numbers, between 0 and 127. A is coded as 65, B as 66, C as 67, Z as 90.
In order to shift letters, you just have to change the internal letter code as if it were a number, so basically just substracting 3 from the character. Beware of edge cases though, because substracting 3 to 'A' will give you '>' (code 62) and not 'X' (code 88). You can deal with them using "if" statements or the modulo operator ("%").
Here is an ASCII characters table to help you
Once you've loaded your string in, you can use the modulous operator to rotate while keeping within the confines of A-Z space.
I'd keep track of whether the letter was a capital to start with:
bool isCaps = ( letter >= 'A' ) && ( letter <= 'Z' );
if( isCaps )
letter -= 'A'-'a';
and then just do the cipher shift like this:
int shift = -3;
letter -= 'a'; // to make it a number from 0-25
letter = ( letter + shift + 26 ) % 26;
// add 26 in case the shift is negative
letter += 'a'; // back to ascii code
finally finish off with
if( isCaps )
letter += 'A'-'a';
so, putting all this together we get:
char *mystring; // ciphertext
int shift = -3; // ciphershift
for( char *letter = mystring; letter; ++letter )
{
bool isCaps = ( *letter >= 'A' ) && ( *letter <= 'Z' );
if( isCaps )
*letter -= 'A'-'a';
letter -= 'a';
letter = ( letter + shift + 26 ) % 26;
letter += 'a';
if( isCaps )
letter += 'A'-'a';
}
You're going to have to learn loops. They will allow you to repeat some code over the characters of a string, which is exactly what you need here. You'll keep an integer variable that will be your index into the string, and inside the loop do your letter-shifting on the character at that index and increment the index variable by one until you reach NULL.
Edit: If you're not expected to know about loops yet in your course, maybe they want you to do this:
string[0] -= 3; // this is short for "string[0] = string[0] - 3;"
string[1] -= 3;
string[2] -= 3;
...
It will only result in 12 lines of code rather than miles. You don't have to "reassemble" the string this way, you can just edit each character in-place. Then I bet after making you do that, they'll show you the fast way of doing it using loops.
Iterate over the characters with a for loop. And do what you want with the char*. Then put the new char back.
for(int i=0; i<12; i++){
string[i] = string[i] - 3;
}
Where string is your character array (string). There is a bit more involved if you want to make it periodic (I.E. have A wrap round to Z, but the above code should help you get started)
I'm a little unclear what you mean by "shift the letters backwards 3"?
Does that mean D ==> A?
If so, here's a simple loop.
(I didn't do reading from the file, or writing to the file... Thats your part)
#include <string.h>
int main(void)
{
char input[13] = "ABCDEFGHIJKL";
int i;
int len = strlen(input);
for(i=0; i<len; ++i)
{
input[i] = input[i]-3;
}
printf("%s", input); // OUTPUT is: ">?#ABCDEFGHI"
}