How to set my decode function to take in the encoded string? - c++

I'm working on my decode function and I've hit a wall. I dont know if I should pass in the encode function or create a class. My encode function compresses a string, I need the decode function to take that encoded string and expand it.
I've been told that it was the same as doing the encode function. I'm not sure where to go here.
#include<iostream>
#include<string>
using namespace std;
string encode(string str)
{
string encoding = "";
int count;
for (int i = 0; str[i]; i++)
{
count = 1;
while (str[i]==str[i+1])
{
count++, i++;
}
encoding += to_string(count) + str[i];
}
return encoding;
}
//Im trying to decode the encoded string
//take in a string and count how many of the same characters there are and print
//e.g
// a3b4c1......would be decoded as aaabbbbc
string decode(string in)
{
string decoding = "";
char s;
int count;
for (int i = 0; i<in; i++)
{
count = 1;
if (in[i] == 'A')
count++, i++;
}
}
int main()
{
string str = "ABBCC";
cout << encode(str);
//cout << decode(str);
}
// My encode functions executes as needed. 1A2B2C

Your encoding is not valid because the encoding of "1a" produces "111a" which is also the encoding of 111 consecutive 'a', you need to add a separator between the count and the character
In your decode function you only manage the special case of A and you do not extract the count the encoder put
Note also in
for (int i = 0; i<in; i++)
{
count = 1;
if (in[i] == 'A')
count++, i++;
}
you always reset count to 1
You need to first extract the count (with the problem I signal at the beginning of my answer) then duplicate the letter 'count' times
It is useless to do string encoding = ""; because the constructor of std::string make it empty, can be just string encoding;
You need to decode an encoded string, this is not what you do in your main where you try to decode the initial string
A corrected version can be :
#include<iostream>
#include<string>
#include <sstream>
using namespace std;
string encode(string str)
{
stringstream encoding;
int count;
for (int i = 0; str[i]; i++)
{
count = 1;
while (str[i]==str[i+1])
{
count++, i++;
}
encoding << count << ' ' << str[i];
}
return encoding.str();
}
string decode(string in)
{
stringstream is(in);
string decoding;
int n;
char c;
while (is >> n >> c)
{
while (n--)
decoding += c;
}
return decoding;
}
int main()
{
cout << encode("ABBCC2a") << endl;
cout << decode(encode("ABBCC2a")) << endl;
return 0;
}
Compilation and execution :
pi#raspberrypi:/tmp $ g++ -pedantic -Wall -Wextra e.cc
pi#raspberrypi:/tmp $ ./a.out
1 A2 B2 C1 21 a
ABBCC2a

Run-length-encoding – but in a very strange way!
encoding += to_string(count) + str[i];
Let's encode string "sssssssssss"; it will result in a string with array representation of
{ '1', '1', 's', 0 } // string "11s"
(I chose this representation deliberately, you'll see later...)
The problem is that you wouldn't be able to encode strings containing digits: "1s" will result in
{ '1', '1', '1', 's', 0 } // string "111s"
but how would you want to distinguish if we need to decode back to "1s" or into a string solely containing 111 s characters?
Try it differently: A character actually is nothing more than a number as well, e. g. letter s is represented by numerical value 115 (in ASCII and compatible, at least), the digit 7 (as a character!) by numerical value 55. So you can simply add the value as character:
encoding += static_cast<unsigned char>(count) + str[i];
There are some corner cases, unsigned char cannot hold numbers greater than 255, so a string having more subsequent equal characters would have to be encoded e. g. as
{ 255, 's', 7, 's', 0 } // 262 times letter s
Note the representation; 255 and 7 aren't even printable characters! Now let's assume we encoded a string with 115 times the letter s:
{ 115, 's', 0 } // guess, as string, this would be "ss"...
To catch, you would simply check explicitly your counter for reaching maximum value.
Now decoding gets much simpler:
size_t i = 0;
while(i < encoded.size())
{
unsigned char n = encoded[i];
++i;
while(n--)
decoded += encoded[i];
++i;
}
Totally simple: first byte always as number, second one as character...
If you insist on numbers being encoded as string (and encode only strings not containing digits), you could use a std::istringstream:
std::istringstream s(encoded);
unsigned int n;
char c;
while(s >> n >> c)
{
while(n--)
decoded += encoded[i];
}
OK, it is not symmetric to your encoding function. You could adapt the latter to be so, though:
std::ostringstream s;
for(;;) // ...
{
unsigned int count = 1;
// ...
s << count << str[i];
}

Related

Run-length decompression using C++

I have a text file with a string which I encoded.
Let's say it is: aaahhhhiii kkkjjhh ikl wwwwwweeeett
Here the code for encoding, which works perfectly fine:
void Encode(std::string &inputstring, std::string &outputstring)
{
for (int i = 0; i < inputstring.length(); i++) {
int count = 1;
while (inputstring[i] == inputstring[i+1]) {
count++;
i++;
}
if(count <= 1) {
outputstring += inputstring[i];
} else {
outputstring += std::to_string(count);
outputstring += inputstring[i];
}
}
}
Output is as expected: 3a4h3i 3k2j2h ikl 6w4e2t
Now, I'd like to decompress the output - back to original.
And I am struggling with this since a couple days now.
My idea so far:
void Decompress(std::string &compressed, std::string &original)
{
char currentChar = 0;
auto n = compressed.length();
for(int i = 0; i < n; i++) {
currentChar = compressed[i++];
if(compressed[i] <= 1) {
original += compressed[i];
} else if (isalpha(currentChar)) {
//
} else {
//
int number = isnumber(currentChar).....
original += number;
}
}
}
I know my Decompress function seems a bit messy, but I am pretty lost with this one.
Sorry for that.
Maybe there is someone out there at stackoverflow who would like to help a lost and beginner soul.
Thanks for any help, I appreciate it.
Assuming input strings cannot contain digits (this cannot be covered by your encoding as e. g. both the strings "3a" and "aaa" would result in the encoded string "3a" – how would you ever want to decompose again?) then you can decompress as follows:
unsigned int num = 0;
for(auto c : compressed)
{
if(std::isdigit(static_cast<unsigned char>(c)))
{
num = num * 10 + c - '0';
}
else
{
num += num == 0; // assume you haven't read a digit yet!
while(num--)
{
original += c;
}
}
}
Untested code, though...
Characters in a string actually are only numerical values, though. You can consider char (or signed char, unsigned char) as ordinary 8-bit integers as well. And you can store a numerical value in such a byte, too. Usually, you do run length encoding exactly that way: Count up to 255 equal characters, store the count in a single byte and the character in another byte. One single "a" would then be encoded as 0x01 0x61 (the latter being the ASCII value of a), "aa" would get 0x02 0x61, and so on. If you have to store more than 255 equal characters you store two pairs: 0xff 0x61, 0x07 0x61 for a string containing 262 times the character a... Decoding then gets trivial: you read characters pairwise, first byte you interpret as number, second one as character – rest being trivial. And you nicely cover digits that way as well.
#include "string"
#include "iostream"
void Encode(std::string& inputstring, std::string& outputstring)
{
for (unsigned int i = 0; i < inputstring.length(); i++) {
int count = 1;
while (inputstring[i] == inputstring[i + 1]) {
count++;
i++;
}
if (count <= 1) {
outputstring += inputstring[i];
}
else {
outputstring += std::to_string(count);
outputstring += inputstring[i];
}
}
}
bool alpha_or_space(const char c)
{
return isalpha(c) || c == ' ';
}
void Decompress(std::string& compressed, std::string& original)
{
size_t i = 0;
size_t repeat;
while (i < compressed.length())
{
// normal alpha charachers
while (alpha_or_space(compressed[i]))
original.push_back(compressed[i++]);
// repeat number
repeat = 0;
while (isdigit(compressed[i]))
repeat = 10 * repeat + (compressed[i++] - '0');
// unroll releat charachters
auto char_to_unroll = compressed[i++];
while (repeat--)
original.push_back(char_to_unroll);
}
}
int main()
{
std::string deco, outp, inp = "aaahhhhiii kkkjjhh ikl wwwwwweeeett";
Encode(inp, outp);
Decompress(outp, deco);
std::cout << inp << std::endl << outp << std::endl<< deco;
return 0;
}
The decompression can't possibly work in an unambiguous way because you didn't define a sentinel character; i.e. given the compressed stream it's impossible to determine whether a number is an original single number or it represents the repeat RLE command. I would suggest using '0' as the sentinel char. While encoding, if you see '0' you just output 010. Any other char X will translate to 0NX where N is the repeat byte counter. If you go over 255, just output a new RLE repeat command

How to convert integers represented as string to ASCII symbols?

I have an input string that can look like this: "126022034056098012". It is a result of concatination of ASCII codes of symbols that I've read from some file. The codes are 126 22 34 56 98 12 for example. The problem is how to decode this string back into characters? Note: the string mustn't contain any delimiters other than digits (\,| and so on). What should I do next?
I figured out a way that uses map of ASCII symbols: key->string with numeric representatoin of ASCII symbol, value->ASCII symbol. In the loop, I accumulate the incoming digits in a string until the string matches some key in the map. When matched, I convert the resulting code into a character. I continue until I run out of input data. But this method works good with strings and txt-files but don't work with binary files.
function that makes string of characters from string of ASCII codes:
string Utils::from_number_to_ascii(string number, int size) {
Utils ut;
while(number.size() % 3) {
number = "0" + number;
}
string out;
for (int i = 0; i < size;){
string st;
auto it = ut.triple_dict.end();
while (it == ut.triple_dict.end() && i < size){
st += number[i++];
it = ut.triple_dict.find(st);
}
out += it->second;
st = "";
}
return out;
}
filling the map:
Utils::Utils() {
for (int i = 0; i <= 255; i++){
string s = to_string(static_cast<int>(i));
if (s.size() == 1) {
s = "00" + s;
} if (s.size() == 2){
s = "0" + s;
}
triple_dict.insert(make_pair(s, static_cast<unsigned char>(i)));
}
}
It's not hard to see that I fill the container with three bytes: if the ASCII code of symbol is two-digit number I append it with "0", if code of symbol is one-digit number I append it with "00" to make code three-digit number. I do this for unambiguous decoding of symbol.
If each ascii code is represented by exactly 3 digits, we can do this pretty easily with a loop:
std::string toAscii(char const* digits, size_t size) {
std::string output(size / 3, '\0');
for(char& c : output) {
char d0 = *digits++; // Get 3 digits
char d1 = *digits++;
char d2 = *digits++;
int ascii_value = (d0 - '0') * 100 + (d1 - '0') * 10 + (d2 - '0');
c = (char)ascii_value;
}
return output;
}
Example usage
I have a c-string with the example input, as well as a string with the expected output. This program verifies that they're equal.
int main() {
auto&& input = "126022034056098012";
std::string expected_output = {char(126), char(22), char(34), char(56), char(98), char(12)};
std::cout << (toAscii(input, sizeof(input)) == expected_output); // Prints true
}
Does fstream.write() add '\0' to the end of the file?
No. If your string contains the 0 character, it'll add it, but not otherwise. We can test this for ourselves with some pretty short example code.
#include <fstream>
#include <iostream>
#include <string>
int main()
{
{
std::ofstream file("test.txt");
std::string message = "Hello!";
file.write(message.data(), message.length());
// file gets closed automatically
}
{
std::ifstream file("test.txt");
while (file)
{
std::cout << file.get() << '\n';
}
// file gets closed automatically
}
}
When I compile and run this code, it outputs the following. Each value corresponds to the value of the corresponding character in "Hello!", except for the last one. The -1 indicates that you've reached the end of the file, but if you were using a method like file.read it wouldn't show up. The \0 doesn't appear anywhere in the file.
72
101
0
108
111
33
-1

String subscript out of range (Visual Studio 2013)

I understand that questions with this title/problem have been asked numerous times before (here,here and many others). Here is my code followed by what all I have done to remove the error:
CaesarCipher.h
#ifndef CAESARCIPHER_H
#define CAESARCIPHER_H
#include <ctime>
#include <string>
using namespace std;
// Write your class CaesarCipher here.
class CaesarCipher
{
public:
CaesarCipher();
string Encode(string plainString);
string Decode(string encryptedString);
private:
int key1, key2;
char Encode(char normalChar)const;
char Decode(char encodedChar)const;
};
#endif
CaesarCipher.cpp
#include "stdafx.h"
#include "CaesarCipher.h"
using namespace std;
// Implement the member functions of class CaesarCipher.
CaesarCipher::CaesarCipher()
{
//Random initialization of integer key1
//srand(time(0));
srand((unsigned int)time(0));
int value1 = rand() % 10;
int sign1 = rand() % 2;
sign1 = sign1 == 0 ? -1 : 1;
int key1 = value1 * sign1;
//Random initialization of integer key2
//srand(time(0));
srand((unsigned int)time(0));
int value2 = rand() % 10;
int sign2 = rand() % 2;
sign2 = sign2 == 0 ? -1 : 1;
int key2 = value2 * sign2;
}
char CaesarCipher::Encode(char normalChar) const
{
int result=0;
int charValue = normalChar; //get the ASCII decimal value of character
if (charValue == 32) // if characeter is a space, we leave it
{
result = 32;
}
else
{
if (key1 > 0)
{
result = char(int(charValue + key1 - 97) % 26 + 97); // find the integer value of char after rotating it with key1(positive)
}
if (key1 < 0)
{
result = char(int(charValue -key1 - 97) % 26 + 97); // find the integer value of char after rotating it with key1(negative)
}
if (key2 > 0)
{
result += char(int(charValue + key2 - 97) % 26 + 97); // find the updated integer value of char after rotating it with key2(positive)
}
if (key2 < 0)
{
result += char(int(charValue - key2 - 97) % 26 + 97); // find the updated integer value of char after rotating it with key2(negative)
}
}
return result; // returning the integer value which will be typecasted into a char(encoded char)
}
char CaesarCipher::Decode(char encodedChar) const
{
int result = 0;
int charValue = encodedChar; //get the ASCII decimal value of encoded character
if (charValue == 32) // if characeter is a space, we leave it unchanged
{
result = 32;
}
else
{
if (key1 > 0)
{
result = char(int(charValue - key1 - 97) % 26 + 97); // find the integer value of encoded char after rotating it with key1(positive) in opposite direction
}
if (key1 < 0)
{
result = char(int(charValue + key1 - 97) % 26 + 97); // find the integer value of encoded char after rotating it with key1(negative) in opposite direction
}
if (key2 > 0)
{
result += char(int(charValue - key2 - 97) % 26 + 97); // find the updated integer value of encoded char after rotating it with key2(positive) in opposite direction
}
if (key2 < 0)
{
result += char(int(charValue + key2 - 97) % 26 + 97); // find the updated integer value of encoded char after rotating it with key2(negative) in opposite direction
}
}
return result; // returning the integer value which will be typecasted into a char(decrypted char)
}
string CaesarCipher::Encode(string plainString)
{
int length = plainString.length(); //gets the length of the
input string
string encodedString; // variable to hold the final encrypted string
for (int i = 0; i < length; i++)
{
encodedString[i] = Encode(plainString[i]); // encrypting the string one character at a time
}
return encodedString; // return the final encoded string
}
string CaesarCipher::Decode(string encryptedString)
{
int length = encryptedString.length(); //gets the length of the input encrypted string
string decodedString; // variable to hold the final decrypted string
for (int i = 0; i < length; i++)
{
decodedString[i] = Decode(encryptedString[i]); // decrypting the string one character at a time
}
return decodedString; // return the final decoded string
}
I am using two keys to cipher the text (key1 followed by key2), if it helps in any way.
Main.cpp
#include "stdafx.h"
#include "CaesarCipher.h"
#include <fstream>
#include <iostream>
int main() {
// File streams
ifstream fin("input.txt");
ofstream fout("output.txt");
if (!fin.good()) {
cout << "Error: file \"input.txt\" does not exist!" << endl;
return -1;
}
string original[20], encrypted[20], decrypted[20];
int i = 0; // will store the number of lines in the input file
CaesarCipher cipher; // an object of CaesarCipher class
// Read the sentences from the input file and save to original[20].
// Hint: use getline() function.
while (!fin.eof())
{
getline(fin, original[i]); // Reading a line from input.txt file
encrypted[i] = cipher.Encode(original[i]); // Encrypt the sentences and save to encrypted[20]
decrypted[i] = cipher.Decode(encrypted[i]); // Decrypt the sentences and save to decrypted[20]
i++;
}
//first output all the encrypted lines
for (int j = 0; j < i; j++)
{
fout << "Encrypted sentences:\n";
fout << encrypted[j]<<"\n";
}
//now output all the decrypted lines
for (int j = 0; j < i; j++)
{
fout << "Decrypted sentences:\n";
fout << decrypted[j] << "\n";
}
// Close the files and end the program.
fin.close();
fout.close();
cout << "done!";
return 0;
}
The error which i am getting isExpression: string subscript out of range. Now i understand that i am trying to iterate beyond the limits of the string (somewhere probably in CaesarCipher.cpp in Encoder or Decoder function).
I have tried to change the limits on i without any effect.
I have tried to use size() instead of length() (in desperacy inspite knowing they do the same thing).
I would really appreciate if you can pin-point any thing in particular which might be causing this error and i will try and change it by myself and see the results.
And if you can also tell, how to avoid such errors in future that will also be of great value to me.
CaesarCipher::Encode() is not allocating any memory for the character data of encodedString, so the loop has nothing valid to access with encodedString[i]. To fix that, either:
Use string encodedString = plainString; to make a copy of the input string, then the loop can manipulate the copied data:
string CaesarCipher::Encode(string plainString) {
int length = plainString.length(); //gets the length of the input string
string encodedString = plainString; // variable to hold the final encrypted string
for (int i = 0; i < length; i++) {
encodedString[i] = Encode(encodedString[i]); // encrypting the string one character at a time
}
return encodedString; // return the final encoded string
}
Use encodedString.resize(length) to pre-allocate the output string before entering the loop:
string CaesarCipher::Encode(string plainString) {
int length = plainString.length(); //gets the length of the input string
string encodedString; // variable to hold the final encrypted string
encodedString.resize(length); // allocate memory for the final encoded string
for (int i = 0; i < length; i++) {
encodedString[i] = Encode(plainString[i]); // encrypting the string one character at a time
}
return encodedString; // return the final encoded string
}
Use encodedString += plainString[i]; to append characters to the output string and let it grow as needed:
string CaesarCipher::Encode(string plainString) {
int length = plainString.length(); //gets the length of the input string
string encodedString; // variable to hold the final encrypted string
for (int i = 0; i < length; i++) {
encodedString += Encode(plainString[i]); // encrypting the string one character at a time
}
return encodedString; // return the final encoded string
}
The same problem exists in CaesarCipher::Decode() with the decodedString variable.
Also, main() has a buffer overflow if input.txt has more than 20 lines in it. Consider changing the code to use std::vector instead of fixed arrays.
And while (!fin.eof()) is wrong to use. Use while (getline(...)) instead:
// Read the sentences from the input file and save to original[20].
// Hint: use getline() function.
string line;
while (getline(fin, line)) { // Reading a line from input.txt file
original[i] = line;
encrypted[i] = cipher.Encode(original[i]); // Encrypt the sentences and save to encrypted[20]
decrypted[i] = cipher.Decode(encrypted[i]); // Decrypt the sentences and save to decrypted[20]
i++;
}

Loop not processing the last character of a string

Basically, the (Vigenere) decryption works perfectly except for not including the final letter for the decryption. For instance, the decryption for m_text yields 48 letters instead of 49. I even tried to manipulate the loop but it doesn't work out well since i will get a out of range exception with .at(). Any help would be appreciated!
using namespace std;
#include <string>
#include <iostream>
int main()
{
string ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
string m_text = "ZOWDLTRTNENMGONMPAPXVUXADRIXUBJMWEWYDSYXUSYKRNLXU";
int length = m_text.length();
string key = "DA";
string plainText = "";
int shift = 0;
int shift2 = 0;
//Loop that decrypts
for (int k = 0; k < length-1; k+=2)
{
//Key 1 shift
shift = m_text.at(k) - key.at(0);
//Key 2 shift
shift2 = m_text.at(k+1) - key.at(1);
if (shift >= 0)
{
plainText += ALPHABET.at(shift);
}
else
{
shift += 91;
plainText += (char)shift;
}
if (shift2 >= 0)
{
plainText += ALPHABET.at(shift2);
}
else
{
shift2 += 91;
plainText += (char)shift2;
}
}
cout << plainText << endl;
}
By the looks of things, you are decoding two characters at a time. So when you have 49 characters in your string, there is one left over (which doesn't get processed). If you make m_text 48 characters long, you will notice you get the correct result.
It might be easier to replicate your key to match the length of the message, then do character-by-character decoding.

A cleaner way to convert a string to int after checking for hex prefix?

This little exercise is meant to get a string from the user that could be decimal, hexadecimal, or octal. 1st I need to identify which kind of number the string is. 2nd I need to convert that number to int and display the number in its proper format, eg:
cout <<(dec,hex,oct, etc)<< number;
Here's what I came up with. I'd like a simpler, cleaner way to write this.
string number = "";
cin >> number;
string prefix = "dec";
char zero = '0';
char hex_prefix = 'x';
string temp = "";
int value = 0;
for(int i =0; i<number.size();++i)
{
if(number[0] == zero)//must be octal or hex
{
if (number[0] == zero && number[1] == hex_prefix ) //is hex
{
prefix = "hex";
for(int i = 0; i < (number.size() - 2); ++i)
{
temp[i] = number[i+2];
}
value = atoi(temp.c_str());
}
//... code continues to deal with octal and decimal
You are checking number[0] twice, that's the first most obvious problem.
The inner if already checks both number[0] and number[1], I don't see the point of the outer one.
The outermost loop is also hard to understand, do you expect non-hex data before the number, or what? Your question could be clearer on how the expected input string looks.
I think the cleanest would be to ignore this, and push it into existing (library) code that can parse integers in any base. In C I would recommend strtoul(), you can of course use that in C++ too.
You have two inner loop with same value integer this could be a conflict problem in your code. I suggest you look at the isdigit and islower methods in the c++ library and take advantage of those methods to accomplish your task. isdigit & islower
Good Luck
This is prints the number after deleting the hex prefix, otherwise return 0:
#include<iostream>
#include<cmath>
#include<stdlib.h>
using namespace std;
int main(){
string number = "";
cin >> number;
string prefix = "dec";
char zero = '0';
char hex_prefix = 'x';
string temp = "";
int value = 0;
if (number.size()>=2 && number[0] == zero && number[1] == hex_prefix ) //is hex
{
prefix = "hex";
for(int i = 0; i < (number.size() - 2); ++i)
{
temp[i] = number[i+2];
}
value = atoi(temp.c_str());
}
cout<<value;
return 0;
}
This partial solution that I found is as clean as possible, but it doesn't report the format of the integer:
int string_to_int(std::string str)
{
std::istringstream stream;
stream.unsetf(std::ios_base::dec);
int result;
if (stream >> result)
return result;
else
throw std::runtime_error("blah");
}
...
cout << string_to_int("55") << '\n'; // prints 55
cout << string_to_int("0x37") << '\n'; // prints 55
The point here is stream.unsetf(std::ios_base::dec) - it unsets the "decimal" flag that is set by default. This format flag tells iostreams to expect a decimal integer. If it is not set, iostreams expect the integer in any base.