huffman decoding is going nutts - c++

The problem is that I can decode a data stream for the first time, but then it will become an infinite loop and display the same value over and over again... I'm using borland C++. Decoding is done by first saving the text a to z in an array, then taking the input data stream and then cutting the contents of the array using strcpy then comparing with the contents of the first array then if a match is found, the corresponding ASCII is printed.
code:
#include<conio.h>
#include<iostream.h>
#include<string.h>
#include <stdio.h>
#include <stdlib.h>
using namespace std;
int codes[512];
char cut_stream[100];
char input_stream[100];
char decoded_stream[256];
int test,count,cut_start,cut_end,length_of_stream,final,temp_loop_s,temp_loop_e,temp_num,comp_num,send;
void select_num(int select)
{
int output;
output = select + 64;
cout << char(output);
}
void decode()
{
cout<<"\nEnter the data stream ";//<<endl;
cin >> input_stream;
length_of_stream = strlen(input_stream);
//cout<< length_of_stream ; //only for debugging
/********************************************
the code starts here...
********************************************/
count = length_of_stream;
//count -= count;
for(int ini =0; ini<=count ; ini++)
{
for( final=1;final<=count;final++)
{
strncpy(cut_stream, &input_stream[ini], final);
//cut_start = cut_start + 1;
/*********************************************
compare
*********************************************/
temp_num = atoi(cut_stream);
for(int z= 1;z<=26;z++)
{
comp_num = codes[z];
if(comp_num == temp_num)
{
send =z;
select_num(send);
test =1;
comp_num =0;
break;
}
}
if( test ==1)
{
test =0;
ini = final-1; // the increment will be given on the next for loop so it is reduced here
//final =0;
//cout<< "im in test";
break;
}
}
}
cout<< "end";
while(1);
}
//cout<< decoded_stream;
//while(1);
void main()
{
cut_start =0;
cut_end = 1;
cout << "Huffman decoder" << endl;
cout << "Enter the codes for a-z" << endl;
for(int i =1;i<=3;i++)
{
cin >> codes[i];
}
decode();
}

There are least two major bugs in the code:
The codes[] array is uninitialized mostly, you only read in three numbers even though you later access the array for indexes up to 26 or so.
The call to strncpy() is broken in the sense that strncpy() does NOT null-terminate a string when it copies the maximum number of characters; that is, when you call strncpy() with final set to 1, strncpy() copies one character and does NOT append the terminating NUL-character, which will then cause atoi() to fail.
Also, if you are using "0" and "1" characters in your huffman coding, this won't work anyway because numbers "01" and "1" will both be interpreted by atoi() as 1 (one) even though they are different codes. If this is really huffman-coding you shouldn't be using atoi() and integers at all, but just binary or character strings.
Huffman decoding is better done by using a tree data structure. look up any standard book on algorithms for reference.

Related

can anyone help me to understand this c++ code?

I explain you the working of this program.
step 1: enter the no. of time you want to run the loop.
step 2: enter two strings s1 and s2.
output : it will give you a string s3 that does not contain any character from string s2.
problem: I am unable to understand the working of for loop, and why the value of hash is 257, and how is loops working.
The code is given below.
#include <iostream>
using namespace std;
#include<string.h>
int main()
{
int t;
cout<<"enter any no. to run the loop"<<endl;
cin>>t;
while(t--)
{
string s1,s2,s3;
int i,j,l1,l2;
cout<<"enter two strings s1 and s2"<<endl;
cin>>s1>>s2;
l1=s1.length( );
l2=s2.length( );
int hash[257];
for(i=0;i<257;i++)
{
hash[i]=0;
}
for(i=0;i<l2;i++)
{
hash[s2[i]]++;
}
for(i=0;i<l1;i++)
{
if(hash[s1[i]]==0)
s3=s3+s1[i];
}
cout<<s3<<endl;
}
return 0;
}
This program figures out which characters in the first string are not contained in the second string.
Example input for the program:
1
abcdefghijklmnopqrstuvwxyz
helloworld
Example output (thanks to #mch for correction)
abcfgijkmnpqstuvxyz
Edit: Note that this is of course case sensitive as characters a and A produce different integer values.
Here is some commentary on the program:
#include <iostream>
using namespace std;
#include <string.h>
int main() {
// Do the whole program as many times as the user says
int t;
cout << "enter any no. to run the loop" << endl;
cin >> t;
while (t--) {
string s1, s2, s3;
int i, j, l1, l2;
// read strings and get their respective lengths
cout << "enter two strings s1 and s2" << endl;
cin >> s1 >> s2;
l1 = s1.length();
l2 = s2.length();
// Array with 257 elements
int hash[257];
// Initialize all elements of array with 0
for (i = 0; i < 257; i++) {
hash[i] = 0;
}
// Count occurrences of characters in second string
// s2[i] is the character at position i in s2
// Increase the value of hash for this character by 1
for (i = 0; i < l2; i++) {
hash[s2[i]]++;
}
// Iterate over s1 characters
// If hash[i] == 0: character i is not contained in s2
// s3 => string of letters in s1 that are not contained in s2
for (i = 0; i < l1; i++) {
if (hash[s1[i]] == 0)
s3 = s3 + s1[i];
}
// output s3
cout << s3 << endl;
}
return 0;
}
The code computes an histogram of occurrences of the letters in s1 and copies the letters of s2 that have zero occurrence.
It can crash for any char type not restricted to the range [0,256] (!)
There's a comment above explaining the for-loops.
int hash[257] could actually be int hash[256] . There are 256 different values that can fit in a char (8 bits).

Extra character being added to string

Why is there an extra character at the end of my string?
#include <iostream>
using namespace std;
int main() {
int num;
cin >> num; // Reading input from STDIN
cout << "Input number is " << num << endl; // Writing output to STDOUT
char c1, s2[10];
for (int i=0; i<num; i++)
{
cin >> c1;
if(c1==0){
break;
}
s2[i] = c1;
}
cout <<"output= "<< s2;
}
output example
4
Input number is 4
a l e x
output= alex#
Why is the "#" being added to the end of the string? At first i thought it was a random garbage value but every time I run the program its always the same symbol
cout, when printing a c-string, expects it to zero terminated. Whereas you haven't done so for the array s2.
You can zero initialize the entire array:
char s2[10] = {};
Or just zero terminate the last byte:
int i = 0;
for (i=0; i<num; i++)
{
cin >> c1;
if(c1==0) {
break;
}
s2[i] = c1;
}
s2[i] = '\0';
In any case, you need to be wary of potential buffer overflow (e.g. if num is too large).
Alternatively, you can consider using std::string instead of a fixed length array.
You're reading from a memory location that hasn't been initialized. By using an array of ten characters and only initializing the first four (or whatever else the number is), all other characters stay uninitalized. What data is actually read from an uninitialized location is undefined, meaning it's pretty much up to your compiler that chooses to read the equivalent value of "#" from that location. You can fix that issue by using a memory bit of the appropriate size. For this, you just replace the line
char c1, s2[10];
with
char c1;
char* c2 = new char[num + 1] //num + 1 is necessary to contain a string terminator, see the other answers
this way, you dynamically allocate exactly the size you need.
Don't forget to delete[] c2; afterwards.
You are using a Character sequence well explained here.
By convention, the end of strings represented in character sequences
is signaled by a special character: the null character, whose literal
value can be written as '\0' (backslash, zero).
In this case, the array of 20 elements of type char called foo can be
represented storing the character sequences "Hello" and "Merry Christmas" as:
Notice how after the content of the string itself, a null character
('\0') has been added in order to indicate the end of the sequence.
The panels in gray color represent char elements with undetermined
values.
I offer a c++17 solution with the constructor initialization although I may prefer either a dynamic array or std::string instead of a char.
I also added a simple integer check that always should be used.
Also a few versions of avoiding the use of the whole namespace std for various reasons, mostly to avoid unnecessary errors.
#include <iostream>
#include <limits> //numeric_limits
using std::cout, std::endl, std::cin; //<- explicit declared
int main() {
int num;
while(!(cin >> num)){ //check the Input format for integer the right way
cin.clear();
cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
cout << "Invalid input. Try again: ";
};
cout << "Input number is " << num << endl;
char c1, s2[num+1]{}; // Initialize with an empty string
for (int i = 0; i < num; i++)
{
cin >> c1;
if (c1 == 0) {
break;
}
s2[i] = c1;
}
cout << "output= " << s2 << endl;
return 0;
}
This happens because s2 is actually not a string and does not have the \0 character, which would mean the end of the string. Therefore, cout prints your string and will continue to move further in memory, byte by byte, interpreting each of them as a character to be output until it encounters the \0 character. In order to fix this, you can initialize s2 with an empty string, so the array will initially be completely filled \0.
#include <iostream>
using namespace std;
int main() {
int num;
cin >> num;
cout << "Input number is " << num << endl;
char c1, s2[10] = ""; // Initialize with an empty string
for (int i = 0; i < num; i++)
{
cin >> c1;
if (c1 == 0) {
break;
}
s2[i] = c1;
}
cout << "output= " << s2;
}

String's character delete & Frequency without using library

I am trying to work on a assignment which need to input two string S & str and then deliver three results (Without using any string library):
1.Determine the number and positions of the given str in S
2.Return S with any possible occurrence of str removed
ex. if S=aababacd cdefgh ,
str=aba
The Frequency is 2, position is <0,2>
The Character Delete would be cd cdefgh
Attached code is what I have done so far, I can output the Frequency and the position, but now I have few unsolved questions and I have no idea how to implement it.
1.Once I input a string with space in there, ex. abcd efg, the code will implement it immediately, it will not consider abcd efg as one string but consider it as S=abcd and str=efg , with this problem I cant input a string with blank space to test.
2.How can I output the position like this form: <0,2> , because I am using a loop to output the result so it cant not be like that, I was thinking whether I can create an array to store i and then cout it, but I failed.
3.About the character Delete problem, I found one similar problem, it said if I know how to use strcpy without using library then I would know, but I learned it and I still dont know how to handle this question, I know I can compare these two strings but I dont know how to output S without the str part.I was thinking to change the S into '\0' after loop and output it, but that was totally wrong.
I would be really appreciated if anyone could give me some advice, thank you!
#include <iostream>
#include <algorithm>
using namespace std;
void CharacterDelete(){
char S[100], str[100];
bool match =true;
cout << "Enter string S :";
cin >> S;
cout << "Enter string str :";
cin>>str;
for(int i=0; S[i]!='\0'; i++)
{
for(int j=0; str[j]!='\0';j++){
if(S[i+j]!=str[j]){
match=false;
break;
}
if(match){
S[i+j]='\0';
}
}
}
cout<<S;//Apparently thats a wrong solution
}
void Frequency(){
string S,str;
cout<<"Please input string S"<<endl;
cin>>S;
cout<<"Please input string str"<<endl;
cin>>str;
int sum=0;
for (int i=0; i<S.size(); i++)
{
if (i + str.size() > S.size()) break;
bool match=true;
for (int j=0; j<str.size(); j++)
if (S[i+j] != str[j])
{
match=false;
break;
//Once we print blank space and it would implement it immediately?
}
if(match)
{
sum++;
cout<<"Start from"<<i<<endl;
//What if we use an array to store it and then output it?but how to write it?
}
}
cout<<"The Frequency is "<<sum<<endl;
if(sum==0){
cout<<"There is no starting point"<<endl;
}
}
int main() {
Frequency();
CharacterDelete();
return 0;
}
You are using local variables S and str. You need to use S and str in main function. Then transfer this variables from main in Frequency() and CharacterDelete.
Delete characters: create new variable then copy there characters without delete-characters.
Output:
cout << "<" << num_word << ", " << number << ">\n";

Super basic code: Why is my loop not breaking?

for(int i=0;i<50;i++,size++)
{
cin >> inputnum[i];
cout << size;
if(inputnum[i] == '.')
{
break;
}
}
The break breaks the input stream but the size keeps outputting.
The output of size is 012345678910111213...474849.
I tried putting size++ inside the loop but it made no difference. And size afterwards will be equal to 50, which means it went through the full loop.
I forgot to explain that I added the cout << size within the loop to debug/check why it outputted to 50 after the loop even if I only inputted 3 numbers.
I suspect that inputnum is an array of int (or some other numeric type). When you try to input '.', nothing actually goes into inputnum[i] - the cin >> inputnum[i] expression actually fails and puts cin into a failed state.
So, inputnum[i] is not changed when inputting a '.' character, and the break never gets executed.
Here's an slightly modified version of your code in a small, complete program that demonstrates using !cin.good() to break out of the input loop:
#include <iostream>
#include <ostream>
using namespace std;
int main()
{
int inputnum[50];
int size = 0;
for(int i=0;i<50;i++,size++)
{
cin >> inputnum[i];
if (!cin.good()) {
break;
}
}
cout << "size is " << size << endl;
cout << "And the results are:" << endl;
for (int i = 0; i < size; ++i) {
cout << "inputnum[" << i << "] == " << inputnum[i] << endl;
}
return 0;
}
This program will collect input into the inputnum[] array until it hits EOF or an invalid input.
What is inputnum ? Make sure t's a char[]!! with clang++ this compiles and works perfectly:
#include <iostream>
int main() {
int size = 0;
char inputnum[60];
for(int i=0;i<50;i++,size++) {
std::cin >> inputnum[i];
std::cout << size;
if(inputnum[i] == '.') {
break;
}
}
return 0;
}
(in my case with the following output:)
a
0a
1s
2d
3f
4g
5.
6Argento:Desktop marinos$
Your code seams OK as long as you're testing char against char in your loop and not something else.. Could it be that inputnum is some integral value ? if so, then your test clause will always evaluate to false unless inputnum matches the numerical value '.' is implicitly casted to..
EDIT
Apparently you are indeed trying to put char in a int[]. Try the following:
#include <iostream>
int main() {
using namespace std;
int size = 0;
int inputnum[50];
char inputchar[50];
for(int i=0;i<50;i++,size++) {
cin >> inputchar[i];
inputnum[i] = static_cast<int>(inputchar[i]); // or inputnum[i] = (int)inputchar[i];
cout << size << endl; // add a new line in the end
if(inputchar[i] == '.') break;
}
return 0;
}
Then again this is probably a lab assignment, in a real program I'd never code like this. Tat would depend on the requirements but I'd rather prefer using STL containers and algorithms or stringstreams. And if forced to work at a lower-level C-style, I'd try to figure out to what number '.' translates to (simply by int a = '.'; cout << a;`) and put that number directly in the test clause. Such code however might be simple but is also BAD in my opinion, it's unsafe, implementation specific and not really C++.

Store a word into a dynamically created array when first encountered

Here is the assignment:
Write a program that reads in a text file one word at a time. Store a word into a dynamically created array when it is first encountered. Create a paralle integer array to hold a count of the number of times that each particular word appears in the text file. If the word appears in the text file multiple times, do not add it into your dynamic array, but make sure to increment the corresponding word frequency counter in the parallel integer array. Remove any trailing punctuation from all words before doing any comparisons.
Create and use the following text file containing a quote from Bill Cosby to test your program.
I don't know the key to success, but the key to failure is trying to please everybody.
At the end of your program, generate a report that prints the contents of your two arrays in a format similar to the following:
Word Frequency Analysis
I 1
don't 1
know 1
the 2
key 2
...
Here is my code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int readInFile (string tempArray [], string file, int arraySize);
int main()
{
ifstream inputFile;
string *readInArray = 0,
*compareArray = 0,
filename,
word;
int wordCount = 0;
int encountered = 0;
int j = 0,
*wordFrequency = 0;
cout << "Enter the filename you wish to read in: ";
getline(cin, filename);
inputFile.open(filename.c_str());
if (inputFile)
{
while (inputFile >> word)
{
wordCount++;
}
inputFile.close();
readInArray = new string[wordCount];
readInFile(readInArray, filename, wordCount);
}
else
{
cout << "Could not open file, ending program";
return 0;
}
compareArray = new string[wordCount];
wordFrequency = new int[wordCount];
for (int count = 0; count < wordCount; count++)
wordFrequency[count] = 0;
for(int i = 0; i < wordCount; ++i)
{
j = 0;
encountered = 0;
do
{
if (readInArray[i] == compareArray[j])
encountered = 1;
++j;
} while (j < wordCount);
if (encountered == 0)
{
compareArray[i]=readInArray[i];
wordFrequency[i] += 1;
}
}
for(int k=0; k < wordCount; ++k)
{
cout << "\n" << compareArray[k] << " ";
}
for(int l=0; l < wordCount; ++l)
{
cout << "\n" << wordFrequency[l] << " ";
}
return 0;
}
int readInFile (string tempArray [], string file, int arraySize)
{
ifstream inputFile;
inputFile.open(file.c_str());
if (inputFile)
{
cout << "\nHere is the text file:\n\n";
for(int i=0; i < arraySize; ++i)
{
inputFile >> tempArray[i];
cout << tempArray[i] << " ";
}
inputFile.close();
}
}
Here is my question:
How do you store a word into a dynamically created array when it is first encountered? As you can see from my code made a string array with some of the elements empty. I believe it is suppose to be done using pointers.
Also how do I get rid of the punctuation in the string array? Should it be converted to a c-string first? But then how would I compare the words without converting back to a string array?
Here is a link to a java program that does something similar:
http://math.hws.edu/eck/cs124/javanotes3/c10/ex-10-1-answer.html
Thank you for any help you can offer!!
As to the first part of your question, you are not using a dynamically created array. You are using a regular array. C++ provides implementations of dymnamic arrays, like the vector class http://www.cplusplus.com/reference/vector/vector/
As to the second part of your question, I see no reason to convert it to a c string. The string class in c++ provides functionality for removing and searching for characters. http://www.cplusplus.com/reference/string/string/
The string::erase function can be used to erase punctuation characters found with string::find.
Note: There are other ways of doing this assignment that may be easier (like having an array of structs containing a string and an int, or using a map) but that may defeat the purpose of the assignment.