I'm in need of a little debugging. The code is 100% compile-ready. However, it crashes if given either a small fragment of a document to compress, and when it decompresses it gives a error about bounds checking. I'm a little afraid of running it as well. It's not dangerous, but this is my masterpiece as of now. It is right in the sweet spot of compression techniques. This is one I made up. It uses a calculus derivation algorithm to get millions of unique keys to use. These are all predictable. And because they're unique, I can't screw it up by taking a key more than one time in the hashing. The aim of this code is to generate a hash which is perfectly regenerative and gives no loss in the compression. Thank you.
#include <iostream>
#include <bitset>
#include <vector>
#include <cmath>
#include <fstream>
#include <algorithm>
using namespace std;
class S_Rend {
private:
const bitset<8> beta=0xad;
protected:
bitset<8> alpha, lambda, gamma, omega;
bitset<8> delta, eta, theta, ghost, spec;
vector<long> cred;
public:
unsigned int integral;
S_Rend() { delta=0x00; eta=0x00; theta=0x00; lambda=0x00; alpha=0x00; delta=0x00; };
~S_Rend() { };
int s_render(ifstream&,ofstream&);
int render(ifstream&,ofstream&);
long s_nop(long t, int set);
} n;
/*+**- Project::Sailwinds -**+*/
long S_Rend::s_nop(long t,int set) {
if (set) {
integral=0;
t=(long&)beta;
}
integral++;
if (abs(round((t*1.618)*t-(integral+0.618))-1) <= 4294967296)
return (abs(round((t*1.618)*t-(integral+0.618))-1));
else
return (abs(round(sqrt(t))+(round(sqrt(t))*round(sqrt(integral))+1)));
}
int S_Rend::render(ifstream& in, ofstream& out) {
long bn;
long size=0;
long t;
if (!(in.is_open()))
{ return -1; }
else {
t=(long&)beta;
for_each (std::istreambuf_iterator<char>(in), \
std::istreambuf_iterator<char>(), \
[&] (int x) {
t=s_nop((long&)t,0);
cred.push_back(t);
alpha = (long&)cred[size];
delta = (long&)x;
lambda ^= (alpha ^ delta);
lambda ^= beta;
lambda = (int&)lambda + 1;
size++;
});
printf("*");
}
if (out.is_open())
{ out << lambda << endl;
out << size << endl;
out << delta << endl;
out << cred[size-1] << endl; }
else { return -1; }
in.close();
out.close();
return 0;
}
int S_Rend::s_render(ifstream& in, ofstream& out) {
long i, n;
long size;
long t;
long chk;
in >> lambda;
in >> size;
in >> delta;
in >> chk;
t=(long&)beta;
long bn=0;
while (size-1>=bn) {
t=s_nop((long&)t,0);
cred.push_back(t);
bn++;
}
if (cred[bn-1]==chk)
cout << "\nValidity Pass... Success!" << endl;
else {
printf("\nValidity Pass...Fail! %u != %u",cred[cred.size()-1],chk);
return 1;
}
cout << "\nWriting to Buffer..." << endl;
vector<long> btrace;
vector<long> ltr;
bn=1;
while (size-1>=bn) {
ltr.push_back(1);
btrace.push_back(1);
ltr[0]=(long&)lambda;
for (i=1;i<=btrace.size()-1;i++) {
alpha = (long&)cred[size-bn];
ghost = (long&)btrace[i-1];
spec = (long&)ltr[bn] - 1;
spec ^= (int&)beta;
eta = spec | alpha;
theta = spec & alpha;
omega = spec | eta;
gamma = spec & eta;
if ((eta ^ gamma) == (theta ^ omega)) {
printf(".");
ghost = (eta ^ gamma);
btrace[i-1] = (long&)ghost;
}
}
bn++;
}
cout << "One more second..\n";
bn=0;
while (bn<=btrace.size()-1) {
bn++;
delta = (long&)btrace[bn];
out << (const char)(long&)delta;
}
cout << "\nBuffer Written... Exiting..\n";
in.close();
out.close();
printf("*");
return 0;
}
int main() {
string outfile = "";
string infile = "";
string DC = "1";
printf("Enter <C> or <D> to compress or decompress ");
cin >> DC;
printf("\nInput File: ");
cin >> infile;
ifstream in;
in.open(infile.c_str(), std::ios::in | std::ios::binary);
if (in.fail())
return -1;
printf("\nOutput File: ");
cin >> outfile;
ofstream out;
out.open(outfile.c_str(), std::ios::out);
if (out.fail())
return -1;
if ((DC=="c") || (DC=="C"))
bool f=n.render(in, out);
if ((DC=="d") || (DC=="D"))
bool f=n.s_render(in, out);
printf("\nProgram Execution Done.");
n.~S_Rend();
return 0;
}
This last while-loop is accessing index 1 to (and including!) btrace.size():
bn=0;
while (bn<=btrace.size()-1) {
bn++;
delta = (long&)btrace[bn];
out << (const char)(long&)delta;
}
Move bn++; to the end of the loop, like you did in all your other loops.
And i have to agree with user4581301, using <= size-1 instead of just < size looks weird.
(int &)beta is a mistake. This is a reinterpret_cast which violates the strict aliasing rule. Probably it also accesses out of bounds; e.g. bitset<8> may only be 1 byte big, but (int &)beta will read 4 bytes out of that memory location.
Instead you should use beta.to_ulong(). You make the same mistake in dozens of places.
Remove all the casts from your code. Using a cast (especially a C-style cast) tells the compiler "Don't warn me if this is a mistake, I know what I am doing". But in fact you don't know what you are doing. Mostly, C++ can be written without using casts.
(There may be other mistakes too, just this one stood out to me on first reading. Fix all of these and try again).
Related
We need to create a universal Turing machine.
We have a file with the provided information: tape count, starting input, starting position and the rules.
Reading from the file isn't that big of a problem. What I'm doing right now is creating a structure vector and reading all the rules to there.
What I can't figure out is how to make the machine itself work. We know that the starting state is always zero, so I'm starting from there. Looking for the rules that start with this state, but what I cannot figure out is what if I have to jump back to the first rules? What to do then? What can of counting mechanism for the vector can I implement? I'm really new to vectors. I'll add that for the algorithm itself.
No more than two loops can be used, not counting printing out text or reading from the file.
The code I have right now is:
#include <iostream>
#include <vector>
#include <fstream>
#include <string>
#include <Windows.h>
struct rule {
std::string qstate; // dabartine busena
char csymbol; // dabartinis simbolis
char nsymbol; // naujasis simbolis
char direction; // i kuria puse eis galvute
std::string nstate; // naujoji busena
};
void printingText(std::vector<char> input, int position, long long steps);
void searchingForASymbolOrState(std::vector<rule> rules, std::vector<char> input, std::string state, int position, int& cursorPos);
int main()
{
int tapeCount, position;
long long steps = 0;
std::string tape;
std::ifstream file("1.txt");
file >> tapeCount >> tape >> position;
std::vector <char> input(tape.begin(), tape.end());
std::vector <rule> rules;
rule temp;
while (file >> temp.qstate) {
file >> temp.csymbol;
file >> temp.nsymbol;
file >> temp.direction;
file >> temp.nstate;
rules.push_back(temp);
}
file.close();
position--; // kadangi masyvas skaiciuoja nuo nulio, tai ir startine pozicija sumazinu, kadangi ji skaiciuoja nuo vieno
int cursorPos = 0; // saugosim vieta, kurioje vietoje prasideda taisykles su reikiama busena
std::string state = "0"; // saugosim busena, kad zinotume, kokioje busenoje siuo metu esame
// Tiuringo masinos "algoritmas"
while (true) {
printingText(input, position, steps);
if (state == rules[cursorPos].qstate) {
if (input[position] == rules[cursorPos].csymbol) {
if (input[position] != rules[cursorPos].nsymbol) {
input[position] = rules[cursorPos].nsymbol;
if (rules[cursorPos].direction == 'L') {
position--;
steps++;
}
else if (rules[cursorPos].direction == 'R') {
position++;
steps++;
}
if (rules[cursorPos].nstate != state) {
state = rules[cursorPos].nstate;
}
}
else if (input[position] == rules[cursorPos].nsymbol) {
if (rules[cursorPos].direction == 'L') {
position--;
steps++;
}
else if (rules[cursorPos].direction == 'R') {
position++;
steps++;
}
if (rules[cursorPos].nstate != state) {
state = rules[cursorPos].nstate;
}
}
}
else if (input[position] != rules[cursorPos].csymbol) {
searchingForASymbolOrState(rules, input, state, position, cursorPos);
}
}
else if (state != rules[cursorPos].qstate) {
searchingForASymbolOrState(rules, input, state, position, cursorPos);
} // Skaiciuojam zingsnius
// std::cout << cursorPos << " " << position << " " << state << " " << rules[cursorPos].qstate; // Eilute naudojama klaidu paieskai
Sleep(100);
system("cls");
}
// "Algoritmo pabaiga"
}
void printingText(std::vector<char> input, int position, long long steps) {
std::cout << "Head position can be seen with '' symbols\n\n";
for (int i = 0; i < input.size(); i++) {
if (i == position) {
std::cout << "'" << input[i] << "'";
}
else {
std::cout << input[i];
}
}
std::cout << "\n\nSteps: " << steps;
}
void searchingForASymbolOrState(std::vector<rule> rules, std::vector<char> input, std::string state, int position, int& cursorPos) {
for (int i = 0; i < rules.size(); i++) {
if (rules[i].qstate == state) {
if (rules[i].csymbol == input[position]) {
cursorPos = i;
}
}
if (rules[cursorPos].qstate != state) {
if (rules[i].qstate == state) {
cursorPos = i;
}
}
}
}
I know that either .eof() or system("cls") aren't good functions to use, but for this project, I think they'll work fine. Correct me if I'm wrong.
EDIT: I tried to do something. Not sure if there's a more effective why. Obviously it's not finished, no halting and error checking and etc. But if you have any comments, they would be really appreciated.
Hi I am working with existing C++ code, I normally use VB.NET and much of what I am seeing is confusing and contradictory to me.
The existing code loads neural network weights from a file that is encoded as follows:
2
model.0.conv.conv.weight 5 3e17c000 3e9be000 3e844000 bc2f8000 3d676000
model.0.conv.bn.weight 7 4006a000 3f664000 3fc98000 3fa6a000 3ff2e000 3f5dc000 3fc94000
The first line gives the number of subsequent lines. Each of these lines has a description, a number representing how many values follow, then the weight values in hex. In the real file there are hundreds of rows and each row might have hundreds of thousands of weights. The weight file is 400MB in size. The values are converted to floats for use in the NN.
It takes over 3 minutes to decode this file. I am hoping to improve performance by eliminating the conversion from hex encoding to binary and just store the values natively as floats. The problem is I cant understand what the code is doing, nor how I should be storing the values in binary. The relevant section that decodes the rows is here:
while (count--)
{
Weights wt{ DataType::kFLOAT, nullptr, 0 };
uint32_t size;
// Read name and type of blob
std::string name;
input >> name >> std::dec >> size;
wt.type = DataType::kFLOAT;
// Load blob
uint32_t* val = reinterpret_cast<uint32_t*>(malloc(sizeof(val) * size));
for (uint32_t x = 0, y = size; x < y; ++x)
{
input >> std::hex >> val[x];
}
wt.values = val;
wt.count = size;
weightMap[name] = wt;
}
The Weights class is described here. DataType::kFLOAT is a 32bit float.
I was hoping to add a line(s) in the inner loop below input >> std::hex >> val[x]; so that I could write the float values to a binary file as the values are converted from hex, but I dont understand what is going on. It looks like memory is being assigned to hold the values but sizeof(val) is 8 bytes and uint32_t are 4 bytes. Furthermore it looks like the values are being stored in wt.values from val but val contains integers not floats. I really dont see what the intent is here.
Could I please get some advice on how to store and load binary values to eliminate the hex conversion. Any advice would be appreciated. A lot.
Here's an example program that will convert the text format shown into a binary format and back again. I took the data from the question and converted to binary and back successfully. My feeling is it's better to cook the data with a separate program before consuming it with the actual application so the app reading code is single purpose.
There's also an example of how to read the binary file into the Weights class at the end. I don't use TensorRT so I copied the two classes used from the documentation so the example compiles. Make sure you don't add those to your actual code.
If you have any questions let me know. Hope this helps and makes loading faster.
#include <fstream>
#include <iostream>
#include <unordered_map>
#include <vector>
void usage()
{
std::cerr << "Usage: convert <operation> <input file> <output file>\n";
std::cerr << "\tconvert b in.txt out.bin - Convert text to binary\n";
std::cerr << "\tconvert t in.bin out.txt - Convert binary to text\n";
}
bool text_to_binary(const char *infilename, const char *outfilename)
{
std::ifstream in(infilename);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
std::ofstream out(outfilename, std::ios::binary);
if (!out)
{
std::cerr << "Error: Could not open output file '" << outfilename << "'\n";
return false;
}
uint32_t line_count;
if (!(in >> line_count))
{
return false;
}
if (!out.write(reinterpret_cast<const char *>(&line_count), sizeof(line_count)))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
std::string name;
uint32_t num_values;
if (!(in >> name >> std::dec >> num_values))
{
return false;
}
std::vector<uint32_t> values(num_values);
for (uint32_t i = 0; i < num_values; ++i)
{
if (!(in >> std::hex >> values[i]))
{
return false;
}
}
uint32_t name_size = static_cast<uint32_t>(name.size());
bool result = out.write(reinterpret_cast<const char *>(&name_size), sizeof(name_size)) &&
out.write(name.data(), name.size()) &&
out.write(reinterpret_cast<const char *>(&num_values), sizeof(num_values)) &&
out.write(reinterpret_cast<const char *>(values.data()), values.size() * sizeof(values[0]));
if (!result)
{
return false;
}
}
return true;
}
bool binary_to_text(const char *infilename, const char *outfilename)
{
std::ifstream in(infilename, std::ios::binary);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
std::ofstream out(outfilename);
if (!out)
{
std::cerr << "Error: Could not open output file '" << outfilename << "'\n";
return false;
}
uint32_t line_count;
if (!in.read(reinterpret_cast<char *>(&line_count), sizeof(line_count)))
{
return false;
}
if (!(out << line_count << "\n"))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
uint32_t name_size;
if (!in.read(reinterpret_cast<char *>(&name_size), sizeof(name_size)))
{
return false;
}
std::string name(name_size, 0);
if (!in.read(name.data(), name_size))
{
return false;
}
uint32_t num_values;
if (!in.read(reinterpret_cast<char *>(&num_values), sizeof(num_values)))
{
return false;
}
std::vector<float> values(num_values);
if (!in.read(reinterpret_cast<char *>(values.data()), num_values * sizeof(values[0])))
{
return false;
}
if (!(out << name << " " << std::dec << num_values))
{
return false;
}
for (float &f : values)
{
uint32_t i;
memcpy(&i, &f, sizeof(i));
if (!(out << " " << std::hex << i))
{
return false;
}
}
if (!(out << "\n"))
{
return false;
}
}
return true;
}
int main(int argc, const char *argv[])
{
if (argc != 4)
{
usage();
return EXIT_FAILURE;
}
char op = argv[1][0];
bool result = false;
switch (op)
{
case 'b':
case 'B':
result = text_to_binary(argv[2], argv[3]);
break;
case 't':
case 'T':
result = binary_to_text(argv[2], argv[3]);
break;
default:
usage();
break;
}
return result ? EXIT_SUCCESS : EXIT_FAILURE;
}
// Possible implementation of the code snippet in the original question to read the weights
// START Copied from TensorRT documentation - Do not include in your code
enum class DataType : int32_t
{
kFLOAT = 0,
kHALF = 1,
kINT8 = 2,
kINT32 = 3,
kBOOL = 4
};
class Weights
{
public:
DataType type;
const void *values;
int64_t count;
};
// END Copied from TensorRT documentation - Do not include in your code
bool read_weights(const char *infilename)
{
std::unordered_map<std::string, Weights> weightMap;
std::ifstream in(infilename, std::ios::binary);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
uint32_t line_count;
if (!in.read(reinterpret_cast<char *>(&line_count), sizeof(line_count)))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
uint32_t name_size;
if (!in.read(reinterpret_cast<char *>(&name_size), sizeof(name_size)))
{
return false;
}
std::string name(name_size, 0);
if (!in.read(name.data(), name_size))
{
return false;
}
uint32_t num_values;
if (!in.read(reinterpret_cast<char *>(&num_values), sizeof(num_values)))
{
return false;
}
// Normally I would use float* values = new float[num_values]; here which
// requires delete [] ptr; to free the memory later.
// I used malloc to match the original example since I don't know who is
// responsible to clean things up later, and TensorRT might use free(ptr)
// Makes no real difference as long as new/delete ro malloc/free are matched up.
float *values = reinterpret_cast<float *>(malloc(num_values * sizeof(*values)));
if (!in.read(reinterpret_cast<char *>(values), num_values * sizeof(*values)))
{
return false;
}
weightMap[name] = Weights { DataType::kFLOAT, values, num_values };
}
return true;
}
I have a function in my code which decodes a file compressed using the LZ77 algorithm. But on 15 MB input file decompression takes about 3 minutes (too slow). What's the reason of poor performance? On every step of the loop I read two or three bytes and get length, offset and next character. If offset is not zero I also have to move "offset" bytes back in output stream and read "length" bytes. Then I insert them to the end of the same stream before writing next character there.
void uncompressData(long block_size, unsigned char* data, fstream &file_out)
{
unsigned char* append;
append = new unsigned char[buf_length];
link myLink;
long cur_position = 0;
file_out.seekg(0, ios::beg);
cout << file_out.tellg() << endl;
int i=0;
myLink.length=-1;
while(i<(block_size-1))
{
if(myLink.length!=-1) file_out << myLink.next;
myLink.length = (short)(data[i] >> 4);
//cout << myLink.length << endl;
if(myLink.length!=0)
{
myLink.offset = (short)(data[i] & 0xF);
myLink.offset = myLink.offset << 8;
myLink.offset = myLink.offset | (short)data[i+1];
myLink.next = (unsigned char)data[i+2];
cur_position=file_out.tellg();
file_out.seekg(-myLink.offset,ios_base::cur);
if(myLink.length<=myLink.offset)
{
file_out.read((char*)append, myLink.length);
}
else
{
file_out.read((char*)append, myLink.offset);
int k=myLink.offset,j=0;
while(k<myLink.length)
{
append[k]=append[j];
j++;
if(j==myLink.offset) j=0;
k++;
}
}
file_out.seekg(cur_position);
file_out.write((char*)append, myLink.length);
i++;
}
else {
myLink.offset = 0;
myLink.next = (unsigned char)data[i+1];
}
i=i+2;
}
unsigned char hasOddSymbol = data[block_size-1];
if(hasOddSymbol==0x0) { file_out << myLink.next; }
delete[] append;
}
You could try doing it on a std::stringstream in memory instead:
#include <sstream>
void uncompressData(long block_size, unsigned char* data, fstream& out)
{
std::stringstream file_out; // first line in the function
// the rest of your function goes here
out << file_out.rdbuf(); // last line in the function
}
I have to write a programs that takes an input of string which has some '$' and digits. The output of the program is set of all possible strings where the '$ in the string is replaced by all the other digits.
I have written the following code for it.
#include<bits/stdc++.h>
using namespace std;
int numberOf(string in)
{
int count = 0;
for(int i = 0; i <= in.size()-1; i++)
if(in[i] == '$')
count++;
return count;
}
void solve(string in, string in1, vector <string> &s,
int index)
{
if(numberOf(in) == 0)
{
s.push_back(in);
return;
}
if(index == in.size())
{
return;
}
if(in1.empty())
{
return;
}
else
{
if(in[index] == '$')
{
string in2 = in;
in2[index] = in1[0];
string in3 = in1;
in3.erase(in3.begin());
solve(in2, in1, s, index+1);
solve(in, in3, s, index);
return;
}
else
{
solve(in, in1, s, index+1);
return;
}
}
}
void replaceDollar(string in)
{
string in1 = in;
int count = 0;
for(int i = 0; i <= in.size()- 1; i++)
{
if(in[i] != '$')
{
in1.push_back(in[i]);
count++;
}
}
count = in.size() - count;
cout << "Number is " << count << "\n";
vector <string> s;
solve(in, in1, s, 0);
for(auto i = s.begin(); i != s.end(); i++)
cout << *i << " ";
cout << "\n";
}
int main()
{
int t;
cin >> t;
while(t--)
{
string in;
cin >> in;
replaceDollar(in);
}
return 0;
}
For following input
1
$45
The expected output should be
445 545
But it returns
445 545 445 545
Can anyone please explain why is it outputting repeated strings?
Also can anyone suggest a better approach to this question?
Thanks in advance!
Assuming that this is homework:
Start over. Your code is way too complex for this problem.
I would treat everything as type char
Loop/iterate over said string, using std::string::replace() to replace each instance of $ with each digit.
-- If your teacher doesn't want you using std libraries, then add another loop and compare yourself.
3a. Of course, add a check, so that you don't replace $ with $
Create a new copy of the string on each iteration.
Print each to stdout as you create them.
See this post:
How to replace all occurrences of a character in string?
p.s. Pro tip: don't use using namespace. Use the full namespace in your calls; e.g.:
Bad
using namespace std;
string = "hello world";
Good
std::string = "hello world";
#include <iostream> // std::cout
#include <cstdlib>
#include <climits>
#include <algorithm>
#include <cmath>
#include <fstream>
using namespace std;
struct student{
int ID; // ID
string firstname; // first name
string lastname; // last name
int date; // YYMMDD
static bool sort_date(student a, student b){
long data1;
long data2;
data1 = a.date;
data2 = b.date;
if(data1 < 150000){
data1 += 20000000;
}
else{
data2 += 19000000;
}
if(data2 < 150000){
data2 += 20000000;
}
else{
data1 += 19000000;
}
return data1 < data2;
}
};
bool is_num(const string &s);
void input_year(student &students);
int length_of_int(int x);
int main(){
student students[5];
students[0].date = 000101;
students[1].date = 951230;
students[2].date = 570509;
students[3].date = 120915;
students[4].date = 020324;
stable_sort(students, students + 5, student::sort_date);
ofstream file;
file.open("sort_date.txt");
for(int i = 0; i < 5; i++){
file << students[i].date << endl;
}
return 0;
}
void input_year(student &students){
while(true){
string input;
cin >> input;
if(is_num(input)){
students.date = atoi(input.c_str());
if(length_of_int(students.date) != 6){
cout << "Error, try again." << endl;
}
else{
//
break;
}
}
else{
cout << "Error, try again." << endl;
}
}
}
bool is_num(const string &s){
string::const_iterator it = s.begin();
while(it != s.end() && isdigit(*it)){
++it;
}
return !s.empty() && it == s.end();
}
int length_of_int(int input){
int length = 0;
while(input > 0){
length++;
input /= 10;
}
return length;
}
This is my code above and I'm not sure what else to do to sort the dates.. I've been working on this for a while and can't get it right. I need help, preferably a code which solves my problem.
Basically, the type of date is "YYMMDD", so in sort_date function, I make those integers in format of "YYYYMMDD" and then sort them and then again become YYMMDD. However, the sorting is somehow wrong.. I tried several times, and when writing a date like "010101" in the file, it removes the first "0", so I am looking for help with those two problems. Any help is appreciated.
If a leading 0 is significant, then you don't have an int, but a string. You can check whether strings are sorted using < just as well as you can ints.
Also look at your if-else for adding 19 or 20; you check data1 but then modify data2 (and vice versa)....
Starting a number with 0 in C means that the base should be interpreted as octal (base 8 rather than 10): the literal 020324 will be interpreted as the decimal number 8404.
First convert dates to time_t or tm, then use datetime library (http://en.wikipedia.org/wiki/C_date_and_time_functions)