c++ getline file output change when is copied to string

c++ getline file output change when is copied to string - c++

i have a file that each line of it have a row of matrix and in each row, double numbers placed with space between them, this file include a lot of these matrices with a empty line between them
now i have two different versions of code:
1- single thread read from file with getline(file, readLine) and directly process readLine, split it and use stod to make double numbers and make matrix
#include <fstream>
#include <string>
using namespace std;
void decomposeSerial(double *A, long n)
{
long i, j, k;
for (k = 0; k < n; k++) {
for (j = k + 1; j < n; j++)
A[k*n + j] = A[k*n + j] / A[k*n + k];
for (i = k + 1; i < n; i++)
for (j = k + 1; j < n; j++)
A[i*n + j] = A[i*n + j] - A[i*n + k] * A[k*n + j];
}
}
void main() {
const string inFilePath = ".\\data_in\\file.txt";
const string outFilePath = ".\\data_out\\file.txt";
ifstream inFile(inFilePath);
ofstream outFile(outFilePath);
int n;
int matrixLine = 0;
double * matrix = NULL;
string readLine;
while (getline(inFile, readLine)) {
if (!readLine.empty()) {
if (matrixLine == 0) {
n = 0;
string temp = readLine;
size_t pos = 0;
while ((pos = temp.find(" ")) != string::npos) {
temp.erase(0, pos + 1);
n++;
}
matrix = (double *)malloc(sizeof(double) * n * n);
}
size_t pos = 0;
string token;
int i = 0;
while ((pos = readLine.find(" ")) != string::npos) {
token = readLine.substr(0, pos);
matrix[matrixLine * n + i] = stod(token);
readLine.erase(0, pos + 1);
i++;
}
matrixLine++;
if (matrixLine == n) {
decomposeSerial(matrix, n);
double det = 1;
for (long o = 0; o < n; o++) {
det *= matrix[o * n + o];
}
outFile << det << "\n";
}
}
else {
matrixLine = 0;
}
}
inFile.close();
outFile.close();
}
http://codeshare.io/5enk9x
2- single thread read from file with getline(file, readLine) and append readLine to an element of a string array dedicated for this matrix, and after this, in parallel, each thread get one of these elements and go through the same process to make matrix
#include <fstream>
#include <string>
#include <omp.h>
using namespace std;
double det[1000];
string input[1000];
int ns[1000];
void computation(double* src, int n, int l)
{
long i, j, k;
for (k = 0; k < n; k++) {
for (j = k + 1; j < n; j++)
src[k*n + j] = src[k*n + j] / src[k*n + k];
for (i = k + 1; i < n; i++)
for (j = k + 1; j < n; j++)
src[i*n + j] = src[i*n + j] - src[i*n + k] * src[k*n + j];
}
double res = 1;
for (int j = 0; j < n; j++) {
res *= src[j*n + j];
}
det[l] = res;
}
void main() {
const string inFilePath = ".\\data_in\\file.txt";
const string outFilePath = ".\\data_out\\file.txt";
ifstream inFile(inFilePath);
int matrixCount = 0;
bool inMatrix = false;
string readLine;
int dim = 0;
while (getline(inFile, readLine)) {
dim++;
if (readLine.empty()) {
ns[matrixCount] = dim - 1;
dim = 0;
inMatrix = false;
matrixCount++;
}
else {
if (inMatrix == false) {
inMatrix = true;
input[matrixCount] = readLine;
}
else {
input[matrixCount] += readLine;
}
}
}
ns[matrixCount] = dim;
matrixCount++;
inFile.close();
#pragma omp parallel
{
#pragma omp for schedule(dynamic)
for (int i = 0; i < matrixCount; i++) {
string matrixStr = input[i];
int n = ns[i];
double * matrix = (double *)malloc(sizeof(double) * n * n);
size_t pos = 0;
string token;
int k = 0;
while ((pos = matrixStr.find(" ")) != string::npos) {
token = matrixStr.substr(0, pos);
matrix[k] = stod(token);
matrixStr.erase(0, pos + 1);
k++;
}
computation(matrix, n, i);
free(matrix);
}
}
ofstream outFile(outFilePath);
for (int i = 0; i < matrixCount; i++) {
outFile << det[i] << "\n";
}
outFile.close();
}
http://codeshare.io/ad83Yy
but incredibly, second code work much much slower to make matrices
when i print readLine what comes from getline func with printf("%s", readLine) it prints weird chars, anyway i got that when i append readLine to string array element, these weired chars change on the console and i guess that's why i get slower performance as functions line str.find(" ") or stod(str) work better with first weird ones comparing to second ones
if you think the same, you may suggest a way to prevent char changing in appending

These kinds of performance issues can't be reasoned. You need to use a profiler and measure which parts of the code take up how much time.
To start with I would make both codes more similar. There are a lot of differences that could confuse the issue (for once version 1 has a giant memory leak).

Related

Rabin-Karp algorithm in c++

I am trying to understand the implementation of the Rabin-Karp algorithm. d is the number of characters in the input alphabet, but if I replace 0 or any other value instead of 20, it won't affect anything. Why is this happening like this ?
// Rabin-Karp algorithm in C++
#include <string.h>
#include <iostream>
using namespace std;
#define d 20
void rabinKarp(char pattern[], char text[], int q) {
int m = strlen(pattern);
int n = strlen(text);
int i, j;
int p = 0;
int t = 0;
int h = 1;
for (i = 0; i < m - 1; i++)
h = (h * d) % q;
// Calculate hash value for pattern and text
for (i = 0; i < m; i++) {
p = (d * p + pattern[i]) % q;
t = (d * t + text[i]) % q;
}
// Find the match
for (i = 0; i <= n - m; i++) {
if (p == t) {
for (j = 0; j < m; j++) {
if (text[i + j] != pattern[j])
break;
}
if (j == m)
cout << "Pattern is found at position: " << i + 1 << endl;
}
if (i < n - m) {
t = (d * (t - text[i] * h) + text[i + m]) % q;
if (t < 0)
t = (t + q);
}
}
}
int main() {
// char text[] = "ABCCDXAEFGX";
char text[] = "QWERTYUIOPASDFGHJKLXQWERTYUIOPASDFGHJKLX";
char pattern[] = "KLXQW";
int q = 13;
rabinKarp(pattern, text, q);
}

I believe the short answer is that the lower d is the more hash collisions you will have, but you go about verifying the match anyway so it does not affect anything.
A bit more verbose:
First let me modify your code to be have more expressive variables:
// Rabin-Karp algorithm in C++
#include <string.h>
#include <iostream>
using namespace std;
#define HASH_BASE 0
void rabinKarp(char pattern[], char text[], int inputBase) {
int patternLen = strlen(pattern);
int textLen = strlen(text);
int i, j; //predefined iterators
int patternHash = 0;
int textHash = 0;
int patternLenOut = 1;
for (i = 0; i < patternLen - 1; i++)
patternLenOut = (patternLenOut * HASH_BASE) % inputBase; // hash of pattern len
// Calculate hash value for pattern and text
for (i = 0; i < patternLen; i++) {
patternHash = (HASH_BASE * patternHash + pattern[i]) % inputBase;
textHash = (HASH_BASE * textHash + text[i]) % inputBase;
}
// Find the match
for (i = 0; i <= textLen - patternLen; i++) {
if (patternHash == textHash) {
for (j = 0; j < patternLen; j++) {
if (text[i + j] != pattern[j])
break;
}
if (j == patternLen)
cout << "Pattern is found at position: " << i + 1 << endl;
}
if (i < textLen - patternLen) {
textHash = (HASH_BASE * (textHash - text[i] * patternLenOut) + text[i + patternLen]) % inputBase;
if (textHash < 0)
textHash = (textHash + inputBase);
}
}
}
int main() {
// char text[] = "ABCCDXAEFGX";
char text[] = "QWEEERTYUIOPASDFGHJKLXQWERTYUIOPASDFGHJKLX";
char pattern[] = "EE";
int q = 13;
rabinKarp(pattern, text, q);
}
The easiest way to attack it is to set HASH_BASE (previously d) to zero and see where we can simplify. The rabinKarp function can then be reduced to:
void rabinKarp(char pattern[], char text[], int inputBase) {
int patternLen = strlen(pattern);
int textLen = strlen(text);
int i, j; //predefined iterators
int patternHash = 0;
int textHash = 0;
int patternLenOut = 0;
// Calculate hash value for pattern and text
for (i = 0; i < patternLen; i++) {
patternHash = (pattern[i]) % inputBase;
textHash = (text[i]) % inputBase;
}
// Find the match
for (i = 0; i <= textLen - patternLen; i++) {
if (patternHash == textHash) {
for (j = 0; j < patternLen; j++) {
if (text[i + j] != pattern[j])
break;
}
if (j == patternLen)
cout << "Pattern is found at position: " << i + 1 << endl;
}
if (i < textLen - patternLen) {
textHash = (text[i + patternLen]) % inputBase;
if (textHash < 0)
textHash = (textHash + inputBase);
}
}
}
now you'll notice that all the hashes becomes is the sum of the letters mod some number (in your case 13, in my case 2). This is a bad hash, meaning many things will sum to the same number. However, in this portion of the code:
if (patternHash == textHash) {
for (j = 0; j < patternLen; j++) {
if (text[i + j] != pattern[j])
break;
}
if (j == patternLen)
cout << "Pattern is found at position: " << i + 1 << endl;
}
you explicitly check the match, letter by letter, if the hashes match. The worse your hash function is, the more often you will have false positives (which will mean a longer runtime for your function). There are more details, but I believe that directly answers your question. What might be interesting is to record false positives and see how the false positive rate increases as d and q decrease.

C++ - Output always pushes to the most left?

The fiboEncoding() function below is to read an integer then return the Fibonacci encoding.
When I test it in the main function, it always pushes itself into the most left part of the output. How can I solve this problem? What did I do wrong to cause this problem?
#include <iostream>
#include <vector>
using namespace std;
string fiboEncoding(int n) {
string word;
int fib[1000];
fib[0] = 1;
fib[1] = 2;
int i = 0;
for(i = 2; fib[i-1] <= n; i++) {
fib[i] = fib[i-1] + fib[i-2];
}
int r = i - 2;
int index = r;
vector<char> v(r+3);
while(n > 0) {
v[index] = '1';
n = n - fib[index];
index = index - 1;
while (index >= 0 && fib[index] > n) {
v[index] = '0';
index = index - 1;
}
}
v[r + 1] = '1';
for (int j = 0; j < v.size() - 1; j++) {
cout << v[j];
}
return word;
}
int main() {
int n;
string fibo;
cin >> n;
fibo = fiboEncoding(n);
cout << "code: " << fibo << endl;
}

Your function returns an empty string word. You forgot to copy the result into word string.
What you see in the console is the result of executing the following part not cout.
for (int j = 0; j < v.size() - 1; j++) {
cout << v[j];
}
To fix replace the above for loop by
for (int j = 0; j < v.size() - 1; j++) {
//cout << v[j];
word += v[j];
}

How to resolve the segmentation fault in this code?

I am trying to make an infix calculator for which I am currently trying to convert numbers entered in a character array to double.
here's my code:
#include <iostream>
#include<cmath>
using namespace std;
int main()
{
char exp[500];
const int SIZE = 100;
char temp[SIZE];
char op;
int strLen = 0, k, l, num = 0, fnum = 0;
double number = 0;
cin.getline(exp, 500,'\n');
int i = 0, j = 0, fpoint=0;
cout << exp;
for (i = 0, j = 0; exp[j] != 0; i++)
{
if (i % 2 == 0)
{
for (int m = 0; exp[m] != ','; m++) //stopped working
temp[m] = exp[m];
cout << temp;
for (k = 0; k < SIZE && temp[k] != 0; k++)
{
strLen = k;
if (temp[k] == '.')
fpoint = k + 1;
}
cout << fpoint<<endl;
cout << "strLen" << strLen;
for (k = 0; k <= fpoint; k++)
{
num = num + ((temp[fpoint - k] - '0') * pow(10, k));
}
for (k = fpoint + 1, l = 0; k <= strLen; k++, l++)
{
fnum = fnum + ((temp[strLen - l] - '0') * pow(10, l));
}
number = num + (fnum / pow(10, strLen - fpoint + 1));
cout << number;
j = j + strLen + 1;
}
else
{
char op = temp[j];
cout << op;
}
}
system("pause");
return 0;
}
sample input
2.5*3
It stops working and gives segmentation fault as an error on the marked position.

This line for (int m = 0; exp[m] != ','; m++) //stopped working will always fail if there are no , characters since exp[m] != ',' will always be equal to true and so will reach beyond the end of the array of exp which triggers the "segmentation fault".

mtrix chain multiplication print the sequence of the mattrices

I have written code for matrix chain multiplication in dynamic programming in c++.
there is an error in the recursive call for printing the correct parenthesization of the matrices. I am taking input from text file and giving output on a text file. please help..
#include <iostream>
#include <fstream>
#include <limits.h>
using namespace std;
int * MatrixChainOrder(int p[], int n)
{
static int m[100][100];
static int s[100][100];
int j, q;
int min = INT_MAX;
for (int i = 1; i <= n; i++)
m[i][i] = 0;
for (int L = 2; L <= n; L++) {
for (int i = 1; i <= n - L + 1; i++) {
j = i + L - 1;
m[i][j] = min;
for (int k = i; k <= j - 1; k++) {
q = m[i][k] + m[k + 1][j] + p[i - 1] * p[k] * p[j];
if (q < m[i][j]) {
m[i][j] = q;
s[i][j] = k;
}
}
}
}
return (*s);
}
void Print(int *s, int i, int j)
{
ofstream outfile("output.text");
if (i == j)
{
outfile << "a1";
}
else
outfile << "(";
{
Print(*s, i, s[i][j]);
Print(*s, s[i][j] + 1, j);
outfile << ")";
}
outfile.close();
}
int main()
{
int arr[100];
int num, i = 0;
ifstream infile("input.text");
while (infile)
{
infile >> num;
arr[i] = num;
i++;
}
i = i - 1;
infile.close();
Print(MatrixChainOrder(arr, i - 1), 0, i - 1);
return 0;
}

In C++ it is better to use std::vector for arrays. Aside from that, you can't mix pointers and arrays like that because the compiler loses track of array size.
For example this doesn't work:
int x[10][20];
void foo(int *ptr)
{
//the numbers 10 and 20 have not been passed through
}
But you can change it to
int x[10][20];
void foo(int arr[10][20])
{
//the numbers 10 and 20 are available
}
MatrixChainOrder is supposed to return a number, according to this link
int MatrixChainOrder(int s[100][100], int p[], int n)
{
int m[100][100];
for (int i = 0; i < 100; i++) m[i][i] = 0;
for (int i = 0; i < 100; i++) s[i][i] = 0;
int q = 0;
for (int L = 2; L <= n; L++) {
for (int i = 1; i <= n - L + 1; i++) {
int j = i + L - 1;
m[i][j] = INT_MAX;
for (int k = i; k <= j - 1; k++) {
q = m[i][k] + m[k + 1][j] + p[i - 1] * p[k] * p[j];
if (q < m[i][j]) {
m[i][j] = q;
s[i][j] = k;
}
}
}
}
return q;
}
int main()
{
int arr[] = { 40, 20, 30, 10, 30 };
int array_size = sizeof(arr) / sizeof(int);
int n = array_size - 1;
int s[100][100];
int minimum = MatrixChainOrder(s, arr, n);
printf("{ 40, 20, 30, 10, 30 } should result in 26000 : %d\n", minimum);
return 0;
}
Likewise you can change your Print function
void Print(int s[100][100], int i, int j)
{
if (i < 0 || i >= 100 || j < 0 || j >= 100)
{
cout << "array bound error\n";
}
//safely access s[i][j] ...
}

Structure use in a function C++

I want to create a function that returns a structure, then run a loop for that function. The idea is basically to calculate the maximum sum, start point and end point of a set of n numbers (1000 in this case) and this for 10 lines in a text file
struct triple
{
float Max;
int sp;
int ep;
};
triple Max_line(string linename, int n)
{
int k, i, stp, enp, j;
triple Result;
float Maxi;
float T[n];
string s;
stringstream ss(linename);
j = 0;
Maxi = 0;
stp = 0;
enp = 0;
while (getline(ss, s, ',')and k < n)
{
T[j] = atof(s.c_str());
j++;
}
for (i = 0; i < n - 1; i++)
{
int k;
float S;
S = T[i];
k = i;
while (k < n - 1)
{
S = S + T[k + 1];
if (S > Maxi)
{
Maxi = S;
stp = i + 1;
enp = k + 2;
}
k++;
}
}
Result = { Maxi, stp, enp }
return Result;
}
int main(int argc, char *argv[]) {
int i, j;
triple fin;
fstream myfile("1000.txt"); //extract data from a file containing 10 lines each has 1000 different numbers
string a, b;
getline(myfile, a); //skip the first line
for (i = 0; i < 10; i++)
{
getline(myfile, b);
fin = Max_line(b, 1000);
cout << fin.Max << ";" << fin.sp << ";" << fin.ep << endl;
}
return 0;
}
When I printed the results inside the Max_line function it gave me the right values, I don't understand why it's not working inside the for loop.
Can anyone help me with that please?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

c++ getline file output change when is copied to string - c++

These kinds of performance issues can't be reasoned. You need to use a profiler and measure which parts of the code take up how much time. To start with I would make both codes more similar. There are a lot of differences that could confuse the issue (for once version 1 has a giant memory leak).

Related

Rabin-Karp algorithm in c++

C++ - Output always pushes to the most left?

How to resolve the segmentation fault in this code?

mtrix chain multiplication print the sequence of the mattrices

Structure use in a function C++

Categories

Resources