How do I read a file faster in c++ - c++

I am trying to make a program which takes a text file with words and letters with which it outputs the longest word it can create with the letters you provided.
This is the code:
#include <algorithm>
#include <fstream>
#include <iostream>
#include <unordered_set>
#include <cstring>
#include <chrono>
using namespace std;
using namespace std::chrono;
bool can_make (string word, unordered_set<char> letters);
int main (int argc, char* argv[]) {
fstream dict_file(argv[1]);
unordered_set<char> letters;
unordered_set<string> words;
string word;
size_t max_len = 0;
for (int i = 2; i < argc; i++) {
for (int j = 0; j < strlen(argv[i]); j++) {
letters.insert(argv[i][j]);
}
}
while (getline(dict_file, word)) {
if (word.length() > max_len && can_make(word, letters)) {
words.clear();
max_len = word.length();
words.insert(word);
} else if (word.length() == max_len && can_make(word, letters)) {
words.insert(word);
}
}
dict_file.close();
auto start = high_resolution_clock::now();
cout << "Words: ";
for (string word : words) {
cout << word << " ";
}
cout << "\n" << endl;
auto stop = high_resolution_clock::now();
auto duration = duration_cast<microseconds>(stop - start);
cout << "Time elapsed: " << (duration.count()) << " microseconds" << endl;
cout << endl;
return 0;
}
bool sort_func (string first, string second) {
return first.length() > second.length();
}
bool can_make (string word, unordered_set<char> l) {
for (char i : word) {
if (l.find(i) == l.end()) return false;
}
return true;
}
I want to make the code as fast as possible, and the only thing that is taking a long time is the file reading. Is there a way to read a file faster?

Related

How to use function definitions to open, read, and close a file

We are asked to open a text file that contains a sentence and go through all the letters and whitespace in the file and count how many of each ascii character there are.
When I had all my information in the main function it worked fine. I just can't figure out how to call them all successfully in the main. The output should look like:
15 words
6 a
3 d
6 e
3 g
3 h
3 i
15 l
6 o
6 r
3 s
3 t
3 w
I achieved this output when again everything was in the main function so now it's just a matter of getting it like this with my function definitions.
My Code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
//prototypes
//int openFile();
void readFile(ifstream &in);
void closeFile(ifstream &in);
//definitions
int openFile()
{
ifstream in;
in.open("word_data.txt");
}
void readFile(ifstream &in)
{
//store the frequency of the letters
int letters[128];
//declare variables
char let;
int wordCount = 0;
//for loop to initialize all counts to zero
for (int i = 0; i < 128; i++)
{
letters[i] = 0;
}
//get letters until we reach end of file
//whitespace = wordCount++;
let = in.get();
while (let != EOF)
{
if (let == ' ')
wordCount++;
//change to lowercase
let = tolower(let);
letters[let]++;
let = in.get();
}
//output
//num words
cout << wordCount + 1 << " words" << endl;
//count how many of each letter there are & print to screen in alphabetical order
for (char let = 'a'; let <= 'z'; let++)
{
if (letters[let] != 0)
{
cout << letters[let] << " "<< let <<endl;
}
}
}
void closeFile(ifstream &in)
{
in.close();
}
int main()
{
openFile();
readFile(in);
closeFile(in);
return 0;
}
The problem is with your openFile() function. It creates a local ifstream only, it does not open an ifstream that is accessible to the other functions.
Try this instead:
#include <iostream>
#include <fstream>
#include <string>
#include <cctype>
using namespace std;
//prototypes
void openFile(ifstream &in);
void readFile(ifstream &in);
void closeFile(ifstream &in);
//definitions
void openFile(ifstream &in)
{
in.open("word_data.txt");
}
void readFile(ifstream &in)
{
//store the frequency of the letters
int letters[128] = {};
//declare variables
char ch;
int wordCount = 0;
//get letters until we reach end of file
while (in.get(ch))
{
if (ch == ' ')
wordCount++;
//change to lowercase
ch = static_cast<char>(tolower(static_cast<unsigned char>(ch)));
letters[ch]++;
}
//output
//num words
cout << wordCount + 1 << " words" << endl;
//count how many of each letter there are & print to screen in alphabetical order
for (ch = 'a'; ch <= 'z'; ch++)
{
if (letters[ch] != 0)
{
cout << letters[ch] << " " << ch <<endl;
}
}
}
void closeFile(ifstream &in)
{
in.close();
}
int main()
{
ifstream in;
openFile(in);
readFile(in);
closeFile(in);
return 0;
}
With that said, you might consider using a std::map to track your frequencies, rather than using an int[] array. And using operator>> to read whole words at a time:
#include <map>
...
void readFile(ifstream &in)
{
//store the frequency of the letters
map<char, int> letters;
//declare variables
string word;
int wordCount = 0;
//get letters until we reach end of file
while (in >> word)
{
++wordCount;
//for(size_t idx = 0; idx < word.size(); ++idx)
//{
// char ch = word[idx];
for(char ch : word)
{
//change to lowercase
ch = static_cast<char>(tolower(static_cast<unsigned char>(ch)));
if (ch >= 'a' && ch <= 'z')
letters[ch]++;
}
}
//output
//num words
cout << wordCount << " words" << endl;
//count how many of each letter there are & print to screen in alphabetical order
//for (map<char, int>::iterator iter = letters.begin(); iter != letters.end(); ++iter)
//{
// cout << iter->second << " " << iter->first << endl;
//}
for (auto &elem : letters)
{
cout << elem.second << " " << elem.first << endl;
}
}

Recursive nested algorithm

I am trying to create an algorithm for bruteForce cracking MD5 hash.
My goal is to measure the time consumption when splitting into fibers for the processor and optionally graphics in compute clastr.
I got stuck in creating an algorithm.
The input should be a string. According to the number of string characters, I need to create the same number of forcycles.
Statically written for 3 digist, it looks like this:
#include <iostream>
#include <string>
#include "md5.h"
using namespace std;
int main()
{
string imput = "slv";
cout << "imput string: "<< imput << endl;
cout << "MD5 HASH: "<< wantedHash << endl;
do
{
cout << '\n' << "Enable BruteForce Craker";
} while (cin.get() != '\n');
string s;
for(int i=0; i != 256; i++)
{
for(int j=0; j != 256; j++)
{
for(int k=0; k != 256; k++)
{
string s = md5(string(1,(char)i) + string(1,(char)j) + string(1,(char)k));
serchCounter++;
if(s == wantedHash)
{
cout << "Find: " << string(1,(char)i) + string(1,(char)j) + string(1,(char)k) << endl;
cout << "Count TestedHash: " << serchCounter << endl;
return 0;
}
}
}
}
return 0;
}
My idea .. something like that ...
#include <iostream>
#include <string>
#include "md5.h"
using namespace std;
string imput = "s";
string wantedHash = md5(imput);
double serchCounter = 0;
int bruteForse(int longString, string s)
{
for(int i=0; i != 256; i++)
{
string s = md5(string(1,(char)i));
serchCounter++;
if(s == wantedHash)
{
cout << "Find: " << string(1,(char)i);
cout << "Count TestedHash: " << serchCounter << endl;
return 0;
}
}
if(longString > 1) bruteForse(--longString, s);
return 0;
}
int main()
{
cout << "imput string: "<< imput << endl;
cout << "MD5 HASH: "<< wantedHash << endl;
bruteForse(imput.length(),imput);
}
I would do:
bool increase(std::string& s)
{
for (auto rit = s.rbegin(); rit != s.rend(); ++rit) {
auto& c = s[i];
if (c == -1) {
c = 0;
continue;
} else if (c == 127) {
c = -128;
} else {
++c;
}
return true;
}
return false;
}
void bruteForce(std::size_t size, const string& wantedHash)
{
std::string s;
s.resize(size);
do {
if (md5(s) == wantedHash) {
cout << "Find: " << s << std::endl;
}
} while (increase(s));
}

How to read a 2d array from a file without knowing its length in C++?

Like the title says I'm trying to read an unknown number of integers from a file and place them in a 2d array.
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
fstream f;int i,j,n,a[20][20];char ch;
i=0;j=0;n=0;
f.open("array.txt", ios::in);
while(!f.eof())
{
i++;
n++;
do
{
f>>a[i][j];
j++;
f>>ch;
}
while(ch!='\n');
}
for(i=1;i<=n;i++)
{
for(j=1;j<=n;j++)
cout<<a[i][j]<<endl;
cout<<endl;
}
return 0;
}
and my "array.txt" file :
1 1 1
2 2 2
3 3 3
After compiling the program, it prints this
As your input file is line oriented, you should use getline (C++ equivalent or C fgets) to read a line, then an istringstream to parse the line into integers. And as you do not know a priori the size, you should use vectors, and consistently control that all lines have same size, and that the number of lines is the same as the number of columns.
Last but not least, you should test eof immediately after a read and not on beginning of loop.
Code becomes:
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <sstream>
using namespace std;
int main()
{
fstream f;
int i=0, j=0, n=0;
string line;
vector<vector<int>> a;
f.open("array.txt", ios::in);
for(;;)
{
std::getline(f, line);
if (! f) break; // test eof after read
a.push_back(vector<int>());
std::istringstream fline(line);
j = 0;
for(;;) {
int val;
fline >> val;
if (!fline) break;
a[i].push_back(val);
j++;
}
i++;
if (n == 0) n = j;
else if (n != j) {
cerr << "Error line " << i << " - " << j << " values instead of " << n << endl;
}
}
if (i != n) {
cerr << "Error " << i << " lines instead of " << n << endl;
}
for(vector<vector<int>>::const_iterator it = a.begin(); it != a.end(); it++) {
for (vector<int>::const_iterator jt = it->begin(); jt != it->end(); jt++) {
cout << " " << *jt;
}
cout << endl;
}
return 0;
}
You may want to look into using a vector so you can have a dynamic array.
Try:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
using namespace std;
int main() {
fstream f;
int i, j, n, a[20][20];
string buf;
i = 0;
j = 0;
n = 0;
f.open("array.txt", ios::in);
while (1) {
getline(f, buf);
if (f.eof()) break;
stringstream buf_stream(buf);
j = 0;
do {
buf_stream >> a[i][j];
j++;
} while (!buf_stream.eof());
i++;
n++;
}
for (i = 0; i < n; i++) {
for (j = 0; j < n; j++) cout << a[i][j] << " ";
cout << endl;
}
return 0;
}
Also, if you really want to read arbitrarily large arrays, then you should use std::vector or some such other container, not raw arrays.

lower string characters and add a _ in front of converted capital letter

I have one more question, I want to add a _ in front of every Capital letter which will be converted to lowercase, plus the first letter cannot be capital!! I cant figure out how to do it... :{ example:
input: loLollL, output: lo_loll_l
and I want it to go backwards too: input: lo_loll_l output: loLollL
code is here:
#include <iostream>
#include <algorithm>
using namespace std;
int main ()
{
const int max = 100;
string slovo;
int pocet_r;
cout << "Zadaj pocet uloh:" << endl;
cin >> pocet_r;
if(pocet_r >= 1 && pocet_r <=100)
{
// funkcia na zabezpecenie minimalneho poctu chars
for (int i = 0; i <pocet_r; i++)
{
cout << "Uloha " << i+1 << ":" << endl;
cin >> slovo;
if(slovo.size() > max)
{
cout << "slovo musi mat minimalne 1 a maximalne 100 znakov" << endl;
}
while( slovo.size() > max)
{
cin >> slovo;
}
for (int i=0; i <= slovo.size(); i++)
{
int s = slovo[i];
while (s > 'A' && s <= 'Z')
{
if(s<='Z' && s>='A'){
return s-('Z'-'_z');
}else{
cout << "chyba";
}
}
}
cout << slovo[i] << endl;
}
}else{
cout << "Minimalne 1 a maximalne 100 uloh" << endl;
}
system("pause");
}
EDIT>
for (int i=0; i <= slovo.size(); i++)
{
while (slovo[i] >= 'A' && slovo[i] <= 'Z')
{
string s = transform(slovo[i]);
cout << s << endl;
s = untransform(s);
cout << s << endl;
}
}
This should work:
#include <string>
#include <cctype>
#include <iostream>
using namespace std;
string
transform(const string& s)
{
const size_t n = s.size();
string t;
for (size_t i = 0; i < n; ++i)
{
const char c = s[i];
if (isupper(c))
{
t.push_back('_');
}
t.push_back(tolower(c));
}
return t;
}
string
untransform(const string& s)
{
string t;
const size_t n = s.size();
size_t i = 0;
while (i < n)
{
char c = s[i++];
if (c != '_')
{
t.push_back(c);
continue;
}
c = s[i++];
t.push_back(toupper(c));
}
return t;
}
int
main()
{
string s = transform("loLollL");
cout << s << endl;
s = untransform(s);
cout << s << endl;
}

C++ "Count the number of collisions at each slot in the hash table"

I'm suppose to create a Dictionary as a Hash Table with Linked List to spell check a text document. I read in the file "words.txt" to create the dictionary. Also, I have to count/display the number of collisions at each slot in the hash table when I load in the dictionary "words.txt"
I'm given the source code for the HashTable Class with Linked List as followed :
hashtable.cpp (#include "listtools.cpp" since its using templates)
#include <iostream>
#include <string>
#include "listtools.h"
#include "listtools.cpp"
#include "hashtable.h"
using LinkedListSavitch::Node;
using LinkedListSavitch::search;
using LinkedListSavitch::headInsert;
using namespace std;
#define HASH_WEIGHT 31
namespace HashTableSavitch
{
HashTable::HashTable()
{
for (int i = 0; i < SIZE; i++)
{
hashArray[i] = NULL;
//array for collisons
collisionArray[i] = 0;
}
}
HashTable::~HashTable()
{
for (int i=0; i<SIZE; i++)
{
Node<string> *next = hashArray[i];
while (next != NULL)
{
Node<string> *discard = next;
next = next->getLink( );
delete discard;
}
}
}
unsigned int HashTable::computeHash(string s) const
{
unsigned int hash = 0;
for (unsigned int i = 0; i < s.length( ); i++)
{
hash = HASH_WEIGHT * hash + s[i];
}
return hash % SIZE;
}
bool HashTable::containsString(string target) const
{
int hash = this->computeHash(target);
Node<string>* result = search(hashArray[hash], target);
if (result == NULL)
return false;
else
return true;
}
void HashTable::put(string s)
{
int count = 0;
int hash = computeHash(s);
if (search(hashArray[hash], s) == NULL)
{
// Only add the target if it's not in the list
headInsert(hashArray[hash], s);
}
else
{
collisionArray[hash]++;
}
void HashTable::printArray()
{
int number;
for(int i = 0; i < SIZE; i++)
{
number = collisionArray[i];
cout << "----------------\n";
cout << "index = " << i << endl;
cout << "Collisions = " << number << endl;
cout << "----------------\n";
}
}
} // HashTableSavitch
my main.cpp file
#include <iostream>
#include <fstream>
#include <cctype>
#include <algorithm>
#include <cstring>
#include <string>
#include "hashtable.h"
using namespace std;
using HashTableSavitch::HashTable;
void upToLow(string & str);
void removePunct(string & str);
int main()
{
HashTable h;
string currWord;
string word;
int countMisspelled = 0;
int countCorrect = 0;
//Get input from words.rtf
ifstream dictionary("words.txt");
//File checking
if (dictionary.fail())
{
cout << "File does not exist" << endl;
cout << "Exit program" << endl;
}
//Create the dictionary as a hash table
while(dictionary >> currWord)
{
h.put(currWord);
}
dictionary.close();
//display collisions
h.printArray();
//Get input from gettysburg_address.txt
ifstream input("gettysburg_address.txt");
//File checking
if (input.fail())
{
cout << "File does not exist" << endl;
cout << "Exit program" << endl;
}
//Spell check gettysburg_address.txt
cout << "Misspelled words : " << endl;
cout << endl;
//If a word is not in the dictionary assume misspelled
while(input >> word)
{
removePunct(word);
upToLow(word);
if(h.containsString(word) == false)
{
countMisspelled++; // Increment misspelled words count
cout << word << " ";
if(countMisspelled % 20 == 0) // Display misspelled words 20 per line
{
cout << endl;
}
}
else
{
countCorrect++; // Increment correct words count
}
}
input.close();
cout << endl;
cout << endl;
cout << "Number of misspelled words : " << countMisspelled << endl;
cout << "Number of correct words : " << countCorrect << endl;
return 0;
}
/*Function to convert uppercase letters to lowercase*/
void upToLow(string & str)
{
for (unsigned int i = 0; i < strlen(str.c_str()); i++)
if (str[i] >= 0x41 && str[i] <= 0x5A)
str[i] = str[i] + 0x20;
}
/*Function to remove punctuation from string*/
void removePunct(string & str)
{
str.erase(remove_if(str.begin(), str.end(), static_cast<int(*)(int)>(&ispunct)),str.end());
}
Is there a simple way to count the number of collisions at each slot when loading in "words.txt" ? If I implement a count variable in the "put" function I can get the total number of collisions, but I'm not quite sure how to count/display the number of collisions at each slot of the hash table. Any help/tips is appreciated.
EDIT :
Followed Joe's advice and now I'm wondering how I could display the number of collisions at each slot. I made a void function to do just that but it displays the number of collisions at each slot to be 0. Anyone know what I should do?
Probably the simplest way is declare an array in an appropriate place
int collisionArray[SIZE];
initialize it to 0 in HashTable::HashTable()
HashTable::HashTable()
{
for (int i = 0; i < SIZE; i++)
{
hashArray[i] = NULL;
collisionArray[i] = 0;
}
}
then increment the appropriate element when a collision is found
void HashTable::put(string s)
{
int count = 0;
int hash = computeHash(s);
if (search(hashArray[hash], s) == NULL)
{
// Only add the target if it's not in the list
headInsert(hashArray[hash], s);
collisionArray[hash]++;
}
}