I was assigned a project to read a large text file of different articles, and print out the top 10 most frequent words, i managed to remove all unnecessary information from the file and print it into a string, for simplicity i put a small part the list of unigrams and their frequency in a text file (text2.txt), this is essentially the format in which all the unigrams are written: "(unigram)":(it's frequency within that article),"(another unigram)":(it's frequency within that article) and so on
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
void something(string input, int sizeOfDoc, string unigrams[], int freq[]){
char found;
char last_found2;
char last_found;
char next_found;
bool b = false;
int pos = 0;
string word;
unsigned int u=0;
unsigned int f=0;
for(int x = 0; x<sizeOfDoc; x++){
found = input.at(x);
if(x==sizeOfDoc){
next_found= '*';
}else{next_found = input.at(x+1);}
if(x==0){
last_found = '*';
}else{last_found = input.at(x-1);}
if(x==0 || x==1){
last_found2 = '*';
}else{last_found2 = input.at(x-2);}
if((last_found2 >= '1' && last_found2 <= '9') && last_found == ',' && found == '\"' && //
(next_found >='a' && next_found <='z' || next_found >='A' && next_found <='Z')){ //finds first letter of unigram
word = next_found; //a
}
else if((found >='a' && found <='z' || found >='A' && found <='Z') && //
(last_found >='a' && last_found <='z' || last_found >='A' && last_found <='Z')){ //finds middle of unigram
word += found; //b abc word = "abc"
}
else if((last_found2 >='a' && last_found2 <='z' || last_found2 >='A' && last_found2 <='Z') //
&& last_found =='"' && found == ':' && (next_found >= '1' && next_found <= '9')){ //finds frequency
word += last_found2; //adds last letter to word
for(int i=0; i <= u; i++){ //
if(word == unigrams[i]){ //
b = true; //
pos = i; //checks for duplicate, if found, returns position to pos
}
}
if (b==false){
unigrams[u] = word; //adds word to unigrams array
freq[f] = next_found; //adds frequency to freq array
}
else if(b == true){ //
freq[pos]=freq[f]; //increments frequency if duplicate found
}
f++;
u++;
}
}
}
int main() {
string unigrams[1279];
int freq[1279];
string s;
std::string newstring;
ifstream file; //
file.open("text2.txt"); //
while (!file.eof()) { //
getline(file, s); //
newstring += s + "\n"; //reads original text and inputs it into newstring
}file.close();
something(newstring, newstring.size(), unigrams, freq); //calls function
for (int x = 0; x <= 1278; x++) { //
cout << unigrams[x]; //prints unigrams to console
}
}
when i run the code it throws
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::at: __n (which is 1278) >= this->size() (which is 1278)
i have tried using vectors instead of arrays with emplace_back, push_back, and directly assigning, all to no avail, there's much more to work on in the project, submission is tomorrow and the more i progress the more complex it gets ):
this is the text im using:
{"others":1,"air":1,"networks,":1,"conventional":1,"Environ.":1,"AHP":1,"Osterwalder,":1,"la":8,"Non-motorized":1,"(SHE).":1,"beer":1,"[7,8]":1,"provider":1,"futurible":1,"13(4),":1,"Agency":1,"24.":1,"concern":1,"eight":1,"facilitated":1,"2009":1,"review":1,"Car,":4,"viability.":1,"cycles":1,"contribute":1,"results,":1,"design":24,"CSIROPub.9780643094529,":1,"ecodesign":2,"reserves":1,"follow:":1,"sp\u00e9cifique":2,"(2017)[20,21,22].":1,"pp.":4,"Costs":1,"diversity":1,"In-depth":1,"Both":1,"\u2013":6,"Grenoble":1,"realistic":2,"Largepurchasecost:":1,"navale":1,"Est,":1,"petits":1,"Support":1,"eliminated":1,"relationship,":1,"progressed,":1,"Imnm":1,"significantly":2,"76":1,"Technical":1,"Tertre,":1,"(Fig.":1,"Freeman,":1,"(1.28>Ib":1,"IT":2,"defined":1,"maturity":1,"experimentation.":1,"review,":2,"interests":1,"tools":1,"Firm":3,"opportunities.":1,"behaviour":1,"2014":1,"fili\u00e8res":1,"feedback":3,"interviews":1,"60":1,"187":1,"d\u00e9fi":1,"strategies":3,"did":1,"Techniques":8,"In":8,"have":5,"issues.":1,"useful":1,"se":1,"QC,":1,"vision":1,"regarding":1,"take":4,"Brezet,":1,"such":2,"circulaire":1,"software":1,"parameter":9,"appliances":3,"wedging":1,"Prod.":6,"domains\u201d.":1,"typologie,":1,"D\u00e9veloppement":3,"real":1,"desACVcomparatives":1}}
its a sample of one list of unigrams and their frequencies, the formatting is horrible as you can see i had to create a million conditions to take out the words without running into issues caused by things like words with quotation marks within them, the original file has 1500 publications and this text is just a small part of one of them, thank you for reading this at least
Your bug is here, specifically because when you call input.at() the maximum argument should be sizeOfDoc - 1.
for(int x = 0; x<sizeOfDoc; x++){
found = input.at(x);
if(x==sizeOfDoc){
next_found= '*';
}else{next_found = input.at(x+1);}
If you consider the case where x == sizeOfDoc - 1, the last line shown will result in calling at with an argument of sizeOfDoc, which is too high. To fix this, change
if(x==sizeOfDoc){
to
if(x==sizeOfDoc - 1){
How do I remove the first full number of a string as an integer in C++
for instance a string "thdfwrhwh456dfhdfh764"
Would need to only pull out the first number 456 as an integer.
Thanks
Start by finding the first digit:
std::size_t pos = str.find_first_of(“0123456789”);
then check whether a digit was found:
if (pos != std::string::npos)
and then extract the tail of the string:
std::string tail = str.substr(pos);
and then extract the value:
int value = std::stoi(tail);
Here is a good example of how you might go about reading only the first number of a string that appears:
const char string_c[] = "this is a number 67theaksjdhflkajsh 78";
std::string string_n;
bool exitable = false;
for (int i = 0; i < sizeof(string_c); i++)
{
char value = string_c[i];
if (value == '0' ||
value == '1' ||
value == '2' ||
value == '3' ||
value == '4' ||
value == '5' ||
value == '6' ||
value == '7' ||
value == '8' ||
value == '9')
{
string_n += string_c[i];
exitable = true;
} else if (exitable == true)
{
printf("break\n");
break;
}
}
printf("this is the number: %s ", string_n.c_str());
If you need the number as int then you can use the std::stoi() function.
I am trying to create a for loop that has a conditional statement which reads until an operation is found, ex. (+,-,/,*), but every time I try I get an error:
Unhandled exception at 0x7936F2F6 (ucrtbased.dll) in CIS310 Project 44.exe: An invalid parameter was passed to a function that considers invalid parameters fatal.
while (getline(infile, hex))
{
n = hex.length();//find the length of the line
for (i = 0; hex[i] != '/'||'+'||'-'||'*'; i++,++k) //loop to split the first hexadecimal number
h1 = h1 + hex[i];
for (i++; i < n - 1; i++) //loop to get the second hexadecimal number
h2 = h2 + hex[i];
n1 = convertDecimal(h1); //convert the first hexadecimal number to decimal
n2 = convertDecimal(h2);
Your condition hex[i] != '/'||'+'||'-'||'*' is malformed. C++ requires that you specify both sides of the operator each time, so you will need something more similar to hex[i] != '/' || hex[i] != '+' || ....
You have to check after every ' | | '(OR), like:
hex[i] != '/' || hex[i] != '+' || hex[i] != '-' || hex[i] != '*'
This is a similar code to what you wrote:
while(getline(file,line))
{
string firstPart = "";
unsigned int i;
//We can use the algorithm library to search for them but its ok
for(i=0;(line[i] != '+') || (line[i] != '-') || (line[i] != '*') || (line[i] != '/') || (line[i] != '\0');i++ );
firstPart = line.substr(0,i);
}
now if you tried this, it will cause the same error (or atleast similar to it), if we even try to print every character in the loop
for(/*stuff*/)
cout << line[i];
Then notice this will become an infinite loop, the problem is that you're checking the character line[i] if it wasn't a + or - or * or / all at the same time, fix this by changing the || to &&.
I'll suppose that your file (named testfile.txt) has the content below:
0xAB+0xCD
0x11-0x03
Sample working code:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
ifstream file("testfile.txt");
///Don't forget to check if file has opened
if(!file.is_open())
{
cout << "File didn\'t open :(";
return 0;
}
string line;
while(getline(file,line))
{
string firstPart = "",secondPart = "";
char operation;
unsigned int i;
//We can use the algorithm library to search for them but its ok
for(i=0;(line[i] != '+') && (line[i] != '-') && (line[i] != '*') && (line[i] != '/') && (line[i] != '\0');i++ );
firstPart = line.substr(0,i);
operation = line[i];
secondPart = line.substr(i+1,firstPart.size());
}
file.close();
return 0;
}
I'm working on a Caesar Cipher program for an assignment and I have the general understanding planned out, but my function for determining the decipher key is unnecessarily long and messy.
while(inFile().peek != EOF){
inFile.get(character);
if (character = 'a'|| 'A')
{ aCount++; }
else if (character = 'b' || 'B')
{ bCount++; }
so on and so on.
What way, if it's possible, can I turn this into an array?
You can use the following code:
int count [26] = {0};
while(inFile().peek != EOF){
inFile.get(character);
if (int (character) >=65 || int (character) <=90)
{ count [(int (character)) - 65] ++; }
else if (int (character) >=97 || int (character) <=122)
{ count [(int (character)) - 97] ++; }
}
P.S. This is checking for the ASCII value of each character and then increment its respective element in the array of all characters, having 0 index for A/a and 1 for B/b and so on.
Hope this helps...
P.S. - There was an error in your code, = is an assignment operator and == is a conditional operator and you do not assign value in if statement, you check for condition... So always use == to check for equality...
You can use an array in the following manner
int letterCount['z'] = {0}; //z is the highest letter in the uppercase/lowercase alphabet
while(inFile().peek != EOF){
inFile.get(character);
if (character > 'A' && character < 'z')
letterCount[character]++;
}
You can also use a hashmap like this
#include <unordered_map>
std::unordered_map<char,int> charMap;
while(inFile().peek != EOF){
inFile.get(character);
if (charMap.find(character) == charMap.end())
charMap[character] = 1;
else
charMap[character] = charMap[character] + 1;
}
In case you do not know, a hashmap functions as an array, where the index can be any class you like, as long as it implements a hash function.
I'm fairly new to programming and I'm trying to get a function working that converts a string to an int. My idea with this function was to collect every number in the string and store it in another string, then convert it to an int.
The function returns the value 0.
What this function is supposed to do is return the converted number. Which should not be 0.
int getNumberFromString(int convertedNumber, string textToConvert)
{
for (int i = 0; i < textToConvert.size(); i++)
{
string collectNumbers;
int j = 0;
if (textToConvert[i] == '1' || textToConvert[i] == '2' || textToConvert[i] == '3' ||
textToConvert[i] == '4' || textToConvert[i] == '5' || textToConvert[i] == '6' ||
textToConvert[i] == '7' || textToConvert[i] == '8' || textToConvert[i] == '9' || textToConvert[i] == '0')
{
collectNumbers[j] = textToConvert[i];
j++;
}
if (collectNumbers.size() == 0)
{
return false;
}
else if (collectNumbers.size() > 0)
{
stringstream convert(collectNumbers);
if (!(convert >> convertedNumber))
{
convertedNumber = 0;
}
return convertedNumber;
}
}
}
Maybe you should just use library function ?
int stoi (const string& str, size_t* idx = 0, int base = 10);
You want somehting more like:
int getNumberFromString(int convertedNumber, string textToConvert) {
int retval = 0;
for (auto c: textToConvert) {
retval *= 10;
retval += c - '0';
}
return retval;
}
if you need to code it, or simply use stoi()
Your MAIN problem is that you are trying to convert the number before you have collected all the digits. You should loop over all the digits (use isdigit or if (x >= '0' && x <= '9') to avoid long list of individual digits - or, if you really like to list all digits, use switch to make it more readable).
Once you have collected all the digits, then convert AFTER the loop.
The statement return false, will be the same as return 0; since false will get converted to an integer with the value zero. So you won't be able to tell the difference between reading the value zero from a string and returning false (this is not PHP or JavaScript where type information is included in return values).