remove stopwords then apply case folding ( how to combine two codes) - c++

I have a txt file called aisha includes this
This is a new file I did it for mediu.
Its about Removing stopwords fRom the file
and apply casefolding to it
I Tried doing that many Times
and finally now I could do
and I wrote two codes one is to remove some stop words from it
#include <iostream>
#include <string>
#include <fstream>
int main()
{
using namespace std;
ifstream file("aisha.txt");
if(file.is_open())
{
string myArray[200];
for(int i = 0; i < 200; ++i)
{
file >> myArray[i];
if (myArray[i] !="is" && myArray[i]!="the" && myArray[i]!="that"&& myArray[i]!="it"&& myArray[i]!="to"){
cout<< myArray[i]<<" ";
}
}
}
system("PAUSE");
return 0;
}
and the other is for apply casefolding for four ketters
#include <iostream>
#include <string>
#include <fstream>
int main()
{
using namespace std;
ifstream file("aisha.txt");
if(file.is_open())
{
file >> std::noskipws;
char myArray[200];
for(int i = 0; i < 200; ++i)
{
file >> myArray[i];
if (myArray[i]=='I')
cout<<"i";
if (myArray[i]=='A')
cout<<"a";
if (myArray[i]=='T')
cout<<"t";
if (myArray[i]=='R')
cout<<"r";
else
if (myArray[i]!='I' && myArray[i]!='T' && myArray[i]!='R')
cout<<myArray[i];
}
file.close();
}
system("PAUSE");
return 0;
}
now that I need to combine these two codes into one code that remove stopwords and then apply case folding
the problem that I used string myArray[200]; for the stopwords codeand char myArray[200]; for the case folding code
and I cant use only string or only char
what can I do ?

Put the text processors in separate functions and call them one by one in main. There will be no names and types collisions.
Here is rough example
void removeStopWords(ifstream file) {
// put your code here for removing the stopwords
}
void applyCaseFolding(ifstream file) {
// put your code here for applying case folding
}
int main() {
ifstream file("aisha.txt");
if(file.is_open()) {
removeStopWords(file);
applyCaseFolding(file);
}
return 0;
}

Related

C++ Problem with space detection in an Array String

I'm currently writting a program where I try to filter extra spaces so if there are more than 1 spaces in a row, I discard the rest leaving only one
But this is only the first step because the aim of the program is to parse a txt file with mips assembly instructions.
So far I've opened the file, stored the content in a vector and then stored the vector content in an array. Then I check, if you find a char 2 times in a row shift the array to the left.
The problem is that the code works well for any other letter, except for the space character. (On the code below I test it with the 'D' character and it works)
#include <iostream>
#include <cmath>
#include <fstream>
#include <cstdlib>
#include <vector>
#include <algorithm>
using namespace std;
class myFile {
vector<string> myVector;
public:
void FileOpening();
void file_filter();
};
void myFile::FileOpening() {
string getcontent;
ifstream openfile; //creating an object so we can open files
char filename[50];
int i = 0;
cout << "Enter the name of the file you wish to open: ";
cin.getline(filename, 50); //whatever name file the user enters, it's going to be stored in filename
openfile.open(filename); //opening the file with the object I created
if (!openfile.is_open()) //if the file is not opened, exit the program
{
cout << "File is not opened! Exiting the program.";
exit(EXIT_FAILURE);
};
while (!openfile.eof()) //as long as it's not the end of the file do..
{
getline(openfile, getcontent); //get the whole text line and store it in the getcontent variable
myVector.push_back(getcontent);
i++;
}
}
void myFile::file_filter() {
unsigned int i = 0, j = 0, flag = 0, NewLineSize, k, r;
string Arr[myVector.size()];
for (i = 0; i < myVector.size(); i++) {
Arr[i] = myVector[i];
}
//removing extra spaces,extra line change
for (i = 0; i < myVector.size(); i++) {
cout << "LINE SIZE" << myVector[i].size() << endl;
for (j = 0; j < myVector[i].size(); j++) {
//If I try with this character for example,
//it works (Meaning that it successfully discards extra 'Ds' leaving only one.
// But if I replace it with ' ', it won't work. It gets out of the loop as soon
//as it detects 2 consecutive spaces.
if ((Arr[i][j] == 'D') && (Arr[i][j + 1] == 'D')) {
for (k = j; k < myVector[i].size(); k++) {
Arr[i][k] = Arr[i][k + 1];
flag = 0;
j--;
}
}
}
}
for (i = 0; i < myVector.size(); i++) {
for (j = 0; j < myVector[i].size(); j++) //edw diapernw tin kathe entoli
{
cout << Arr[i][j];
}
}
}
int main() {
myFile myfile;
myfile.FileOpening();
myfile.file_filter();
}
My question is, why does it work with all the characters except the space one, and how do I fix this?
Thanks in advace.
Wow. Many lines of code. I can only recomend to learn more about the STL and algorithms.
You can read the complete file into a vector using the vectors "range"-constructor and std::istream_iterator. Then you can replace one or more spaces in a string by using a std::regex. This is really not complicated.
In the below example, I do all the work, with 2 lines of code in function main. Please have a look:
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
#include <string>
#include <fstream>
#include <regex>
using LineBasedTextFile = std::vector<std::string>;
class CompleteLine { // Proxy for the input Iterator
public:
// Overload extractor. Read a complete line
friend std::istream& operator>>(std::istream& is, CompleteLine& cl) { std::getline(is, cl.completeLine); return is; }
// Cast the type 'CompleteLine' to std::string
operator std::string() const { return completeLine; }
protected:
// Temporary to hold the read string
std::string completeLine{};
};
int main()
{
// Open the input file
std::ifstream inputFile("r:\\input.txt");
if (inputFile)
{
// This vector will hold all lines of the file. Read the complete file into the vector through its range constructor
LineBasedTextFile text{ std::istream_iterator<CompleteLine>(inputFile), std::istream_iterator<CompleteLine>() };
// Replace all "more-than-one" spaces by one space
std::for_each(text.begin(), text.end(), [](std::string& s) { s = std::regex_replace(s, std::regex("[\\ ]+"), " "); });
// For Debug purposes. Print Result to std::out
std::copy(text.begin(), text.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
}
return 0;
}
I hope, I could give you some idea on how to proceed.

How do I limit the file size so my program creates a new file after it becomes too large? and edit the name of the newly created file?

As I run my code it builds the text file which quickly exceeds a size where I can effectively use it.
How would I set a size limit to the text size so that it creates a new text file with a title +1 of the previous file?
#include <iostream>
#include <vector>
#include <string.h>
#include <iostream>
#include <vector>
#include <fstream>
using namespace std;
void Crack(string password, vector<char> Chars)
{
ofstream myfile;
myfile.open ("pass.txt");
myfile<<"PASSWORDs TO CRACK: 0 to "<<password<<endl;
int n = Chars.size();
int i = 0;
while(true)
{
i++;
int N = 1;
for(int j=0;j<i;j++)N*=n;
for(int j=0;j<N;j++)
{
int K = 1;
string crack = "";
for(int k=0;k<i;k++)
{
crack += Chars[j/K%n];
K *= n;
}
myfile<< crack<<" "<<endl;
if(password.compare(crack) == 0){
myfile<<"Cracked password: "<<crack<<endl;
return;
myfile.close();
}
}
}
}
int main()
{
vector<char> Chars;
for(char c = '0';c<='z';c++){
if(islower(c) || isdigit(c))Chars.push_back(c);
}
Crack("zzzzzzzzzzzzzzzzzz", Chars);
}
You can use ostream::tellp to give you the current file size. If it exceeds the limit, close the stream and create a new file.
What do you mean by "title+1"? Do you mean the file name or do you want to print a header in each file?

C++ Array pointer-to-object error

I am having what seems to be a common issue however reading through the replies to the similar questions I can't find the solution to my issue at all as I have already done what they are suggesting such as making the variable an array. I have the following code:
#include "stdafx.h"
#include <cstring>
#include <fstream>
#include <iostream>
#include <string>
#include <algorithm>
#include <future>
using namespace std;
string eng2Str[4] = { "money", "politics", "RT", "#"};
int resArr[4];
int main()
{
engine2(eng2Str[4], resArr[4]);
system("Pause");
system("cls");
return 0;
}
void engine2(string &eng2Str, int &resArr)
{
ifstream fin;
fin.open("sampleTweets.csv");
int fcount = 0;
string line;
for (int i = 0; i < 4; i++) {
while (getline(fin, line)) {
if (line.find(eng2Str[i]) != string::npos) {
++fcount;
}
}
resArr[i] = fcount;
}
fin.close();
return;
}
Before you mark as duplicate I have made sure of the following:
The array and variable I am trying to assign are both int
Its an array
The error is:
expression must have pointer-to-object type
The error is occurring at the "resArr[i] = fcount;" line and am not sure why as resArr is an int array and I am trying to assign it a value from another int variable. I am quite new to C++ so any help would be great as I am really stuck!
Thanks!
The problem is that you've declared your function to take a reference to a single string and int, not arrays. It should be:
void engine2(string *eng2Str, int *resArr)
or:
void engine2(string eng2Str[], int resArr[])
Then when you call it, you can give the array names as arguments:
engine2(eng2Str, resArr);
Another problem is the while loop in the function. This will read the entire file during the first iteration of the for() loop. Other iterations will not have anything to read, since it will be at the end of the file already. You could seek back to the beginning of the file, but a better way would be to rearrange the two loops so you just need to read the file once.
while (getline(fin, line)) {
for (int i = 0; i < 4; i++) {
if (line.find(eng2Str[i]) != string::npos) {
resArr[i]++;
}
}
}
I would suggest to use std::vector instead of pure C array.
In your code, there are more issues.
You are passing the fourth element of both arrays to the engine2 function.
From your definition of void engine2(string &eng2Str, int &resArr) you expect reference to a string (not array / vector) and an address / reference of int - you need to pass an pointer to the first element of resArr.
#include <cstring>
#include <fstream>
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
#include <future>
using namespace std;
vector<string> eng2Str = { "money", "politics", "RT", "#" };
int resArr[4] = {};
void engine2(const vector<string>& eng2Str, int* resArr)
{
ifstream fin;
fin.open("sampleTweets.csv");
int fcount = 0;
string line;
for (int i = 0; i < 4; i++)
{
while (getline(fin, line))
{
if (line.find(eng2Str[i]) != string::npos)
{
++fcount;
}
}
resArr[i] = fcount;
}
fin.close();
return;
}
int main()
{
engine2(eng2Str, resArr);
system("Pause");
system("cls");
return 0;
}

Reading in input to construct an object

I am trying to read a string line for line down a .txt file in order to initiate an array of objects using a constructor that takes a string.
The text file is written like
TransAm
Mustang
Corvette
I feel like my loop is not iterating the information I want to be set correctly. Is there an easy way of accomplishing this?
main.cc
#include <string>
#include <iostream>
#include "Car.cc"
#include <fstream>
using namespace std;
int main()
{
Car cars[3];
string STRING;
ifstream infile;
infile.open("cars.txt");
// THIS IS HOW IT'S ACHIEVED USING FOR-LOOP - Sam
for(int i = 0; i<3 && infile;++i){
getline(infile,STRING);
cars[i].setName(STRING);
}
/* THIS IS WHAT I HAD
while(!infile)
{
getline(infile,STRING);
for(int i = 0; i<sizeof(cars);i++){
cars[i].setName(STRING);
}
}
*/
infile.close();
for(int j = 0;j<sizeof(cars);j++){
cars[j].print();
}
}
Car.h
#include <string>
using namespace std;
class Car{
public:
Car();
Car(string);
string getName();
void setName(string);
void print();
private:
string name;
};
Car.cc
#include <string>
#include "Car.h"
using namespace std;
Car::Car()
{
}
Car::Car(string s)
{
setName(s);
}
void Car::setName(string s)
{
name = s;
}
string Car::getName()
{
return name;
}
void Car::print()
{
cout << name;
}
These points need to be corrected:
while (!infile) prevents you from entering the loop.
You don't need two loops.
You can modify your loop like this:
for (int i = 0; i < sizeof(cars) && getline(infile, STRING); ++i)
cars[i].setName(STRING);
Or like this:
for (int i = 0; i < sizeof(cars) && infile; ++i) {
getline(infile, STRING);
cars[i].setName(STRING);
}
Your loop does at the moment nothing if the file is correctly opened. It will only enter if the call to open was unsuccessful.
Change your loop to either
while (getline(infile,STRING))
{
//...
}
or
while (infile)
{
//...
}
As it's been said, "Change while(!infile) to while(getline(infile,STRING))" but do not forget to remove the getline(infile,STRING); afterwards.

Searching for an int inside a file

So I am supposed to take all ints in source3.txt and check which of them occur in source.txt. If any of them don't occur, I'm supposed to print a corresponding line from source2.txt to output.txt (source2.txt contains descriptions of the numbers in source 3, in the same order, each description is 1 line). I wrote this code, but it only prints the last line from source2.txt, furthermore it is a wrong line.
I have no idea what might be wrong. Can you help me?
#include <bits/stdc++.h>
using namespace std;
int main()
{
ifstream source ("source.txt");
ifstream source2 ("source2.txt");
ifstream source3 ("source3.txt");
vector<int> tab(1051,0);
vector<string> tab2(857,*new string);
vector<int> tab3(857,0);
ofstream output("output.txt");
for(int i=0;i<1050;++i)
{
source>>tab[i];
}
for(int i=0;i<856;++i)
{
string a;
getline(source2,a);
tab2[i]=a;
source3>>tab3[i];
}
for(int i=0;i<856;++i)
{
if(std::find(tab.begin(), tab.end(), tab3[i]) != tab.end())
{
continue;
}
else
{
output<<tab2[i]<<endl;
}
}
}
I think below modifications to code should work for you . Replace value of SOURCE_COUNT with 1051 and SOURCE2_COUNT with 857
#include <iostream>
#include <fstream>
#include <vector>
#include <vector>
const int SOURCE_COUNT = 4;
const int SOURCE2_COUNT = 3;
//const int SOURCE2_COUNT = 3;
using namespace std;
int main()
{
ifstream source ("source.txt");
ifstream source2 ("source2.txt");
ifstream source3 ("source3.txt");
vector<int> tab(SOURCE_COUNT,0);
vector<string> tab2(SOURCE2_COUNT,"");
vector<int> tab3(SOURCE2_COUNT,0);
ofstream output("output.txt");
for(int i=0;i<SOURCE_COUNT;++i)
{
source>>tab[i];
}
for(int i=0;i<SOURCE2_COUNT;++i)
{
string a;
getline(source2,a);
tab2[i]=a;
source3>>tab3[i];
}
for(int i=0;i<SOURCE2_COUNT;++i)
{
if(std::find(tab.begin(), tab.end(), tab3[i]) != tab.end())
{
continue;
}
else
{
output<<tab2[i]<<endl;
}
}
}
It looks to me like you are printing only in those cases where you have not found the number. In other words, the cases in your if-statement are reversed. It should read:
if(std::find(tab.begin(), tab.end(), tab3[i]) != tab.end())
output<<tab2[i]<<endl;
[EDIT] Oops, I read the question not carefully enough. It should print the line, if the number is NOT contained in source3. So the loop should read:
if(std::find(tab.begin(), tab.end(), tab3[i]) == tab.end())
output<<tab2[i]<<endl;
Also: I would strongly suggest to do away with all those constants like 856 and 1050. Why don't you simply read the file until you reach the end?