class Read
{
public:
Read(const char* filename)
:mFile(filename)
{
}
void setString()
{
while(getline(mFile, str, '.'))
{
getline(mFile, str, '.');
str.erase(std::remove(str.begin(), str.end(), '\n'), str.end());
}
}
private:
ifstream mFile;
string str;
};
int main()
{
Read r("sample.txt");
return 0;
}
My ultimate goal is to parse through each sentence in the file so I used getline setting the delimiter to '.' to get each individual sentence. I want to create a sentence vector but am not really sure how to do so.
The file is pretty big so it will have a lot of sentences. How do I create a vector for each sentence?
Will it simply be vector < string > str? How will it know the size?
EDIT: I added a line of code to remove the '\n'
EDIT: Got rid of !eof
while(!myFile.eof())
getline(mFile, str, '.');
Where did you find that? Please put it back. Try:
std::vector<std::string> sentences;
while(std::getline(mFile, str, '.'))
sentences.push_back(str);
The vector container has a .size() function to return the number of populated elements. You should google "std::vector" and read through the functions in the API.
Vectors are dynamica arrays. You need not to worry about the size of the vector. You can use push_back() function to add element in the vector. I have made some changes in your code. Please check if this work for you..
#include<vector>
using namespace std;
class Read
{
public:
Read(const char* filename)
:mFile(filename)
{
}
void setString()
{
while(getline(mFile, str, '.'))
{
vec.push_back(str);
}
}
private:
ifstream mFile;
string str;
vector<string> vec;
};
int main()
{
Read r("sample.txt");
return 0;
}
#include <vector>
using namespace std;
...
vector<string> sentences;
sentences.push_back(line);
The vector is a dynamic array and it will resize itself as you keep adding sentences. If you know the number of sentences, you can increase the performance by calling:
sentences.resize(number of sentences here)
Related
I want to store words separated by spaces into single string elements in a vector.
The input is a string that may end or may not end in a symbol( comma, period, etc.)
All symbols will be separated by spaces too.
I created this function but it doesn't return me a vector of words.
vector<string> single_words(string sentence)
{
vector<string> word_vector;
string result_word;
for (size_t character = 0; character < sentence.size(); ++character)
{
if (sentence[character] == ' ' && result_word.size() != 0)
{
word_vector.push_back(result_word);
result_word = "";
}
else
result_word += character;
}
return word_vector;
}
What did I do wrong?
Your problem has already been resolved by answers and comments.
I would like to give you the additional information that such functionality is already existing in C++.
You could take advantage of the fact that the extractor operator extracts space separated tokens from a stream. Because a std::string is not a stream, we can put the string first into an std::istringstream and then extract from this stream vie the std:::istream_iterator.
We could life make even more easier.
Since roundabout 10 years we have a dedicated, special C++ functionality for splitting strings into tokens, explicitely designed for this purpose. The std::sregex_token_iterator. And because we have such a dedicated function, we should simply use it.
The idea behind it is the iterator concept. In C++ we have many containers and always iterators, to iterate over the similar elements in these containers. And a string, with similar elements (tokens), separated by a delimiter, can also be seen as such a container. And with the std::sregex:token_iterator, we can iterate over the elements/tokens/substrings of the string, splitting it up effectively.
This iterator is very powerfull and you can do really much much more fancy stuff with it. But that is too much for here. Important is that splitting up a string into tokens is a one-liner. For example a variable definition using a range constructor for iterating over the tokens.
See some examples below:
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
#include <regex>
const std::regex delimiter{ " " };
const std::regex reWord{ "(\\w+)" };
int main() {
// Some debug print function
auto print = [](const std::vector<std::string>& sv) -> void {
std::copy(sv.begin(), sv.end(), std::ostream_iterator<std::string>(std::cout, "\n")); std::cout << "\n"; };
// The test string
std::string test{ "word1 word2 word3 word4." };
//-----------------------------------------------------------------------------------------
// Solution 1: use istringstream and then extract from there
std::istringstream iss1(test);
// Define a vector (CTAD), use its range constructor and, the std::istream_iterator as iterator
std::vector words1(std::istream_iterator<std::string>(iss1), {});
print(words1); // Show debug output
//-----------------------------------------------------------------------------------------
// Solution 2: directly use dedicated function sregex_token iterator
std::vector<std::string> words2(std::sregex_token_iterator(test.begin(), test.end(), delimiter, -1), {});
print(words2); // Show debug output
//-----------------------------------------------------------------------------------------
// Solution 3: directly use dedicated function sregex_token iterator and look for words only
std::vector<std::string> words3(std::sregex_token_iterator(test.begin(), test.end(), reWord, 1), {});
print(words3); // Show debug output
//-----------------------------------------------------------------------------------------
// Solution 4: Use such iterator in an algorithm, to copy data to a vector
std::vector<std::string> words4{};
std::copy(std::sregex_token_iterator(test.begin(), test.end(), reWord, 1), {}, std::back_inserter(words4));
print(words4); // Show debug output
//-----------------------------------------------------------------------------------------
// Solution 5: Use such iterator in an algorithm for direct output
std::copy(std::sregex_token_iterator(test.begin(), test.end(), reWord, 1), {}, std::ostream_iterator<std::string>(std::cout,"\n"));
return 0;
}
You added the index instead of the character:
vector<string> single_words(string sentence)
{
vector<string> word_vector;
string result_word;
for (size_t i = 0; i < sentence.size(); ++i)
{
char character = sentence[i];
if (character == ' ' && result_word.size() != 0)
{
word_vector.push_back(result_word);
result_word = "";
}
else
result_word += character;
}
return word_vector;
}
Since your mistake was only due to the reason, that you named your iterator variable character even though it is actually not a character, but rather an iterator or index, I would like to suggest to use a ranged-base loop here, since it avoids this kind of confusion. The clean solution is obviously to do what #ArminMontigny said, but I assume you are prohibited to use stringstreams. The code would look like this:
#include <iostream>
#include <string>
#include <vector>
using namespace std;
vector<string> single_words(string sentence)
{
vector<string> word_vector;
string result_word;
for (char& character: sentence) // Now `character` is actually a character.
{
if (character==' ' && result_word.size() != 0)
{
word_vector.push_back(result_word);
result_word = "";
}
else
result_word += character;
}
word_vector.push_back(result_word); // In your solution, you forgot to push the last word into the vector.
return word_vector;
}
int main() {
string sentence="Maybe try range based loops";
vector<string> result= single_words(sentence);
for(string& word: result)
cout<<word<<" ";
return 0;
}
I'm having slight trouble creating a 2D Vector of String that's created by reading values from a text file. I initially thought I needed to use an array. however I've come to realise that a vector would be much more suited to what I'm trying to achieve.
Here's my code so far:
I've initialised the vector globally, but haven't given it the number of rows or columns because I want that to be determined when we read the file:
vector<vector<string>> data;
Test data in the file called "test" currently looks like this:
test1 test2 test3
blue1 blue2 blue3
frog1 frog2 frog3
I then have a function that opens the file and attempts to copy over the strings from text.txt to the vector.
void createVector()
{
ifstream myReadFile;
myReadFile.open("text.txt");
while (!myReadFile.eof()) {
for (int i = 0; i < 5; i++){
vector<string> tmpVec;
string tmpString;
for (int j = 0; j < 3; j++){
myReadFile >> tmpString;
tmpVec.push_back(tmpString);
}
data.push_back(tmpVec);
}
}
}
However, when I attempt to check the size of my vector in my main function, it returns the value '0'.
int main()
{
cout << data.size();
}
I think I just need a pair of fresh eyes to tell me where I'm going wrong. I feel like the issues lies within the createVector function, although I'm not 100% sure.
Thank you!
You should use std::getline to get the line of data first, then extract each string from the line and add to your vector. This avoids the while -- eof() issue that was pointed out in the comments.
Here is an example:
#include <string>
#include <iostream>
#include <vector>
#include <sstream>
typedef std::vector<std::string> StringArray;
std::vector<StringArray> data;
void createVector()
{
//...
std::string line, tempStr;
while (std::getline(myReadFile, line))
{
// add empty vector
data.push_back(StringArray());
// now parse the line
std::istringstream strm(line);
while (strm >> tempStr)
// add string to the last added vector
data.back().push_back(tempStr);
}
}
int main()
{
createVector();
std::cout << data.size();
}
Live Example
Here's part my code:
#include <stdio.h>
#include<string>
#include<string.h>
#include<algorithm>
#include <vector>
#include <iostream>
using namespace std;
int main(){
FILE *in=fopen("C.in","r");
//freopen("C.out","w",stdout);
int maxl=0;
int i;
string word;
vector<string> words;
while(!feof(in)){
fscanf(in,"%s ",word.c_str());
int t=strlen(word.c_str());
if(t>maxl){
maxl=t;
words.clear();
words.insert(words.end(),word);
}else if (t==maxl){
words.insert(words.end(),word);
}
}
the problem occurs at
words.insert(words.end,word)
while
word
contains the word from my file, the vector item
words[i]
contains an empty string.
How is this possible?
fscanf(in,"%s ",word.c_str());
That's never going to work. c_str() is a const pointer to the string's current contents, which you mustn't modify. Even if you do subvert const (using a cast or, in this case, a nasty C-style variadic function), writing beyond the end of that memory won't change the length of the string - it will just give undefined behaviour.
Why not use C++ style I/O, reading into a string so that it automatically grows to the correct size?
std::ifstream in(filename);
std::string word;
while (in >> word) {
if (word.size() > maxl) {
maxl = word.size();
words.clear();
words.push_back(word);
} else if (word.size() == maxl) {
words.push_back(word);
}
}
string *parse(string str,int from){
int i=0,n=0,j,k;
i=j=from;
string *data=new string[6];
while(str[i]){
if(str[i]==' '){
for(k=0;k<(i-j-1);k++){
data[n][k]=str[j+k]; << Error takes place here
}
data[n][k]='\0';
j=i;
n++;
}
i++;
}
return data;
}
Thanks for your help. I tried to debug but without success, what am I missing?
The problem is that elements data[i] of the data array all have the length of zero. That is why the assignment data[n][k] is always outside of data[n]'s range.
One way of fixing this would be using concatenation:
data[n] += str[j+k];
A better approach would be eliminating the loop altogether, and using substr member function of std::string instead: it lets you cut out a portion of str knowing the desired length and the starting position.
In addition, you are returning a pointer to a local array, which is undefined behavior. You should replace an array with a vector<string>, and add items to it using push_back.
Finally, you need to push the final word when the str does not end in a space.
Here is your modified program that uses the above suggestions:
vector<string> parse(string str,int from){
int i=from, j=from;
vector<string> data;
while(str[i]){
if(str[i]==' '){
data.push_back(str.substr(j, i-j+1));
j=i+1;
}
i++;
}
if (j != str.size()) {
data.push_back(str.substr(j));
}
return data;
}
Here is a demo on ideone.
data starts with 0 length, data[n][k] out of boundry. data[n][k]='\0' is not correct way of using C++ string and string * is considered of bad practice.
To separate a string by space, try:
#include <string>
#include <vector>
#include <sstream>
std::string data("hi hi hi hi hi");
std::stringstream ss(data);
std::string word;
std::vector<std::string> v;
while(std::getline(ss, word, ' '))
{
v.push_back(word);
}
I'm trying to load lines of a text file containing dictionary words into an array object. I want an array to hold all the words that start with "a", another one for "b" ... for all the letters in the alphabet.
Here's the class I wrote for the array object.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
class ArrayObj
{
private:
string *list;
int size;
public:
~ArrayObj(){ delete list;}
void loadArray(string fileName, string letter)
{
ifstream myFile;
string str = "";
myFile.open(fileName);
size = 0;
while(!myFile.eof())
{
myFile.getline(str, 100);
if (str.at(0) == letter.at(0))
size++;
}
size -= 1;
list = new string[size];
int i = 0;
while(!myFile.eof())
{
myFile.getline(str, 100);
if(str.at(0) == letter.at(0))
{
list[i] = str;
i++;
}
}
myFile.close();
}
};
I'm getting an error saying:
2 IntelliSense: no instance of overloaded function "std::basic_ifstream<_Elem, _Traits>::getline [with _Elem=char, _Traits=std::char_traits<char>]" matches the argument list d:\champlain\spring 2012\algorithms and data structures\weeks 8-10\map2\arrayobj.h 39
I guess it's requiring me to overload the getline function, but I'm not quite certain how to go about or why it's necessary.
Any advice?
the function for streams that deals with std::string is not a member function of istream but rather a free function it is used like so. (the member function version deals with char*).
std::string str;
std::ifstream file("file.dat");
std::getline(file, str);
It is worth noting there are better safer ways to do what you are trying to do like so:
#include <fstream>
#include <string>
#include <vector>
//typedeffing is optional, I would give it a better name
//like vector_str or something more descriptive than ArrayObj
typedef std::vector<std::string> > ArrayObj
ArrayObj load_array(const std::string file_name, char letter)
{
std::ifstream file(file_name);
ArrayObj lines;
std::string str;
while(std::getline(file, str)){
if(str.at(0)==letter){
lines.push_back(str);
}
}
return lines;
}
int main(){
//loads lines from a file
ArrayObj awords=load_array("file.dat", 'a');
ArrayObj bwords=load_array("file.dat", 'b');
//ao.at(0); //access elements
}
don't reinvent the wheel; checkout vectors they are standard and will save you a lot of time and pain.
Final try not to put in using namespace std that is bad for a whole host of reasons I wont go into; instead prefix std objects with std:: so like std::cout or std::string.
http://en.cppreference.com/w/cpp/container/vector
http://en.cppreference.com/w/cpp/string/basic_string/getline
http://en.cppreference.com/w/cpp/string