I have an input file of dna.txt which looks like what is below. I'm trying to read all of the characters to a single string and I'm not allowed to use a character array. How would I go about doing this.
cggccgattgtattctgtatagaaaaacac
atacagatggattttaactagagc
aagtcgcaataaccagcgagtattaca
cctcgaccaaatcctcgaattctc
Try the following:
std::string dna;
std::string text_read;
while (getline(input_file, text_read))
{
dna += text_read;
}
In the above loop, each line is read into a separate variable.
After the line is read, then it is appended to the DNA string.
Edit 1: Example working program:
Note: on some platforms, there may be a \r' in the buffer which causes portions to be overwritten when displayed.
#include <iostream>
#include <fstream>
#include <string>
int main()
{
std::ifstream input_file("./data.txt");
std::string dna;
std::string text_read;
while (std::getline(input_file, text_read))
{
const std::string::size_type position = text_read.find('\r');
if (position != std::string::npos)
{
text_read.erase(position);
}
dna += text_read;
}
std::cout << "As one long string:\n"
<< dna;
return 0;
}
Output:
$ ./dna.exe
As one long string:
cggccgattgtattctgtatagaaaaacacatacagatggattttaactagagcaagtcgcaataaccagcgagtattacacctcgaccaaatcctcgaattctc
The file "data.txt":
cggccgattgtattctgtatagaaaaacac
atacagatggattttaactagagc
aagtcgcaataaccagcgagtattaca
cctcgaccaaatcctcgaattctc
The program compiled using g++ version 5.3.0 on Cygwin terminal.
The issue was found by using the gdb debugger.
Related
I want to be able to a string that contains certain characters in a file that contains one string per line.
#include <iostream>
#include <fstream>
#include <string>
int main(){
string line;
ifstream infile;
infile.open("words.txt");
while(getline(infile, line,' ')){
if(line.find('z')){
cout << line;
}
}
}
That's my attempt at finding all the string that contains the character z.
The text file contains random strings such as
fhwaofhz
cbnooeht
rhowhrj
perwqreh
dsladsap
zpuaszu
so with my implementation, it should only print out the strings with the character z in it. However, it seems to be reprinting out all the contents from the text file again.
Problem:
In your file the strings aren't separated by a space (' ') which is the end delimiter, they are separated by a end of line ('\n'), that is a different character. As a consequence, in the first getline everything goes to line. line contains all the text in the file, including z's, so all the content is printed. Finally, the code exits the while block after running once because getline reaches the end of the file and fails.
If you run this code
#include <iostream>
#include <fstream>
#include <string>
int main(){
std::string line;
std::ifstream infile;
infile.open("words.txt");
while(getline(infile, line,' ')){
std::cout << "Hi";
if(line.find('z')){
std::cout << line;
}
}
}
"Hi" will be only printed once. That is because the while block is only executed once.
Additionaly, see that line.find('z') won't return 0 if not match is found, it will return npos. See it running this code (As it says here):
#include <iostream>
#include <fstream>
#include <string>
int main(){
std::string line;
std::ifstream infile;
infile.open("words.txt");
while(getline(infile,line)){
std::cout << line.find('z');
if(line.find('z')){
std::cout << line << "\n";
}
}
}
Solution:
Use getline(infile,line) that is more suitable for this case and replace if(line.find('z')) with if(line.find('z') != line.npos).
while(getline(infile,line)){
if(line.find('z') != line.npos){
std::cout << line << "\n";
}
}
If you need to put more than one string per line you can use the operator >> of ifstream.
Additional information:
Note that the code you posted won't compile because string, cout and ifstream are in the namespace std. Probably it was a part of a longer file where you were using using namespace std;. If that is the case, consider that it is a bad practice (More info here).
Full code:
#include <iostream>
#include <fstream>
#include <string>
int main(){
std::string line;
std::ifstream infile;
infile.open("words.txt");
while(getline(infile,line)){
if(line.find('z') != line.npos){
std::cout << line << "\n";
}
}
}
getline extracts characters from the source and stores them into the variable line until the delimitation character is found. Your delimiter character is a space (" "), which isn't present in the file, so line will contain the whole file.
Try getline(infile, line, '\n') or simply getline(infile, line) instead.
The method find returns the index of the found character, where 0 is a perfectly valid index. If the character is not found, it returns npos. This is a special value whcih indicates "not found", and it's nonzero to allow 0 to refer to a valid index. So the correct check is:
if (line.find('z') != string::npos)
{
// found
}
I'm trying to read a bunch of words from a file and sort them into what kind of words they are (Nouns, Adjective, Verbs ..etc). For example :
-Nouns;
zyrian
zymurgy
zymosis
zymometer
zymolysis
-Verbs_participle;
zoom in
zoom along
zoom
zonk out
zone
I'm using getline to read until the delimiter ';' but how can I know when it read in a type and when it read in a word?
The function below stop right after "-Nouns;"
int main()
{
map<string,string> data_base;
ifstream source ;
source.open("partitioned_data.txt");
char type [MAX];
char word [MAX];
if(source) //check to make sure we have opened the file
{
source.getline(type,MAX,';');
while( source && !source.eof())//make sure we're not at the end of file
{
source.getline(word,MAX);
cout<<type<<endl;
cout<<word<<endl;
source.getline(type,MAX,';');//read the next line
}
}
source.close();
source.clear();
return 0;
}
I am not fully sure about the format of your input file. But you seem to have a file with lines, and in that, items separated by a semicolon.
Reading this should be done differently.
Please see the following example:
#include <iostream>
#include <string>
#include <sstream>
#include <fstream>
std::istringstream source{R"(noun;tree
noun;house
verb;build
verb;plant
)"};
int main()
{
std::string type{};
std::string word{};
//ifstream source{"partitioned_data.txt"};
if(source) //check to make sure we have opened the file
{
std::string line{};
while(getline(source,line))//make sure we're not at the end of file
{
size_t pos = line.find(';');
if (pos != std::string::npos) {
type = line.substr(0,pos);
word = line.substr(pos+1);
}
std::cout << type << " --> " << word << '\n';
}
}
return 0;
}
There is no need for open and close statements. The constructor and
destructor of the std::ifstream will do that for us.
Do not check eof in while statement
Do not, and never ever use C-Style arrays like char type [MAX];
Read a line in the while statement and check validity of operation in the while. Then work on the read line later.
Search the ';' in the string, and if found, take out the substrings.
If I would knwo the format of the input file, then I will write an even better example for you.
Since I do not have files on SO, I uses a std::istringstream instead. But there is NO difference compared to a file. Simply delete the std::istringstream and uncomment teh ifstream definition in the source code.
I am working on creating a program that is supposed to read a text file (ex. dog, buddy,,125,,,cat,,,etc...) line by line and parse it based on commas. This is what I have so far but when I run it, nothing happens. I am not entirely sure what i'm doing wrong and I am fairly new to the higher level concepts.
#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>
#include <cstdlib>
#include <sstream>
#include <vector>
using namespace std;
int main()
{
std::ifstream file_("file.txt"); //open file
std::string line_; //declare line_ as a string
std::stringstream ss(line_); //using line as stringstream
vector<string> result; //declaring vector result
while (file_.is_open() && ss.good())
{ //while the file is open and stringstream is good
std::string substr; //declares substr as a string
getline( ss, substr, ',' ); //getting the stringstream line_ and substr and parsing
result.push_back(substr);
}
return 0;
}
Did you forget to add a line like std::getline(file_, line_);? file_ was not read from at all and line_ was put into ss right after it was declared when it was empty.
I'm not sure why you checked if file_ is open in your loop condition since it will always be open unless you close it.
As far as I know, using good() as a loop condition is not a good idea. The flags will only be set the first time an attempt is made to read past the end of the file (it won't be set if you read to exactly the end of the file when hitting the delimiter), so if there was a comma at the end of the file the loop will run one extra time. Instead, you should somehow put the flag check after the extraction and before you use the result of the extraction. A simple way is to just use the getline() call as your loop condition since the function returns the stream itself, which when cast into a bool is equivalent to !ss.fail(). That way, the loop will not execute if the end of the file is reached without extracting any characters.
By the way, comments like //declaring vector result is pretty much useless since it gives no useful information that you can't easily see from the code.
My code:
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
int main()
{
std::ifstream file("input.txt");
std::string line, word;
std::vector<std::vector<string>> result; //result[i][j] = the jth word in the input of the ith line
while(std::getline(file, line))
{
std::stringstream ss(line);
result.emplace_back();
while(std::getline(ss, word, ','))
{
result.back().push_back(word);
}
}
//printing results
for(auto &i : result)
{
for(auto &j : i)
{
std::cout << j << ' ';
}
std::cout << '\n';
}
}
I have a file which contains text (ASCII + unicode) and I am trying to count total words in it using a C++ program. It is a requirement that I should read the file line by line (using getline) and then process each line to count the words within it.
So I have written the following simple program:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
int main(int argc, char* argv[]) {
uint64_t ct = 0;
std::string line;
std::ifstream infile(argv[1]);
while(std::getline(infile, line)) {
std::stringstream inputStream(line);
std::string token;
while (inputStream >> token) {
++ct;
}
}
std::cout << ct << std::endl;
return 0;
}
However, the above program outputs a number that is lesser than what wc -w command gives. To narrow down the problem, I modified the program to simply output whatever it reads. So now the program becomes:
int main(int argc, char* argv[]) {
uint64_t ct = 0;
std::string line;
std::ifstream infile(argv[1]);
while(std::getline(infile, line)) {
std::stringstream inputStream(line);
std::string token;
while (inputStream >> token) {
std::cout << token << " ";
}
std::cout << std::endl;
}
return 0;
}
I redirected the output of this program to another file. Now, when I run wc -w on this new file, the number is same as running wc -w on the original file. This means, I am reading all the words (i.e., "words" defined by wc) in my program. And hence, a reasonable explanation would be that one of the values of token that is read using inputStream >> token consists of some unicode character that is interpreted as a white space by wc program. So how do I change my program to also support such interpretation of unicode white space characters?
You can go by either:
A. Java's definition of Unicode (not non-breaking) whitespace.
or
B. Wikipedia's list of all 25 Unicode code points defined as whitespace.
I am using STL. I need to read lines from a text file. How to read lines till the first \n but not till the first ' ' (space)?
For example, my text file contains:
Hello world
Hey there
If I write like this:
ifstream file("FileWithGreetings.txt");
string str("");
file >> str;
then str will contain only "Hello" but I need "Hello world" (till the first \n).
I thought I could use the method getline() but it demands to specify the number of symbols to be read. In my case, I do not know how many symbols I should read.
You can use getline:
#include <string>
#include <iostream>
int main() {
std::string line;
if (getline(std::cin,line)) {
// line is the whole line
}
}
using getline function is one option.
or
getc to read each char with a do-while loop
if the file consists of numbers, this would be a better way to read.
do {
int item=0, pos=0;
c = getc(in);
while((c >= '0') && (c <= '9')) {
item *=10;
item += int(c)-int('0');
c = getc(in);
pos++;
}
if(pos) list.push_back(item);
}while(c != '\n' && !feof(in));
try by modifying this method if your file consists of strings..
Thanks to all of the people who answered me. I made new code for my program, which works:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(int argc, char** argv)
{
ifstream ifile(argv[1]);
// ...
while (!ifile.eof())
{
string line("");
if (getline(ifile, line))
{
// the line is a whole line
}
// ...
}
ifile.close();
return 0;
}
I suggest:
#include<fstream>
ifstream reader([filename], [ifstream::in or std::ios_base::in);
if(ifstream){ // confirm stream is in a good state
while(!reader.eof()){
reader.read(std::string, size_t how_long?);
// Then process the std::string as described below
}
}
For the std::string, any variable name will do, and for how long, whatever you feel appropriate or use std::getline as above.
To process the line, just use an iterator on the std::string:
std::string::iterator begin() & std::string::iterator end()
and process the iterator pointer character by character until you have the \n and ' ' you are looking for.