Extract all directory names from a text file - c++

I have a text file in which files names, along with their subdirectory names, could appear at any random places. E.g.
input_file.txt
This is a text file. This line has a file name and location Folder1/file1.dat appearing here.
This is another line: Folder2/file2.txt
Here is yet another line with ../Folder3/Folder4/file3.doc filename and location.
This will be on a Linux system; hence the forward-slashes.
I need a C++ code that can extract all the directory names/locations from this type of file. In the above example, the strings to be extracted would be:
Folder1
Folder2
../Folder3/Folder4
Given the above format of the input file, I suppose the algorithm ought to be something like this:
Go through each line in the file and see if the line has a forward-slash (/) anywhere in it.
If a forward-slash is found in a line, extract the substring between the last occurance of the forward-slash (/) in that line and the last space character that appeared before it.
I have tried several different ways, such as below, but just cannot get it to work, I am afraid.
#include <iostream>
#include <string>
int main(int argc, char* argv[])
{
using std::cout; using std::endl;
unsigned first, last;
if(argc < 2)
{
cout << "\nPlease give valid file name!"<<endl;
exit(1);
}
std::string para_file_name = argv[1]; //The name of the input file.
std::ifstream configfile(para_file_name);
while (getline(configfile, line)) {
if (line.find(" ")) {
if (line.find(" ")!=std::string::npos) first = line.find(" ");
if (line.find("/")!=std::string::npos) last = line.find("/");
std::string DirName = line.substr (first,last-first);
cout << " DirName = " << DirName << endl;
}
}
The code has to be compatible with versions older than C++11 and cannot use fancy external libraries such as Boost. Just native C++ please.

Not the most concise, but more performant than <regex> and works with C++98.
#include <cstdlib> // exit
#include <fstream> // fstream
#include <iostream> // cout
#include <sstream> // istringstream
#include <string> // getline
int main(int argc, char **argv)
{
if (argc < 2)
{
std::cout << "\nPlease give valid file name!\n";
exit(1);
}
// Load the file in
std::string line;
std::fstream file(argv[1]);
// For each line of file...
while (std::getline(file, line))
{
std::istringstream iss(line);
std::string word;
char delim = ' ';
// For each word of line...
while (std::getline(iss, word, delim))
{
size_t pos = word.find_last_of('/');
// Word includes '/'
if (pos != std::string::npos)
{
std::string dir_name = word.substr(0, pos);
std::cout << dir_name << "\n";
}
}
}
}
Output
Folder1
Folder2
../Folder3/Folder4

Maybe overkill, but you could use regex.
#include <iostream>
#include <regex>
#include <string>
int main() {
std::cmatch m;
std::regex_match("This is another line: Folder2/file2.txt", m,
std::regex(".*?([^/ ]+/)+.*"));
std::cout << m.str(1) << std::endl;
return 0;
}
Output
Folder2/

Related

How to cout individual words from a input file containing multiple sentences. C++

#include <iostream>
#include <fstream>
//#include <cstring>
//#include <string>
using namespace std;
int main()
{
string word;
ifstream infile;
infile.open ("inputfile.txt");
if (infile.fail())
{
cout<<"UNABLE TO ACCESS INPUT FILE";
}
while(!infile.eof())
{
while (infile>> word)
{
cout<<word<<endl<<endl;
}
}
infile.close ();
system("pause");
return 0;
}
The above code couts all the words in the input text file. How do I cout just one word of my choice? I am asking this because I want to eventually be able to cin a word from user, and find that word in the input file either to delete or replace it with another word.
here is example to find word from string
std::string str ("There are two needles in this haystack with needles.");
std::string str2 ("needle");
std::size_t found = str.find(str2);
if (found!=std::string::npos)
std::cout << "first 'needle' found at: " << found << '\n';

How to parse table of numbers in C++

I need to parse a table of numbers formatted as ascii text. There are 36 space delimited signed integers per line of text and about 3000 lines in the file. The input file is generated by me in Matlab so I could modify the format. On the other hand, I also want to be able to parse the same file in VHDL and so ascii text is about the only format possible.
So far, I have a little program like this that can loop through all the lines of the input file. I just haven't found a way to get individual numbers out of the line. I am not a C++ purest. I would consider fscanf() but 36 numbers is a bit much for that. Please suggest practical ways to get numbers out of a text file.
int main()
{
string line;
ifstream myfile("CorrOut.dat");
if (!myfile.is_open())
cout << "Unable to open file";
else{
while (getline(myfile, line))
{
cout << line << '\n';
}
myfile.close();
}
return 0;
}
Use std::istringstream. Here is an example:
#include <sstream>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
string line;
istringstream strm;
int num;
ifstream ifs("YourData");
while (getline(ifs, line))
{
istringstream strm(line);
while ( strm >> num )
cout << num << " ";
cout << "\n";
}
}
Live Example
If you want to create a table, use a std::vector or other suitable container:
#include <sstream>
#include <string>
#include <fstream>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
string line;
// our 2 dimensional table
vector<vector<int>> table;
istringstream strm;
int num;
ifstream ifs("YourData");
while (getline(ifs, line))
{
vector<int> vInt;
istringstream strm(line);
while ( strm >> num )
vInt.push_back(num);
table.push_back(vInt);
}
}
The table vector gets populated, row by row. Note we created an intermediate vector to store each row, and then that row gets added to the table.
Live Example
You can use a few different approaches, the one offered above is probable the quickest of them, however in case you have different delimitation characters you may consider one of the following solutions:
The first solution, read strings line by line. After that it use the find function in order to find the first position o the specific delimiter. It then removes the number read and continues till the delimiter is not found anymore.
You can customize the delimiter by modifying the delimiter variable value.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myfile("CorrOut.dat");
string delimiter = " ";
size_t pos = 0;
string token;
vector<vector<int>> data;
if (!myfile.is_open())
cout << "Unable to open file";
else {
while (getline(myfile, line))
{
vector<int> temp;
pos = 0;
while ((pos = line.find(delimiter)) != std::string::npos) {
token = line.substr(0, pos);
std::cout << token << std::endl;
line.erase(0, pos + delimiter.length());
temp.push_back(atoi(token.c_str()));
}
data.push_back(temp);
}
myfile.close();
}
return 0;
}
The second solution make use of regex and it doesn't care about the delimiter use, it will search and match any integers found in the string.
#include <iostream>
#include <string>
#include <regex> // The new library introduced in C++ 11
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myfile("CorrOut.dat");
std::smatch m;
std::regex e("[-+]?\\d+");
vector<vector<int>> data;
if (!myfile.is_open())
cout << "Unable to open file";
else {
while (getline(myfile, line))
{
vector<int> temp;
while (regex_search(line, m, e)) {
for (auto x : m) {
std::cout << x.str() << " ";
temp.push_back(atoi(x.str().c_str()));
}
std::cout << std::endl;
line = m.suffix().str();
}
data.push_back(temp);
}
myfile.close();
}
return 0;
}

How can identify a blank line in C++?

I'm reading in information from a file. I need a counter that counts how many text filled lines there are. I need that counter to stop if there is any blank line (even if there are text filled lines after that blank line).
How would I do this? Because I'm not exactly sure how to identify a blank line to stop the counter there.
If you are using std::getline then you can just detect an empty line by checking if the std::string you have just read is empty.
std::ifstream stream;
stream.open("file.txt");
std::string text;
while(std::getline(stream,text))
if(!text.size())
std::cout << "empty" << std::endl;
I'd suggest using std::getline for it:
#include <string>
#include <iostream>
int main()
{
unsigned counter = 0;
std::string line;
while (std::getline(std::cin, line) && line != "")
++counter;
std::cout << counter << std::endl;
return 0;
}
Since #Edward made a comment about handling whitespace and it might be important. When lines with only whitespaces are considered as "empty lines" too I'd suggest changing it to:
#include <string>
#include <iostream>
#include <algorithm>
#include <cctype>
int main()
{
unsigned counter = 0;
std::string line;
while (std::getline(std::cin, line) &&
std::find_if_not( line.begin(), line.end(), std::isspace != line.end()) {
++counter;
}
std::cout << counter << std::endl;
return 0;
}
It's quite verbose, but the advantage is that it uses std::isspace to handle all different kind of spaces (e.g. ' ', '\t', '\v', etc...) and you don't have to worry if you handle them correctly.
In C++ 11 you can use,
std::isblank
In a loop, read all lines, one-by-one, into a single string variable. You can use a std::getline function for that.
Each time after reading a line into that variable, check its length. If it's zero, then the line is empty, and in that case break the loop.
However, checking for empty lines like is not always really right thing. If you are sure that the lines will be empty, then it's OK. But if your "empty" lines can contain whitespaces,
123 2 312 3213
12 3123 123
// <--- Here are SPACEs. Is it "empty"?
123 123 213
123 21312 3
then you might need to not check for "zero-length", but rather whether "all characters are whitespaces".
No error checking, no protection, just a simple example... It is not tested, but you get the gist.
#incldue <iostream>
#include <string>
using namespace std;
int main()
{
string str = "";
int blank = 0, text = 0;
ifstream myfile;
myfile.open("pathToFile");
while(getline(myfile,str))
{
if(str == "")
{
++blank;
}
else
{
++text;
}
}
cout << "Blank: " << blank << "\t" << "With text: " << text;
return 0;
}
Simply check the string length and use a line counter. When the string length is zero (i.e., the string is blank) print the line counter. Sample code is provided for your reference:
// Reading a text file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main () {
string line;
ifstream myfile ("example.txt");
int i = 0;
if (myfile.is_open())
{
while (getline (myfile, line))
{
i++;
// cout << line << '\n';
if (line.length() == 0)
break;
}
cout << "blank line found at line no. " << i << "\n";
myfile.close();
}
else
cout << "Unable to open file";
return 0;
}

Trying to read from a file and skip punctuation in C++, tips?

I'm trying to read from a file, and make a vector of all the words from the file. What I tried to do below is have the user input the filename, and then have the code open the file, and skip characters if they aren't alphanumeric, then input that to a file.
Right now it just closes immediately when I input the filename. Any idea what I could be doing wrong?
#include <vector>
#include <string>
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;
int main()
{
string line; //for storing words
vector<string> words; //unspecified size vector
string whichbook;
cout << "Welcome to the book analysis program. Please input the filename of the book you would like to analyze: ";
cin >> whichbook;
cout << endl;
ifstream bookread;
//could be issue
//ofstream bookoutput("results.txt");
bookread.open(whichbook.c_str());
//assert(!bookread.fail());
if(bookread.is_open()){
while(bookread.good()){
getline(bookread, line);
cout << line;
while(isalnum(bookread)){
words.push_back(bookread);
}
}
}
cout << words[];
}
I think I'd do the job a bit differently. Since you want to ignore all but alphanumeric characters, I'd start by defining a locale that treats all other characters as white space:
struct digits_only: std::ctype<char> {
digits_only(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::space);
std::fill(&rc['0'], &rc['9']+1, std::ctype_base::digit);
std::fill(&rc['a'], &rc['z']+1, std::ctype_base::lower);
std::fill(&rc['A'], &rc['Z']+1, std::ctype_base::upper);
return &rc[0];
}
};
That makes reading words/numbers from the stream quite trivial. For example:
int main() {
char const test[] = "This is a bunch=of-words and 2#numbers#4(with)stuff to\tseparate,them, I think.";
std::istringstream infile(test);
infile.imbue(std::locale(std::locale(), new digits_only));
std::copy(std::istream_iterator<std::string>(infile),
std::istream_iterator<std::string>(),
std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
For the moment, I've copied the words/numbers to standard output, but copying to a vector just means giving a different iterator to std::copy. For real use, we'd undoubtedly want to get the data from an std::ifstream as well, but (again) it's just a matter of supplying the correct iterator. Just open the file, imbue it with the locale, and read your words/numbers. All the punctuation, etc., will be ignored automatically.
The following would read every line, skip non-alpha numeric characters and add each line as an item to the output vector. You can adapt it so it outputs words instead of lines. I did not want to provide the entire solution, as this looks a bit like a homework problem.
#include <vector>
#include <sstream>
#include <string>
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
string line; //for storing words
vector<string> words; //unspecified size vector
string whichbook;
cout << "Welcome to the book analysis program. Please input the filename of the book you would like to analyze: ";
cin >> whichbook;
cout << endl;
ifstream bookread;
//could be issue
//ofstream bookoutput("results.txt");
bookread.open(whichbook.c_str());
//assert(!bookread.fail());
if(bookread.is_open()){
while(!(bookread.eof())){
line = "";
getline(bookread, line);
string lineToAdd = "";
for(int i = 0 ; i < line.size(); ++i)
{
if(isalnum(line[i]) || line[i] == ' ')
{
if(line[i] == ' ')
lineToAdd.append(" ");
else
{ // just add the newly read character to the string 'lineToAdd'
stringstream ss;
string s;
ss << line[i];
ss >> s;
lineToAdd.append(s);
}
}
}
words.push_back(lineToAdd);
}
}
for(int i = 0 ; i < words.size(); ++i)
cout << words[i] + " ";
return 0;
}

The last word in line not read

I am currently working on a program that read each line from a file and extract the word from the line using specific delimiter.
So basically my code looks like this
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(int argv, char **argc)
{
ifstream fin(argc[1]);
char delimiter[] = "|,.\n ";
string sentence;
while (getline(fin,sentence)) {
int pos;
pos = sentence.find_first_of(delimiter);
while (pos != string::npos) {
if (pos > 0) {
cout << sentence.substr(0,pos) << endl;
}
sentence =sentence.substr(pos+1);
pos = sentence.find_first_of(delimiter);
}
}
}
However my code didnot read the last word in the line. For example, my file looks like this.
hello world
the output from the program is only the word "hello" but not "world" . I have use '\n' as the delimiter but why didnot it works?.
Any hint would be appreciated.
getline does not save the new line character in the string. For example, if your file has the line
"Hello World\n"
getline will read this string
"Hello World\0"
So your code misses the "World".
Igonoring that sentence is not defined, you could alter your code to work like this:
#include<iostream>
#include<fstream>
using namespace std;
int main(int argv, char *argc)
{
ifstream fin(argc[1]);
char delimiter[]="|,.\n ";
while (getline(fin,sentence)) {
sentence += "\n";
int pos;
pos = find_first_of(sentence,delimiter);
while (pos != string:: npos) {
if (pos > 0) {
cout << sentence.substr(0,pos) << "\n";
}
sentence =sentence.substr(pos+1);
pos = find_first_of(sentence,delimiter);
}
}
}
Note, I borrowed Bill the Lizards more elegant solution of appending the last delimiter. My previous version had a loop exit condition.
Paraphrasing this reference document:
Characters are extracted until the delimiting character (\n) is found, discarded and the remaining characters returned.
Your string doesn't end with an \n, it is ^`hello world`$, so no delimiter or new pos is found.
As others have mentioned, getline doesn't return the newline character at the end. The simplest way to fix your code is to append one to the end of the sentence after the getline call.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(int argv, char **argc)
{
ifstream fin(argc[1]);
char delimiter[] = "|,.\n ";
string sentence;
while (getline(fin,sentence)) {
sentence += "\n";
int pos;
pos = sentence.find_first_of(delimiter);
while (pos != string::npos) {
if (pos > 0) {
cout << sentence.substr(0,pos) << endl;
}
sentence =sentence.substr(pos+1);
pos = sentence.find_first_of(delimiter);
}
}
}