Extract the file name from filename with path which comes from argument - c++

My program get the filename with or without path(direct or indirect).
I'd like to use the filename from argv as a part of output filename.
The problem is that sometimes the filename from argv includes path and sometimes doesn't.
What I want to do is
1. if filename includes path, extract filename only and return filename.
2. if filename doesn't includes path, return filename.
Current my code is
std::string input_trace_filename = argv[1];
std::string read_filename = input_trace_filename + ".read.";
std::string write_filename = input_trace_filename + ".write.";
Thanks in advance.

You can use this:
std::string filename = string(argv[1]);
int index = filename.find_last_of("/\\");
std::string input_trace_filename = filename.substr(index+1);

Related

Fstream not reading a complete struct from binary data (C++)

I've been trying to make my program write a string into a binary file using Ofstream::write(), but I could not find out how to (through the interwebs), so I tried writing a struct with a string into the file. That worked perfectly; I could open the file and read the string (with my human eyes), but when I tried to use Ifstream::read() to read the struct, I just got an empty string and the string that I wrote (in this case, "dir" was the empty one, and "fileName" was correctly read).
Any and all help is appreciated :)
PS: Both strings are saved in the file...
This is my writing code:
StringStruct texPath;
texPath.dir = "src/Assets/";
texPath.fileName = "bricks_top.png";
file.write((char*)&texPath, sizeof(texPath));
This is my reading code:
StringStruct texFile;
file.read((char*)&texFile, sizeof(texFile));
std::string filepath = "";
filepath += texFile.dir;
filepath += texFile.fileName;
std::cout << filepath;
And this is the "StringStruct" code:
struct StringStruct {
std::string dir = "src/Assets/";
std::string fileName = "Example.png";
};
Ok, I recieved some comments (thanks manni66) saying that I have to write as c-strings. So I changed my struct to this:
struct StringStruct {
char* dir = "src/Assets/";
char* fileName = "Example.png";
};
So that I was writing each string as a c-string instead.

Adding to the name of a file

I'm working on an application that processes text files, and I want to create a new file with a similar name to that file, but slightly modified.
So for instance, I have a function that takes a string fileName as a parameter and creates a new file with the word "PROCESSED" added before ".txt."
so if fileName = "testFile.txt"
the new file should be named "testFilePROCESSED.txt"
string newFile = filename + "PROCESSED"; obviously doesn't work since the filename would be "testFile.txtPROCESSED" in this case.
You just need more practice with strings:
int ii = filename.rfind('.');
filename.insert(ii, "PROCESSED");
Let's keep it simple, I assume fileName is a string.
`#include <sstream>`
using namespace std;
stringstream ss;
fileName.erase(name.end()-4, name.end()); //Extension removal.
ss << name << "PROCESSED.txt";
string newFileName = ss.str();

C++ string equivalent for strrchr

Using C strings I would write the following code to get the file name from a file path:
#include <string.h>
const char* filePath = "dir1\\dir2\\filename"; // example
// extract file name (including extension)
const char* fileName = strrchr(progPath, '\\');
if (fileName)
++fileName;
else
fileName = filePath;
How to do the same with C++ strings? (i.e. using std::string from #include <string>)
The closest equivalent is rfind:
#include <string>
std::string filePath = "dir1\\dir2\\filename"; // example
// extract file name (including extension)
std::string::size_type filePos = filePath.rfind('\\');
if (filePos != std::string::npos)
++filePos;
else
filePos = 0;
std::string fileName = filePath.substr(filePos);
Note that rfind returns an index into the string (or npos), not a pointer.
To find the last occurence of a symbol in a string use std::string::rfind
std::string filename = "dir1\\dir2\\filename";
std::size_t pos = filename.rfind( "\\" );
However, if you're handling filenames and pathes more often, have a look at boost::filesystem
boost::filesystem::path p("dir1\\dir2\\filename");
std::string filename = p.filename().generic_string(); //or maybe p.filename().native();
Either call string::rfind(), or call std::find using reverse iterators (which are returned from string::rbegin() and string::rend()).
find might be a little bit more efficient since it explicitly says that you're looking for a matching character. rfind() looks for a substring and you'd give it a length 1 string, so it finds the same thing.
Apart from rfind(), you can also use find_last_of()
You have an example too written in cplusplus.com which is same as your requirement.

Changing last 5 char of array

I have a program that encrypts files, but adds the extension ".safe" to the end. So the end result is something like "file.txt.safe"
When I go to decrypt the file, the user enters the file name again: "file.txt.safe" which is saved to a char. Now I want to remove ".safe" and rename the file to its original name.
I have tried the following, but nothing seems to happen and there are no errors.
Decrypt (myFile); //decrypts myFile
char * tmp = myFile;
char * newFile;
newFile = strstr (tmp,".safe"); //search tmp for ".safe"
strncpy (newFile,"",5); //replace .safe with ""
rename (myFile, newFile);
I'm sure I'm missing something obvious, but if this approach doesn't work, I'm looking for any simple method.
Edited to add:
(copied by moderator from poster's response to K-ballo)
Thanks everyone. I took the std::string approach and found this to work:
Decrypt(myFile);
string str = myFile;
size_t pos = str.find(".safe");
str.replace(pos,5,"");
rename(myFile, str.c_str());
For what you want to do, simply changing the strncpy line to this will work:
*newFile = '\0';
This would still have problems if the filename contains an early .safe (like in file.safest.txt.safe), or if it does not contain the substring .safe at all. You would be better of searching from the end of the array, and making sure you do find something.
This seems like a better approach (although in C++ it would be better to just go with std::string):
char* filename = ...;
size_t filename_length = strlen( filename );
int safe_ext_pos = filename_length - 5; // 5 == length of ".safe"
if( safe_ext_pos > 0 && strcmp( ".safe", filename + safe_ext_pos ) == 0 )
filename[ safe_ext_pos ] = '\0';
This is the std::string version of the code:
std::string filename = ...;
int safe_ext_pos = filename.length() - 5; // 5 == length of ".safe"
if( safe_ext_pos > 0 && filename.compare( safe_ext_pos, 5, ".safe" ) == 0 )
filename.erase( safe_ext_pos );
You should take care:
my.safe.file.txt.safe
Instead of just searching for '.safe' and removing it or truncating the filename at the first one, you should ensure that it's actually at the end of the string:
std::string myfile = ...
Decrypt(myFile);
const std::string extension_to_remove = ".safe";
if (decryption is successful &&
myfile.size() >= extension_to_remove.size() &&
myfile.substr(myfile.size()-5) == extension_to_remove)
{
std::string newFile = myfile.substr(0, myfile.size()-5);
rename(myFile, newFile);
}
Also a note on filename extensions. It's really a pretty awful practice for software to identify file types using a special format in the filename.* It's fine for humans to organize their files with special naming conventions, but software should by and large be oblivious to it, except perhaps to make it easy for humans to use the conventions they want.
So your code for decrypting a file shouldn't be doing this task. Instead your decryption code should take a file to decrypt and a file to contain the output. Then your code for computing the output filename from the encrypted file's name should exist somewhere else, such as in the user interface where the user tells you the output filename. Your code would remove '.safe' if it exists and supply the modified name as the default output filename, to be confirmed by the user.
void perform_decryption(std::string const &encrypted, std::string const &decrypted) {
Decrypt(encrypted);
if (decryption is successful && encrypted!=decrypted)
rename(encrypted, decrypted);
}
std::string default_decrypted_name(std::string const &filename) {
const std::string extension_to_remove = ".safe";
if (filename.size() >= extension_to_remove.size() &&
filename.substr(filename.size()-extension_to_remove.size()) == extension_to_remove)
{
return filename.substr(0, filename.size()-extension_to_remove.size());
}
return filename + ".decrypted";
}
* here are some reasons against filename extensions:
filename extensions are not unique, in some circumstances causing conflicts where a file's type cannot be positively identified. (the fact that they can't even perform their intended purpose really ought to be enough...)
It degrades the usability of the filename for organizing. When 'myfile.txt' is renamed to 'myfile.txt.old' it's no longer seen as a text file.
It's caused security issues because fake type metadata can be mistaken for real type metadata when the real type metadata is hidden.
and more...

C++ program snippet: what is this doing?

I'm trying to figure out how to output hOCR using Tesseract. Documentation is limited, so I am looking into the code. I found this in the main() function:
bool output_hocr = tessedit_create_hocr;
outfile = argv[2];
outfile += output_hocr ? ".html" : tessedit_create_boxfile ? ".box" : ".txt";
A typical command for Tesseract is this: tesseract input.tif output_file.txt (the output file will be appended with another .txt in this example). main()'s signature is int main(int argc, char **argv).
What exactly is the code snippet doing?
It's generating the output filename.
bool output_hocr = tessedit_create_hocr;
Saves the tessedit_create_hocr flag in a locally scoped variable.
outfile = argv[2];
Initializes the outfile variable with the base filename from the command line. Something like "Scann0000.tif".
outfile += output_hocr ? ".html" : tessedit_create_boxfile ? ".box" : ".txt";
Appends the appropriate extension based on flags. Could be re-written as
if( ouput_hocr )
outfile += ".html";
else if( tessedit_create_boxfile )
outfile += ".box";
else
outfile += ".txt";
It's taking a base filename from the second command-line argument (output_file.txt in your example) then choosing the extension with the ternary operator.
If output_hocr, ".html"
Otherwise, if tessedit_create, ".box"
Otherwise, ".txt"
Note that this is C++.
If the output_hocr variable is true it appends ".html" to outfile.
If it is false it checks tessedit_create_boxfile if it is true, it appends ".box" to outfile, otherwise it appends ".txt".
This code is just deciding what file extension to give outfile based on the value of tessedit_create_hocr (it is unclear how or where this variable is initialized given the code snippet provided).
If the value is true, the program will name the output file ".html". Otherwise, it will be ".box" or ".txt", depending on the value of tessedit_create_boxfile (it is also unclear where this is initialized).