C++ program snippet: what is this doing? - c++

I'm trying to figure out how to output hOCR using Tesseract. Documentation is limited, so I am looking into the code. I found this in the main() function:
bool output_hocr = tessedit_create_hocr;
outfile = argv[2];
outfile += output_hocr ? ".html" : tessedit_create_boxfile ? ".box" : ".txt";
A typical command for Tesseract is this: tesseract input.tif output_file.txt (the output file will be appended with another .txt in this example). main()'s signature is int main(int argc, char **argv).
What exactly is the code snippet doing?

It's generating the output filename.
bool output_hocr = tessedit_create_hocr;
Saves the tessedit_create_hocr flag in a locally scoped variable.
outfile = argv[2];
Initializes the outfile variable with the base filename from the command line. Something like "Scann0000.tif".
outfile += output_hocr ? ".html" : tessedit_create_boxfile ? ".box" : ".txt";
Appends the appropriate extension based on flags. Could be re-written as
if( ouput_hocr )
outfile += ".html";
else if( tessedit_create_boxfile )
outfile += ".box";
else
outfile += ".txt";

It's taking a base filename from the second command-line argument (output_file.txt in your example) then choosing the extension with the ternary operator.
If output_hocr, ".html"
Otherwise, if tessedit_create, ".box"
Otherwise, ".txt"
Note that this is C++.

If the output_hocr variable is true it appends ".html" to outfile.
If it is false it checks tessedit_create_boxfile if it is true, it appends ".box" to outfile, otherwise it appends ".txt".

This code is just deciding what file extension to give outfile based on the value of tessedit_create_hocr (it is unclear how or where this variable is initialized given the code snippet provided).
If the value is true, the program will name the output file ".html". Otherwise, it will be ".box" or ".txt", depending on the value of tessedit_create_boxfile (it is also unclear where this is initialized).

Related

Extract the file name from filename with path which comes from argument

My program get the filename with or without path(direct or indirect).
I'd like to use the filename from argv as a part of output filename.
The problem is that sometimes the filename from argv includes path and sometimes doesn't.
What I want to do is
1. if filename includes path, extract filename only and return filename.
2. if filename doesn't includes path, return filename.
Current my code is
std::string input_trace_filename = argv[1];
std::string read_filename = input_trace_filename + ".read.";
std::string write_filename = input_trace_filename + ".write.";
Thanks in advance.
You can use this:
std::string filename = string(argv[1]);
int index = filename.find_last_of("/\\");
std::string input_trace_filename = filename.substr(index+1);

opening output filestreams with string names

Hi I have some C++ code that uses user defined input to generate file-names for some output files:
std::string outputName = fileName;
for(int i = 0; i < 4; i++)
{
outputName.pop_back();
}
std::string outputName1 = outputName;
std::string outputName2 = outputName;
outputName.append(".fasta");
outputName1.append("_Ploid1.fasta");
outputName2.append("_Ploid2.fasta");
Where fileName could be any word the user can define with .csv after it e.g. '~/Desktop/mytest.csv'
The code chomps .csv off and makes three filenames / paths for 3 output streams.
It then creates them and attempts to open them:
std::ofstream outputFile;
outputFile.open(outputName.c_str());
std::ofstream outputFile1;
outputFile1.open(outputName1.c_str());
std::ofstream outputFile2;
outputFile2.open(outputName2.c_str());
I made sure to pass the names to open as const char* with the c_str method, however if I test my code by adding the following line:
std::cout << outputFile.is_open() << " " << outputFile1.is_open() << " " << outputFile2.is_open() << std::endl;
and compiling and setting fineName as "test.csv". I successfully compile and run, however,
Three zeros's are printed to screen showing the three filestreams for output are not in fact open. Why are they not opening? I know passing strings as filenames does not work which is why I thought conversion with c_str() would be sufficient.
Thanks,
Ben W.
Your issue is likely to be due to the path beginning with ~, which isn't expanded to /{home,Users}/${LOGNAME}.
ifstream open file C++
This answer to How to create a folder in the home directory? may be of use to you.
Unfortunately, there is no standard, portable way of finding out exactly why open() failed:
Detecting reason for failure to open an ofstream when fail() is true
I know passing strings as filenames does not work which is why I thought conversion with c_str() would be sufficient.
std::basic_ofstream::open() does accept a const std::string & (since C++11)!

QFile is not reading nor opening my file

I have a file called "sequence_30.dat" that contains a sequence of 1 and -1 in a vertical representation (i.e.: each 1 or -1 is in a separate line) .. I am trying to read the file for another operation using the following code:
int length = 31
QFile file("sequence_"+ (static_cast<QString>(length)) +".dat");
if(file.exists()){
file.open(QIODevice::ReadOnly);
if(file.isOpen()){
....
....
}
file.close();
}
but when debugging, the compiler skips the "if(file.exists())" statement and when it is removed the compiler again skips the "if(file.isOpen())" statement
I am very sure that path is correct, but if is not how to make sure that I am in the right path (i.e.: is there is a way to check where am I reading from) .. and if the path is correct why my file is not opening ?
static_cast<QString>(length)
Should be:
QString::number( length )
You can check it by just printing it out to the console:
cout << qPrintable( QString( "sequence_" ) +
QString::number( length ) + ".dat" ) << endl;
static_cast doesn't work that way, so instead of a static_cast, you should use QString::number to convert an int into a QString.

Console main Input

Okay Some may remember me from earlier, I am fairly new to programming so I may seem not up to par with many others. However at the moment, i am very much stuck.
int main(int argc, char* argv[]) {
string temp,input,output;//store input from file, and get which file//
ofstream out("output.txt");
if(argc == 3)
{
if(ifstream(argv[2]))
{
input = argv[2];
ifstream in(input);
while(in.good())
{
in >> temp;
ReverseWord(temp);
cout << temp << endl;
out << temp << endl;
}
out.close();
in.close();
}
}
}
This code right here is meant to reverse the letter order of words that it takes in from a file by typing "revstr < input.txt" with input.txt being the file name. however at the moment the program just opens and closes right away without anything happening and nothing being typed into the console. does anyone know how to fix this?
If you call your program as revstr < input.text your main() function will be called (on usual platforms) with:
argv = { "revstr", NULL }
argc = 1
In this case you get the contents of input.txt by reading from std::cin. That is what 'input readirection' means: your standard input stream is redirected to read from a file rather than the keyboard (aka terminal) device. No need to deal with the filename in that case.
To pass a filename as argument, use revstr input.txt. That should call main()with
argv = { "revstr", "input.txt", NULL }
argc = 2
so the filename will be available as argv[1].
The behavior in the former case is typically due to command shells, which treat '<' as a redirection directive (which ends the preceding command). You may have expected to get
argv = { "revstr", "<", "input.txt", NULL }
argc = 3
For that you would need to apply some form of quoting or escaping to disable the shell behavior, for example revstr "<" input.txtor revstr \< input.txt. But as far as I understand where you are coming from, you want the redirection. In that case forget about argc and argv and simply read your input from std::cin.

Changing last 5 char of array

I have a program that encrypts files, but adds the extension ".safe" to the end. So the end result is something like "file.txt.safe"
When I go to decrypt the file, the user enters the file name again: "file.txt.safe" which is saved to a char. Now I want to remove ".safe" and rename the file to its original name.
I have tried the following, but nothing seems to happen and there are no errors.
Decrypt (myFile); //decrypts myFile
char * tmp = myFile;
char * newFile;
newFile = strstr (tmp,".safe"); //search tmp for ".safe"
strncpy (newFile,"",5); //replace .safe with ""
rename (myFile, newFile);
I'm sure I'm missing something obvious, but if this approach doesn't work, I'm looking for any simple method.
Edited to add:
(copied by moderator from poster's response to K-ballo)
Thanks everyone. I took the std::string approach and found this to work:
Decrypt(myFile);
string str = myFile;
size_t pos = str.find(".safe");
str.replace(pos,5,"");
rename(myFile, str.c_str());
For what you want to do, simply changing the strncpy line to this will work:
*newFile = '\0';
This would still have problems if the filename contains an early .safe (like in file.safest.txt.safe), or if it does not contain the substring .safe at all. You would be better of searching from the end of the array, and making sure you do find something.
This seems like a better approach (although in C++ it would be better to just go with std::string):
char* filename = ...;
size_t filename_length = strlen( filename );
int safe_ext_pos = filename_length - 5; // 5 == length of ".safe"
if( safe_ext_pos > 0 && strcmp( ".safe", filename + safe_ext_pos ) == 0 )
filename[ safe_ext_pos ] = '\0';
This is the std::string version of the code:
std::string filename = ...;
int safe_ext_pos = filename.length() - 5; // 5 == length of ".safe"
if( safe_ext_pos > 0 && filename.compare( safe_ext_pos, 5, ".safe" ) == 0 )
filename.erase( safe_ext_pos );
You should take care:
my.safe.file.txt.safe
Instead of just searching for '.safe' and removing it or truncating the filename at the first one, you should ensure that it's actually at the end of the string:
std::string myfile = ...
Decrypt(myFile);
const std::string extension_to_remove = ".safe";
if (decryption is successful &&
myfile.size() >= extension_to_remove.size() &&
myfile.substr(myfile.size()-5) == extension_to_remove)
{
std::string newFile = myfile.substr(0, myfile.size()-5);
rename(myFile, newFile);
}
Also a note on filename extensions. It's really a pretty awful practice for software to identify file types using a special format in the filename.* It's fine for humans to organize their files with special naming conventions, but software should by and large be oblivious to it, except perhaps to make it easy for humans to use the conventions they want.
So your code for decrypting a file shouldn't be doing this task. Instead your decryption code should take a file to decrypt and a file to contain the output. Then your code for computing the output filename from the encrypted file's name should exist somewhere else, such as in the user interface where the user tells you the output filename. Your code would remove '.safe' if it exists and supply the modified name as the default output filename, to be confirmed by the user.
void perform_decryption(std::string const &encrypted, std::string const &decrypted) {
Decrypt(encrypted);
if (decryption is successful && encrypted!=decrypted)
rename(encrypted, decrypted);
}
std::string default_decrypted_name(std::string const &filename) {
const std::string extension_to_remove = ".safe";
if (filename.size() >= extension_to_remove.size() &&
filename.substr(filename.size()-extension_to_remove.size()) == extension_to_remove)
{
return filename.substr(0, filename.size()-extension_to_remove.size());
}
return filename + ".decrypted";
}
* here are some reasons against filename extensions:
filename extensions are not unique, in some circumstances causing conflicts where a file's type cannot be positively identified. (the fact that they can't even perform their intended purpose really ought to be enough...)
It degrades the usability of the filename for organizing. When 'myfile.txt' is renamed to 'myfile.txt.old' it's no longer seen as a text file.
It's caused security issues because fake type metadata can be mistaken for real type metadata when the real type metadata is hidden.
and more...