c++ Function to add an extra '\' to a filepath? - c++

I have about 3500 full file paths to sort through (ex. "C:\Users\Nick\Documents\ReadIns\NC_000852.gbk"). I just learned that c++ does not recognize the single backslash when reading in a file path. I have about 3500 file paths that I am reading in so it would be overly tedious to manually change each one.
I have this for loop that finds the single backslash and inserts a double backslash at that index. This:
string line = "C:\Users\Nick\Documents\ReadIns\NC_000852.gbk";
for (unsigned int i = 0; i < filepath.size(); i++) {
if(filepath[i] == '\') {
filepath.insert(i, '\');
}
}
However, c++, specifically on c::b, does not compile because of the backslash character. Is there a way to add in the extra backslash character with a function?
I am reading the filepaths in from a text file, so they are being read into the string filepath variable, this is just a test.

Use double backslash as '\\' and "C:\\Users...". Because single backslash with the next character makes an escape.
Also the string::insert() method's 2nd argument expects number of characters, which is missing in your code.
With all those fixes, it compiles fine:
string filepath = "C:\\Users\\Nick\\Documents\\ReadIns\\NC_000852.gbk";
// ^^ ^^ ^^ ^^ ^^
for (unsigned int i = 0; i < filepath.size(); i++) {
if(filepath[i] == '\\') {
// ^^
filepath.insert(i, 1, '\\');
} // ^^^^^^^
}
I am not sure, how above logic will work. But below is my preferred way:
for(auto pos = filepath.find('\\'); pos != string::npos; pos = filepath.find('\\', ++pos))
filepath.insert(++pos, 1, '\\');
If you had only single character to be replaced (e.g. linux system or probably supported in windows); then, you may also use std::replace() to avoid the looping as mentioned in this answer:
std::replace(filepath.begin(), filepath.end(), '\\', '/');
I assumed that, you already have a file created which contains single backslashes and you are using that for parsing.
But from your comments, I notice that apparently you are getting the file paths directly in runtime (i.e. while running the .exe). In that case, as #MSalters has mentioned, you need not worry about such transformations (i.e. changing the backslashes).

The problem that you're seeing is because in C++, string literals are commonly enclosed in "" quotes. This brings up one minor problem: how do you put a quote inside a string literal, when that quote would end the string literal. The solution is escaping it with a \. This can also be used to add a few other characters to a string, such as \n (newline). And since \ now has a special meaning in string literals, it's also used to escape itself. So "\\" is a string containing just one character (and of course a trailing NUL).
This also applies to character literals: char example[4] = {'a', '\\', 'b', 0} is an alternative way to write "a\\b".
Now this is all about compile time, when the compiler needs to separate C++ code and string contents. Once your executable is running, a backslash is just one char. std::cout << "a\\b" prints a single backslash, because there's only one in memory. std::String word; std::cin >> word will read a single word, and if you enter one backslash then word will contain one backslash. The compiler isn't involved in that.
So if you read 3500 filenames from a std::ifstream list_of_filenames and then use that to create a further 3500 std::ifstreams, you only need to worry about backslashes in specifying that very first filename in code. And if ou take that filename from argv[1] instead, you don't need to care at all.

One way to get rid of special handling of backslash is to keep all file names in a separate disk file as such and use file stream objects such as ifstream to get file names in C++ format.
TCHAR tcszFilename[MAX_PATH] = {0};
ifstream ObjInFiles( "E:\\filenames.txt" );
ObjInFiles.getline( tcszFilename, MAX_PATH );
ObjInFiles.close();
Suppose first file name stored in filenames.txt is "e:\temp\abc.txt" then after executing getline() above, the variable tcszFilename will hold "e:\\temp\\abc.txt".

Related

How can I read CSV file in to vector in C++

I'm doing the project that convert the python code to C++, for better performance. That python project name is Adcvanced EAST, for now, I got the input data for nms function, in .csv file like this:
"[ 5.9358170e-04 5.2773970e-01 5.0061589e-01 -1.3098677e+00
-2.7747922e+00 1.5079222e+00 -3.4586751e+00]","[ 3.8175487e-05 6.3440394e-01 7.0218205e-01 -1.5393494e+00
-5.1545496e+00 4.2795391e+00 -3.4941311e+00]","[ 4.6003381e-05 5.9677261e-01 6.6983813e-01 -1.6515008e+00
-5.1606908e+00 5.2009044e+00 -3.0518508e+00]","[ 5.5172237e-05 5.8421570e-01 5.9929764e-01 -1.8425952e+00
-5.2444854e+00 4.5013981e+00 -2.7876694e+00]","[ 5.2929961e-05 5.4777789e-01 6.4851379e-01 -1.3151239e+00
-5.1559062e+00 5.2229333e+00 -2.4008298e+00]","[ 8.0250458e-05 6.1284608e-01 6.1014801e-01 -1.8556541e+00
-5.0002270e+00 5.2796564e+00 -2.2154367e+00]","[ 8.1256607e-05 6.1321974e-01 5.9887391e-01 -2.2241254e+00
-4.7920742e+00 5.4237065e+00 -2.2534993e+00]
one unit is 7 numbers, but a '\n' after first four numbers,
I wanna read this csv file into my C++ project,
so that I can do the math work in C++, make it more fast.
using namespace std;
void read_csv(const string &filename)
{
//File pointer
fstream fin;
//open an existing file
fin.open(filename, ios::in);
vector<vector<vector<double>>> predict;
string line;
while (getline(fin, line))
{
std::istringstream sin(line);
vector<double> preds;
double pred;
while (getline(sin, pred, ']'))
{
preds.push_back(preds);
}
}
}
For now...my code emmmmmm not working ofc,
I'm totally have no idea with this...
please help me with read the csv data into my code.
thanks
Unfortunately parsing strings (and consequently files) is very tedious in C++.
I highly recommend using a library, ideally a header-only one, like this one.
If you insist on writing it yourself, maybe you can draw some inspiration from this StackOverflow question on how to parse general CSV files in C++.
You could look at getdelim(',', fin, line),
But the other issue will be those quotes, unless you /know/ the file is always formatted exactly this way, it becomes difficult.
One hack I have used in the past that is NOT PERFECT, if the first character is a quote, then the last character before the comma must also be a matching quote, and not escaped.
If it is not a quote then getdelim() some more, but the auto-alloc feature of getdelim means you must use another buffer. In C++ I end up with a vector of all the pieces of getdelim results that then need to be concatenated to make the final string:
std::vector<char*> gotLine;
gotLine.push_back(malloc(2));
*gotLine.back() = fgetch();
gotLine.back()[1] = 0;
bool gotquote = *gotLine.back() == '"'; // perhaps different classes of quote
if (*gotLine.back() != ',')
for(;;)
{
char* gotSub= nullptr;
gotSub=getdelim(',');
gotLine.push_back(gotSub);
if (!gotquote) break;
auto subLen = strlen(gotSub);
if (subLen>1 && *(gotSub-1)=='"') // again different classes of quote
if (sublen==2 || *(gotSub-2)!='\\') // needs to be a while loop
break;
}
Then just concatenate all these string segments back together.
Note that getdelim supports null bytes. If you expect null bytes in the content, and not represented by the character sequences \000 or \# you need to store the actual length returned by getdelim, and use memcpy to concatenate them.
Oh, and if you allow utf-8 extended quotes it gets very messy!
The case this doesn't cover is a string that ends \\" or \\\\". Ideally you need to while count the number of leading backslashes, and accept the quote if the count is even.
Note that this leave the issue of unescaping the quoted content, i.e. converting any \" into ", and \\ into \, etc. Also discarding the enclosing quotes.
In the end a library may be easier if you need to deal with completely arbitrary content. But if the content is "known" you can live without.

Create argument string from argv [duplicate]

Let I want to write an application, that launches another application. Like this:
# This will launch another_app.exe
my_app.exe another_app.exe
# This will launch another_app.exe with arg1, arg and arg3 arguments
my_app.exe another_app.exe arg1 arg2 arg3
The problem here is that I'm getting char* argv[] in my main function, but I need to merge it to LPTSTR in order to pass it to CreateProcess.
There is a GetCommandLine function, but I cannot use it because I'm porting code from Linux and tied to argc/argv (otherwise, it's a very ugly hack for me).
I cannot easily merge arguments by hand, because argv[i] might contain spaces.
Basically, I want the reverse of CommandLineToArgvW. Is there a standard way to do this?
The definitive answer on how to quote arguments is on Daniel Colascione's blog:
https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/
I am reluctant to quote the code here because I don't know the license. The basic idea is:
for each single argument:
if it does not contain \t\n\v\",
just use as is
else
output "
for each character
backslashes = 0
if character is backslash
count how many successive backslashes there are
fi
if eow
output the backslashs doubled
break
else if char is "
output the backslashs doubled
output \"
else
output the backslashes (*not* doubled)
output character
fi
rof
output "
fi // needs quoting
rof // each argument
If you need to pass the command line to cmd.exe, see the article (it's different).
I think it is crazy that the Microsoft C runtime library doesn't have a function to do this.
There is no Win32 API that does the reverse of CommandLineToArgvW(). You have to format the command line string yourself. This is nothing more than basic string concatenation.
Microsoft documents the format for command-line arguments (or at least the format expected by VC++-written apps, anyway):
Parsing C++ Command-Line Arguments
Microsoft C/C++ startup code uses the following rules when
interpreting arguments given on the operating system command line:
Arguments are delimited by white space, which is either a space or a
tab.
The caret character (^) is not recognized as an escape character or
delimiter. The character is handled completely by the command-line
parser in the operating system before being passed to the argv array
in the program.
A string surrounded by double quotation marks ("string") is
interpreted as a single argument, regardless of white space contained
within. A quoted string can be embedded in an argument.
A double quotation mark preceded by a backslash (\") is interpreted
as a literal double quotation mark character (").
Backslashes are interpreted literally, unless they immediately
precede a double quotation mark.
If an even number of backslashes is followed by a double quotation
mark, one backslash is placed in the argv array for every pair of
backslashes, and the double quotation mark is interpreted as a string
delimiter.
If an odd number of backslashes is followed by a double quotation
mark, one backslash is placed in the argv array for every pair of
backslashes, and the double quotation mark is "escaped" by the
remaining backslash, causing a literal double quotation mark (") to be
placed in argv.
It should not be hard for you to write a function that takes an array of strings and concatenates them together, applying the reverse of the above rules to each string in the array.
You need to recreate the command line, taking care of having all program name and arguments enclosed in ". This is done by concatenating a \" to these strings, one at the beginning, one at the end.
Assuming the program name to be created is argv[1], the first argument argv[2] etc...
char command[1024]; // size to be adjusted
int i;
for (*command=0, i=1 ; i<argc ; i++) {
if (i > 1) strcat(command, " ");
strcat(command, "\"");
strcat(command, argv[i]);
strcat(command, "\"");
}
Use the 2nd argument of CreateProcess
CreateProcess(NULL, command, ...);
You can check out the below code if it suits your need, the txt array sz can be used as a string pointer. I have added code support for both Unicode and MBCS,
#include <string>
#include <vector>
#ifdef _UNICODE
#define String std::wstring
#else
#define String std::string
#endif
int _tmain(int argc, _TCHAR* argv[])
{
TCHAR sz[1024] = {0};
std::vector<String> allArgs(argv, argv + argc);
for(unsigned i=1; i < allArgs.size(); i++)
{
TCHAR* ptr = (TCHAR*)allArgs[i].c_str();
_stprintf_s(sz, sizeof(sz), _T("%s %s"), sz, ptr);
}
return 0;
}

seekg() not working as expected

I have a small program, that is meant to copy a small phrase from a file, but it appears that I am either misinformed as to how seekg() works, or there is a problem in my code preventing the function from working as expected.
The text file contains:
//Intro
previouslyNoted=false
The code is meant to copy the word "false" into a string
std::fstream stats("text.txt", std::ios::out | std::ios::in);
//String that will hold the contents of the file
std::string statsStr = "";
//Integer to hold the index of the phrase we want to extract
int index = 0;
//COPY CONTENTS OF FILE TO STRING
while (!stats.eof())
{
static std::string tempString;
stats >> tempString;
statsStr += tempString + " ";
}
//FIND AND COPY PHRASE
index = statsStr.find("previouslyNoted="); //index is equal to 8
//Place the get pointer where "false" is expected to be
stats.seekg(index + strlen("previouslyNoted=")); //get pointer is placed at 24th index
//Copy phrase
stats >> previouslyNotedStr;
//Output phrase
std::cout << previouslyNotedStr << std::endl;
But for whatever reason, the program outputs:
=false
What I expected to happen:
I believe that I placed the get pointer at the 24th index of the file, which is where the phrase "false" begins. Then the program would've inputted from that index onward until a space character would have been met, or the end of the file would have been met.
What actually happened:
For whatever reason, the get pointer started an index before expected. And I'm not sure as to why. An explanation as to what is going wrong/what I'm doing wrong would be much appreciated.
Also, I do understand that I could simply make previouslyNotedStr a substring of statsStr, starting from where I wish, and I've already tried that with success. I'm really just experimenting here.
The VisualC++ tag means you are on windows. On Windows the end of line takes two characters (\r\n). When you read the file in a string at a time, this end-of-line sequence is treated as a delimiter and you replace it with a single space character.
Therefore after you read the file you statsStr does not match the contents of the file. Every where there is a new line in the file you have replaced two characters with one. Hence when you use seekg to position yourself in the file based on numbers you got from the statsStr string, you end up in the wrong place.
Even if you get the new line handling correct, you will still encounter problems if the file contains two or more consecutive white space characters, because these will be collapsed into a single space character by your read loop.
You are reading the file word by word. There are better methods:
while (getline(stats, statsSTr)
{
// An entire line is read into statsStr.
std::string::size_type posn = statsStr.find("previouslyNoted=");
// ...
}
By reading entire text lines into a string, there is no need to reposition the file.
Also, there is a white-space issue when reading by word. This will affect where you think the text is in the file. For example, white space is skipped, and there is no telling how many spaces, newlines or tabs were skipped.
By the way, don't even think about replacing the text in the same file. Replacement of text only works if the replacement text has the same length as the original text in the file. Write to a new file instead.
Edit 1:
A better method is to declare your key strings as array. This helps with positioning pointers within a string:
static const char key_text[] = "previouslyNoted=";
while (getline(stats, statsStr))
{
std::string::size_type key_position = statsStr.find(key_text);
std::string::size_type value_position = key_position + sizeof(key_text) - 1; // for the nul terminator.
// value_position points to the character after the '='.
// ...
}
You may want to save programming type by making your data file conform to an existing format, such as INI or XML, and using appropriate libraries to parse them.

Split string by delimiter by using vectors - how to split by newline?

I have function like this (I found it somewhere, it works with \t separator).
vector<string> delimited_str_to_vector(string& str, string delimiter)
{
vector<string> retVect;
size_t pos = 0;
while(str.substr(pos).find(delimiter) != string::npos)
{
retVect.push_back(str.substr(pos, str.substr(pos).find(delimiter)));
pos += str.substr(pos).find(delimiter) + delimiter.size();
}
retVect.push_back(str.substr(pos));
return retVect;
}
I have problem with splitting string by "\r\n" delimiter. What am I doing wrong?
string data = get_file_contents("csvfile.txt");
vector<string> csvRows = delimited_str_to_vector(data, "\r\n");
I'm sure, that my file uses CRLF for new line.
You can use getline to read the file line by line, which:
Extracts characters from is and stores them into str until the delimitation character delim is found (or the newline character, '\n' ...) If the delimiter is found, it is extracted and discarded, i.e. it is not stored and the next input operation will begin after it.
Perhaps you are already reading the file through a function that removes line endings.
If you open your file in text mode, i.e., you don't mention std::ios_base::binary (or one of it alternate spellings) it is likely that the system specific end of line sequences is replaced by \n characters. That is, even if your source file used \r\n, you may not see this character sequence when reading the file. Add the binary flag when opening the file if you really want to process these sequences.

How to read a word into a string ignoring a certain character

I am reading a text file which contains a word with a punctuation mark on it and I would like to read this word into a string without the punctuation marks.
For example, a word may be " Hello, "
I would like the string to get " Hello " (without the comma). How can I do that in C++ using ifstream libraries only.
Can I use the ignore function to ignore the last character?
Thank you in advance.
Try ifstream::get(Ch* p, streamsize n, Ch term).
An example:
char buffer[64];
std::cin.get(buffer, 64, ',');
// will read up to 64 characters until a ',' is found
// For the string "Hello," it would stream in "Hello"
If you need to be more robust than simply a comma, you'll need to post-process the string. The steps might be:
Read the stream into a string
Use string::find_first_of() to help "chunk" the words
Return the word as appropriate.
If I've misunderstood your question, please feel free to elaborate!
If you only want to ignore , then you can use getline.
const int MAX_LEN = 128;
ifstream file("data.txt");
char buffer[MAX_LEN];
while(file.getline(buffer,MAX_LEN,','))
{
cout<<buffer;
}
EDIT: This uses std::string and does away with MAX_LEN
ifstream file("data.txt");
string string_buffer;
while(getline(file,string_buffer,','))
{
cout<<string_buffer;
}
One way would be to use the Boost String Algorithms library. There are several "replace" functions that can be used to replace (or remove) specific characters or strings in strings.
You can also use the Boost Tokenizer library for splitting the string into words after you have removed the punctuation marks.