howto parse a path to a vector using c++ - c++

I need to parse the path part of a URL for a "router" as part of a REST web service. I'm using the PION library for handling HTTP requests. This library does not seem to have any functionality for retrieving parts of the URL path - or so it seems. I cannot find another library that does this. http://www.w3.org/Library/src/HTParse.c does not give parts of the path for example.
Is there a quicker, more robust way of doing this:
std::vector<std::string> parsePath(std::string path)
{
std::string delimiter = "/";
std::string part = "";
std::size_t firstPos = 0;
std::size_t secondPos = 0;
std::vector<std::string> parts;
while (firstPos != std::string::npos)
{
firstPos = path.find(delimiter, firstPos);
secondPos = path.find(delimiter, firstPos + 1);
part = path.substr(firstPos + 1, (secondPos - 1) - firstPos);
if (part != "") parts.push_back(part);
firstPos = secondPos;
}
return parts;
}

If you have the freedom to use Boost, the easiest way to parse filesystem paths would be to use the filesystem library, which has the advantage of being platform-independent and handling both POSIX and Windows path variants:
boost::filesystem::path p1("/usr/local/bin");
boost::filesystem::path p2("c:\\");
std::cout << p1.filename() << std::endl; // prints "bin"
std::cout << p1.parent_path() << std::endl; // prints "/usr/local"
To iterate through each element of the path, you can use a path iterator:
for (auto const& element : p1)
std::cout << element << std::endl;
prints
"/"
"usr"
"local"
"bin"
Without Boost, choose one of the many ways to parse a delimited string.

Related

How can I print specific name files from filesystem

So I can print all the files in a folder but i'd like to print the files that I want. What I mean by that is that I will input for example Mineract, if I have in the folder, for example minecraft_server,mineract launcher. It would print all the names with Minecraft in they're names so it would print Minecraft server and Mineacraft Launcher
I've tried putting it in a for loop. But i can not do the I position of path it's not possible.
for (const auto& entry : fs::directory_iterator(path))
{
cout << entry.path() << endl;
}
That would just print all the files.
UPDATED CODE (still doesn't work).
search - What ever the user inputs
for (const auto& entry : fs::directory_iterator(path)) {
if (entry.path().string().find(search) != string::npos) {
cout << entry.path().string() << endl;
}
}
If I understand your question correctly which I seriously doubt, you want to loop through a folder and its subfolders and only do something for files that contain a certain string.
The following (off top of my head) would work
#include <experimental/filestream>
namespace fs = std::experimental::filestream
for (auto& file : fs::recursive_directory_iterator(yourPath))
{
if(file.path().u8string().find(yourString) != std::string::npos)
do your stuff
}
This example comes straight from code I used for 8 weeks straight and it never failed on me:
for (auto file : fs::recursive_directory_iterator("./"))
{
//std::cout << file.path().u8string() << std::endl;
if (includedFiles.find(file.path().u8string()) != includedFiles.end()
|| skipFile(config, files, &file)
|| file.path().u8string().find((*config)["testFile"].get<std::string>()) != std::string::npos
|| file.path().u8string().find((*config)["outputFile"].get<std::string>()) != std::string::npos
|| matchRegex(&fileOrder, &file.path().u8string())) // Last one does ordering
{
//if (file.path().u8string().find("ValidateModel") != std::string::npos)
//{
// std::cout << "skipped model string " << file.path().u8string() << std::endl;
//}
continue;
}
includedFiles[file.path().u8string()] = true;
std::cout << file.path().u8string() << std::endl;
functor(file);
}
Full code minus the library is available at github: https://github.com/erikknaake/IseProjectSQLFileCombiner/blob/master/SQLFileCombiner.cpp
When you know the name of the folder:
std::string path = std::cin;
for (auto& file : fs::recursive_directory_iterator(path))
{
do your stuff
}
Maybe you need to prepend a /

How to convert C++ string array to json?

I have started implementing Microsoft Cognitive Services using C++. I have a C++ String array(faceIds array)
string faceIds[] ={
"29e874a8-a08f-491f-84e8-eac263d51fe1",
"6f89f38a-2411-4f6c-91b5-15eb72c17c22",
"7284b730-6dd7-47a3-aed3-5dadaef75d76",
"1fc794fa-3fd4-4a78-af11-8f36c4cbf14c",
"3e57afca-bd1d-402e-9f96-2cae8dbdfbfa",
"c2a4e0f5-4277-4f5a-ae28-501085b05209",
"23b5910e-9c32-46dd-95f3-bc0434dff641"
};
Then, I try to convert string array(C++) to json string.
JSONObject jsnobject = new JSONObject(10);
JSONArray jsonArray = jsnobject.getJSONArray(faceIds);
for (int i = 0; i < jsonArray.length(); i++) {
JSONObject explrObject = jsonArray.getJSONObject(i);
}
But, I got problem. So, My question is, How to convert C++ string array to json?
Thank in Advance.
Your question doesn't precisely identify your input and expected output. Are you parsing the C++ from a file? I can't tell.
If the first code block is an autogenerated input file and will always have that whitespace pattern and the JSON equivalent is your desired output, replace the first line with "[\n" and the last line with "]/n" and you're done.
If you can't guarantee the white space pattern of the input file, then you will need a C++ parser to generate an AST (abstract symbol tree) that you can traverse to find the faceIds array RHS (right hand side) and then do the same thing as shown below from that AST collection.
If you simply want to iterate in C++ through faceIds, then the following code should produce the desired JSON string:
#include <iostream>
#include <sstream>
std::string faceIds[] = {
"29e874a8-a08f-491f-84e8-eac263d51fe1",
"6f89f38a-2411-4f6c-91b5-15eb72c17c22",
"7284b730-6dd7-47a3-aed3-5dadaef75d76",
"1fc794fa-3fd4-4a78-af11-8f36c4cbf14c",
"3e57afca-bd1d-402e-9f96-2cae8dbdfbfa",
"c2a4e0f5-4277-4f5a-ae28-501085b05209",
"23b5910e-9c32-46dd-95f3-bc0434dff641"
};
int main() {
std::ostringstream ostr;
ostr << '[' << std::endl;
int last = std::extent<decltype(faceIds)>::value - 1;
int i = 0;
while (i < last)
ostr << " \"" << faceIds[i ++] << "\"," << std::endl;
ostr << " \"" << faceIds[i] << "\"" << std::endl;
ostr << ']' << std::endl;
std::cout << ostr.str();
return 0;
}
If you want some library's object representation, then you'll have to identify what library you are using so we can review its API. Whatever library you use, you could always just run whatever parse method it has on ostr.str() above, but we could find a more efficient method to build the equivalent JSON tree if you identified the JSON library. One can't uniquely identify the library from an object name like JSONObject, which is a class name used in dozens of libraries.
This is a robust cross platform solution to working with JSON in C++ https://github.com/nlohmann/json. I'm sure Microsoft has some library locked to their own OS too. The examples are clear.
I think the nlohmann c++ library is useful in your case.

Split string by regex in VC++

I am using VC++ 10 in a project. Being new to C/C++ I just Googled, it appears that in standard C++ doesnt have regex? VC++ 10 seems to have regex. However, how do I do a regex split? Do I need boost just for that?
Searching the web, I found that many recommend Boost for many things, tokenizing/splitting string, parsing (PEG), and now even regex (though this should be build in ...). Can I conclude boost is a must have? Its 180MB for just trivial things, supported naively in many languages?
C++11 standard has std::regex. It also included in TR1 for Visual Studio 2010. Actually TR1 is available since VS2008, it's hidden under std::tr1 namespace. So you don't need Boost.Regex for VS2008 or later.
Splitting can be performed using regex_token_iterator:
#include <iostream>
#include <string>
#include <regex>
const std::string s("The-meaning-of-life-and-everything");
const std::tr1::regex separator("-");
const std::tr1::sregex_token_iterator endOfSequence;
std::tr1::sregex_token_iterator token(s.begin(), s.end(), separator, -1);
while(token != endOfSequence)
{
std::cout << *token++ << std::endl;
}
if you need to get also the separator itself, you could obtain it from sub_match object pointed by token, it is pair containing start and end iterators of token.
while(token != endOfSequence)
{
const std::tr1::sregex_token_iterator::value_type& subMatch = *token;
if(subMatch.first != s.begin())
{
const char sep = *(subMatch.first - 1);
std::cout << "Separator: " << sep << std::endl;
}
std::cout << *token++ << std::endl;
}
This is sample for case when you have single char separator. If separator itself can be any substring you need to do some more complex iterator work and possible store previous token submatch object.
Or you can use regex groups and place separators in first group and the real token in second:
const std::string s("The-meaning-of-life-and-everything");
const std::tr1::regex separatorAndStr("(-*)([^-]*)");
const std::tr1::sregex_token_iterator endOfSequence;
// Separators will be 0th, 2th, 4th... tokens
// Real tokens will be 1th, 3th, 5th... tokens
int subMatches[] = { 1, 2 };
std::tr1::sregex_token_iterator token(s.begin(), s.end(), separatorAndStr, subMatches);
while(token != endOfSequence)
{
std::cout << *token++ << std::endl;
}
Not sure it is 100% correct, but just to illustrate the idea.
Here an example from this blog.
You'll have all your matches in res
std::tr1::cmatch res;
str = "<h2>Egg prices</h2>";
std::tr1::regex rx("<h(.)>([^<]+)");
std::tr1::regex_search(str.c_str(), res, rx);
std::cout << res[1] << ". " << res[2] << "\n";

how to subtract one path from another?

So... I have a base path and a new path.New path contains in it base path. I need to see what is different in new path. Like we had /home/ and new path is /home/apple/one and I need to get from it apple/one. note - when I would create some path from (homePath/diffPath) I need to get that /home/apple/one again. How to do such thing with Boost FileSystem?
Using stem() and parent_path() and walk backwards from the new path until we get back to base path, this works, but I am not sure if it is very safe.
Be cautious, as the path "/home" and "/home/" are treated as different paths. The below only works if base path is /home (without trailing slash) and new path is guaranteed to be below base path in the directory tree.
#include <iostream>
#include <boost/filesystem.hpp>
int main(void)
{
namespace fs = boost::filesystem;
fs::path basepath("/home");
fs::path newpath("/home/apple/one");
fs::path diffpath;
fs::path tmppath = newpath;
while(tmppath != basepath) {
diffpath = tmppath.stem() / diffpath;
tmppath = tmppath.parent_path();
}
std::cout << "basepath: " << basepath << std::endl;
std::cout << "newpath: " << newpath << std::endl;
std::cout << "diffpath: " << diffpath << std::endl;
std::cout << "basepath/diffpath: " << basepath/diffpath << std::endl;
return 0;
}
Assuming you have:
namespace fs = std::filesystem; // or boost::filesystem
fs::path base = "/home/usera"
fs::path full = "/home/usera/documents/doc"
If you want to extract documents/doc, you can do that with lexically_relative:
fs::path diff = full.lexically_relative(base);
assert( diff == fs::path("documents/doc") );
This works for base = "/home/usera" or base = "home/usera/". If full does not contain base, this may give you a pretty long path with lots of .. instead of getting an error.
std::filesystem::path::lexically_relative requires C++17
Other solution, if you know that newpath really belongs to basepath, could be:
auto nit = newpath.begin();
for (auto bit = basepath.begin(); bit != basepath.end(); ++bit, ++nit)
;
fs::path = path(nit, newpath.end());

How can I extract the file name and extension from a path in C++

I have a list of files stored in a .log in this syntax:
c:\foto\foto2003\shadow.gif
D:\etc\mom.jpg
I want to extract the name and the extension from this files. Can you give a example of a simple way to do this?
To extract a filename without extension, use boost::filesystem::path::stem instead of ugly std::string::find_last_of(".")
boost::filesystem::path p("c:/dir/dir/file.ext");
std::cout << "filename and extension : " << p.filename() << std::endl; // file.ext
std::cout << "filename only : " << p.stem() << std::endl; // file
For C++17:
#include <filesystem>
std::filesystem::path p("c:/dir/dir/file.ext");
std::cout << "filename and extension: " << p.filename() << std::endl; // "file.ext"
std::cout << "filename only: " << p.stem() << std::endl; // "file"
Reference about filesystem: http://en.cppreference.com/w/cpp/filesystem
std::filesystem::path::filename
std::filesystem::path::stem
As suggested by #RoiDanto, for the output formatting, std::out may surround the output with quotations, e.g.:
filename and extension: "file.ext"
You can convert std::filesystem::path to std::string by p.filename().string() if that's what you need, e.g.:
filename and extension: file.ext
If you want a safe way (i.e. portable between platforms and not putting assumptions on the path), I'd recommend to use boost::filesystem.
It would look somehow like this:
boost::filesystem::path my_path( filename );
Then you can extract various data from this path. Here's the documentation of path object.
BTW: Also remember that in order to use path like
c:\foto\foto2003\shadow.gif
you need to escape the \ in a string literal:
const char* filename = "c:\\foto\\foto2003\\shadow.gif";
Or use / instead:
const char* filename = "c:/foto/foto2003/shadow.gif";
This only applies to specifying literal strings in "" quotes, the problem doesn't exist when you load paths from a file.
You'll have to read your filenames from the file in std::string. You can use the string extraction operator of std::ostream. Once you have your filename in a std::string, you can use the std::string::find_last_of method to find the last separator.
Something like this:
std::ifstream input("file.log");
while (input)
{
std::string path;
input >> path;
size_t sep = path.find_last_of("\\/");
if (sep != std::string::npos)
path = path.substr(sep + 1, path.size() - sep - 1);
size_t dot = path.find_last_of(".");
if (dot != std::string::npos)
{
std::string name = path.substr(0, dot);
std::string ext = path.substr(dot, path.size() - dot);
}
else
{
std::string name = path;
std::string ext = "";
}
}
Not the code, but here is the idea:
Read a std::string from the input stream (std::ifstream), each instance read will be the full path
Do a find_last_of on the string for the \
Extract a substring from this position to the end, this will now give you the file name
Do a find_last_of for ., and a substring either side will give you name + extension.
The following trick to extract the file name from a file path with no extension in c++ (no external libraries required):
#include <iostream>
#include <string>
using std::string;
string getFileName(const string& s) {
char sep = '/';
#ifdef _WIN32
sep = '\\';
#endif
size_t i = s.rfind(sep, s.length());
if (i != string::npos)
{
string filename = s.substr(i+1, s.length() - i);
size_t lastindex = filename.find_last_of(".");
string rawname = filename.substr(0, lastindex);
return(rawname);
}
return("");
}
int main(int argc, char** argv) {
string path = "/home/aymen/hello_world.cpp";
string ss = getFileName(path);
std::cout << "The file name is \"" << ss << "\"\n";
}
I also use this snippet to determine the appropriate slash character:
boost::filesystem::path slash("/");
boost::filesystem::path::string_type preferredSlash = slash.make_preferred().native();
and then replace the slashes with the preferred slash for the OS. Useful if one is constantly deploying between Linux/Windows.
For linux or unix machines, the os has two functions dealing with path and file names. use man 3 basename to get more information about these functions.
The advantage of using the system provided functionality is that you don't have to install boost or needing to write your own functions.
#include <libgen.h>
char *dirname(char *path);
char *basename(char *path);
Example code from the man page:
char *dirc, *basec, *bname, *dname;
char *path = "/etc/passwd";
dirc = strdup(path);
basec = strdup(path);
dname = dirname(dirc);
bname = basename(basec);
printf("dirname=%s, basename=%s\n", dname, bname);
Because of the non-const argument type of the basename() function, it is a little bit non-straight forward using this inside C++ code. Here is a simple example from my code base:
string getFileStem(const string& filePath) const {
char* buff = new char[filePath.size()+1];
strcpy(buff, filePath.c_str());
string tmp = string(basename(buff));
string::size_type i = tmp.rfind('.');
if (i != string::npos) {
tmp = tmp.substr(0,i);
}
delete[] buff;
return tmp;
}
The use of new/delete is not good style. I could have put it into a try/catch
block in case something happened between the two calls.
Nickolay Merkin's and Yuchen Zhong's answers are great, but however from the comments you can see that it is not fully accurate.
The implicit conversion to std::string when printing will wrap the file name in quotations. The comments aren't accurate either.
path::filename() and path::stem() returns a new path object and path::string() returns a reference to a string. Thus something like std::cout << file_path.filename().string() << "\n" might cause problems with dangling reference since the string that the reference points to might have been destroyed.