I was trying to get sf::String into std::filesystem::u8path. My first method is converting it into an std::string, (std::string)sfstringbar but it sees it as a single byte character, i also tried auto x = sfstringbar.toUtf8() std::string(x.begin(), x.end()) but the same. My second method is to pass it as a char array which hopefully could read it with the UTF 8 encoding, but still the same happens.
EDIT:
char* makeutf8str(str string) {
std::basic_string<sf::Uint8> utf8 = string.toUtf8();
std::vector<char>* out = new std::vector<char>;
for (auto x = utf8.begin(); x != utf8.end(); x++) {
out->push_back(*x);
}
return &(out->at(0));
}
bool neaxfile::isfile(str file) {
std::cout << "\nThis: " << makeutf8str(file) << "\n";
return std::filesystem::is_regular_file(std::filesystem::u8path(makeutf8str(file)));
}
Here's about the second solution i tried. I have a file called Яyes.txt as an example, but when i pass in to check if it exists, it says it doesn't. Because the makeutf8str() function splits Я into Ð and ¯. I can't seem to get the encoder to work properly.
EDIT 2:
str neaxfile::getcwd() {
std::error_code ec;
str path = std::filesystem::current_path(ec).u8string();
if (ec.value() == 0) {
return path;
} else {
return '\0';
}
}
std::vector<str> neaxfile::listfiles() {
std::vector<str> res;
for (auto entry : std::filesystem::directory_iterator((std::string)neaxfile::getcwd())) {
if (neaxfile::isfile(entry.path().wstring())) res.push_back(entry.path().wstring());
}
return res;
}
I tried the first solution below. It no longer prints Я. But it still doesn't confirm that this is a file. I tried to list the files using that ^
std::filesystem::u8path() "Constructs a path p from a UTF-8 encoded sequence of chars [or char8_ts (since C++20)], supplied either as an std::string, or as std::string_view, or as a null-terminated multibyte string, or as a [first, last) iterator pair."
A std::string can hold a UTF-8 encoded char sequence (better to use std::u8string in C++20, though). sf::String::ToUtf8() returns a UTF-8 encoded std::basic_string<Uint8>. You can simply cast the UInt8 data to char to construct a std::string, there is no need for your makeutf8str() function to use std::vector<char> or return a raw char* at all (especially since it is leaking the std::vector anyway).
You can use the std::string constructor that takes a char* and a size_t as input, eg:
std::string makeutf8str(const str &string) {
auto utf8 = string.toUtf8();
return std::string(reinterpret_cast<const char*>(utf8.c_str()), utf8.size());
}
Or, you can use the std::string constructor that takes a range of iterators as input (despite your claim, this should work just fine), eg:
std::string makeutf8str(const str &string) {
auto utf8 = string.toUtf8();
return std::string(utf8.begin(), utf8.end());
}
Either way will work fine with std::cout and std::filesystem::u8path(), eg:
bool neaxfile::isfile(const str &file) {
auto utf8 = makeutf8str(file);
std::cout << "\nThis: " << utf8 << "\n";
return std::filesystem::is_regular_file(std::filesystem::u8path(utf8));
}
That being said, the Unicode character Я is encoded in UTF-8 as bytes 0xD0 0xAF, which when interpreted as Latin-1 instead of UTF-8 will appear as Я. This means the std::string data is properly UTF-8 encoded, it is just not being processed correctly. For instance, if your console cannot handle UTF-8 output, then you will see Я instead of Я. But, u8path() should process the UTF-8 encoded std::string just fine, and convert it to the filesystem's native encoding as needed. But then, there is no guarantee that the underlying filesystem will actually handle a Unicode filename like Яyes.txt properly, but that would be an OS issue, not a C++ issue.
UPDATE: your listfiles() function is not making use of UTF-8 at all when using directory_iterator. It is type-casting the sf::String from getcwd() to an ANSI encoded std::string (which is a lossy conversion), not to a UTF-8 encoded std::string. But worse, that sf::String is being constructed by getcwd() from a UTF-8 encoded std::string but the std::string constructor of sf::String requires ANSI by default, not UTF-8 (to fix that, you have to give it a UTF-8 std::locale). So, you are passing through several lossy conversions trying to get a string from the std::filesystem::pathreturned fromstd::filesystem::current_pathtostd::filesystem::directory_iterator`.
sf::String can convert to/from std::wstring, which std::filesystem::path can also use, so there is no need to go through UTF-8 and std::filesystem::u8path() at all, at least on Windows where std::wstring uses UTF-16 and Windows underlying filesystem APIs also use UTF-16.
Try this instead:
bool neaxfile::isfile(const str &file) {
std::wstring wstr = file;
std::wcout << L"\nThis: " << wstr << L"\n";
return std::filesystem::is_regular_file(std::filesystem::path(wstr));
}
str neaxfile::getcwd() {
std::error_code ec;
str path = std::filesystem::current_path(ec).wstring();
if (ec.value() == 0) {
return path;
} else {
return L"";
}
}
std::vector<str> neaxfile::listfiles() {
std::vector<str> res;
std::filesystem::path cwdpath(neaxfile::getcwd().wstring());
for (auto entry : std::filesystem::directory_iterator(cwdpath) {
str filepath = entry.path().wstring();
if (neaxfile::isfile(filepath)) res.push_back(filepath);
}
return res;
}
If you really want to use UTF-8 for conversions between C++ strings and SFML strings, then try this instead to avoid any data loss:
std::string makeutf8str(const str &string) {
auto utf8 = string.toUtf8();
return std::string(reinterpret_cast<const char*>(utf8.c_str()), utf8.size());
}
str fromutf8str(const std::string &string) {
return str::fromUtf8(utf8.begin(), utf8.end());
}
bool neaxfile::isfile(const str &file) {
auto utf8 = makeutf8str(file);
std::cout << "\nThis: " << utf8 << "\n";
return std::filesystem::is_regular_file(std::filesystem::u8path(utf8));
}
str neaxfile::getcwd() {
std::error_code ec;
auto path = std::filesystem::current_path(ec).u8string();
if (ec.value() == 0) {
return fromutf8str(path);
} else {
return "";
}
}
std::vector<str> neaxfile::listfiles() {
std::vector<str> res;
auto cwdpath = std::filesystem::u8path(makeutf8str(neaxfile::getcwd()));
for (auto entry : std::filesystem::directory_iterator(cwdpath)) {
str filepath = fromutf8str(entry.path().u8string());
if (neaxfile::isfile(filepath)) res.push_back(filepath);
}
return res;
}
That being said, you are doing a lot of unnecessary conversions between C++ strings and SFML strings. You really shouldn't be using SFML strings when you are not directly interacting with SFML's API. You really should be using C++ strings as much as possible, especially with the <filesystem> API, eg:
bool neaxfile::isfile(const std::string &file) {
std::cout << L"\nThis: " << file << L"\n";
return std::filesystem::is_regular_file(std::filesystem::u8path(file));
}
std::string neaxfile::getcwd() {
std::error_code ec;
std::string path = std::filesystem::current_path(ec).u8string();
if (ec.value() == 0) {
return path;
} else {
return "";
}
}
std::vector<std::string> neaxfile::listfiles() {
std::vector<std::string> res;
auto cwdpath = std::filesystem::u8path(neaxfile::getcwd());
for (auto entry : std::filesystem::directory_iterator(cwdpath)) {
auto filepath = entry.path().u8string();
if (neaxfile::isfile(filepath)) res.push_back(filepath);
}
return res;
}
Alternatively:
bool neaxfile::isfile(const std::wstring &file) {
std::wcout << L"\nThis: " << file << L"\n";
return std::filesystem::is_regular_file(std::filesystem::path(file));
}
std::wstring neaxfile::getcwd() {
std::error_code ec;
auto path = std::filesystem::current_path(ec).wstring();
if (ec.value() == 0) {
return path;
} else {
return L"";
}
}
std::vector<std::wstring> neaxfile::listfiles() {
std::vector<std::wstring> res;
std::filesystem::path cwdpath(neaxfile::getcwd());
for (auto entry : std::filesystem::directory_iterator(cwdpath)) {
auto filepath = entry.path().wstring();
if (neaxfile::isfile(filepath)) res.push_back(filepath);
}
return res;
}
A better option is to simply not pass around strings at all. std::filesystem::path is an abstraction to help shield you from that, eg:
bool neaxfile::isfile(const std::filesystem::path &file) {
std::wcout << L"\nThis: " << file.wstring() << L"\n";
return std::filesystem::is_regular_file(file);
}
std::filesystem::path neaxfile::getcwd() {
std::error_code ec;
auto path = std::filesystem::current_path(ec);
if (ec.value() == 0) {
return path;
} else {
return {};
}
}
std::vector<std::filesystem::path> neaxfile::listfiles() {
std::vector<std::filesystem::path> res;
for (auto entry : std::filesystem::directory_iterator(neaxfile::getcwd())) {
auto filepath = entry.path();
if (neaxfile::isfile(filepath)) res.push_back(filepath);
}
return res;
}
Related
I want my function() to always return a "" string under error conditions else return a string that is converted to string from an unsigned long integer variable.
My initial implementation is as follows:
uint32 cfgVariable_1 = 4;
uint32 cfgVariable_2 = 1;
const char* getCfgVariable (const char* msg)
{
char* retValue = "";
if(strcmp("cfgVariable_1", msg)==0)
{
// Since I want my function to return a const char* and returning uint32_t is not an option
sprintf((retValue), "%lu", cfgVariable_1);
return (const char*)retValue;
}
else if(strcmp("cfgVariable_2", msg)==0)
{
// Since I want my function to return a const char* and returning uint32_t is not an option
sprintf((retValue), "%lu", cfgVariable_2);
return (const char*)retValue;
}
else
{
//error
}
return (const char*) retValue;
}
When the function is called at different instances to get the cfgVariables, I expect my function getCfgVariable() to return "" on error condition, when no match found.
Somewhere in code:
const char* CfgValue = NULL;
CfgValue = getCfgVariable("cfgVariable_1");
Here CfgValue gets pointed to location which contains 4
later
const char* CfgValue = NULL;
CfgValue = getCfgVariable("cfgVariable_3");
I expect to get a "" back but I get 4 instead (CfgValue gets the same address as before).
Fix implemented by me works, but I fail to understand the logic behind it, fix:
const char* getCfgVariable (const char* msg)
{
const char* defValue = "";
char* retValue = "\0";
if(strcmp("cfgVariable_1", msg)==0)
{
// Since I want my function to return a const char* and returning uint32_t is not an option
sprintf((retValue), "%lu", cfgVariable_1);
return (const char*)retValue;
}
else if(strcmp("cfgVariable_2", msg)==0)
{
// Since I want my function to return a const char* and returning uint32_t is not an option
sprintf((retValue), "%lu", cfgVariable_2);
return (const char*)retValue;
}
else
{
//error
}
return defValue;
}
I see during debugging that defValue and retValue get pointed to two different locations that do not get overwritten. defValue always gets pointed to the same address when its initialized with "" and retValue gets pointed to a different address when initialized with "\0". Can anyone explain the logic behind this ? Is there a better implementation for my use case ?
My Solution after considering the comments:
const char* getCfgVariable (const char* msg)
{
const char* retValue = "";
std::ostringstream oss;
if(!strcmp("cfgVariable_1", msg))
{
oss << cfgVariable_1;
}
else if(!strcmp("cfgVariable_2", msg))
{
oss << cfgVariable_2;
}
else
{
//error
return retValue;
}
const std::string tmp = oss.str();
retValue = tmp.c_str();
return retValue;
}
Thanks for the comments so far and this solution is still open to further improvement suggestions.
Constexpr strings such as "\0", "", "cfgVariable_1", etc. These are constant strings in memory compiled into your resulting executable. Attempting to write values into those strings is downright dangerous! In old style C, you'd have to use malloc to allocate a bit of memory to use for your string. This is a real pain to deal with in practice (and not ideal for someone who's learning C++).
A far simpler solution is to start using the C++ string object, std::string (which handles all of the dynamic memory allocation for you!). This should reduce your problem to something a little simpler (and most importantly, safer!):
#include <string>
#include <sstream>
std::string getCfgVariable (const char* const msg)
{
std::ostringstream oss;
if(!strcmp("cfgVariable_1", msg))
{
oss << cfgVariable_1;
}
else
if(!strcmp("cfgVariable_2", msg))
{
oss << cfgVariable_2;
}
return oss.str();
}
Doing this in C, you have 2 choices. Allocate the memory for the returned string, or use a static buffer that is always available (which is what this example does).
uint32 cfgVariable_1 = 4;
uint32 cfgVariable_2 = 1;
const char* getCfgVariable (const char* msg)
{
static char retValue[32] = {0};
if(strcmp("cfgVariable_1", msg)==0)
{
// Since I want my function to return a const char* and returning uint32_t is not an option
sprintf(retValue, "%lu", cfgVariable_1);
return retValue;
}
else if(strcmp("cfgVariable_2", msg)==0)
{
// Since I want my function to return a const char* and returning uint32_t is not an option
sprintf(retValue, "%lu", cfgVariable_2);
return retValue;
}
else
{
//error
}
return retValue;
}
However, because now the retValue is an array fixed in memory, the string returned would only be valid until the next call to getCfgVariable, which could be a little strange....
const char* A = getCfgVariable("cfgVariable_1");
printf("%s\n", A); // prints '4'
const char* B = getCfgVariable("cfgVariable_2");
printf("%s\n", B); // prints '1'
printf("%s\n", A); // now this will print '1', and not '4'.
const char* C = getCfgVariable("anythingElse");
printf("%s\n", C); // prints ""
printf("%s\n", B); // prints ""
printf("%s\n", A); // aso prints ""
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I have a key like:
wchar_t key[] = L"764frtfg88fgt320nolmo098vfr";
and a char* row[i] returned by a query from a Database.
I'd like to compare my Key with row[i]. I tried with
wcscmp (key,row[i]) != 0)
but it gives me an error. Any suggestions ?
This might help: C++ Convert string (or char*) to wstring (or wchar_t*)
As a summary:
#include <string>
wchar_t Key[] = L"764frtfg88fgt320nolmo098vfr";
std::wstring k(Key);
const char* text = "test"; // your row[i]
std::string t(text);
// only works well if the string being converted contains only ASCII characters.
std::wstring a(t.begin(), t.end());
if(a.compare(k) == 0)
{
std::cout << "same" << std::endl;
}
I'd use C++ tools:
#include <iostream>
#include <string>
// construct a wstring from a string
std::wstring to_wstring(std::string const& str)
{
const size_t len = ::mbstowcs(nullptr, &str[0], 0);
if (len == size_t(-1)) {
throw std::runtime_error("to_wstring()");
}
std::wstring result(len, 0);
::mbstowcs(&result[0], &str[0], result.size());
return result;
}
//
// TEST CASES ---
//
const wchar_t key[] = L"764frtfg88fgt320nolmo098vfr";
const auto wkey = std::wstring(key);
bool operator==(std::string const& lhs, std::wstring const& rhs)
{
return to_wstring(lhs) == rhs;
}
bool operator==(std::wstring const& lhs, std::string const& rhs) { return rhs == lhs; }
int main() {
std::cout << std::boolalpha << ("hello" == wkey) << "\n"
<< (wkey == "764frtfg88fgt320nolmo098vfr") << "\n";
}
Prints
false
true
Its perks are that it (should) work(s) with non-ASCII characters on both *nix and windows.
There are other answers already but you could also convert char* to wchat_t* like this.
Declare the following:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t* wc = new wchar_t[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
Then use it like this:
wchar_t * temprow;
temprow = (wchar_t *)GetWC(row[i]);
/* replace following line with your own */
std::cout << "i " << i << " is " << (wcscmp (key,temprow) != 0) << "\n";
/* avoid memory leak */
free(temprow);
Say thanks to this thread: How to convert char* to wchar_t*?
I need to split LPWSTR with multiple delimiters & return array of LPWSTR in c++. How to do it?
I tried to do from the following question:
How to split char pointer with multiple delimiters & return array of char pointers in c++?
But it prints ?? for each wstring. What's wrong with it?
can I do it as I tried follow? If so what's the mistake I made? If not how to do it?
std::vector<wstring> splitManyW(const wstring &original, const wstring &delimiters)
{
std::wstringstream stream(original);
std::wstring line;
vector <wstring> wordVector;
while (std::getline(stream, line))
{
std::size_t prev = 0, pos;
while ((pos = line.find_first_of(delimiters, prev)) != std::wstring::npos)
{
if (pos > prev)
{
wstring toPush = line.substr(prev, pos-prev);
//wstring toPushW = toWide(toPush);
wordVector.push_back(toPush);
}
prev = pos + 1;
}
if (prev < line.length())
{
wstring toPush = line.substr(prev, std::wstring::npos);
//wstring toPushW = toWide(toPush);
wordVector.push_back(toPush);
}
}
for (int i = 0; i< wordVector.size(); i++)
{
//cout << wordVector[i] << endl;
wprintf(L"Event message string: %s\n", wordVector[i]);
}
return wordVector;
}
int main()
{
wstring original = L"This:is\nmy:tst?why I hate";
wstring separators = L":? \n";
vector<wstring> results = splitManyW(original, separators);
getchar();
}
You're not properly accessing the wchar_t* exposed from std::wstring when you print your final tokens. Further, your output format specifier is incorrect. Per the wprintf documentation (see here), in particular "If the l specifier is used, the argument must be a pointer to the initial element of an array of wchar_t.".
A few modifications and stripping out some redundancies gives the following:
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
using std::wstring;
using std::vector;
std::vector<wstring> splitManyW(const wstring &original, const wstring &delimiters)
{
std::wstringstream stream(original);
std::wstring line;
vector <wstring> wordVector;
while (std::getline(stream, line))
{
std::size_t prev = 0, pos;
while ((pos = line.find_first_of(delimiters, prev)) != std::wstring::npos)
{
if (pos > prev)
wordVector.emplace_back(line.substr(prev, pos-prev));
prev = pos + 1;
}
if (prev < line.length())
wordVector.emplace_back(line.substr(prev, std::wstring::npos));
}
return wordVector;
}
int main()
{
wstring original = L"This:is\nmy:tst?why I hate";
wstring separators = L":? \n";
vector<wstring> results = splitManyW(original, separators);
for (auto const& w : results)
wprintf(L"Event message string: %ls\n", w.c_str());
getchar();
}
Output
Event message string: This
Event message string: is
Event message string: my
Event message string: tst
Event message string: why
Event message string: I
Event message string: hate
Note: I would have preferred using formatted stream output using operator <<, but that is somewhat unrelated to your question.
Best of luck.
You should have executed it under a debugger. You would have seen immediately that your parsing is correct, so is your vector.
The problem is that you are trying to use to old C wprintf with format %s which expects a C string (a null terminated char array), and you pass a std::string which is a totally different object.
You can either :
do it the C way, getting the C string contained by the std::string:
wprintf(L"Event message string: %s\n", wordVector[i].c_str());
do it the C++ way, using wcout:
wcout << L"Event message string: " << wordVector[i] << std::endl;
But your return value is not an array of LPWSTR but a vector of std::string.
You should first allocate the array of pointers to char array, then individually allocate the char arrays, return that... and do not forget to free everything.
LPWSTR is wchar_t*, so basically what you need is wcstok.
What is the simplest way to get the file name that from a path?
string filename = "C:\\MyDirectory\\MyFile.bat"
In this example, I should get "MyFile". without extension.
The task is fairly simple as the base filename is just the part of the string starting at the last delimeter for folders:
std::string base_filename = path.substr(path.find_last_of("/\\") + 1)
If the extension is to be removed as well the only thing to do is find the last . and take a substr to this point
std::string::size_type const p(base_filename.find_last_of('.'));
std::string file_without_extension = base_filename.substr(0, p);
Perhaps there should be a check to cope with files solely consisting of extensions (ie .bashrc...)
If you split this up into seperate functions you're flexible to reuse the single tasks:
template<class T>
T base_name(T const & path, T const & delims = "/\\")
{
return path.substr(path.find_last_of(delims) + 1);
}
template<class T>
T remove_extension(T const & filename)
{
typename T::size_type const p(filename.find_last_of('.'));
return p > 0 && p != T::npos ? filename.substr(0, p) : filename;
}
The code is templated to be able to use it with different std::basic_string instances (i.e. std::string & std::wstring...)
The downside of the templation is the requirement to specify the template parameter if a const char * is passed to the functions.
So you could either:
A) Use only std::string instead of templating the code
std::string base_name(std::string const & path)
{
return path.substr(path.find_last_of("/\\") + 1);
}
B) Provide wrapping function using std::string (as intermediates which will likely be inlined / optimized away)
inline std::string string_base_name(std::string const & path)
{
return base_name(path);
}
C) Specify the template parameter when calling with const char *.
std::string base = base_name<std::string>("some/path/file.ext");
Result
std::string filepath = "C:\\MyDirectory\\MyFile.bat";
std::cout << remove_extension(base_name(filepath)) << std::endl;
Prints
MyFile
A possible solution:
string filename = "C:\\MyDirectory\\MyFile.bat";
// Remove directory if present.
// Do this before extension removal incase directory has a period character.
const size_t last_slash_idx = filename.find_last_of("\\/");
if (std::string::npos != last_slash_idx)
{
filename.erase(0, last_slash_idx + 1);
}
// Remove extension if present.
const size_t period_idx = filename.rfind('.');
if (std::string::npos != period_idx)
{
filename.erase(period_idx);
}
The Simplest way in C++17 is:
use the #include <filesystem> and filename() for filename with extension and stem() without extension.
#include <iostream>
#include <string>
#include <filesystem>
namespace fs = std::filesystem;
int main()
{
std::string filename = "C:\\MyDirectory\\MyFile.bat";
std::cout << fs::path(filename).filename() << '\n'
<< fs::path(filename).stem() << '\n'
<< fs::path("/foo/bar.txt").filename() << '\n'
<< fs::path("/foo/bar.txt").stem() << '\n'
<< fs::path("/foo/.bar").filename() << '\n'
<< fs::path("/foo/bar/").filename() << '\n'
<< fs::path("/foo/.").filename() << '\n'
<< fs::path("/foo/..").filename() << '\n'
<< fs::path(".").filename() << '\n'
<< fs::path("..").filename() << '\n'
<< fs::path("/").filename() << '\n';
}
Which can be compiled with g++ -std=c++17 main.cpp -lstdc++fs, and outputs:
"MyFile.bat"
"MyFile"
"bar.txt"
"bar"
".bar"
""
"."
".."
"."
".."
"/"
Reference: cppreference
The simplest solution is to use something like boost::filesystem. If
for some reason this isn't an option...
Doing this correctly will require some system dependent code: under
Windows, either '\\' or '/' can be a path separator; under Unix,
only '/' works, and under other systems, who knows. The obvious
solution would be something like:
std::string
basename( std::string const& pathname )
{
return std::string(
std::find_if( pathname.rbegin(), pathname.rend(),
MatchPathSeparator() ).base(),
pathname.end() );
}
, with MatchPathSeparator being defined in a system dependent header
as either:
struct MatchPathSeparator
{
bool operator()( char ch ) const
{
return ch == '/';
}
};
for Unix, or:
struct MatchPathSeparator
{
bool operator()( char ch ) const
{
return ch == '\\' || ch == '/';
}
};
for Windows (or something still different for some other unknown
system).
EDIT: I missed the fact that he also wanted to suppress the extention.
For that, more of the same:
std::string
removeExtension( std::string const& filename )
{
std::string::const_reverse_iterator
pivot
= std::find( filename.rbegin(), filename.rend(), '.' );
return pivot == filename.rend()
? filename
: std::string( filename.begin(), pivot.base() - 1 );
}
The code is a little bit more complex, because in this case, the base of
the reverse iterator is on the wrong side of where we want to cut.
(Remember that the base of a reverse iterator is one behind the
character the iterator points to.) And even this is a little dubious: I
don't like the fact that it can return an empty string, for example.
(If the only '.' is the first character of the filename, I'd argue
that you should return the full filename. This would require a little
bit of extra code to catch the special case.)
}
_splitpath should do what you need. You could of course do it manually but _splitpath handles all special cases as well.
EDIT:
As BillHoag mentioned it is recommended to use the more safe version of _splitpath called _splitpath_s when available.
Or if you want something portable you could just do something like this
std::vector<std::string> splitpath(
const std::string& str
, const std::set<char> delimiters)
{
std::vector<std::string> result;
char const* pch = str.c_str();
char const* start = pch;
for(; *pch; ++pch)
{
if (delimiters.find(*pch) != delimiters.end())
{
if (start != pch)
{
std::string str(start, pch);
result.push_back(str);
}
else
{
result.push_back("");
}
start = pch + 1;
}
}
result.push_back(start);
return result;
}
...
std::set<char> delims{'\\'};
std::vector<std::string> path = splitpath("C:\\MyDirectory\\MyFile.bat", delims);
cout << path.back() << endl;
If you can use boost,
#include <boost/filesystem.hpp>
boost::filesystem::path p("C:\\MyDirectory\\MyFile.bat");
string basename = p.filename().string();
//or
//string basename = boost::filesystem::path("C:\\MyDirectory\\MyFile.bat").filename().string();
This is all.
I recommend you to use boost library. Boost gives you a lot of conveniences when you work with C++. It supports almost all platforms.
If you use Ubuntu, you can install boost library by only one line sudo apt-get install libboost-all-dev (ref. How to install Boost on Ubuntu)
You can also use the shell Path APIs PathFindFileName, PathRemoveExtension. Probably worse than _splitpath for this particular problem, but those APIs are very useful for all kinds of path parsing jobs and they take UNC paths, forward slashes and other weird stuff into account.
wstring filename = L"C:\\MyDirectory\\MyFile.bat";
wchar_t* filepart = PathFindFileName(filename.c_str());
PathRemoveExtension(filepart);
http://msdn.microsoft.com/en-us/library/windows/desktop/bb773589(v=vs.85).aspx
The drawback is that you have to link to shlwapi.lib, but I'm not really sure why that's a drawback.
Function:
#include <string>
std::string
basename(const std::string &filename)
{
if (filename.empty()) {
return {};
}
auto len = filename.length();
auto index = filename.find_last_of("/\\");
if (index == std::string::npos) {
return filename;
}
if (index + 1 >= len) {
len--;
index = filename.substr(0, len).find_last_of("/\\");
if (len == 0) {
return filename;
}
if (index == 0) {
return filename.substr(1, len - 1);
}
if (index == std::string::npos) {
return filename.substr(0, len);
}
return filename.substr(index + 1, len - index - 1);
}
return filename.substr(index + 1, len - index);
}
Tests:
#define CATCH_CONFIG_MAIN
#include <catch/catch.hpp>
TEST_CASE("basename")
{
CHECK(basename("") == "");
CHECK(basename("no_path") == "no_path");
CHECK(basename("with.ext") == "with.ext");
CHECK(basename("/no_filename/") == "no_filename");
CHECK(basename("no_filename/") == "no_filename");
CHECK(basename("/no/filename/") == "filename");
CHECK(basename("/absolute/file.ext") == "file.ext");
CHECK(basename("../relative/file.ext") == "file.ext");
CHECK(basename("/") == "/");
CHECK(basename("c:\\windows\\path.ext") == "path.ext");
CHECK(basename("c:\\windows\\no_filename\\") == "no_filename");
}
From C++ Docs - string::find_last_of
#include <iostream> // std::cout
#include <string> // std::string
void SplitFilename (const std::string& str) {
std::cout << "Splitting: " << str << '\n';
unsigned found = str.find_last_of("/\\");
std::cout << " path: " << str.substr(0,found) << '\n';
std::cout << " file: " << str.substr(found+1) << '\n';
}
int main () {
std::string str1 ("/usr/bin/man");
std::string str2 ("c:\\windows\\winhelp.exe");
SplitFilename (str1);
SplitFilename (str2);
return 0;
}
Outputs:
Splitting: /usr/bin/man
path: /usr/bin
file: man
Splitting: c:\windows\winhelp.exe
path: c:\windows
file: winhelp.exe
C++11 variant (inspired by James Kanze's version) with uniform initialization and anonymous inline lambda.
std::string basename(const std::string& pathname)
{
return {std::find_if(pathname.rbegin(), pathname.rend(),
[](char c) { return c == '/'; }).base(),
pathname.end()};
}
It does not remove the file extension though.
The boost filesystem library is also available as the experimental/filesystem library and was merged into ISO C++ for C++17. You can use it like this:
#include <iostream>
#include <experimental/filesystem>
namespace fs = std::experimental::filesystem;
int main () {
std::cout << fs::path("/foo/bar.txt").filename() << '\n'
}
Output:
"bar.txt"
It also works for std::string objects.
this is the only thing that actually finally worked for me:
#include "Shlwapi.h"
CString some_string = "c:\\path\\hello.txt";
LPCSTR file_path = some_string.GetString();
LPCSTR filepart_c = PathFindFileName(file_path);
LPSTR filepart = LPSTR(filepart_c);
PathRemoveExtension(filepart);
pretty much what Skrymsli suggested but doesn't work with wchar_t*,
VS Enterprise 2015
_splitpath worked as well, but I don't like having to guess at how many char[?] characters I'm going to need; some people probably need this control, i guess.
CString c_model_name = "c:\\path\\hello.txt";
char drive[200];
char dir[200];
char name[200];
char ext[200];
_splitpath(c_model_name, drive, dir, name, ext);
I don't believe any includes were needed for _splitpath. No external libraries (like boost) were needed for either of these solutions.
std::string getfilename(std::string path)
{
path = path.substr(path.find_last_of("/\\") + 1);
size_t dot_i = path.find_last_of('.');
return path.substr(0, dot_i);
}
I would do it by...
Search backwards from the end of the string until you find the first backslash/forward slash.
Then search backwards again from the end of the string until you find the first dot (.)
You then have the start and end of the file name.
Simples...
You can use the std::filesystem to do it quite nicely:
#include <filesystem>
namespace fs = std::experimental::filesystem;
fs::path myFilePath("C:\\MyDirectory\\MyFile.bat");
fs::path filename = myFilePath.stem();
m_szFilePath.MakeLower();
CFileFind finder;
DWORD buffSize = MAX_PATH;
char longPath[MAX_PATH];
DWORD result = GetLongPathName(m_szFilePath, longPath, MAX_PATH );
if( result == 0)
{
m_bExists = FALSE;
return;
}
m_szFilePath = CString(longPath);
m_szFilePath.Replace("/","\\");
m_szFilePath.Trim();
//check if it does not ends in \ => remove it
int length = m_szFilePath.GetLength();
if( length > 0 && m_szFilePath[length - 1] == '\\' )
{
m_szFilePath.Truncate( length - 1 );
}
BOOL bWorking = finder.FindFile(this->m_szFilePath);
if(bWorking){
bWorking = finder.FindNextFile();
finder.GetCreationTime(this->m_CreationTime);
m_szFilePath = finder.GetFilePath();
m_szFileName = finder.GetFileName();
this->m_szFileExtension = this->GetExtension( m_szFileName );
m_szFileTitle = finder.GetFileTitle();
m_szFileURL = finder.GetFileURL();
finder.GetLastAccessTime(this->m_LastAccesTime);
finder.GetLastWriteTime(this->m_LastWriteTime);
m_ulFileSize = static_cast<unsigned long>(finder.GetLength());
m_szRootDirectory = finder.GetRoot();
m_bIsArchive = finder.IsArchived();
m_bIsCompressed = finder.IsCompressed();
m_bIsDirectory = finder.IsDirectory();
m_bIsHidden = finder.IsHidden();
m_bIsNormal = finder.IsNormal();
m_bIsReadOnly = finder.IsReadOnly();
m_bIsSystem = finder.IsSystem();
m_bIsTemporary = finder.IsTemporary();
m_bExists = TRUE;
finder.Close();
}else{
m_bExists = FALSE;
}
The variable m_szFileName contains the fileName.
Dont use _splitpath() and _wsplitpath(). They are not safe, and they are obsolete!
Instead, use their safe versions, namely _splitpath_s() and _wsplitpath_s()
This should work too :
// strPath = "C:\\Dir\\File.bat" for example
std::string getFileName(const std::string& strPath)
{
size_t iLastSeparator = 0;
return strPath.substr((iLastSeparator = strPath.find_last_of("\\")) != std::string::npos ? iLastSeparator + 1 : 0, strPath.size() - strPath.find_last_of("."));
}
If you can use it, Qt provide QString (with split, trim etc), QFile, QPath, QFileInfo etc to manipulate files, filenames and directories. And of course it's also cross plaftorm.
A really simple and short function that returns the filename+path that I made which uses no dependencies:
const char* GetFileNameFromPath(const char* _buffer)
{
char c;
int i;
for (i = 0; ;++i) {
c = *((char*)_buffer+i);
if (c == '\\' || c == '/')
return GetFileNameFromPath((char*)_buffer + i + 1);
if (c == '\0')
return _buffer;
}
return "";
}
To only get the filename without the extension you could change c == '\0' to c == '.'.
For long time I was looking for a function able to properly decompose file path. For me this code is working perfectly for both Linux and Windows.
void decomposePath(const char *filePath, char *fileDir, char *fileName, char *fileExt)
{
#if defined _WIN32
const char *lastSeparator = strrchr(filePath, '\\');
#else
const char *lastSeparator = strrchr(filePath, '/');
#endif
const char *lastDot = strrchr(filePath, '.');
const char *endOfPath = filePath + strlen(filePath);
const char *startOfName = lastSeparator ? lastSeparator + 1 : filePath;
const char *startOfExt = lastDot > startOfName ? lastDot : endOfPath;
if(fileDir)
_snprintf(fileDir, MAX_PATH, "%.*s", startOfName - filePath, filePath);
if(fileName)
_snprintf(fileName, MAX_PATH, "%.*s", startOfExt - startOfName, startOfName);
if(fileExt)
_snprintf(fileExt, MAX_PATH, "%s", startOfExt);
}
Example results are:
[]
fileDir: ''
fileName: ''
fileExt: ''
[.htaccess]
fileDir: ''
fileName: '.htaccess'
fileExt: ''
[a.exe]
fileDir: ''
fileName: 'a'
fileExt: '.exe'
[a\b.c]
fileDir: 'a\'
fileName: 'b'
fileExt: '.c'
[git-archive]
fileDir: ''
fileName: 'git-archive'
fileExt: ''
[git-archive.exe]
fileDir: ''
fileName: 'git-archive'
fileExt: '.exe'
[D:\Git\mingw64\libexec\git-core\.htaccess]
fileDir: 'D:\Git\mingw64\libexec\git-core\'
fileName: '.htaccess'
fileExt: ''
[D:\Git\mingw64\libexec\git-core\a.exe]
fileDir: 'D:\Git\mingw64\libexec\git-core\'
fileName: 'a'
fileExt: '.exe'
[D:\Git\mingw64\libexec\git-core\git-archive.exe]
fileDir: 'D:\Git\mingw64\libexec\git-core\'
fileName: 'git-archive'
fileExt: '.exe'
[D:\Git\mingw64\libexec\git.core\git-archive.exe]
fileDir: 'D:\Git\mingw64\libexec\git.core\'
fileName: 'git-archive'
fileExt: '.exe'
[D:\Git\mingw64\libexec\git-core\git-archiveexe]
fileDir: 'D:\Git\mingw64\libexec\git-core\'
fileName: 'git-archiveexe'
fileExt: ''
[D:\Git\mingw64\libexec\git.core\git-archiveexe]
fileDir: 'D:\Git\mingw64\libexec\git.core\'
fileName: 'git-archiveexe'
fileExt: ''
I hope this helps you also :)
shlwapi.lib/dll uses the HKCU registry hive internally.
It's best not to link to shlwapi.lib if you're creating a library or the product does not have a UI. If you're writing a lib then your code can be used in any project including those that don't have UIs.
If you're writing code that runs when a user is not logged in (e.g. service [or other] set to start at boot or startup) then there's no HKCU. Lastly, shlwapi are settlement functions; and as a result high on the list to deprecate in later versions of Windows.
A slow but straight forward regex solution:
std::string file = std::regex_replace(path, std::regex("(.*\\/)|(\\..*)"), "");
I implemented a function that might meet your needs.
It is based on string_view's constexpr function find_last_of (since c++17) which can be calculated at compile time
constexpr const char* base_filename(const char* p) {
const size_t i = std::string_view(p).find_last_of('/');
return std::string_view::npos == i ? p : p + i + 1 ;
}
//in the file you used this function
base_filename(__FILE__);
Here is the simplest version:
#include <iostream>
#include <string>
int main()
{
std::string filepath = "directory/file-name.txt";
std::string filename = filepath.substr(filepath.find_last_of("/")+1, filepath.find_last_of(".") - filepath.find_last_of("/") - 1);
std::cout << filename << std::endl;
}
Returns:
file-name
I have a file which is similar to /etc/passwd (semi-colon separated values), and need to extract all three values per line into variables then compare them to what have been given into the program. Here is my code:
typedef struct _UserModel UserModel;
struct _UserModel {
char username[50];
char email[55];
char pincode[30];
};
void get_user(char *username) {
ifstream io("test.txt");
string line;
while (io.good() && !io.eof()) {
getline(io, line);
if (line.length() > 0 && line.substr(0,line.find(":")).compare(username)==0) {
cout << "found user!\n";
UserModel tmp;
sscanf(line.c_str() "%s:%s:%s", tmp.username, tmp.pincode, tmp.email);
assert(0==strcmp(tmp.username, username));
}
}
}
I can't strcmp the values as the trailing '\0' mean the strings are different, so the assertion fails. I only really want to hold the memory for the values anyway and not use up memory that I don't need for these values. What do I need to change to get this to work..?
sscanf is so C'ish.
struct UserModel {
string username;
string email;
string pincode;
};
void get_user(char *username) {
ifstream io("test.txt");
string line;
while (getline(io, line)) {
UserModel tmp;
istringstream str(line);
if (getline(str, tmp.username, ':') && getline(str, tmp.pincode, ':') && getline(str, tmp.email)) {
if (username == tmp.username)
cout << "found user!\n";
}
}
}
If you are using c++, I would try to use std::string, iostreams and all those things that come with C++, but then again...
I understand that your problem is that one of the C strings is null terminated, while the other is not, and then the strcmp is stepping to the '\0' on one string, but the other has another value... if that is the only thing you want to change, use strncpy with the length of the string that is known.
Here's a complete example that does what I think you asked about.
Things you didn't ask for but it does anyway:
It uses exceptions to report data file format errors so that GetModelForUser() can simply return an object (instead of a boolean or something like that).
It uses a template function for splitting the line into fields. This really the heart of the original question and so it's a bit unfortunate that this is arguably over-complex. But the idea here of making it a template function is that this separates the concerns of splitting a string into fields from choosing a data structure to represent the result.
/* Parses a file of user data.
* The data file is of this format:
* username:email-address:pincode
*
* The pincode field is actually one-way-encrypted with a secret salt
* in order to avoid catastrophic loss of customer data when the file
* or a backup tape is lost/leaked/compromised. However, this code
* simply treats it as an opaque value.
*
* Internationalisation: this code assumes that the data file is
* encoded in the execution character set, whatever that is. This
* means that updates to the file must first transcode the
* username/mail-address/pincode data into the execution character
* set.
*/
#include <string>
#include <vector>
#include <fstream>
#include <iostream>
#include <iterator>
#include <exception>
const char* MODEL_DATA_FILE_NAME = "test.txt";
// This stuff should really go in a header file.
class UserUnknown : public std::exception { };
class ModelDataIsMissing : public std::exception { };
class InvalidModelData : public std::exception { }; // base: don't throw this directly.
class ModelDataBlankLine : public InvalidModelData { };
class ModelDataEmptyUsername : public InvalidModelData { };
class ModelDataWrongNumberOfFields : public InvalidModelData { };
class UserModel {
std::string username_;
std::string email_address_;
std::string pincode_;
public:
UserModel(std::string username, std::string email_address, std::string pincode)
: username_(username), email_address_(email_address), pincode_(pincode) {
}
UserModel(const UserModel& other)
: username_(other.username_),
email_address_(other.email_address_),
pincode_(other.pincode_) {
}
std::string GetUsername() const { return username_; }
std::string GetEmailAddress() const { return email_address_; }
std::string GetPincode() const { return pincode_; }
};
UserModel GetUserModelForUser(const std::string& username)
throw (InvalidModelData, UserUnknown, ModelDataIsMissing);
// This stuff is the implementation.
namespace { // use empty namespace for modularity.
template void SplitStringOnSeparator(
std::string input, char separator, ForwardIterator output)
{
std::string::const_iterator field_start, pos;
bool in_field = false;
for (pos = input.begin(); pos != input.end(); ++pos) {
if (!in_field) {
field_start = pos;
in_field = true;
}
if (*pos == separator) {
*output++ = std::string(field_start, pos);
in_field = false;
}
}
if (field_start != input.begin()) {
*output++ = std::string(field_start, pos);
}
}
}
// Returns a UserModel instance for the specified user.
//
// Don't call this more than once per program invocation, because
// you'll end up with quadratic performance. Instead modify this code
// to return a map from username to model data.
UserModel GetUserModelForUser(const std::string& username)
throw (InvalidModelData, UserUnknown, ModelDataIsMissing)
{
std::string line;
std::ifstream in(MODEL_DATA_FILE_NAME);
if (!in) {
throw ModelDataIsMissing();
}
while (std::getline(in, line)) {
std::vector<std::string> fields;
SplitStringOnSeparator(line, ':', std::back_inserter(fields));
if (fields.size() == 0) {
throw ModelDataBlankLine();
} else if (fields.size() != 3) {
throw ModelDataWrongNumberOfFields();
} else if (fields[0].empty()) {
throw ModelDataEmptyUsername();
} else if (fields[0] == username) {
return UserModel(fields[0], fields[1], fields[2]);
}
// We don't diagnose duplicate usernames in the file.
}
throw UserUnknown();
}
namespace {
bool Example (const char *arg)
{
const std::string username(arg);
try
{
UserModel mod(GetUserModelForUser(username));
std::cout << "Model data for " << username << ": "
<< "username=" << mod.GetUsername()
<< ", email address=" << mod.GetEmailAddress()
<< ", encrypted pin code=" << mod.GetPincode()
<< std::endl;
return true;
}
catch (UserUnknown) {
std::cerr << "Unknown user " << username << std::endl;
return false;
}
}
}
int main (int argc, char *argv[])
{
int i, returnval=0;
for (i = 1; i < argc; ++i)
{
try
{
if (!Example(argv[i])) {
returnval = 1;
}
}
catch (InvalidModelData) {
std::cerr << "Data file " << MODEL_DATA_FILE_NAME << " is invalid." << std::endl;
return 1;
}
catch (ModelDataIsMissing) {
std::cerr << "Data file " << MODEL_DATA_FILE_NAME << " is missing." << std::endl;
return 1;
}
}
return returnval;
}
/* Local Variables: /
/ c-file-style: "stroustrup" /
/ End: */
I don't see a problem with strcmp, but you have one in your sscanf format. %s will read upto the first non white character, so it will read the :. You probably want "%50[^:]:%55[^:]:%30s" as format string. I've added field size in order to prevent buffer overflow, but I could be off by one in the limit.