Removing certain characters at the beginning of string - c++

Here's the usecase.
I have got a string with relative path to the folder. It's format may vary a little bit depending on where it came from (I am dealing with exported files from difference software).
For example: ./path/to/folder, /path/to/folder, path/to/folder.
What I need to do is to delete all the characters '.', '/' from the beginning of the string. Of course I can just do this manually in a for loop, but I thought maybe there's some kind of stl function exactly for such use-cases.

I thought maybe there's some kind of stl function exactly for such use-cases
#include <regex>
const std::string src("./path/to/folder");
static const std::regex re("^\\.?\\/?");
const std::string result = std::regex_replace(src, re, "");
If you need more efficiency than what <regex> provides, do it manually.

Related

Stripping characters of a wxString

I am working on an application written in C++ that primarily uses wxWidgets for its objects. Now, suppose I have the following wxString variable:
wxString path = "C:\\Program Files\\Some\\Path\\To\\A\\Directory\\";
Are there any ways to remove the trailing slashes? While wxString provides a Trim() method, it only applies to whitespace characters. I could think of converting the string to another string type and perform the stripping there and switch back to the wxString type (it is essential that I use the wxString type) but if there is a less convoluted way of doing things, I'd prefer that.
The others have mentioned how this can be achieved using wxString methods, however I would strongly advise using an appropriate class, i.e. either wxFileName or, maybe, std::filesystem::path, for working with paths instead of raw strings. E.g. to get a canonical representation of the path in your case I would use wxFileName::DirName(path).GetFullPath().
This is what I would use, if I had no proper path-parsing alternative:
wxString& remove_trailing_backslashes(wxString& path)
{
auto inb = path.find_last_not_of(L'\\');
if(inb != wxString::npos)
path.erase(inb + 1); //inb + 1 <= size(), valid for erase()
else //empty string or only backslashes
path.clear();
return path; //to allow chaining
}
Notes:
Unless you're doing something unusual, wxString stores wchar_ts internally, so it makes sense to use wide string and character literals (prefixed with L) to avoid unnecessary conversions.
Even in the unusual case when you'd have strings encoded in UTF-8, the code above still works, as \ is ASCII, so it cannot appear in the encoding of another code point (the L prefix wouldn't apply anymore in this case, of course).
Even if you're forced to use wxString, I suggest you try to use its std::basic_string-like interface whenever possible, instead of the wx-specific functions. The code above works fine if you replace wxString with std::wstring.
In support of what VZ. said in his answer, note that all these simplistic string-based solutions will strip C:\ to C:, and \ to the empty string, which may not be what you want. To avoid such issues, I would go for the Boost.Filesystem library, which is, as far as I know, the closest to the proposed standard library filesystem functionality (which is not formally part of the standard yet, but very close).
For completeness, here's what it would look like using Boost.Filesystem:
wxString remove_trailing_backslashes(const wxString& arg)
{
using boost::filesystem::path;
static const path dotp = L".";
path p = arg.wc_str();
if(p.filename() == dotp)
p.remove_filename();
return p.native();
}
It's not as efficient as the ad-hoc solution above, mainly because the string is not modified in-place, but more resilient to problems caused by special path formats.
path.erase(path.end() - 1);
or
path.RemoveLast();
My use case actually also considers scenarios without trailing slashes.
I came up with two solutions. The first makes use of regular expression:
wxRegEx StripRegex("(.+?)\\\\*$", wxRE_ADVANCED);
if (StripRegex.Matches(path))
{
path = StripRegex.GetMatch(path,1);
}
The second, as #catalin suggested, uses RemoveLast:
while (path.EndsWith("\\"))
{
path.RemoveLast();
}
Edit: Using #VZ's suggestion, I came up with the following:
// for some reason, the 'Program Files' part get's taken out in the resulting string
// so I have to first replace the double slashes
path.Replace("\\\\","\\");
path = wxFileName::DirName(path).GetPath();

Replacing instances of a given std::string with another std::string in C++

I have been looking online without success for something that does the following. I have some ugly string returned as part of a Betfair SOAP response that uses a number of different char delimiters to identify certain parts of the information. What makes it awkward is that they are not always just one character length. Specifically, I need to split a string at ':' characters, but only after I have first replaced all instances of "\\:" with my personal flag "-COLON-" (which must then be replaced again AFTER the first split).
Basically I need all portions of a string like this
"6(2.5%,11\:08)~true~5.0~1162835723938~"
to become
"6(2.5%,11-COLON-08)~true~5.0~1162835723938~
In perl it is (from memory)
$mystring =~ s/\\:/-COLON-/g;
I have been looking for some time at the functions of std::string, specifically std::find and std::replace and I know that I can code up how to do what I need using these basic functions, but I was wondering if there was a function in the standard library (or elsewhere) that already does this??
boost::replace_all(input_string, "\\:", "-COLON-");
If you have C++11 something like this ought to do the trick:
#include <string>
#include <regex>
int main()
{
std::string str("6(2.5%,11\\:08)~true~5.0~1162835723938~");
std::regex rx("\\:");
std::string fmt("-COLON-");
std::regex_replace(str, rx, fmt);
return 0;
}
Edit: There is an optional fourth parameter for the type of match as well which can be anything found in std::regex_constants namespace I do believe. For example replacing only the first occurrence of the regular expression match with the supplied format.

How to determine a character within a string in c++

I have a variable named "String" that may have values like the following ones:
const char* String = "/v1/AUTH_abb52a71-fc76-489b-b56b-732b66bf50b1/test/DSC_0188.JPG";
or
const char* String = "/auth/v1.0";
or
const char* String = "/v2/AUTH_abb52a71-fc76-489b-b56b-732b66bf50b1/images?limit=1000&delimiter=/&format=xml";
Now I want to make sure whether or not "String" has the character 'v1'. Checking this has to be precise. I tried with strchr, but it's not quite accurate as it doesn't take 'v1' as one character, it rather takes 'v' and '1' as two separate characters. Moreover I can't use namepace std and library string, I can only use "string.h". Within these limitations how can I accurately check whether the variable "String" has a character 'v1'?
Thank you.
I want to make sure whether or not "String" has the character 'v1'
I can only use "string.h"
Then you probably want strstr. Also v1 is not a character, it's a string.
Side note: why use cstring in C++ ? What kind of teacher is still calling it string.h ?!
v1 is not character according to any alphabet. This is a proper string "v1" and as #cnicutar mentioned the c way to search for string in string is to use strstr. It is quite easy to use and runs KMP which is also very fast (though for the kind of your string it is not that crucial).
I would advise you to:
always name your variables starting small-caps (i.e. String -> my_string)
declare your string as const char[], no need to interfere with pointers, when you can avoid them. Declaring this as pointer might confuse you, that you dynamically allocated the memory for the string.

boost split with a single character or just one string

I wish to split a string on a single character or a string. I would like to use boost::split since boost string is our standard for basic string handling (I don't wish to mix several techniques).
In the single character case I could do split(vec,str,is_any_of(':')) but I'd like to know if there is a way to specify just a single character. It may improve performance, but more importantly I think the code would be clearer with just a single character, since is_any_of conveys a different meaning that what I want.
For matching against a string I don't know what syntax to use. I don't wish to to construct a regex; some simple syntax like split(vec,str,match_str("::") would be good.
I was looking for the same answer but I couldn't find one. Finally I managed to produce one on my own.
You can use std::equal_to to form the predicate you need. Here's an example:
boost::split(container, str, std::bind1st(std::equal_to<char>(), ','));
This is exactly how I do it when I need to split a string using a single character.
In the following code, let me assume using namespace boost for brevity.
As for splitting on a character, if only algorithm/string is allowed,
is_from_range might serve the purpose:
split(vec,str, is_from_range(':',':'));
Alternatively, if lambda is allowed:
split(vec,str, lambda::_1 == ':');
or if preparing a dedicated predicate is allowed:
struct match_char {
char c;
match_char(char c) : c(c) {}
bool operator()(char x) const { return x == c; }
};
split(vec,str, match_char(':'));
As for matching against a string, as David Rodri'guez mentioned,
there seems not to be the way with split.
If iter_split is allowed, probably the following code will meet the purpose:
iter_split(vec,str, first_finder("::"));
On the simple token, I would just leave is_any_of as it is quite easy to understand what is_any_of( single_option ) means. If you really feel like changing it, the third element is a functor, so you could pass an equals functor to the split function.
That approach will not really work with multiple tokens, as the iteration is meant to be characater by character. I don't know the library enough to offer prebuilt alternatives, but you can implement the functionality on top of split_iterators

Overloading a method on default arguments

Is it possible to overload a method on default parameters?
For example, if I have a method split() to split a string, but the string has two delimiters, say '_' and "delimit". Can I have two methods something like:
split(const char *str, char delim = ' ')
and
split(const char *str, const char* delim = "delimit");
Or, is there a better way of achieving this? Somehow, my brain isn't working now and am unable to think of any other solution.
Edit: The problem in detail:
I have a string with two delimiters, say for example, nativeProbableCause_Complete|Alarm|Text. I need to separate nativeProbableCause and Complete|Alarm|Text and then further, I need to separate Complete|Alarm|Text into individual words and join them back with space as a separator sometime later (for which I already have written a utility and isn't a big deal). It is only the separation of the delimited string that is troubling me.
No, you cant - if you think about it, the notion of a default means 'use this unless I say otherwise'. If the compiler has 2 options for a default, which one will it choose?
How about implementing as 2 different methods like
split_with_default_delimiter_space
split_with_default_delimiter_delimit
Personally I'd prefer using something like this (more readable.. intent conveying) over the type of overloading that you mentioned... even if it was somehow possible for the compiler to do that.
Why not just call split() twice and explicitly pass the delimiter the second time? Will delimiters always be single characters?
Do you perform any other processing on the 2nd set of words before joining them? If not, then for the second task what you really want to do is replace substrings. This is most easily done with std::string::find and std::string::replace. If you must use c-strings, you could use strstr/strchr/strpbrk, strcpy and strcat, or use just strstr/strchr/strpbrk and join them in place.
You could use a version of split that accepts a variable number of delimiters (split(const char*,vector<string>), if you want to split(const char*, const char**)) or just use Boost Tokenizer.