parsing user string, escape characters - c++

How can I parse a string and replace all occurences of a \. with something? Yet at the same time replace all \\ with \ (literal).. Examples:
hello \. world => hello "." world
hello \\. world=> hello \. world
hello \\\. world => hello \"." world
The first reaction was to use std::replace_if, as in the following:
bool escape(false);
std::replace_if(str.begin(), str.end(), [&] (char c) {
if (c == '\\') {
escape = !escape;
} else if (escape && c == '.') {
return true;
}
return false;
},"\".\"");
However that simply changes \. by \"." sequences. Also it won't be working for \\ parts in the staring.
Is there an elegant approach to this? Before I start doing a hack job with a for loop & rebuilding the string?

Elegant approach: a finite state machine with three states:
looking for '\' (iterating through string)
found '\' and next character is '.'
found '\' and next character is '\'
To implement you could use the iterators in the default string library and the replace method.
http://www.cplusplus.com/reference/string/string/replace/

Related

How can I remove all trailing backslashes from a string in Scala?

I want to remove all trailing backslashes ('\') from a string.
For example:
"ab" -> "ab"
"ab\\\\" -> "ab"
"\\\\ab\\" -> "\\\\ab"
"\\" -> ""
I am able to do this using below code but unable to handle the scenario where the String has only slash(es). Please let me know if this can be achieved through a different regex.
val str = """\\\\q\\"""
val regex = """^(.*[^\\])(\\+)$""".r
str match {
case regex(rest, slashes) => str.stripSuffix(slashes)
case _ => str
}
Converting my comment as an answer. This should work for removing all trailing backslashes:
str = str.replaceFirst("\\\\+$", "");
\\\\+ matches 1+ backslashes (single backslash is entered as \\\\ in Java/Scala).
While not a regex, I suggest a simpler solution : str.reverse.dropWhile(_ == '\\').reverse
Not using a regex, but you could use String.lastIndexWhere(p: (Char) ⇒ Boolean) to get the position of the last character which is not a '\' in order to substring until this character:
str.substring(0, str.lastIndexWhere(_ != '\\') + 1)
If, for some reason, you're committed to a regex solution, it can be done.
val regex = """[^\\]?(\\*)$""".r.unanchored
str match {
case regex(slashes) => str.stripSuffix(slashes)
}
You can do the same with slice function
str.slice(0,str.lastIndexWhere(_ != '\\')+1)

Regex to match tokens in a string using string.gmatch

I need a regex to use in string.gmatch that matches sequences of alphanumeric characters and non alphanumeric characters (quotes, brackets, colons and the like) as separated, single, matches, so basically:
str = [[
function test(arg1, arg2) {
dosomething(0x12f, "String");
}
]]
for token in str:gmatch(regex) do
print(token)
end
Should print:
function
test
(
arg1
,
arg2
)
{
dosomething
(
0x121f
,
"
String
"
)
;
}
How can I achieve this? In standard regex I've found that ([a-zA-Z0-9]+)|([\{\}\(\)\";,]) works for me but I'm not sure on how to translate this to Lua's regex.
local str = [[
function test(arg1, arg2) {
dosomething(0x12f, "String");
}
]]
for p, w in str:gmatch"(%p?)(%w*)" do
if p ~= "" then print(p) end
if w ~= "" then print(w) end
end
You need a workaround involving a temporary char that is not used in your code. E.g., use a § to insert it after the alphanumeric and non-alphanumeric characters:
str = str:gsub("%s*(%w+)%s*", "%1§") -- Trim chunks of 1+ alphanumeric characters and add a temp char after them
str = str:gsub("(%W)%s*", "%1§") -- Right trim the non-alphanumeric char one by one and add the temp char after each
for token in str:gmatch("[^§]+") do -- Match chunks of chars other than the temp char
print(token)
end
See this Lua demo
Note that %w in Lua is an equivalent of JS [a-zA-Z0-9], as it does not match an underscore, _.

(Vim regex) Following by anything except bracket character

Test string:
best.string_a = true;
best.string_b + bad.string_c;
best.string_d ();
best.string_e );
I want to catch string that after '.' and followed by anything except '('. My expression:
\.\#<=[_a-z]\+\(\s*[^(]\)\#=
I want :
string_a
string_b
string_c
string_e
But it doesn't work and result :
string_a
string_b
string_c
string_d
string_e
I am new to vim regex and i dont know why :(
Make this \.\#<=\<[_a-z]\+\>\(\s*(\)\#!
This matches:
\.\#<= Assure a dot is in front of the match followed by
\<[_a-z]\+\> A word containing only lowercase or '_' chars
\(\s*(\)\#! not followed by (any amount of spaces in front of a '(')
this would work for your needs too:
\.\zs[_a-z]\+\>\ze\s*[^( ]

Is it possible to return "weird" characters in a char?

I would like to know is it possbile to return "weird" characters, or rather ones that are important to the language
For example: \ ; '
I would like to know that because I need to return them by one function that's checking the unicode value of the text key, and is returning the character by it's number, I need these too.
I get a 356|error: missing terminating ' character
Line 356 looks as following
return '\';
Ideas?
The backslash is an escape for special characters. If you want a literal backslash you have to escape it with another backslash. Try:
return '\\';
The only problem here is that a backslash is used to escape characters in a literal. For example \n is a new line, \t is a horizontal tab. In your case, the compiler is seeing \' and thinking you mean a ' character (this is so you could have the ' character like so: '\''). You just need to escape your backslash:
return '\\';
Despite this looking like a character literal with two characters in it, it's not. \\ is an escape sequence which represents a single backslash.
Similarly, to return a ', you would do:
return '\'';
The list of available escape sequences are given by Table 7:
You can have a character literal containing any character from the execution character set and the resulting char will have the value of that character. However, if the value does not fit in a char, it will have implementation-defined value.
Any character can be returned.
Yet for some of them, you have to escape it using backslash: \.
So for returning backslash, you have to return:
return '\\';
To get a plain backslash use '\\'.
In C the following characters are represented using a backslash:
\a or \A : A bell
\b or \B : A backspace
\f or \F : A formfeed
\n or \N : A new line
\r or \R : A carriage return
\t or \T : A horizontal tab
\v or \V : A vertical tab
\xhh or \Xhh : A hexadecimal bit pattern
\ooo : An octal bit pattern
\0 : A null character
\" : The " character
\' : The ' character
\\ : A backslash (\)
A plain backslash confuses the system because it expects a character to follow it. Thus, you need to "escape" it. The octal/hexadecimal bit patterns may not seem too useful at first, but they let you use ANSI escape codes.
If the character following the backslash does not specify a legal escape sequence, as shown above, the result is implementation defined, but often the character following the backslash is taken literally, as though the escape were not present.
If you have to return such characters(",',\,{,]...etc) more then once, you should write a function that escapes that characters. I wrote that function once and it is:
function EscapeSpecialChars (_data) {
try {
if (!GUI_HELPER.NOU(_data)) {
return _data;
}
if (typeof (_data) != typeof (Array)) {
return _data;
}
while (_data.indexOf("
") > 0) {
_data = _data.replace("
", "");
}
while (_data.indexOf("\n") > 0) {
_data = _data.replace("\n", "\\n");
}
while (_data.indexOf("\r") > 0) {
_data = _data.replace("\r", "\\r");
}
while (_data.indexOf("\t") > 0) {
_data = _data.replace("\t", "\\t");
}
while (_data.indexOf("\b") > 0) {
_data = _data.replace("\b", "\\b");
}
while (_data.indexOf("\f") > 0) {
_data = _data.replace("\f", "\\f");
}
return _data;
} catch (err) {
alert(err);
}
},
then use it like this:
return EscapeSpecialChars("\'"{[}]");
You should improve the function. It was working for me, but it is not escaping all special characters.

How do I add a backslash after every character in a string?

I need to transform a literal filepath (C:/example.txt) to one that is compatible with the various WinAPI Registry functions (C://example.txt) and I have no idea on how to go about doing it.
I've broken it down to having to add a backslash after a certain character (/ in this case) but i'm completely stuck after that.
Guidance and Code Examples will be greatly appreciated.
I'm using C++ and VS2012.
In C++, strings are made up of individual characters, like "foo". Strings can be composed of printable characters, such as the letters of the alphabet, or non-printable characters, such as the enter key or other control characters.
You cannot type one of these non-printable characters in the normal way when populating a string. For example, if you want a string that contains "foo" then a tab, and then "bar", you can't create this by typing:
fooTABbar
because this will simply insert that many spaces -- it won't actually insert the TAB character.
You can specify these non-printable characters by "escaping" them out. This is done by inserting a back slash character (\) followed by the character's code. In the case of the string above TAB is represented by the escape sequence \t, so you would write: "foo\tbar".
The character \ is not itself a non-printable character, but C++ (and C) recognize it to be special -- it always denotes the beginning of an escape sequence. To include the character "\" in a string, it has to itself be escaped, with \\.
So in C++ if you want a string that contains:
c:\windows\foo\bar
You code this using escape sequences:
string s = "c:\\windows\\foo\\bar"
\\ is not two chars, is one char:
for(size_t i = 0, sz = sPath.size() ; i < sz ; i++)
if(sPath[i]=='/') sPath[i] = '\\';
But be aware that some APIs work with \ and some with /, so you need to check in which cases to use this replacement.
If replacing every occurrence of a forward slash with two backslashes is really what you want, then this should do the job:
size_t i = str.find('/');
while (i != string::npos)
{
string part1 = str.substr(0, i);
string part2 = str.substr(i + 1);
str = part1 + R"(\\)" + part2; // Use "\\\\" instead of R"(\\)" if your compiler doesn't support C++11's raw string literals
i = str.find('/', i + 1);
}
EDIT:
P.S. If I misunderstood the question and your intention is actually to replace every occurrence of a forward slash with just one backslash, then there is a simpler and more efficient solution (as #RemyLebeau points out in a comment):
size_t i = str.find('/');
while (i != string::npos)
{
str[i] = '\\';
i = str.find('/', i + 1);
}
Or, even better:
std::replace_if(str.begin(), str.end(), [] (char c) { return (c == '/'); }, '\\');