Validator pattern not working Regex [duplicate] - regex

In Javascript, when I put a backslash in some variables like:
var ttt = "aa ///\\\";
var ttt = "aa ///\";
Javascript shows an error.
If I try to restrict user in entering this character, I also get an error:
(("aaa ///\\\").indexOf('"') != -1)
Restricting backslashes from user input is not a good strategy, because you have to show an annoying message to the user.
Why am I getting an error with backslash?

The backslash (\) is an escape character in Javascript (along with a lot of other C-like languages). This means that when Javascript encounters a backslash, it tries to escape the following character. For instance, \n is a newline character (rather than a backslash followed by the letter n).
In order to output a literal backslash, you need to escape it. That means \\ will output a single backslash (and \\\\ will output two, and so on). The reason "aa ///\" doesn't work is because the backslash escapes the " (which will print a literal quote), and thus your string is not properly terminated. Similarly, "aa ///\\\" won't work, because the last backslash again escapes the quote.
Just remember, for each backslash you want to output, you need to give Javascript two.

You may want to try the following, which is more or less the standard way to escape user input:
function stringEscape(s) {
return s ? s.replace(/\\/g,'\\\\').replace(/\n/g,'\\n').replace(/\t/g,'\\t').replace(/\v/g,'\\v').replace(/'/g,"\\'").replace(/"/g,'\\"').replace(/[\x00-\x1F\x80-\x9F]/g,hex) : s;
function hex(c) { var v = '0'+c.charCodeAt(0).toString(16); return '\\x'+v.substr(v.length-2); }
}
This replaces all backslashes with an escaped backslash, and then proceeds to escape other non-printable characters to their escaped form. It also escapes single and double quotes, so you can use the output as a string constructor even in eval (which is a bad idea by itself, considering that you are using user input). But in any case, it should do the job you want.

You have to escape each \ to be \\:
var ttt = "aa ///\\\\\\";
Updated: I think this question is not about the escape character in string at all. The asker doesn't seem to explain the problem correctly.
because you had to show a message to user that user can't give a name which has (\) character.
I think the scenario is like:
var user_input_name = document.getElementById('the_name').value;
Then the asker wants to check if user_input_name contains any [\]. If so, then alert the user.
If user enters [aa ///\] in HTML input box, then if you alert(user_input_name), you will see [aaa ///\]. You don't need to escape, i.e. replace [\] to be [\\] in JavaScript code. When you do escaping, that is because you are trying to make of a string which contain special characters in JavaScript source code. If you don't do it, it won't be parsed correct. Since you already get a string, you don't need to pass it into an escaping function. If you do so, I am guessing you are generating another JavaScript code from a JavaScript code, but it's not the case here.
I am guessing asker wants to simulate the input, so we can understand the problem. Unfortunately, asker doesn't understand JavaScript well. Therefore, a syntax error code being supplied to us:
var ttt = "aa ///\";
Hence, we assume the asker having problem with escaping.
If you want to simulate, you code must be valid at first place.
var ttt = "aa ///\\"; // <- This is correct
// var ttt = "aa ///\"; // <- This is not.
alert(ttt); // You will see [aa ///\] in dialog, which is what you expect, right?
Now, you only need to do is
var user_input_name = document.getElementById('the_name').value;
if (user_input_name.indexOf("\\") >= 0) { // There is a [\] in the string
alert("\\ is not allowed to be used!"); // User reads [\ is not allowed to be used]
do_something_else();
}
Edit: I used [] to quote text to be shown, so it would be less confused than using "".

The backslash \ is reserved for use as an escape character in Javascript.
To use a backslash literally you need to use two backslashes
\\

If you want to use special character in javascript variable value, Escape Character (\) is required.
Backslash in your example is special character, too.
So you should do something like this,
var ttt = "aa ///\\\\\\"; // --> ///\\\
or
var ttt = "aa ///\\"; // --> ///\
But Escape Character not require for user input.
When you press / in prompt box or input field then submit, that means single /.

Related

Powershell find non printing characters [duplicate]

i would appreciate your help on this, since i do not know which range of characters to use, or if there is a character class like [[:cntrl:]] that i have found in ruby?
by means of non printable, i mean delete all characters that are not shown in ie output, when one prints the input string. Please note, i look for a c# regex, i do not have a problem with my code
You may remove all control and other non-printable characters with
s = Regex.Replace(s, #"\p{C}+", string.Empty);
The \p{C} Unicode category class matches all control characters, even those outside the ASCII table because in .NET, Unicode category classes are Unicode-aware by default.
Breaking it down into subcategories
To only match basic control characters you may use \p{Cc}+, see 65 chars in the Other, Control Unicode category. It is equal to a [\u0000-\u0008\u000E-\u001F\u007F-\u0084\u0086-\u009F \u0009-\u000D \u0085]+ regex.
To only match 161 other format chars including the well-known soft hyphen (\u00AD), zero-width space (\u200B), zero-width non-joiner (\u200C), zero-width joiner (\u200D), left-to-right mark (\u200E) and right-to-left mark (\u200F) use \p{Cf}+. The equivalent including astral place code points is a (?:[\xAD\u0600-\u0605\u061C\u06DD\u070F\u08E2\u180E\u200B-\u200F\u202A-\u202E\u2060-\u2064\u2066-\u206F\uFEFF\uFFF9-\uFFFB]|\uD804[\uDCBD\uDCCD]|\uD80D[\uDC30-\uDC38]|\uD82F[\uDCA0-\uDCA3]|\uD834[\uDD73-\uDD7A]|\uDB40[\uDC01\uDC20-\uDC7F])+ regex.
To match 137,468 Other, Private Use control code points you may use \p{Co}+, or its equivalent including astral place code points, (?:[\uE000-\uF8FF]|[\uDB80-\uDBBE\uDBC0-\uDBFE][\uDC00-\uDFFF]|[\uDBBF\uDBFF][\uDC00-\uDFFD])+.
To match 2,048 Other, Surrogate code points that include some emojis, you may use \p{Cs}+, or [\uD800-\uDFFF]+ regex.
You can try with :
string s = "Täkörgåsmrgås";
s = Regex.Replace(s, #"[^\u0000-\u007F]+", string.Empty);
Updated answer after comments:
Documentation about non-printable character:
https://en.wikipedia.org/wiki/Control_character
Char.IsControl Method:
https://msdn.microsoft.com/en-us/library/system.char.iscontrol.aspx
Maybe you can try:
string input; // this is your input string
string output = new string(input.Where(c => !char.IsControl(c)).ToArray());
To remove all control and other non-printable characters
Regex.Replace(s, #"\p{C}+", String.Empty);
To remove the control characters only (if you don't want to remove the emojis 😎)
Regex.Replace(s, #"\p{Cc}+", String.Empty);
you can try this:
public static string TrimNonAscii(this string value)
{
string pattern = "[^ -~]*";
Regex reg_exp = new Regex(pattern);
return reg_exp.Replace(value, "");
}

Replace every " with \" in Lua

X-Problem: I want to dump an entire lua-script to a single string-line, which can be compiled into a C-Program afterwards.
Y-Problem: How can you replace every " with \" ?
I think it makes sense to try something like this
data = string.gsub(line, "c", "\c")
where c is the "-character. But this does not work of course.
You need to escape both quotes and backslashes, if I understand your Y problem:
data = string.gsub(line, "\"", "\\\"")
or use the other single quotes (still escape the backslash):
data = string.gsub(line, '"', '\\"')
A solution to your X-Problem is to safely escape any sequence that could interfere with the interpreter.
Lua has the %q option for string.format that will format and escape the provided string in such a way, that it can be safely read back by Lua. It should be also true for your C interpreter.
Example string: This \string's truly"tricky
If you just enclosed it in either single or double-quotes, there'd still be a quote that ended the string early. Also there's the invalid escape sequence \s.
Imagine this string was already properly handled in Lua, so we'll just pass it as a parameter:
string.format("%q", 'This \\string\'s truly"tricky')
returns (notice, I used single-quotes in code input):
"This \\string's truly\"tricky"
Now that's a completely valid Lua string that can be written and read from a file. No need to manually escape every special character and risk implementation mistakes.
To correctly implement your Y approach, to escape (invalid) characters with \, use proper pattern matching to replace the captured string with a prefix+captured string:
string.gsub('he"ll"o', "[\"']", "\\%1") -- will prepend backslash to any quote

Processing a string with the null character

I have a text file full of strings (computer paths) which I want to process by replacing every backslash with an underscore, in addition to replacing every number ( integer or float) with an underscore as well, the original string looks like that :
string = "\Software\Microsoft\0\Windows\CurrentVersion\Internet Settings\5.0\Cache"
Usually, I could replace easily the backslash with the following command:
string=string.replace('\\','_')
and apply some regular expressions such as: '(\d(?:\.\d)?)' to replace the numbers.
However in my case I couldn't do either, because python recognise always '\0' as a null character and '\5.0' as ENQ, in fact any number follow the backslash will be treated the same way as well.
Any suggested way to replace them ?
e.g. is there a way to convert my string to raw string as a start ?
Always remember: Backslash(\) escapes special characters. If you want to use the backslash itself, you need to escape it too. Your string should look like this:
string = "\\Software\\Microsoft\\0\\Windows\\CurrentVersion\\Internet Settings\\5.0\\Cache"

Double-escaping regex from inside a Groovy expression

Note: I had to simplify my actual use case to spare SO a lot of backstory. So if your first reaction to this question is: why would you ever do this, trust me, I just need to.
I'm trying to write a Groovy expression that replaces double-quotes (""") that appear in a string with single-quotes ("'").
// BEFORE: Replace my "double" quotes with 'single' quotes.
String toReplace = "Replace my \"double-quotes\" with 'single' quotes.";
// Wrong: compiler error
String replacerExpression = "toReplace.replace(""", "'");";
Binding binding = new Binding();
binding.setVariable("toReplace", toReplace);
GroovyShell shell = new GroovyShell(binding);
// AFTER: Replace my 'double' quotes with 'single' quotes.
String replacedString = (String)shell.evaluate(replacerExpression);
The problem is, I'm getting a compile error on the line where I assign replacerExpression:
Syntax error on token ""toReplace.replace("", { expected
I think it's because I need to escape the string that contains the double-quote character (""") but since it's a string-inside-a-string, I'm not sure how to properly escape it here. Any ideas?
You need to escape the quote within quotes in this line:
String replacerExpression = "toReplace.replace(""", "'");";
The string will be evaluated twice: once as a string literal, and once as a script. This means you have to escape it with a backslash, and escape the backslash too. Also, with the embedded quotes, it'll be much more readable if you use triple quotes.
Try this (in groovy):
String replacerExpression = """toReplace.replace("\\"", "'");""";
In Java, you're stuck with using backslashes to escape all the quotes and the embedded backslash:
String replacerExpression = "toReplace.replace(\"\\\"\", \"\'\");";
Triple-quotes work well, but one can also use single-quoted string to specify a double-quote, and a double-quoted string for a single-quote.
Consider this:
String toReplace = "Replace my \"double-quotes\" with 'single' quotes."
// key line:
String replacerExpression = """toReplace.replace('"', "'");"""
Binding binding = new Binding(); binding.setVariable("toReplace", toReplace)
GroovyShell shell = new GroovyShell(binding)
String replacedString = (String)shell.evaluate(replacerExpression)
That is, after the string literal evaluation, this is evaluated in the Groovy shell:
toReplace.replace('"', "'");
If that is too hard on the eyes, replace the "key line" above with another style (using slashy strings):
String ESC_DOUBLE_QUOTE = /'"'/
String ESC_SINGLE_QUOTE = /"'"/
String replacerExpression = """toReplace.replace(${ESC_DOUBLE_QUOTE}, ${ESC_SINGLE_QUOTE});"""
Please try to use regular expressions to solve this kind of problems, instead of messing your head to tackle the escaping of quotes.
I have put up a solution using groovy console. Please see if that helps.

Why doesn't my Regular Expression for an empty string work?

I've tested this on Regexr and it works, but it doesn't seem to work in AS3:
var emptyResult:* = new RegExp("^\s*$", "gi").exec(myField.text);
or
var emptyResult:* = /^\s*$/gi.exec(myField.text);
No matter whether I have text in the field or not, whitespace or non-whitespace, emptyResult is always null. I've tried with and without the g and i tags, but nothing seems to work.
Anyone know why this might be?
You don't need the 'i' flag - it stands for 'ignore case', which is only applicable to letters of Latin alphabet - you aren't using them.
In the first example, you need to escape the backslash, otherwise it's treated as if it was meant to escape the following letter 's'.
You don't need the 'g' flag either, since you are trying to test the entire string (in your case the line and the string are the same thing, \s will first encounter the end of the line, before $ can).
When using your second regexp, however, with the 'i' flag removed, it gives me the results I'd expect, i.e. if the entire text of the tested string consists of white space, tabulation, carriage return or line feed, then that entire string is returned.
For instance:
trace(/^\s*$/.exec(" \t\r\n")[0].length); // 4