Correcting Wrong Marker Folding in VIM

Correcting Wrong Marker Folding in VIM - regex

I mistakenly did marker folding to my .vimrc:
{{{8 #CS
something..
}}}8
{{{9 #Math
...
}}}9
... many more!
I need to switch the format to "#SOMETHING {{{NUMBER" like:
#CS {{{8
something..
}}}8
#Math {{{9
...
}}}9
... many more!
What is wrong in the following code:
:%s$/({{{\d/) /(#[:alpha:]/)$\2 \1$g
[Solution]
%s$\({{{\d\) \(#[[:alnum:]]*\)$\2 \1$g

You forgot to escape the parentheses, and the POSIX character classes are only valid within a character class [[:alpha:]]:
:%s$/\({{{\d/\) /\(#[[:alpha:]]/\)$\2 \1$g
Note, however, that your example text doesn't contain any slashes - is this what your sample text is actually like?
The above regex changes this
/{{{8/ /#A/
To this
#A/ {{{8/

:%s/{{{\(\d\) \(.*\)/\2 {{{\1/g
it works, but in your regex I don't understand why do you got a $ after s.

Related

Scala. Regexp can't remove symbol ^

I need split sentence to words removing redundant characters.
I prepared regexp for that:
val wordCharacters = """[^A-z'\d]""".r
right now I have rule which can be used to handle task in next way:
wordCharacters.split(words)
.filterNot(_.isEmpty)
where words any sentence I need to parse.
But issue is that in case I try to handle "car: carpet, as,,, java: javascript!!&#$%^&" I get one more word ^. Trying to change my regex and without ^ I'm getting much more issues for different cases...
Is any ideas how to solve it?
P.S.
If somebody want to play with it try link or code below please:
val wordCharacters = """[^A-z'\d]""".r
val stringToInt =
wordCharacters.split("car: carpet, as,,, java: javascript!!&#$%^&")
.filterNot(_.isEmpty)
.toList
println(stringToInt)
Expected result is:
List(car, carpet, as, java, javascript)

The part A-z is not exactly what you want. Probably you assume that lower a comes immediately after upper Z, but there are some other characters in between, and one of them is ^.
So, correcting the regex as
"""[^A-Za-z'\d]""".r
would fix the issue.
Have a look at the order of characters:
https://en.wikipedia.org/wiki/List_of_Unicode_characters

I'd be tempted to start with \W and expand from there.
"\\W+".r.split("car: carpet, as,,, java: javascript!!&#$%^&")
//res0: Array[String] = Array(car, carpet, as, java, javascript)

Deleting comments in a large file

I am trying to delete a bunch of comments that are all in the following format:
/**
* #ngdoc
... comment body (delete me, too!)
*/
I have tried using this command: %s/\/**\n * #ngdoc.\{-}*\///g
Here is the regex without the patterns: %s/pattern1.\{-}pattern2//g
Here are the individual patterns: \/**\n * #ngdoc and *\/
When I try my pattern in vim I get the following error:
E871: (NFA regexp) Can't have a multi follow a multi !
E61: Nested *
E476: Invalid command
Thanks for any help with this regexp nightmare!

Instead of trying to cram this into one complex regex, it's much easier to search for the start of a comment and delete from there on to the end of a comment
:g/^\/\*\*$/,/\*\/$/d_
This breaks down into
:g start a global command
/^\/\*\*$/ search for start of a comment: <sol>/**<eol>
,/^\*\/$/ extend the range to the end of a comment: <sol>*/<eol>
d delete the range
_ use the black hole register (performance optimization)

Your problem is you have \{-} followed by * which are the multis referenced in the error message. Quote the *:
%s/\/\*\*\n \* #ngdoc\_.\{-}\*\/\n//g

Using embedded newlines in the pattern is the wrong approach. You should instead use an address range. Something like:
sed '\#^/\*\*$#,\#^\*/$#d' file
This will delete all lines starting from one that matches /** anchored at column 1 to the line matching */ anchored at column 1. If your comments are well behaved (eg, no trailing space after /**), this should do what you want.

Try this using gc to be careful when deleting
%s/\v\/\*\*\n\s\*\s\#ngdoc\n((\s*\n)?(\s\*.*\n)?){-}\s?\*\///gc
Match comments like
/**
* #ngdoc
* ... comment body (delete me, too!)
*
*/

My approached consists of using a macro:
qa/\/\*\*<enter><shift-v>/\*\/<enter>d
qa ........ starts recording macro "a"
/\/\*\* ... searches for the comment beginning
<Enter> ... use Ctrl-v Enter
V ......... starts visual block (until...)
/\*\/ ..... end of your comment
<Enter> ... Ctrl-v Enter agai
d ......... it will delete selected area
In order to isert etc presse followed by the keyword you want.

Regex to ignore commented lines C++

I'm trying to use regex to find all variable initializations or assignments in code.
Currently I have
(\w+|\w[_])\s*=\s*(\d+\.\d+|.*)
which works but also finds commented out code like
// a = 100; which I don't want it to do. I've tried
([^/]\w+|\w[_])\s*=\s*(\d+\.\d+|.*)`
which I thought should ignore strings that start with / but that doesn't work.
Edit:
For example I'd like it to find lines like
b = 200;
but not // c = 3;

I try this take if necessary.
^(?:(?!\/\/).)*[a-z][a-z0-9\_]*\s*=\s*[0-9]+;
SEE DEMO: http://regex101.com/r/jE4vM0/3

Use this regex and check if the first sub-match is "//", if yes, it is after a comment.
(//)*\s*(\w+|\w[_])\s*=\s*(\d+\.\d+|.*)
For example "var=5;" will get three sub-matches: blank, var, and 5 while "//var=5;" will get //, var, and 5.

Regex: How to match a string that is not only numbers

Is it possible to write a regular expression that matches all strings that does not only contain numbers? If we have these strings:
abc
a4c
4bc
ab4
123
It should match the four first, but not the last one. I have tried fiddling around in RegexBuddy with lookaheads and stuff, but I can't seem to figure it out.

(?!^\d+$)^.+$
This says lookahead for lines that do not contain all digits and match the entire line.

Unless I am missing something, I think the most concise regex is...
/\D/
...or in other words, is there a not-digit in the string?

jjnguy had it correct (if slightly redundant) in an earlier revision.
.*?[^0-9].*
#Chad, your regex,
\b.*[a-zA-Z]+.*\b
should probably allow for non letters (eg, punctuation) even though Svish's examples didn't include one. Svish's primary requirement was: not all be digits.
\b.*[^0-9]+.*\b
Then, you don't need the + in there since all you need is to guarantee 1 non-digit is in there (more might be in there as covered by the .* on the ends).
\b.*[^0-9].*\b
Next, you can do away with the \b on either end since these are unnecessary constraints (invoking reference to alphanum and _).
.*[^0-9].*
Finally, note that this last regex shows that the problem can be solved with just the basics, those basics which have existed for decades (eg, no need for the look-ahead feature). In English, the question was logically equivalent to simply asking that 1 counter-example character be found within a string.
We can test this regex in a browser by copying the following into the location bar, replacing the string "6576576i7567" with whatever you want to test.
javascript:alert(new String("6576576i7567").match(".*[^0-9].*"));

/^\d*[a-z][a-z\d]*$/
Or, case insensitive version:
/^\d*[a-z][a-z\d]*$/i
May be a digit at the beginning, then at least one letter, then letters or digits

Try this:
/^.*\D+.*$/
It returns true if there is any simbol, that is not a number. Works fine with all languages.

Since you said "match", not just validate, the following regex will match correctly
\b.*[a-zA-Z]+.*\b
Passing Tests:
abc
a4c
4bc
ab4
1b1
11b
b11
Failing Tests:
123

if you are trying to match worlds that have at least one letter but they are formed by numbers and letters (or just letters), this is what I have used:
(\d*[a-zA-Z]+\d*)+

If we want to restrict valid characters so that string can be made from a limited set of characters, try this:
(?!^\d+$)^[a-zA-Z0-9_-]{3,}$
or
(?!^\d+$)^[\w-]{3,}$
/\w+/:
Matches any letter, number or underscore. any word character

.*[^0-9]{1,}.*
Works fine for us.
We want to use the used answer, but it's not working within YANG model.
And the one I provided here is easy to understand and it's clear:
start and end could be any chars, but, but there must be at least one NON NUMERICAL characters, which is greatest.

I am using /^[0-9]*$/gm in my JavaScript code to see if string is only numbers. If yes then it should fail otherwise it will return the string.
Below is working code snippet with test cases:
function isValidURL(string) {
var res = string.match(/^[0-9]*$/gm);
if (res == null)
return string;
else
return "fail";
};
var testCase1 = "abc";
console.log(isValidURL(testCase1)); // abc
var testCase2 = "a4c";
console.log(isValidURL(testCase2)); // a4c
var testCase3 = "4bc";
console.log(isValidURL(testCase3)); // 4bc
var testCase4 = "ab4";
console.log(isValidURL(testCase4)); // ab4
var testCase5 = "123"; // fail here
console.log(isValidURL(testCase5));

I had to do something similar in MySQL and the following whilst over simplified seems to have worked for me:
where fieldname regexp ^[a-zA-Z0-9]+$
and fieldname NOT REGEXP ^[0-9]+$
This shows all fields that are alphabetical and alphanumeric but any fields that are just numeric are hidden. This seems to work.
example:
name1 - Displayed
name - Displayed
name2 - Displayed
name3 - Displayed
name4 - Displayed
n4ame - Displayed
324234234 - Not Displayed

Regex - If contains '%', can only contain '%20'

I am wanting to create a regular expression for the following scenario:
If a string contains the percentage character (%) then it can only contain the following: %20, and cannot be preceded by another '%'.
So if there was for instance, %25 it would be rejected. For instance, the following string would be valid:
http://www.test.com/?&Name=My%20Name%20Is%20Vader
But these would fail:
http://www.test.com/?&Name=My%20Name%20Is%20VadersAccountant%25
%%%25
Any help would be greatly appreciated,
Kyle
EDIT:
The scenario in a nutshell is that a link is written to an encoded state and then launched via JavaScript. No decoding works. I tried .net decoding and JS decoding, each having the same result - The results stay encoded when executed.

Doesn't require a %:
/^[^%]*(%20[^%]*)*$/

Which language are you using?
Most languages have a Uri Encoder / Decoder function or class.
I would suggest you decode the string first and than check for valid (or invalid) characters.
i.e. something like /[\w ]/ (empty is a space)
With a regex in the first place you need to respect that www.example.com/index.html?user=admin&pass=%%250 means that the pass really is "%250".

Another solution if look-arounds are not available:
^([^%]|%([013-9a-fA-F][0-9a-fA-F]|2[1-9a-fA-F]))*$

Reject the string if it matches %[^2][^0]

I think that would find what you need
/^([^%]|%%|%20)+$/
Edit: Added case where %% is valid string inside URI
Edit2: And fixed it for case where it should fail :-)
Edit3:
In case you need to use it in editor (which would explain why you can't use more programmatic way), then you have to correctly escape all special characters, for example in Vim that regex should lool:
/^\([^%]\|%%\|%20\)\+$/

Maybe a better approach is to deal with that validation after you decode that string:
string name = HttpUtility.UrlDecode(Request.QueryString["Name"]);

/^([^%]|%20)*$/

This requires a test against the "bad" patterns. If we're allowing %20 - we don't need to make sure it exists.
As others have said before, %% is valid too... and %%25would be %25
The below regex matches anything that doesn't fit into the above rules
/(?<![^%]%)%(?!(20|%))/
The first brackets check whether there is a % before the character (meaning that it's %%) and also checks that it's not %%%. it then checks for a %, and checks whether the item after doesn't match 20
This means that if anything is identified by the regex, then you should probably reject it.

I agree with dominic's comment on the question. Don't use Regex.
If you want to avoid scanning the string twice, you can just iteratively search for % and then check that it is being followed by 20 and nothing else. (Update: allow a % after to be interpreted as a literal %nnn sequence)
// pseudo code
pos = 0
while (pos = mystring.find(pos, '%'))
{
if mystring[pos+1] = "%" then
pos = pos + 2 // ok, this is a literal, skip ahead
else if mystring.substring(pos,2) != "20"
return false; // string is invalid
end if
}
return true;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Correcting Wrong Marker Folding in VIM - regex

:%s/{{{\(\d\) \(.*\)/\2 {{{\1/g it works, but in your regex I don't understand why do you got a $ after s.

Related

Scala. Regexp can't remove symbol ^

Deleting comments in a large file

Regex to ignore commented lines C++

Regex: How to match a string that is not only numbers

Regex - If contains '%', can only contain '%20'

Categories

Resources