Regular expression to Replace special characters from node property in neo4j - regex

I have a property whose value may contain following characters: ~!##$%^&*() and the space character.
I want to replace all of them with an empty string.
Please suggest a suitable regular expression to do this.

You already have the regular expression, it's the class of all the characters you listed:
[~!##$%^&*() ]
You just have to replace all occurrences by an empty string, using the regex/string API of your language.
For example, in Java:
// The pattern can be declared as a constant, computed only once.
Pattern p = Pattern.compile("[~!##$%^&*() ]");
String newPropName = p.matcher(propName).replaceAll("");

There is a thus-far undocumented APOC function, apoc.text.replace, that you can use from your Cypher code. It accepts a regular expression as its second parameter. (Since it is a function, it is not invoked in a CALL clause.)
For example:
RETURN apoc.text.replace('~!#1~!#', '[~!##$%^&*() ]', '') AS res;
returns:
╒═══╕
│res│
╞═══╡
│1 │
└───┘

Related

Regulare expression

I need the regular expression for below string cases,
String value = "�江苏银行股份有限公司南京迈皋桥支行";
String value = "�/CNYXB/02112";
in both the cases only the character "�" needs to be removed and the final string values should be as below after applying regular expression,
String value = "江苏银行股份有限公司南京迈皋桥支行";
String value = "/CNYXB/02112";
thanks in advance!!!
yes i have tried below regEx,
value = value.replaceAll("[^\\p{ASCII}]", "");
I'm not sure if this is what you're actually asking, but you can easily remove the first character from the string:
^.
matches the first character at the start of the string.
If you want to remove an out-of-range character then you need to define your range. Use multiple classes wiht octal escapes, so something like:
[\o{2444}-\o{3444}\o{40}-\o{77}]
without know what the characters you're looking for really are it's difficult to be more specific.
try to use replaceFirst instead of replaceAll:
value = value.replaceFirst("[^\\p{ASCII}]", "");

Using regex to find data in between certain data [duplicate]

How can I extract a substring from within a string in Ruby?
Example:
String1 = "<name> <substring>"
I want to extract substring from String1 (i.e. everything within the last occurrence of < and >).
"<name> <substring>"[/.*<([^>]*)/,1]
=> "substring"
No need to use scan, if we need only one result.
No need to use Python's match, when we have Ruby's String[regexp,#].
See: http://ruby-doc.org/core/String.html#method-i-5B-5D
Note: str[regexp, capture] → new_str or nil
String1.scan(/<([^>]*)>/).last.first
scan creates an array which, for each <item> in String1 contains the text between the < and the > in a one-element array (because when used with a regex containing capturing groups, scan creates an array containing the captures for each match). last gives you the last of those arrays and first then gives you the string in it.
You can use a regular expression for that pretty easily…
Allowing spaces around the word (but not keeping them):
str.match(/< ?([^>]+) ?>\Z/)[1]
Or without the spaces allowed:
str.match(/<([^>]+)>\Z/)[1]
Here's a slightly more flexible approach using the match method. With this, you can extract more than one string:
s = "<ants> <pants>"
matchdata = s.match(/<([^>]*)> <([^>]*)>/)
# Use 'captures' to get an array of the captures
matchdata.captures # ["ants","pants"]
# Or use raw indices
matchdata[0] # whole regex match: "<ants> <pants>"
matchdata[1] # first capture: "ants"
matchdata[2] # second capture: "pants"
A simpler scan would be:
String1.scan(/<(\S+)>/).last

Regular expression to filter special characters not working

i am searching for a regular expression i can use to check if a user input contains special characters in a specified list.
Here are the special characters not allowed by using a regular expression i tried to write: ^[`~!##$%^&*()_+={}\[\]|\\:;“’<,>.?๐฿]*$
i went to https://regex101.com/ and i was expecting the following input to match but did it not why:
127 elmer road ??<>()
so in android java (but an be any ) i wrote the following function but it also always returns true . how can i filter all these special characters . I want a function that returns true if a given string does NOT match.
public boolean isValid( EditText et) {
String string = et.getText().toString();
boolean isValid = true;
final Pattern sPattern
= Pattern.compile("^[`~!##$%^&*()_+={}\\[\\]|\\\\:;“’<,>.?๐฿]*$");
isValid= !sPattern.matcher(string).matches();
return isValid;
}
update: i tried the following also:
I want a function that returns true if a given string does NOT match.
You can negate the character set. (Note the ^ symbol within the square brackets). This will return true for strings that don't contain any of these special characters.
^[^`~!##$%^&*()_+={}\[\]|\\:;“’<,>.?๐฿]*$
https://regex101.com/r/CqtqoK/1

Playframework with Deadbolt 2: Pattern regular expression not match

I am using Deadbolt2 with play-framework 2.3.x. When I am trying to access the controller with declare deadbolt Patterns using regular expressions. I am getting Not-found error. According to this sample, it is possible to use regular expressions with Pattern in our application. But when I declare a regular expression, I am not able to use it. My code looks like this:
def pattern_one = Pattern("CH{4,}", PatternType.REGEX, new MyDeadboltHandler) {} // NOT ACCESSED
def pattern_one = Pattern("CH*", PatternType.REGEX, new MyDeadboltHandler) { // NOT ACCESSED
def pattern_one = Pattern("CHANNEL", PatternType.REGEX, new MyDeadboltHandler) { // ACCESSED SUCCESSFULLY
Regular expressions are not wildcards. If a * wildcard matches anything any number of times, in regex, you need to use .*, where . means any character but a newline, and * means 0 or more times.
More, if you want to match the whole string that contains a word in a string starting with CH, you can use a word boundary, \\b: \\bCH.*.
If you want to specify that the string must start with CH and match the whole string, you can use ^CH.*.
You need to use CH.* or CH.{4,} if you want something (not just Hs) after CH. The . means any character, just like in any other regular expression.

how to use a regular expression to extract json fields?

Beginner RegExp question. I have lines of JSON in a textfile, each with slightly different Fields, but there are 3 fields I want to extract for each line if it has it, ignoring everything else. How would I use a regex (in editpad or anywhere else) to do this?
Example:
"url":"http://www.netcharles.com/orwell/essays.htm",
"domain":"netcharles.com",
"title":"Orwell Essays & Journalism Section - Charles' George Orwell Links",
"tags":["orwell","writing","literature","journalism","essays","politics","essay","reference","language","toread"],
"index":2931,
"time_created":1345419323,
"num_saves":24
I want to extract URL,TITLE,TAGS,
/"(url|title|tags)":"((\\"|[^"])*)"/i
I think this is what you're asking for. I'll provide an explanation momentarily. This regular expression (delimited by / - you probably won't have to put those in editpad) matches:
"
A literal ".
(url|title|tags)
Any of the three literal strings "url", "title" or "tags" - in Regular Expressions, by default Parentheses are used to create groups, and the pipe character is used to alternate - like a logical 'or'. To match these literal characters, you'd have to escape them.
":"
Another literal string.
(
The beginning of another group. (Group 2)
(
Another group (3)
\\"
The literal string \" - you have to escape the backslash because otherwise it will be interpreted as escaping the next character, and you never know what that'll do.
|
or...
[^"]
Any single character except a double quote The brackets denote a Character Class/Set, or a list of characters to match. Any given class matches exactly one character in the string. Using a carat (^) at the beginning of a class negates it, causing the matcher to match anything that's not contained in the class.
)
End of group 3...
*
The asterisk causes the previous regular expression (in this case, group 3), to be repeated zero or more times, In this case causing the matcher to match anything that could be inside the double quotes of a JSON string.
)"
The end of group 2, and a literal ".
I've done a few non-obvious things here, that may come in handy:
Group 2 - when dereferenced using Backreferences - will be the actual string assigned to the field. This is useful when getting the actual value.
The i at the end of the expression makes it case insensitive.
Group 1 contains the name of the captured field.
EDIT: So I see that the tags are an array. I'll update the regular expression here in a second when I've had a chance to think about it.
Your new Regex is:
/"(url|title|tags)":("(\\"|[^"])*"|\[("(\\"|[^"])*"(,"(\\"|[^"])*")*)?\])/i
All I've done here is alternate the string regular expression I had been using ("((\\"|[^"])*)"), with a regular expression for finding arrays (\[("(\\"|[^"])*"(,"(\\"|[^"])*")*)?\]). No so easy to Read, is it? Well, substituting our String Regex out for the letter S, we can rewrite it as:
\[(S(,S)*)?\]
Which matches a literal opening bracket (hence the backslashes), optionally followed by a comma separated list of strings, and a closing bracket. The only new concept I've introduced here is the question mark (?), which is itself a type of repetition. Commonly referred to as 'making the previous expression optional', it can also be thought of as exactly 0 or 1 matches.
With our same S Notation, here's the whole dirty Regular Expression:
/"(url|title|tags)":(S|\[(S(,S)*)?\])/i
If it helps to see it in action, here's a view of it in action.
This question is a bit older, but I have had browsed a bit on my PC and found that expression. I passed him as GIST, could be useful to others.
EDIT:
# Expression was tested with PHP and Ruby
# This regular expression finds a key-value pair in JSON formatted strings
# Match 1: Key
# Match 2: Value
# https://regex101.com/r/zR2vU9/4
# http://rubular.com/r/KpF3suIL10
(?:\"|\')(?<key>[^"]*)(?:\"|\')(?=:)(?:\:\s*)(?:\"|\')?(?<value>true|false|[0-9a-zA-Z\+\-\,\.\$]*)
# test document
[
{
"_id": "56af331efbeca6240c61b2ca",
"index": 120000,
"guid": "bedb2018-c017-429E-b520-696ea3666692",
"isActive": false,
"balance": "$2,202,350",
"object": {
"name": "am",
"lastname": "lang"
}
}
]
the json string you'd like to extract field value from
{"fid":"321","otherAttribute":"value"}
the following regex expression extract exactly the "fid" field value "321"
(?<=\"fid\":\")[^\"]*
Please try below expression:
/"(url|title|tags)":("([^""]+)"|\[[^[]+])/gm
Explanation:
1st Capturing Group (url|title|tags): This is alternatively capturing the characters 'url','title' and 'tags' literally (case sensitive).
2nd Capturing Group ("([^""]+)"|[[^[]+]):
1st Alternative "([^""]+)" is matches all words within " and " including " and "
2nd Alternative [[^[]+] is matches all words within [ and ] including [ and ]
I have tested here
I adapted regex to work with JSON in my own library. I've detailed algorithm behavior below.
First, stringify the JSON object. Then, you need to store the starts and lengths of the matched substrings. For example:
"matched".search("ch") // yields 3
For a JSON string, this works exactly the same (unless you are searching explicitly for commas and curly brackets in which case I'd recommend some prior transform of your JSON object before performing regex (i.e. think :, {, }).
Next, you need to reconstruct the JSON object. The algorithm I authored does this by detecting JSON syntax by recursively going backwards from the match index. For instance, the pseudo code might look as follows:
find the next key preceding the match index, call this theKey
then find the number of all occurrences of this key preceding theKey, call this theNumber
using the number of occurrences of all keys with same name as theKey up to position of theKey, traverse the object until keys named theKey has been discovered theNumber times
return this object called parentChain
With this information, it is possible to use regex to filter a JSON object to return the key, the value, and the parent object chain.
You can see the library and code I authored at http://json.spiritway.co/
if your json is
{"key1":"abc","key2":"xyz"}
then below regex will extract key1 or key2 based on a key that you pass in regex
"key2(.*?)(?=,|}|$)
you can verify it here - regex101.com
Why does it have to be a Regular Expression object?
Here we can just use a Hash object first and then go search it.
mh = {"url":"http://www.netcharles.com/orwell/essays.htm","domain":"netcharles.com","title":"Orwell Essays & Journalism Section - Charles' George Orwell Links","tags":["orwell","writing","literature","journalism","essays","politics","essay","reference","language","toread"],"index":2931,"time_created":1345419323,"num_saves":24}
The output of which would be
=> {:url=>"http://www.netcharles.com/orwell/essays.htm", :domain=>"netcharles.com", :title=>"Orwell Essays & Journalism Section - Charles' George Orwell Links", :tags=>["orwell", "writing", "literature", "journalism", "essays", "politics", "essay", "reference", "language", "toread"], :index=>2931, :time_created=>1345419323, :num_saves=>24}
Not that I want to avoid using Regexp but don't you think it would be easier to take it a step at a time until your getting the data you want to further search through? Just MHO.
mh.values_at(:url, :title, :tags)
The output:
["http://www.netcharles.com/orwell/essays.htm", "Orwell Essays & Journalism Section - Charles' George Orwell Links", ["orwell", "writing", "literature", "journalism", "essays", "politics", "essay", "reference", "language", "toread"]]
Taking the pattern that FrankieTheKneeman gave you:
pattern = /"(url|title|tags)":"((\\"|[^"])*)"/i
we can search the mh hash by converting it to a json object.
/#{pattern}/.match(mh.to_json)
The output:
=> #<MatchData "\"url\":\"http://www.netcharles.com/orwell/essays.htm\"" 1:"url" 2:"http://www.netcharles.com/orwell/essays.htm" 3:"m">
Of course this is all done in Ruby which is not a tag that you have but relates I hope.
But oops! Looks like we can't do all three at once with that pattern so I will do them one at a time just for sake.
pattern = /"(title)":"((\\"|[^"])*)"/i
/#{pattern}/.match(mh.to_json)
#<MatchData "\"title\":\"Orwell Essays & Journalism Section - Charles' George Orwell Links\"" 1:"title" 2:"Orwell Essays & Journalism Section - Charles' George Orwell Links" 3:"s">
pattern = /"(tags)":"((\\"|[^"])*)"/i
/#{pattern}/.match(mh.to_json)
=> nil
Sorry about that last one. It will have to be handled differently.