Regular expression with Chinese characters - regex

I tried to use regular expression to to capture both English and Chinese company name into two groups.
However I stuck at space issue, also if the line didn't contain chinese name, then i am unable to capture, can anyone help me to check what is the problem of my regular expression?
Link (https://regex101.com/r/VLwr7b/1/)

This will do:
^([^\p{Han}]*?)(?:\s+([\p{Han}].*))?$
Demo: https://regex101.com/r/VLwr7b/3

Hopefully this regex helps
https://regex101.com/r/VLwr7b/2
^([\s\S])*(.?[[:space:]])[(\[\p{Han}\/]?.+$ (new)
^([\s\S].*)[[:space:]]((?!\/)[\p{Han}\/].+)$ (old)
I think ultimately is due to the fact that your old regex wasn't catering to the availability of brackets hence, it was stuck.

Related

Regex to get all characters AFTER first comma

I'm trying to find a regex pattern to extract some names that apprear in a string after the first comma (David Peter Richard) below.
Example string:
PALMER, David Peter Richard
I came across this thread that successfully extracts the name before, but require all the names after the comma.
I've tried to modify the ^.*?(?=,), but not having any joy. Needs to be JavaScript Regex and capture groups are not supported in the platform i'm using (Bubble)
Any help appreciated, thanks a lot!
I tried this: (?<=,)[^,]+
Which seems to work on Desktop, however on a wrapped mobile app, it doesn't seem to work.
Similarly for the Name before, I was using ^[^,]+ and experiencing the same issue, but when I use the pattern in the ^.*?(?=,) it works fine.
So now I just need the pattern to be adjusted for the names after.
On JavaScript, I would suggest a simple string split:
var input = "PALMER, David Peter Richard";
var names = input.split(/,\s*/);
console.log(names);
try to use this regex for the after comma string:
(?<=,\s)(.*)$
hope this helps.

Regular expression missing a match - C++

I'm using a regular expression to find character entries (e.g. '[any single character]' or '\[any single character]') and I noticed my current regex is missing '\''. Can anyone help me understand why and how to fix it? My current regex is ('.'|'\\.')
I'm writing my program using C++, in case that matters to anyone.
Thanks.
Answer: ('\\.'|'.')
The search wasn't working because it matched the first option - '.'. Switching the order made it try to match '\.' first.

Matching Any Word Regex

I would like to remove hundreds on onmouseover events from my code. the evt all pass different variables and I want to be able to use dreamwaever to find and replace all the strings with nothing.
Here is an example
onmouseover="parent.mv_mapTipOver(evt,'Wilson');"
onmouseover="parent.mv_mapTipOver(evt,'Harris');"
onmouseover="parent.mv_mapTipOver(evt,'Walker');"
I want to run a search that will identify all of these and replace/remove them.
I have tried seemingly infinite permutations of things like:
onmouseover="parent.mv_mapTipOver(evt,'[^']');"
or
onmouseover="parent.mv_mapTipOver(evt,'[^']);"
or
onmouseover="parent.mv_mapTipOver(evt,[^']);"
or
onmouseover="parent.mv_mapTipOver(evt,'[^']+');"
And many more. I cannot find the regular expression that will work.
Any/all help would be appreciated.
Thanks a ton!
"." and "(" have special meaning in regular expressions, so you need to escape them:
onmouseover="parent\.mv_mapTipOver\(evt,'[^']+'\);"
I'm not sure if this is correct dreamweaver regex syntax, but this stuff is standard enough.
Try this one:
onmouseover="parent\.mv_mapTipOver\(evt,'.+?'\);"
And see it in action here.
When using reg expressions you have to be very careful about how you handle white space. For example the following piece of code will not get caught by most of the reg expressions mentioned so far because of the space after the comma and equals sign, despite the fact that it is most likely valid syntax in the language you are using.
onmouseover= "parent.mv_mapTipOver(evt, 'Walker');"
In order to create regexp that ignore white space you must insert /s* everywhere in the regexp that white space might occur.
The following regexp should work even if there is additional white space in your code.
onmouseover\s*=\s*"parent\.mv_mapTipOver\(\s*evt\s*,\s*'[A-Za-z]+'\s*\);"

Regular expression to amend Sysprep.inf file

I currently have a requirement to parse a sysprep.inf file and insert a value input by the end user.
I'm coding this utility using AutoIT and my regular expression is slightly out.
The line I need amending is as follows:
ComputerName=%DeviceName%
DeviceName is variable injected by LANDesk. If the device has previously been in the LANDesk database the name is injected into the file. If not the variable name remains. The device name must go after the =
Here is a snippet of my current code:
$FileContents = StringRegExpReplace($FileContents,'ComputerName=[a-z]','ComputerName='& $deviceNameInput)
Thanks for any guidance anyone can offer.
I'm not familiar with AutoIT or BASIC... but it looks like you need to be using something like this:
$FileContents = StringRegExpReplace($FileContents,'.*ComputerName=(\%[a-zA-Z]*\%).*', $deviceNameInput)
OR
$FileContents = StringRegExpReplace($FileContents,'ComputerName=\%[a-zA-Z]*\%', 'ComputerName='&$deviceNameInput)
this will only replace a device name that's a-z or A-Z. Not numerical or containing spaces.
Writing regular expressions can be tough because there are so many dialects of regular expressions. Assuming you are using a regex library that supports a Perl-like dialect you might want to try this for your regex:
^\s*ComputerName\s*=\s*(?:%DeviceName%|[a-zA-Z0-9_-]+)
Basically this regex will match an lines either the litteral string ComputerName=%DeviceName% or ComputerName=<some actual device name that only contains the characters a-z, A-Z, 0-9, _, and ->. This regex is also a bit lenient in that it will match a line that contains whitespace at the beginning of the line as well as before and/or after the equals sign. The image below explains the components of this regex in greater detail.
p.s. that image was generated by RegexBuddy, an excellent regular expression IDE.
Autoit has a great way of dealing with ini files - IniWrite
IniWrite("SysPrep.ini", "write_section_here", "ComputerName", $deviceNameInput)
creates or updates SysPrep.ini with:
[write_section_here]
ComputerName=localhost

Regex Replacing characters with zero

I have the following string 3}HFB}4AF4}1 -M}1.
I have searched for this string using the regex :
([0-9])(\})([A-Z]{3})(\})([0-9][A-Z]{2}[0-9])(\})([0-9])(\s\-)([A-Z])(\})([0-9]).
I want to replace the } with 0. The Result I am looking for is 30HFB04AF401-M01, any assistance is appriciated. The tool I am using is Regex Buddy
A possible solution
Problem solved? In JavaScript at least :-)
"3}HFB}4AF4}1 -M}1".replace(/\}/g, "0");
// "30HFB04AF401 -M01"
I'm missing the point, right?
Assuming the language is JavaScript, we can write something like
"dfghj456783}HFB}4AF4}1 -M}1fghjkl8765".replace(/(?:[\d\w\s]+)([0-9]}[A-Z]{3}}[0-9][A-Z]{2}[0-9]}[0-9] -[A-Z]}[0-9])(?:[\d\w\s]+)/g, function () {
return arguments[1].replace(/}/g, "0");
});
What's possible in other languages though may be a different story.
Try the home of RegexBuddy for details.
So you've already got an expression to find instances of the string. Now you can either use groups to replace the characters, or you can use a separate regular expression over the string you found, simply replacing the } character within group(0) (which is the entire matched part of the input). I would certainly prefer the latter.
Fred seems to have created the replacement method for you already, so I won't repeat it here.
I have managed to find a solution to the formating in the JGSoft Lanugage used by Regex Buddy, thanks to all that provided suggestions that helped me channel my thoughts in the right direction.
Solution(I am still a beginner with Regex hence the syntax might not be efficent, but it does the job!!)
Using Group Names instead of Regex assiging groups with backreference and $ syntax.
Hence to replace 0 for } in the string 3}HFB}4AF4}1 -M}1 or any similar string. I used the following search and replacement syntax
Search : (?<Gp1>([0-9]))(?:})(?<Gp2>([A-Z]){3})(?:})(?<Gp3>([0-9])([A-Z]{2})([0-9]))(?:})(?<Gp4>([0-9]))(?:\s-)(?<Gp5>([A-Z]))(?:})(?<Gp6>[0-9])
Replace : ${Gp1}0${Gp2}0${Gp3}0${Gp4}-${Gp5}0${Gp6}
Result : 30HFB04AF401-M01