Python regex to get string in front of hyphen and plus signs - regex

I have a string as shown in the code. I want to get the final result as: ['AA', 'BB','CC'].
But what I have got here is ['AA', 'BB']. Could you please give me some suggestion? Thank you.
s = "AA-ZZ, BB+ZZ, CC"
a = re.findall(r'(\w+)[-|\\+\\]\w',s)

Use lookahead to see whether the string is in front of +, - or at the end of string.
a = re.findall(r'(\w+)(?=[-+]|$)',s)

Related

get number after matching string in ruby with regex

I'm trying to get the number of tracking_number from the string.
string = '{:rate_type=>"PAYOR_ACCOUNT_PACKAGE", :rated_weight_method=>"ACTUAL", tracking_number=>"795856589804"}, :group_number=>"0", :package_rating=>{:actual_rate_type=>"PAYOR_ACCOUNT_PACKAGE", :package_rate_details=>{:rate_type=>"PAYOR_ACCOUNT_PACKAGE", :rated_weight_method=>"ACTUAL", :minimum_charge_type=>"CUSTOMER_FREIGHT_WEIGHT", :billing_weight=>{:units=>"LB", :value=>"1.0"}}'
I have tried /tracking_number=>['"]((.*?)['"])*/ but getting all the string after the match.
Can anybody help me on this.
I have tried this at https://rubular.com/r/ZcmJinTHDQSDsZ
Output I want is 795856589804
Remove * from end of your regex. This is the reason you are getting all the string after match.
And If you want to get just the number part then use this regex.
/tracking_number=>"(\d+)"/

Scala. Regexp can't remove symbol ^

I need split sentence to words removing redundant characters.
I prepared regexp for that:
val wordCharacters = """[^A-z'\d]""".r
right now I have rule which can be used to handle task in next way:
wordCharacters.split(words)
.filterNot(_.isEmpty)
where words any sentence I need to parse.
But issue is that in case I try to handle "car: carpet, as,,, java: javascript!!&#$%^&" I get one more word ^. Trying to change my regex and without ^ I'm getting much more issues for different cases...
Is any ideas how to solve it?
P.S.
If somebody want to play with it try link or code below please:
val wordCharacters = """[^A-z'\d]""".r
val stringToInt =
wordCharacters.split("car: carpet, as,,, java: javascript!!&#$%^&")
.filterNot(_.isEmpty)
.toList
println(stringToInt)
Expected result is:
List(car, carpet, as, java, javascript)
The part A-z is not exactly what you want. Probably you assume that lower a comes immediately after upper Z, but there are some other characters in between, and one of them is ^.
So, correcting the regex as
"""[^A-Za-z'\d]""".r
would fix the issue.
Have a look at the order of characters:
https://en.wikipedia.org/wiki/List_of_Unicode_characters
I'd be tempted to start with \W and expand from there.
"\\W+".r.split("car: carpet, as,,, java: javascript!!&#$%^&")
//res0: Array[String] = Array(car, carpet, as, java, javascript)

Regex with Replace String Python

I have this situation, I have a sentence with wrong dot (.) to process, the sentence:
sentence = 'Hi. Long time no see .how are you ?can you follow .#abcde?'
I am trying to normalize this sentence, if you see it, there is some wrong format sentence (.how, ?can, and .#abcde). I am thinking of using regex to handle this because the sentence keep changing. This is my code so far:
import re
character = ['.','?','#']
sentence = 'Hi. Long time no see .how are you ?can you follow .#abcde?'
sentence = str(sentence)
for i in character:
charac = str(i)
charac_after = re.findall(r'\\'+charac+r'\S*', sentence)
if charac_after:
print("Exist")
sentence = sentence.replace(charac, charac+' ')
print(sentence)
The result some how skip the dot (.) and at (#) it just process the question mark (?). This is the result:
Exist
Hi. Long time no see .how are you ? can you follow .#abcde?
its supposed to be "Hi. Long time no see . how are you ? can you follow . # abcde?". I don't know if my double backslash in "r'\'+charac+r'\S*'" are wrong or something, did I miss something?
How can I process all the character? please help.
Without any knowlegde of python i think you need to do it like this:
(as per suggestion from #Sebastian Proske)
character = ['.','?','#']
sentence = str('Hi. Long time no see .how are you ?can you follow .#abcde?')
sentence = re.sub(r'([' + ''.join(map(re.escape, character)) + r'])(?=\S)', r'\1 ', sentence)
print(sentence)
The code i am not sure about, but the regex. see here:
https://regex101.com/r/HXdeuK/2
see demo here https://repl.it/Fw5b/3

to give space between two continuous uppercase letter

I need to know how to give space between two uppercase letter continuously.
Ihave large list of customer. with first name middle name and last name. GaryACloud should be split as Gary A Cloud. I used (.)([A-Z]) And replaced with \1 \2. I have no clue what it means. So if anyone can explain i will be really grateful. the above gave me a partial output only. i got Gary ACloud but how to provide space before every upper case letter? and also if you can expalin the solution, it will be very helpful
You can match:
"([A-Z])(?=[A-Z])"
And replace with:
"\1 "
var input = "CategoryName";
var result = Regex.Replace(input, "([a-z])([A-Z])", #"$1 $2"); //Category Name
UPDATE (this will treat sequence of capital letters as one word)
var input = "SimpleHTTPRequest";
var result = Regex.Replace(input, "([a-z]|[A-Z]{2,})([A-Z])", #"$1 $2");
//Simple HTTP Request

Regex: How to match a string that is not only numbers

Is it possible to write a regular expression that matches all strings that does not only contain numbers? If we have these strings:
abc
a4c
4bc
ab4
123
It should match the four first, but not the last one. I have tried fiddling around in RegexBuddy with lookaheads and stuff, but I can't seem to figure it out.
(?!^\d+$)^.+$
This says lookahead for lines that do not contain all digits and match the entire line.
Unless I am missing something, I think the most concise regex is...
/\D/
...or in other words, is there a not-digit in the string?
jjnguy had it correct (if slightly redundant) in an earlier revision.
.*?[^0-9].*
#Chad, your regex,
\b.*[a-zA-Z]+.*\b
should probably allow for non letters (eg, punctuation) even though Svish's examples didn't include one. Svish's primary requirement was: not all be digits.
\b.*[^0-9]+.*\b
Then, you don't need the + in there since all you need is to guarantee 1 non-digit is in there (more might be in there as covered by the .* on the ends).
\b.*[^0-9].*\b
Next, you can do away with the \b on either end since these are unnecessary constraints (invoking reference to alphanum and _).
.*[^0-9].*
Finally, note that this last regex shows that the problem can be solved with just the basics, those basics which have existed for decades (eg, no need for the look-ahead feature). In English, the question was logically equivalent to simply asking that 1 counter-example character be found within a string.
We can test this regex in a browser by copying the following into the location bar, replacing the string "6576576i7567" with whatever you want to test.
javascript:alert(new String("6576576i7567").match(".*[^0-9].*"));
/^\d*[a-z][a-z\d]*$/
Or, case insensitive version:
/^\d*[a-z][a-z\d]*$/i
May be a digit at the beginning, then at least one letter, then letters or digits
Try this:
/^.*\D+.*$/
It returns true if there is any simbol, that is not a number. Works fine with all languages.
Since you said "match", not just validate, the following regex will match correctly
\b.*[a-zA-Z]+.*\b
Passing Tests:
abc
a4c
4bc
ab4
1b1
11b
b11
Failing Tests:
123
if you are trying to match worlds that have at least one letter but they are formed by numbers and letters (or just letters), this is what I have used:
(\d*[a-zA-Z]+\d*)+
If we want to restrict valid characters so that string can be made from a limited set of characters, try this:
(?!^\d+$)^[a-zA-Z0-9_-]{3,}$
or
(?!^\d+$)^[\w-]{3,}$
/\w+/:
Matches any letter, number or underscore. any word character
.*[^0-9]{1,}.*
Works fine for us.
We want to use the used answer, but it's not working within YANG model.
And the one I provided here is easy to understand and it's clear:
start and end could be any chars, but, but there must be at least one NON NUMERICAL characters, which is greatest.
I am using /^[0-9]*$/gm in my JavaScript code to see if string is only numbers. If yes then it should fail otherwise it will return the string.
Below is working code snippet with test cases:
function isValidURL(string) {
var res = string.match(/^[0-9]*$/gm);
if (res == null)
return string;
else
return "fail";
};
var testCase1 = "abc";
console.log(isValidURL(testCase1)); // abc
var testCase2 = "a4c";
console.log(isValidURL(testCase2)); // a4c
var testCase3 = "4bc";
console.log(isValidURL(testCase3)); // 4bc
var testCase4 = "ab4";
console.log(isValidURL(testCase4)); // ab4
var testCase5 = "123"; // fail here
console.log(isValidURL(testCase5));
I had to do something similar in MySQL and the following whilst over simplified seems to have worked for me:
where fieldname regexp ^[a-zA-Z0-9]+$
and fieldname NOT REGEXP ^[0-9]+$
This shows all fields that are alphabetical and alphanumeric but any fields that are just numeric are hidden. This seems to work.
example:
name1 - Displayed
name - Displayed
name2 - Displayed
name3 - Displayed
name4 - Displayed
n4ame - Displayed
324234234 - Not Displayed