undefined method `match?' for true:TrueClass (NoMethodError) - regex

I am trying to return a list/array of values from a range of (100..1000) that match the following criteria:
3 digit value
All the digits in each value are unique.
$global_range = Array (100..999)
$fun = []
def listOfFunPossibilities
# FUN values should meet the criteria below:
# 1. 3 digit value
# 2. All are unique
$global_range.each do |i|
if (!(/([0-9]).*?\1/)).match?(i)
$fun.push(i)
end
end
return $fun
end
listOfFunPossibilities()

You apply negation ! too early:
if (!(/([0-9]).*?\1/)).match?(i)
so you first negate a regex (that is true for some reason) and then you try to call match on true value
Use instead:
if !(/([0-9]).*?\1/.match?(i))
or even
if !/([0-9]).*?\1/.match?(i)

Related

Shorten Code Logic For String Manipulation

Examples
"123456" would be ["123", "456"].
"1234567891011" would be ["123", "456", "789", "10", "11"].
I have come up with this logic using regex to solve the challenge but I am being asked if there is a way to make the logic shorter.
def ft(str)
end
The result from the scan gives a lot of whitespaces so after the join operation, I am left with either a double dash or triple dashes so I used this .gsub(/-+/, '-') to fix that. I also noticed sometimes there is a dash at the begin or the end of the string, so I used .gsub(/^-|-$/, '') to fix that too
Any Ideas?
Slice the string in chunks of max 3 digits. (s.scan(/.{1,3}/)
Check if the last chunk has only 1 character. If so, take the last char of the chunk before and prepend it to the last.
Glue the chunks together using join(" ")
Inspired by #steenslag's recommendation. (There are quite a few other ways to achieve the same with varying levels of verbosity and esotericism)
Here is how I would go about it:
def format_str(str)
numbers = str.delete("^0-9").scan(/.{1,3}/)
# there are a number of ways this operation can be performed
numbers.concat(numbers.pop(2).join.scan(/../)) if numbers.last.length == 1
numbers.join('-')
end
Breakdown:
numbers = str.delete("^0-9") - delete any non numeric character from the string
.scan(/.{1,3}/) - scan them into groups of 1 to 3 characters
numbers.concat(numbers.pop(2).join.scan(/../)) if numbers.last.length == 1 - If the last element has a length of 1 then remove the last 2 elements join them and then scan them into groups of 2 and add these groups back to the Array
numbers.join('-') - join the numbers with a hyphen to return a formatted String
Example:
require 'securerandom'
10.times do
s = SecureRandom.hex
puts "Original: #{s} => Formatted: #{format_str(s)}"
end
# Original: fd1bbce41b1c784ce6ad5303d868bbe9 => Formatted: 141-178-465-303-86-89
# Original: af04bd4d4d6beb5a0412a692d5d3d42d => Formatted: 044-465-041-269-253-42
# Original: 9a1833a43cbef51c3f3c21baa66fe996 => Formatted: 918-334-351-332-166-996
# Original: 4104ae13c998cec896997b9919bdafb3 => Formatted: 410-413-998-896-997-991-93
# Original: 0eb49065472240ba32b3c029f897b30d => Formatted: 049-065-472-240-323-029-897-30
# Original: 4c68f9f68e8f6132c0ed5b966d639cf4 => Formatted: 468-968-861-320-596-663-94
# Original: 65987ee04aea8fb533dbe38c0fea7d63 => Formatted: 659-870-485-333-807-63
# Original: aa8aaf1cf59b52c9ad7db6d4b1ae0cbb => Formatted: 815-952-976-410
# Original: 8eb6b457059f91fd06ccbac272db8f4e => Formatted: 864-570-599-106-272-84
# Original: 1c65825ed59dcdc6ec18af969938ea57 => Formatted: 165-825-596-189-699-38-57
That being said to modify your existing code this will work as well:
def format_str(str)
str
.delete("^0-9")
.scan(/(?=\d{5})\d{3}|(?=\d{3}$)\d{3}|\d{2}/)
.join('-')
end
Here are three more ways to do that.
Use String#scan with a regular expression
def fmt(str)
str.delete("^0-9").scan(/\d{2,3}(?!\d\z)/)
end
The regular expression reads, "match two or three digits provided they are not followed by a single digit at the end of the string". (?!\d\z) is a negative lookahead (which is not part of the match). As matches are greedy by default, the regex engine will always match three digits if possible.
Solve by recursion
def fmt(str)
recurse(str.delete("^0-9"))
end
def recurse(s)
case s.size
when 2,3
[s]
when 4
[s[0,2], s[2,2]]
else
[s[0,3], *fmt(s[3..])]
end
end
Determine the last two matches from the size of the string
def fmt(str)
s = str.delete("^0-9")
if s.size % 3 == 1
s[0..-5].scan(/\d{3}/) << s[-4,2] << s[-2,2]
else
s.scan(/\d{2,3}/)
end
end
All methods exhibit the following behaviour.
["5551231234", "18883319", "123456", "1234567891011"].each do |str|
puts "#{str}: #{fmt(str)}"
end
5551231234: ["555", "123", "12", "34"]
18883319: ["188", "833", "19"]
123456: ["123", "456"]
1234567891011: ["123", "456", "789", "10", "11"]
An approach:
def foo(s)
s.gsub(/\D/, '').scan(/\d{1,3}/).each_with_object([]) do |x, arr|
if x.size == 3 || arr == []
arr << x
else
y = arr.last
arr[-1] = y[0...-1]
arr << "#{y[-1]}#{x}"
end
end
end
Remove all non-digits characters, then scan for 1 to 3 digit chunks. Iterate over them. If it's the first time through or the chunk is three digits, add it to the array. If it isn't, take the last digit from the previous chunk and prepend it to the current chunk and add that to the array.
Alternatively, without generating a new array.
def foo(s)
s.gsub(/\D/, '').scan(/\d{1,3}/).instance_eval do |y|
y.each_with_index do |x, i|
if x.size == 1
y[i] = "#{y[i-1][-1]}#{x}"
y[i-1] = y[i-1][0...-1]
end
end
end
end
Without changing your code too much and without adjusting your actual regex, I might suggest replacing scan with split in order to avoid all the extra nil values; replacing gsub with tr which is much faster; and then using reject(&:empty?) to loop through and remove any blank array elements before joining with whatever character you want:
string = "12345fg\n\t\t\t 67"
string.tr("^0-9", "")
.split(/(?=\d{5})(\d{3})|(?=\d{3}$)(\d{3})|(\d{1,2})/)
.reject(&:empty?)
.join("-")
#=> 123-45-67
Not suggesting this is the best approach, but wanted to offer a little food for thought:
You can basically reduce the logic for your challenge to test for 1 single condition and to use 2 very simple pattern matches:
Condition to test for: Number of characters is more than 3 and has a modulo(3) of 1. This condition will require the use of both pattern matches.
All other conditions will use a single pattern match so no reason to test for those.
This could probably be made a little less verbose but it’s all spelled out pretty well for clarity:
def format(s)
n = s.delete("^0-9")
regex_1 = /.{1,3}/
regex_2 = /../
if [n.length-3, 0].max.modulo(3) == 1
a = n[0..-5].scan(regex_1)+n[-4..-1].scan(regex_2)
else a=n.scan(regex_1)
end
a.join("-")
end

How to find any non-digit characters using RegEx in ABAP

I need a Regular Expression to check whether a value contains any other characters than digits between 0 and 9.
I also want to check the length of the value.
The RegEx I´ve made: ^([0-9]\d{6})$
My test value is: 123Z45 and 123456
The ABAP code:
FIND ALL OCCURENCES OF REGEX '^([0-9]\d{6})$' IN L_VALUE RESULTS DATA(LT_RESULTS).
I´m expecting a result in LT_RESULTS, when I´m testing the first test value '123Z45', because there is a non-digit character.
But LT_RESULTS is in nearly every test case empty.
Your expression ^([0-9]\d{6})$ translates to:
^ - start of input
( - begin capture group
[0-9] - a character between 0 and 9
\d{6} - six digits (digit = character between 0 and 9)
) - end capture group
$ - end of input
So it will only match 1234567 (7 digit strings), not 123456, or 123Z45.
If you just need to find a string that contains non digits you could use the following instead: ^\d*[^\d]+\d*$
* - previous element may occur zero, one or more times
[^\d] - ^ right after [ means "NOT", i.e. any character which is not a digit
+ - previous element may occur one or more times
Example:
const expression = /^\d*[^\d]+\d*$/;
const inputs = ['123Z45', '123456', 'abc', 'a21345', '1234f', '142345'];
console.log(inputs.filter(i => expression.test(i)));
You can also use this character class if you want to extract non-digit group:
DATA(l_guid) = '0074162D8EAA549794A4EF38D9553990680B89A1'.
DATA(regx) = '[[:alpha:]]+'.
DATA(substr) = match( val = l_guid
regex = regx
occ = 1 ).
It finds a first occured non-digit group of characters and shows it.
If you want to just check if they are exists or how much of them reside in your string, count built-in function is your friend:
DATA(how_many) = count( val = l_guid regex = regx ).
DATA(yes) = boolc( count( val = l_guid regex = regx ) > 0 ).
Match and count exist since ABAP 7.50.
If you don't need a Regular Expression for something more complex, ABAP has some nice comparison operators CO (Contains Only), CA, NA etc for you. Something like:
IF L_VALUE CO '0123456789' AND STRLEN( L_VALUE ) = 6.

Is there a pythonic way to count the number of leading matching characters in two strings?

For two given strings, is there a pythonic way to count how many consecutive characters of both strings (starting at postion 0 of the strings) are identical?
For example in aaa_Hello and aa_World the "leading matching characters" are aa, having a length of 2. In another and example there are no leading matching characters, which would give a length of 0.
I have written a function to achive this, which uses a for loop and thus seems very unpythonic to me:
def matchlen(string0, string1): # Note: does not work if a string is ''
for counter in range(min(len(string0), len(string1))):
# run until there is a mismatch between the characters in the strings
if string0[counter] != string1[counter]:
# in this case the function terminates
return(counter)
return(counter+1)
matchlen(string0='aaa_Hello', string1='aa_World') # returns 2
matchlen(string0='another', string1='example') # returns 0
You could use zip and enumerate:
def matchlen(str1, str2):
i = -1 # needed if you don't enter the loop (an empty string)
for i, (char1, char2) in enumerate(zip(str1, str2)):
if char1 != char2:
return i
return i+1
An unexpected function in os.path, commonprefix, can help (because it is not limited to file paths, any strings work). It can also take in more than 2 input strings.
Return the longest path prefix (taken character-by-character) that is a prefix of all paths in list. If list is empty, return the empty string ('').
from os.path import commonprefix
print(len(commonprefix(["aaa_Hello","aa_World"])))
output:
2
from itertools import takewhile
common_prefix_length = sum(
1 for _ in takewhile(lambda x: x[0]==x[1], zip(string0, string1)))
zip will pair up letters from the two strings; takewhile will yield them as long as they're equal; and sum will see how many there are.
As bobble bubble says, this indeed does exactly the same thing as your loopy thing. Its sole pro (and also its sole con) is that it is a one-liner. Take it as you will.

Extract substring based on regex to use in RDD.filter

I am trying to filter out rows of a text file whose second column value begins with words from a list.
I have the list such as:
val mylist = ["Inter", "Intra"]
If I have a row like:
Cricket Inter-house
Inter is in the list, so that row should get filtered out by the RDD.filter operation. I am using the following regex:
`[A-Za-z0-9]+`
I tried using """[A-Za-z0-9]+""".r to extract the substring but the result is in a non empty iterator.
My question is how to access the above result in the filter operation?
You need to construct regular expression like ".* Inter.*".r since """[A-Za-z0-9]+""" matches any word. Here is some working example, hope it helps:
val mylist = List("Inter", "Intra")
val textRdd = sc.parallelize(List("Cricket Inter-house", "Cricket Int-house",
"AAA BBB", "Cricket Intra-house"))
// map over my list to dynamically construct regular expressions and check if it is within
// the text and use reduce to make sure none of the pattern exists in the text, you have to
// call collect() to see the result or take(5) if you just want to see the first five results.
(textRdd.filter(text => mylist.map(word => !(".* " + word + ".*").r
.pattern.matcher(text).matches).reduce(_&&_)).collect())
// res1: Array[String] = Array(Cricket Int-house, AAA BBB)
filter will remove anything for which the function passed to the filter method returns true. Thus, Regex isn't exactly what you want. Instead, let's develop a function that takes a row and compares it against a candidate string and returns true if the second column in that row starts with the candidate:
val filterFunction: (String, String) => Boolean =
(row, candidate) => row.split(" ").tail.head.startsWith(candidate)
We can convince ourselves that this works pretty easily using a worksheet:
// Test data
val mylist = List("Inter", "Intra")
val file = List("Cricket Inter-house", "Boom Shakalaka")
filterFunction("Cricket Inter-house", "Inter") // true
filterFunction("Cricket Inter-house", "Intra") // false
filterFunction("Boom Shakalaka", "Inter") // false
filterFunction("Boom Shakalaka", "Intra") // false
Now all that remains is to utilize this function in the filter. Essentially, for every row, we want to test the filter against every line in our candidate list. That means taking the candidate list and 'folding left' to check every item on it against the function. If any candidate reports true, then we know that row should be filtered out of the final result:
val result = file.filter((row: String) => {
!mylist.foldLeft(false)((x: Boolean, candidate: String) => {
x || filterFunction(row, candidate)
})
})
// result: List[String] = List(Boom Shakalaka)
The above can be a little dense to unpack. We are passing to the filter method a function that takes in a row and produces a boolean value. We want that value to be true if and only if the row does not match our criteria. We've already embedded our criteria in the filterFunction: we just need to run it against every combination of item in mylist.
To do this we use foldLeft, which takes a starting value (in this case false) and iteratively moves through the list, updating that starting value and returning the final result.
To 'update' that value we write a function that logically-ORs the starting value with the result of running our filter function against the row and the current item in mylist.

How to check sequence in string using regular expression

I want to check input string is in correct format. ^[\d-.]+$ expression check only existance of numbers and .(dot) and -(minus) But I want to check its sequence also.
Suppose I want to use calculator with . and - only. How to get regular expression which satify below all conditions.
Regex.IsMatch(input, #"^[\d-\.]+$")
//this expression works for below conditions only
if string v1="10-20-30"; // should return true
if string v1="10-20"; // should return true
if string v1="10.20"; // should return true
if string v1="10R20"; // should return false
if string v1="10#20"; // should return false
if string v1="10-20.30.40-50"; // should return true
if string v1="10"; // should return true
//above expression not works for below conditions
if string v1="10--20.30"; // should return false
if string v1="10-20-30.."; // should return false
if string v1="--10-20.30"; // should return false
if string v1="-10-20.30"; // should return false
if string v1="10-20.30."; // should return false
So something like
var pattern = #"^(\d+(-|\.))*\d+$";
should do the job for you.
What this regex "is saying" is:
Find one or more digits (\d+)
Followed by a minus sign or dot (-|.) - need to escape the dot here with \
This could be 0 or more times in the string - the star sign in the end (\d+(-|.))*
And then another one or more digits (\d+).
All this should be right after the beginning of the string and right before the end (the ^ and $ I believe you know about).
Note: If you need to be possible the numbers to be negative, you will need to add another conditional minus sign before both \d+ instances in the regex or :
var pattern = #"^(-?\d+(-|.))*-?\d+$";
Regards