Convert duration in seconds to hours and minutes and skip empty values - regex

We have a Time Tracking model with a duration (in seconds) value that is always greater than 60 (1 minute).
I need to convert the duration to hours and minutes if they are not zero and without zero at the start of hours or minutes.
For example:
duration1 = 63000 # expected value: 17 h 30 m
duration2 = 28800 # expected value: 8 h
duration3 = 1800 # expected value: 30 m
duration4 = 300 # expected value: 5 m
I almost did, but have small problem with a zero values:
Time.at(duration1).utc.strftime('%H h %M m').sub!(/^0/, '')
# 17 h 30 m
Time.at(duration2).utc.strftime('%H h %M m').sub!(/^0/, '')
# 8 h 00 m
Time.at(duration3).utc.strftime('%H h %M m').sub!(/^0/, '')
# 0 h 30 m
Time.at(duration4).utc.strftime('%H h %M m').sub!(/^0/, '')
# 0 h 05 m
Thanks for answers.

Why not just something simple like this:
def to_hms(time)
hours = time / 3600
minutes = (time / 60) % 60
if (hours > 0 and minutes > 0)
'%d h %d m' % [ hours, minutes ]
elsif (hours > 0)
'%d h' % hours
elsif (minutes > 0)
'%d m' % minutes
end
end
Where this produces the desired results:
to_hms(63000)
# => "17 h 30 m"
to_hms(28800)
# => "8 h"
to_hms(1800)
# => "30 m"

Use
.gsub(/\b0+(?:\B|\s[hm](?:\s|$))/, '')
See proof.
Explanation
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
0+ '0' (1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
\B the boundary between two word chars (\w)
or two non-word chars (\W)
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
[hm] any character of: 'h', 'm'
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
$ before an optional \n, and the end of
the string
--------------------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------------------
) end of grouping

If you are using Rails 6.1+, you can try to utilize the following function:
def format_seconds(numeric)
seconds = numeric.abs.seconds
hours = seconds.in_hours.floor.hours
minutes = seconds.in_minutes.minutes - hours.in_minutes.minutes
hours = hours.parts[:hours].to_i
minutes = minutes.parts[:minutes].to_i
formatted =
case
when hours.nonzero? && minutes.nonzero?
"#{hours} h #{minutes} m"
when hours.nonzero?
"#{hours} h"
when minutes.nonzero?
"#{minutes} m"
else
'0 m'
end
formatted = "-#{formatted}" if numeric.negative?
formatted
end
# format_seconds(63_000)
# => "17 h 30 m"
# format_seconds(28_800)
# => "8 h"
# format_seconds(1_800)
# => "30 m"
# format_seconds(0)
# => "0 m"
# format_seconds(1_000_000)
# => "277 h 46 m"
# format_seconds(-1_000_000)
# => "-277 h 46 m"
# format_seconds(300.25)
# => "5 m"
Sources:
Numeric#abs.
Numeric#seconds.
ActiveSupport::Duration#in_hours.
Numeric#floor.
Numeric#hours.
ActiveSupport::Duration#in_hours.
Numeric#minutes.
ActiveSupport::Duration#parts.
Numeric#nonzero?.
Numeric#negative?.

You can either match 00 followed by h or m, or you can match a 0 and assert a digit 1-9 directly to the right followed by either h or m
\b(?:00 [hm]|0(?=[1-9] [hm]))\s*
Rubular demo and a Ruby demo.
Time.at(duration1).utc.strftime('%H h %M m').gsub(/\b(?:00 [hm]|0(?=[1-9] [hm]))\s*/, ''))

Taking a page from #Tadman's answer, I suggest the following.
def doit(duration)
hr, min = (duration/60).divmod(60)
case
when hr == 0 then "#{min} m"
when min == 0 then "#{hr} h"
else "#{hr} h, #{min} m"
end
end
doit 63000 #=> "17 h 30 m"
doit 28800 #=> "8 h"
doit 1800 #=> "30 m"
doit 0 #=> "0 m"
See Integer#divmod (which references Numeric#divmod). For example,
(63000/60).divmod(60) #=> [17, 1800]
#=> [17, 30]
I'll leave my origin answer below for anyone wanting to practice the construction of regular expressions, but I cannot recommend its use in practice, or for that matter, the use of a regular expression generally (or conversion to a Time object).
def doit(duration)
hrs, secs = duration.divmod(3600)
("%d h %d m" % [hrs, secs/60]).gsub(/\A0 h |(?<=h) 0 m\z/, '')
end
doit 63000 #=> "17 h 30 m"
doit 28800 #=> "8 h"
doit 1800 #=> "30 m"
doit 0 #=> ""
I will explain the regular expression below.
Notice that doit 0 above returns an empty string. If it is desired that "0 m" be returned in that case the regular expression can be modified as follows.
def doit(duration)
hrs, secs = duration.divmod(3600)
("%d h %d m" % [hrs, secs/60]).gsub(/\A0 h |(?<=h)(?<!\A0 h) 0 m\z/, '')
end
doit 63000 #=> "17 h 30 m"
doit 28800 #=> "8 h"
doit 1800 #=> "30 m"
doit 0 #=> "0 m"
The second regular expression can be written in free-spacing mode to make it self-documenting.
/
\A # match beginning of string
0[ ]h[ ] # match '0 h '
| # or
(?<=h) # use postive lookbehind to assert the current match is preceded by 'h'
(?<!\A0 h) # use negative lookbehind to assert the current match not preceded by
# '0 h' at the beginning of the string
0[ ]m # match '0 m'
\z # invoke free-spacing regex definition mode
The first regular espression above differs from this one only that it does not contain the negative lookbehind.
When using free-spacing mode the regex engine removes all whitespace outside comments before parsing the expression. Spaces that are part of the expression must therefore be protected. I've done that be put each space in a character class. The are other ways to do that, one being to escape spaces (\ ).

Just for fun, chaining Kernel#then, returning a hash with data, manipulate it rejecting zeros, etc.
duration0 = 322200
res = duration0.divmod(24*60*60).then do |day, sec|
sec.divmod(60*60).then do |hour, sec|
sec.divmod(60).then do |min, sec|
{d: day, h: hour, m: min, s: sec}
end
end
end.reject { |_, v| v.zero? }.map { |k, v| "#{v}#{k}"}.join(' ')
res #=> "3d 17h 30m"
You could add days to hours if you want: {h: 24 * days + hour, m: min, s: sec}.

Related

Remove only non-leading and non-trailing spaces from a string in Ruby?

I'm trying to write a Ruby method that will return true only if the input is a valid phone number, which means, among other rules, it can have spaces and/or dashes between the digits, but not before or after the digits.
In a sense, I need a method that does the opposite of String#strip! (remove all spaces except leading and trailing spaces), plus the same for dashes.
I've tried using String#gsub!, but when I try to match a space or a dash between digits, then it replaces the digits as well as the space/dash.
Here's an example of the code I'm using to remove spaces. I figure once I know how to do that, it will be the same story with the dashes.
def valid_phone_number?(number)
phone_number_pattern = /^0[^0]\d{8}$/
# remove spaces
number.gsub!(/\d\s+\d/, "")
return number.match?(phone_number_pattern)
end
What happens is if I call the method with the following input:
valid_phone_number?(" 09 777 55 888 ")
I get false because line 5 transforms the number into " 0788 ", i.e. it gets rid of the digits around the spaces as well as the spaces. What I want it to do is just to get rid of the inner spaces, so as to produce " 0977755888 ".
I've tried
number.gsub!(/\d(\s+)\d/, "") and number.gsub!(/\d(\s+)\d/) { |match| "" } to no avail.
Thank you!!
If you want to return a boolean, you might for example use a pattern that accepts leading and trailing spaces, and matches 10 digits (as in your example data) where there can be optional spaces or hyphens in between.
^ *\d(?:[ -]?\d){9} *$
For example
def valid_phone_number?(number)
phone_number_pattern = /^ *\d(?:[ -]*\d){9} *$/
return number.match?(phone_number_pattern)
end
See a Ruby demo and a regex demo.
To remove spaces & hyphen inbetween digits, try:
(?:\d+|\G(?!^)\d+)\K[- ]+(?=\d)
See an online regex demo
(?: - Open non-capture group;
d+ - Match 1+ digits;
| - Or;
\G(?!^)\d+ - Assert position at end of previous match but (negate start-line) with following 1+ digits;
)\K - Close non-capture group and reset matching point;
[- ]+ - Match 1+ space/hyphen;
(?=\d) - Assert position is followed by digits.
p " 09 777 55 888 ".gsub(/(?:\d+|\G(?!^)\d+)\K[- ]+(?=\d)/, '')
Prints: " 0977755888 "
Using a very simple regex (/\d/ tests for a digit):
str = " 09 777 55 888 "
r = str.index(/\d/)..str.rindex(/\d/)
str[r] = str[r].delete(" -")
p str # => " 0977755888 "
Passing a block to gsub is an option, capture groups available as globals:
>> str = " 09 777 55 888 "
# simple, easy to understand
>> str.gsub(/(^\s+)([\d\s-]+?)(\s+$)/){ "#$1#{$2.delete('- ')}#$3" }
=> " 0977755888 "
# a different take on #steenslag's answer, to avoid using range.
>> s = str.dup; s[/^\s+([\d\s-]+?)\s+$/, 1] = s.delete("- "); s
=> " 0977755888 "
Benchmark, not that it matters that much:
n = 1_000_000
puts(Benchmark.bmbm do |x|
# just a match
x.report("match") { n.times {str.match(/^ *\d(?:[ -]*\d){9} *$/) } }
# use regex in []=
x.report("[//]=") { n.times {s = str.dup; s[/^\s+([\d\s-]+?)\s+$/, 1] = s.delete("- "); s } }
# use range in []=
x.report("[..]=") { n.times {s = str.dup; r = s.index(/\d/)..s.rindex(/\d/); s[r] = s[r].delete(" -"); s } }
# block in gsub
x.report("block") { n.times {str.gsub(/(^\s+)([\d\s-]+?)(\s+$)/){ "#$1#{$2.delete('- ')}#$3" }} }
# long regex
x.report("regex") { n.times {str.gsub(/(?:\d+|\G(?!^)\d+)\K[- ]+(?=\d)/, "")} }
end)
Rehearsal -----------------------------------------
match 0.997458 0.000004 0.997462 ( 0.998003)
[//]= 1.822698 0.003983 1.826681 ( 1.827574)
[..]= 3.095630 0.007955 3.103585 ( 3.105489)
block 3.515401 0.003982 3.519383 ( 3.521392)
regex 4.761748 0.007967 4.769715 ( 4.772972)
------------------------------- total: 14.216826sec
user system total real
match 1.031670 0.000000 1.031670 ( 1.032347)
[//]= 1.859028 0.000000 1.859028 ( 1.860013)
[..]= 3.074159 0.003978 3.078137 ( 3.079825)
block 3.751532 0.011982 3.763514 ( 3.765673)
regex 4.634857 0.003972 4.638829 ( 4.641259)

How to write a regex for a date-time string

dateTime = "SATURDAY1200PM1230PMWEEKLY"
Desired Result: "12:00 PM - 12:30 PM"
I tried doing this: let str = "SATURDAY600PM630PMWEEKLY".split(/[^A-Z][0-9]{3,4}(A|P)M/);
But I keep getting an array with chars/numbers. I am unsure if split is the way to go here.
Try a match approach:
var dateTime = "SATURDAY1200PM1230PMWEEKLY";
var ts = dateTime.match(/\d{3,4}[AP]M/g)
.map(x => x.replace(/(\d{1,2})(\d{2})([AP]M)/, "$1:$2 $3"))
.join(" - ");
console.log(ts);
As the programming language was not given I will provide a straightforward solution in Ruby which I expect could be converted easily to most other languages.
str = "SATURDAY1130AM130PMWEEKLY"
rgx = /\A[A-Z]+(\d{1,2})(\d{2})([AP]M)(\d{1,2})(\d{2})([AP]M)[A-Z]+\z/
m = str.match(rgx)
#=> #<MatchData "1130AM130PM" 1:"11" 2:"30" 3:"AM" 4:"1" 5:"30" 6:"PM">
"%s:%s %s - %s:%s %s" % [$1, $2, $3, $4, $5, $6]
#=> "11:30 AM - 1:30 PM"
Demo
The regular expression could be broken down as follows.
\A # match beginning of string
[A-Z]+ # match one or more uppercase letters
(\d{1,2}) # match 1 or 2 digits, save to capture group 1
(\d{2}) # match 2 digits, save to capture group 2
([AP]M) # match 'AM' or 'PM', save to capture group 3
(\d{1,2}) # match 1 or 2 digits, save to capture group 4
(\d{2}) # match 2 digits, save to capture group 5
([AP]M) # match 'AM' or 'PM', save to capture group 6
[A-Z]+ # match one or more uppercase letters
\z # match end of string
The last statement could also be written:
"%s:%s %s - %s:%s %s" % m.captures
#=> "11:30 AM - 1:30 PM"
which of course is specific to Ruby.
Another way is to make use of a language's date-time library. Again, this could be done as follows in Ruby.
require 'time'
s1, s2 = str.scan(/\d{3,4}[AP]M/).map do |s|
s.sub(/(?=\d{2}[AP])/, ' ')
end
#=> ["11 30AM", "1 30PM"]
t1 = DateTime.strptime(s1, '%I %M%p')
#=> #<DateTime: 2022-02-01T11:30:00+00:00
# ((2459612j,41400s,0n),+0s,2299161j)>
t2 = DateTime.strptime(s2, '%I %M%p')
#=> #<DateTime: 2022-02-01T13:30:00+00:00
# ((2459612j,48600s,0n),+0s,2299161j)>
t1.strftime('%l:%M %p') + " - " + t2.strftime('%l:%M %p')
#=> "11:30 AM - 1:30 PM"
If you are wondering why .map do |s| s.sub(/(?=\d{2}[AP])/, ' ') end is needed in calculating s1 and s2 try removing it and changing the format string to '%I%M%p'.
Solution is use match and then convert resoult to your string
let str = "SATURDAY600PM630PMWEEKLY"
.match(/[\d]{3,4}(A|P)M/g)
.map((time) => {
const AMPM = time.slice(-2);
const m = time.slice(-4,-2);
const h = time.slice(0,-4);
return `${h}:${m} ${AMPM}`;
})
.join(' - ')
console.log(str)

Try to split a string with particular regex expression

i'm trying to split a string using 2 separator and regex. My string is for example
"test 10 20 middle 30 - 40 mm".
and i would like to split in ["test 10", "20 middle 30", "40 mm"]. So, splittin dropping ' - ' and the space between 2 digits.
I tried to do
result = re.split(r'[\d+] [\d+]', s)
> ['test 1', '0 middle 30 - 40 mm']
result2 = re.split(r' - |{\d+} {\d+}', s)
> ['test 10 20 middle 30', '40 mm']
Is there any reg expression to split in ['test 10', '20 middle 30', '40 mm'] ?
You may use
(?<=\d)\s+(?:-\s+)?(?=\d)
See the regex demo.
Details
(?<=\d) - a digit must appear immediately on the left
\s+ - 1+ whitespaces
(?:-\s+)? - an optional sequence of a - followed with 1+ whitespaces
(?=\d) - a digit must appear immediately on the right.
See the Python demo:
import re
text = "test 10 20 middle 30 - 40 mm"
print( re.split(r'(?<=\d)\s+(?:-\s+)?(?=\d)', text) )
# => ['test 10', '20 middle 30', '40 mm']
Data
k="test 10 20 middle 30 - 40 mm"
Please Try
result2 = re.split(r"(^[a-z]+\s\d+|\^d+\s[a-z]+|\d+)$",k)
result2
**^[a-z]**-match lower case alphabets at the start of the string and greedily to the left + followed by:
**`\s`** white space characters
**`\d`** digits greedily matched to the left
| or match start of string with digits \d+ also matched greedily to the left and followed by:
`**\s**` white space characters
**`a-z`** lower case alphabets greedily matched to the left
| or match digits greedily to the left \d+ end the string $
Output

Regex to get total price with space as separator

I need to build a regex that would catch the total price, here some exemple:
Total: 145.01 $
Total: 1 145.01 $
Total: 00.01 $
Total: 12 345.01 $
It's need to get any price that follow 'Total: ', without the '$'.
That what I got so far : (?<=\bTotal:\s*)(\d+.\d+)
RegExr
I assume:
each string must begin 'Total: ' (three spaces), the prefix;
the last digit in the string must be followed by ' $' (one space), the suffix, which is at the end of the string;
the substring between the prefix and suffix must end '.dd', where 'd' presents any digit, the cents;
the substring between the prefix and cents must match one of the following patterns, where 'd' represents any digit: 'd', 'dd', 'ddd', 'd ddd', 'dd ddd', 'ddd ddd', 'd ddd ddd', 'dd ddd ddd', 'ddd ddd ddd', 'd ddd ddd ddd' and so on;
the return value is the substring between the prefix and suffix that meets the above requirements; and
spaces will be removed from the substring returned as a separate step at the end.
We can use the following regular expression.
r = /\ATotal: {3}(\d{1,3}(?: \d{3})*\.\d{2}) \$\z/
In Ruby (but if you don't know Ruby you'll get the idea):
arr = <<~_.split(/\n/)
Total: 145.01 $
Total: 1 145.01 $
Total: 00.01 $
Total: 12 345.01 $
Total: 1 241 345.01 $
Total: 1.00 $
Total: 1.00$
Total: 1.00 $x
My Total: 1.00 $
Total: 12 34.01 $
_
The following matches each string in the array arr and extracts the contents of capture group 1, which is shown on the right side of each line.
arr.each do |s|
puts "\"#{(s + '"[r,1]').ljust(30)}: #{s[r,1] || 'no match'}"
end
"Total: 145.01 $"[r,1] : 145.01
"Total: 1 145.01 $"[r,1] : 1 145.01
"Total: 00.01 $"[r,1] : 00.01
"Total: 12 345.01 $"[r,1] : 12 345.01
"Total: 1 241 345.01 $"[r,1] : 1 241 345.01
"Total: 1.00 $"[r,1] : no match
"Total: 1.00$"[r,1] : no match
"Total: 1.00 $x"[r,1] : no match
"My Total: 1.00 $"[r,1] : no match
"Total: 12 34.01 $"[r,1] : no match
The regular expression can be written in free-spacing mode to make it self-documenting.
r = /
\A # match the beginning of the string
Total:\ {3} # match 'Total:' followed by 3 digits
( # begin capture group 1
\d{1,3} # match 1, 2 or 3 digits
(?:\ \d{3}) # match a space followed by 3 digits
* # perform the previous match zero or more times
\.\d{2} # match a period followed by 2 digits
) # end capture group 1
\ \$ # match a space followed by a dollar sign
\z # match end of string
/x # free-spacing regex definition mode
The regex can be seen in action here.

R regex: how to remove "*" only in between a group of variables

I have a group of variable var:
> var
[1] "a1" "a2" "a3" "a4"
here is what I want to achieve: using regex and change strings such as this:
3*a1 + a1*a2 + 4*a3*a4 + a1*a3
to
3a1 + a1*a2 + 4a3*a4 + a1*a3
Basically, I want to trim "*" that is not in between any values in var. Thank you in advance
Can do find (?<![\da-z])(\d+)\* replace $1
(?<! [\da-z] )
( \d+ ) # (1)
\*
Or, ((?:[^\da-z]|^)\d+)\* for the assertion impaired engines
( # (1 start)
(?: [^\da-z] | ^ )
\d+
) # (1 end)
\*
Leading assertions are bad anyways.
Benchmark
Regex1: (?<![\da-z])(\d+)\*
Options: < none >
Completed iterations: 100 / 100 ( x 1000 )
Matches found per iteration: 2
Elapsed Time: 1.09 s, 1087.84 ms, 1087844 µs
Regex2: ((?:[^\da-z]|^)\d+)\*
Options: < none >
Completed iterations: 100 / 100 ( x 1000 )
Matches found per iteration: 2
Elapsed Time: 0.77 s, 767.04 ms, 767042 µs
You can create a dynamic regex out of the var to match and capture *s that are inside your variables, and reinsert them back with a backreference in gsub, and remove all other asterisks:
var <- c("a1","a2","a3","a4")
s = "3*a1 + a1*a2 + 4*a3*a4 + a1*a3"
block = paste(var, collapse="|")
pat = paste0("\\b((?:", block, ")\\*)(?=\\b(?:", block, ")\\b)|\\*")
gsub(pat, "\\1", s, perl=T)
## "3a1 + a1*a2 + 4a3*a4 + a1*a3"
See the IDEONE demo
Here is the regex:
\b((?:a1|a2|a3|a4)\*)(?=\b(?:a1|a2|a3|a4)\b)|\*
Details:
\b - leading word boundary
((?:a1|a2|a3|a4)\*) - Group 1 matching
(?:a1|a2|a3|a4) - either one of your variables
\* - asterisk
(?=\b(?:a1|a2|a3|a4)\b) - a lookahead check that there must be one of your variables (otherwise, no match is returned, the * is matched with the second branch of the alternation)
| - or
\* - a "wild" literal asterisk to be removed.
Taking the equation as a string, one option is
gsub('((?:^| )\\d)\\*(\\w)', '\\1\\2', '3*a1 + a1*a2 + 4*a3*a4 + a1*a3')
# [1] "3a1 + a1*a2 + 4a3*a4 + a1*a3"
which looks for
a captured group of characters, ( ... )
containing a non-capturing group, (?: ... )
containing the beginning of the line ^
or, |
a space (or \\s)
followed by a digit 0-9, \\d.
The capturing group is followed by an asterisk, \\*,
followed by another capturing group ( ... )
containing an alphanumeric character \\w.
It replaces the above with
the first captured group, \\1,
followed by the second captured group, \\2.
Adjust as necessary.
Thank #alistaire for offering a solution with non-capturing group. However, the solution replies on that there exists an space between the coefficient and "+" in front of it. Here's my modified solution based on his suggestion:
> ss <- "3*a1 + a1*a2+4*a3*a4 +2*a1*a3+ 4*a2*a3"
# my modified version
> gsub('((?:^|\\s|\\+|\\-)\\d)\\*(\\w)', '\\1\\2', ss)
[1] "3a1 + a1*a2+4a3*a4 +2a1*a3+ 4a2*a3"
# alistire's
> gsub('((?:^| )\\d)\\*(\\w)', '\\1\\2', ss)
[1] "3a1 + a1*a2+4*a3*a4 +2*a1*a3+ 4a2*a3"