How to extract integer (seconds), from three possible different inputs? - regex

As I described in the title, how can I get the integer data from a string input what contain ints and chars.
Possible inputs for the function are:
("1 min." ... "9 min."),
("11:59" ... "12:00") and
(">>"), what I can assume is 0.
I came up with this solution, but returns me exact string as input. How to get only the number existing in this patern.
def toSeconds(time : String) : String = {
val pattern = """(\d+) min.""".r
val pattern2 = """(\d+):(\d+).""".r
if(pattern.findFirstIn(time) != "None")
{
pattern.findFirstIn(time).toString.concat("h")
}
if (pattern2.findFirstIn(time) != "None")
{
pattern2.findFirstIn(time).toString.concat("x")
}
if (time == ">>") 0.toString
else time
}

I'd do it like this:
time match {
case pattern(m) => s"${m}h"
case pattern2(h,m) => s"${h}h${m}"
case ">>" => "0"
case _ => time
}

Related

Regex scala: Format matching and padding

Correct input format: xxxx/yyyy/zzzz i.e. 4 chars for each part. Total length of the string (not counting "/") should always be 12.
Input can be: xxx/yyy/zzz then it should be padded to come out as 0xxx/0yyy/0zzz
At this stage at least one "/" will be there. If there are 2 parts then we need 6 chars for both.
Looking for a regex with padding logic in Scala.
// line to tune:
val matchThis = raw"(\d{4})/(\d{4})/(\d{4})".r
val valids = List ("1/6", "123456/1", "1/123456", "123456/123456", "1/2/3", "1234/1234/1234", "012/12/3", "1/01/012")
val invalids = List ("/6", "1234567/1", "1/1234567", "1234567/1234567", "/2/3", "1/2/", "12345/1234/1234", "012/12345/3", "1/01/012345")
def tester (input: String) = {
input match {
case matchThis(_*) => "It's valid!"
case _ => "Need some work" /*???*/
}
}
valids.map (s => tester(s))
invalids.map (s => tester(s))
This isn't bulletproof but I think it covers most of what you've described.
val valid = raw"(\d{1,6})/(\d{1,6})(?:/(\d{1,4}))?".r
val output = input match {
case valid(a,b,null) => f"$a%6s/$b%6s" replaceAll(" ","0")
case valid(a,b,c) => f"$a%4s/$b%4s/$c%4s" replaceAll(" ","0")
case _ => "invalid"
}
A little more complete.
val valid = raw"(\d{1,4})/(\d{1,4})/(\d{1,4})|(\d{1,6})/(\d{1,6})".r
val output = input match {
case valid(null,null,null,a,b) => f"$a%6s/$b%6s" replaceAll(" ","0")
case valid(a,b,c,null,null) => f"$a%4s/$b%4s/$c%4s" replaceAll(" ","0")
case _ => "invalid"
}

Scala string pattern match regex a star multiple findallin

I want to parse this string: "er1r2r3" with: """(e|w|n|s)(r[1-3])*""".r
val SideR = """(e|w|n|s)""".r
val PieceR = """(r)([1-3])""".r
def parseSidedPieces(str: String): (Char, List[Char]) = {
val side = str(0) match {
case SideR(s) => s
}
val pieces = parsePieces(str.tail)
(side, pieces)
}
def parsePieces(str: String): List[Char] = {
PieceR.findAllIn(str).toList map {
case PieceR(c, n) => n
}
}
But this throws on empty string "" because str(0).
Fix this, regex only.
I don't think this can be fixed 'regexes only' (whatever that is supposed to mean), because the code fails before the first regex is used.
It fails because you call apply(index: Int) on an empty String. So, either you do an isEmpty check before calling str(0) or even parseSidedPieces, or you change the code and match the whole String:
val PieceR = """(r)([1-3])""".r
val CombinedR = "(e|w|n|s)((?:r[1-3])*)".r
def parseSidedPieces(str: String): (Char, List[Char]) = {
str match {
case CombinedR(side, pieces) =>
(side(0), parsePieces(pieces))
case "" =>
// hmm, what kind of tuple would be a good return value here? maybe:
throw new IllegalArgumentException(s"Unexpected input: $str")
case _ =>
// handle unmatched strings however you like, I'd do:
throw new IllegalArgumentException(s"Unexpected input: $str")
}
}
def parsePieces(str: String): List[Char] = {
PieceR.findAllIn(str).toList map {
case PieceR(c, n) => n(0)
}
}
parseSidedPieces("er1r2r3") |-> res0: (Char, List[Char]) = (e,List(1, 2, 3))

How to split string by delimiter in scala?

I have a string like this:
val str = "3.2.1"
And I want to do some manipulations based on it.
I will share also what I want to do and it will be nice if you can share your suggestions:
im doing automation for some website, and based on this string I need to do some actions.
So:
the first digit - I will need to choose by value: value="str[0]"
the second digit - I will need to choose by value: value="str[0]+"."+str[1]"
the third digit - I will need to choose by value: value="str[0]+"."+str[1]+"."+str[2]"
as you can see the second field i need to choose is the name firstdigit.seconddigit and the third field is firstdigit.seconddigit.thirddigit
You can use pattern matching for this.
First create regex:
# val pattern = """(\d+)\.(\d+)\.(\d+)""".r
pattern: util.matching.Regex = (\d+)\.(\d+)\.(\d+)
then you can use it to pattern match:
# "3.4.342" match { case pattern(a, b, c) => println(a, b, c) }
(3,4,342)
if you don't need all numbers you can for example do this
"1.2.0" match { case pattern(a, _, _) => println(a) }
1
if you want to for example to take just first two numbers you can do
# val twoNumbers = "1.2.0" match { case pattern(a, b, _) => s"$a.$b" }
twoNumbers: String = "1.2"
Can only add to #Lukasz's answer one more variant with the values extration:
# val pattern = """(\d+)\.(\d+)\.(\d+)""".r
pattern: scala.util.matching.Regex = (\d+)\.(\d+)\.(\d+)
# val pattern(firstdigit, seconddigit, thirddigit) = "3.2.1"
firstdigit: String = "3"
seconddigit: String = "2"
thirddigit: String = "1"
This way all the values can be treated as regular vals further in the code.
val str="vaquar.khan"
val strArray=str.split("\\.")
strArray.foreach(println)
Try the following:
scala> "3.2.1".split(".")
res0: Array[java.lang.String] = Array(string1, string2, string3)
This one:
object Splitter {
def splitAndAccumulate(string: String) = {
val s = string.split("\\.")
s.tail.scanLeft(s.head){ case (acc, elem) =>
acc + "." + elem
}
}
}
passes this test:
test("Simple"){
val t = Splitter.splitAndAccumulate("1.2.3")
val answers = Seq("1", "1.2", "1.2.3")
t.zip(answers).foreach{ case (l, r) =>
assert(l == r)
}
}

filter/map structure to map/guard structure in Scala

I have a summaryPool mutable map that maps a String to a Summary object.
The function namesToSummary has two parameters, the first one is a series of names (in Iterable[String]), and the second one is the summaryPool. What it does is that it returns a series of Summary that corresponds the names.
It's a little bit more complicated, as the name should be checked using regular expression to extract the information that used to be the key to the summaryPool.
For example, "summary1b" should be checked to get "summary1" and "b"; the "summary1" is the key to the pool. In some cases, there may not be the "b" appended.
My implementation uses isSummaryPool function to filter out wrongly formatted name, or the name that is not in the pool. Then, I use map to get the copy of Summary object in the pool.
import scala.collection.mutable.{Map => mm}
def namesToSummaries(names: Iterable[String], summaryPool: mm[String, Summary]) = {
val namePattern = """([a-zA-Z]+\d+)([a-z])?""".r
def inSummaryPool(name: String) = {
name match {
case namePattern(summaryName, summaryType) => {
if (summaryPool.contains(summaryName)) true
else false
}
case _ => false
}
}
names filter inSummaryPool map { name =>
name match {
case namePattern(summaryName, summaryType) => {
var sType = summaryType
if (sType == null || !(sType == "b" || sType == "l")) sType = "b"
summaryPool.get(summaryName).get.copy(sType)
}
}
}
}
It works fine, but I don't like the implementation as it checks regular expression matching twice.
I think I can integrate the filter/map into map with guard. In order to do so, I thinK I may need to implement similar to this:
import scala.collection.mutable.{Map => mm}
def namesToSummaries(names: Iterable[String], summaryPool: mm[String, Summary]) = {
val namePattern = """([a-zA-Z]+\d+)([a-z])?""".r
names map { name =>
name match {
case namePattern(summaryName, summaryType) => {
if (summaryPool.contains(summaryName)) {
var sType = summaryType
if (sType == null || !(sType == "b" || sType == "l")) sType = "b"
summaryPool.get(summaryName).get.copy(sType)
}
else
???
}
case _ => ???
}
}
}
I'm not sure what expression should be given in ??? to teach Scala to ignore these cases.
What might be the solution?
EDIT1
I can think about making a ListBuffer object to add Summary object when necessary.
But, I'm not sure about the case when the pattern does not match.
val list: ListBuffer
names foreach { name =>
name match {
case namePattern(summaryName, summaryType) => {
if (summaryPool.contains(summaryName)) {
var sType = summaryType
if (sType == null || !(sType == "b" || sType == "l")) sType = "b"
list += summaryPool.get(summaryName).get.copy(sType)
}
}
case _ => ???
}
}
}
EDIT2
From Shadowlands' answer, flatMap with None return works fine.
def namesToSummaries(names: Iterable[String], summaryPool: mm[String, Summary]) = {
val namePattern = """([a-zA-Z]+\d+)([a-z])?""".r
names flatMap { name =>
name match {
case namePattern(summaryName, summaryType) => {
if (summaryPool.contains(summaryName)) {
var sType = summaryType
if (sType == null || !(sType == "b" || sType == "l")) sType = "b"
Some(summaryPool.get(summaryName).get.copy())
}
else None
}
case _ => None
}
}
}
EDIT3
From Jilen's hint, collect seems to be the good answer to reduce more lines of code.
def namesToSummaries(names: Iterable[String], summaryPool: mm[String, Summary]) = {
val namePattern = """([a-zA-Z]+\d+)([a-z])?""".r
names collect { name =>
name match {
case namePattern(summaryName, summaryType) if (summaryPool.contains(summaryName)) => {
var sType = summaryType
if (sType == null || !(sType == "b" || sType == "l")) sType = "b"
summaryPool.get(summaryName).get.copy()
}
}
}
}
However, this code in IntelliJ 14 shows false positive error: this is a bug report (https://youtrack.jetbrains.com/issue/SCL-9094#).
Instead of calling map on the names, try using flatMap. Wrap your successful cases in Some(...), and the ??? becomes None. The 'flattening' part of the flatMap will reduce the 'mapped' Iterable[Option[String]] back to an Iterable[String], ditching all the None cases.
Edit: I didn't drill into your code quite carefully enough - in the 'successful' case you appear to be doing pure side-effecting stuff (ie. updating the mutable map), not returning a result of any kind.
You could instead return a (summaryName, summaryType) tuple at this point (wrapped in Some) and apply the side-effecting code to the contents of the resulting flatMap (probably my preference as being a slightly more functional style), or simply go back to using map and just write _ (meaning here: 'do nothing - ignore any result') instead of ???.

Number of occurrences of substring in string in Swift

My main string is "hello Swift Swift and Swift" and substring is Swift.
I need to get the number of times the substring "Swift" occurs in the mentioned string.
This code can determine whether the pattern exists.
var string = "hello Swift Swift and Swift"
if string.rangeOfString("Swift") != nil {
println("exists")
}
Now I need to know the number of occurrence.
A simple approach would be to split on "Swift", and subtract 1 from the number of parts:
let s = "hello Swift Swift and Swift"
let tok = s.components(separatedBy:"Swift")
print(tok.count-1)
This code prints 3.
Edit: Before Swift 3 syntax the code looked like this:
let tok = s.componentsSeparatedByString("Swift")
Should you want to count characters rather than substrings:
extension String {
func count(of needle: Character) -> Int {
return reduce(0) {
$1 == needle ? $0 + 1 : $0
}
}
}
Optimising dwsolbergs solution to count faster. Also faster than componentsSeparatedByString.
extension String {
/// stringToFind must be at least 1 character.
func countInstances(of stringToFind: String) -> Int {
assert(!stringToFind.isEmpty)
var count = 0
var searchRange: Range<String.Index>?
while let foundRange = range(of: stringToFind, options: [], range: searchRange) {
count += 1
searchRange = Range(uncheckedBounds: (lower: foundRange.upperBound, upper: endIndex))
}
return count
}
}
Usage:
// return 2
"aaaa".countInstances(of: "aa")
If you want to ignore accents, you may replace options: [] with options: .diacriticInsensitive like dwsolbergs did.
If you want to ignore case, you may replace options: [] with options: .caseInsensitive like ConfusionTowers suggested.
If you want to ignore both accents and case, you may replace options: [] with options: [.caseInsensitive, .diacriticInsensitive] like ConfusionTowers suggested.
If, on the other hand, you want the fastest comparison possible and you can guarantee some canonical form for composed character sequences, then you may consider option .literal and it will only perform exact matchs.
Swift 5 Extension
extension String {
func numberOfOccurrencesOf(string: String) -> Int {
return self.components(separatedBy:string).count - 1
}
}
Example use
let string = "hello Swift Swift and Swift"
let numberOfOccurrences = string.numberOfOccurrencesOf(string: "Swift")
// numberOfOccurrences = 3
I'd recommend an extension to string in Swift 3 such as:
extension String {
func countInstances(of stringToFind: String) -> Int {
var stringToSearch = self
var count = 0
while let foundRange = stringToSearch.range(of: stringToFind, options: .diacriticInsensitive) {
stringToSearch = stringToSearch.replacingCharacters(in: foundRange, with: "")
count += 1
}
return count
}
}
It's a loop that finds and removes each instance of the stringToFind, incrementing the count on each go-round. Once the searchString no longer contains any stringToFind, the loop breaks and the count returns.
Note that I'm using .diacriticInsensitive so it ignore accents (for example résume and resume would both be found). You might want to add or change the options depending on the types of strings you want to find.
I needed a way to count substrings that may contain the start of the next matched substring. Leveraging dwsolbergs extension and Strings range(of:options:range:locale:) method I came up with this String extension
extension String
{
/**
Counts the occurrences of a given substring by calling Strings `range(of:options:range:locale:)` method multiple times.
- Parameter substring : The string to search for, optional for convenience
- Parameter allowOverlap : Bool flag indicating whether the matched substrings may overlap. Count of "🐼🐼" in "🐼🐼🐼🐼" is 2 if allowOverlap is **false**, and 3 if it is **true**
- Parameter options : String compare-options to use while counting
- Parameter range : An optional range to limit the search, default is **nil**, meaning search whole string
- Parameter locale : Locale to use while counting
- Returns : The number of occurrences of the substring in this String
*/
public func count(
occurrencesOf substring: String?,
allowOverlap: Bool = false,
options: String.CompareOptions = [],
range searchRange: Range<String.Index>? = nil,
locale: Locale? = nil) -> Int
{
guard let substring = substring, !substring.isEmpty else { return 0 }
var count = 0
let searchRange = searchRange ?? startIndex..<endIndex
var searchStartIndex = searchRange.lowerBound
let searchEndIndex = searchRange.upperBound
while let rangeFound = range(of: substring, options: options, range: searchStartIndex..<searchEndIndex, locale: locale)
{
count += 1
if allowOverlap
{
searchStartIndex = index(rangeFound.lowerBound, offsetBy: 1)
}
else
{
searchStartIndex = rangeFound.upperBound
}
}
return count
}
}
why not just use some length maths??
extension String {
func occurences(of search:String) -> Int {
guard search.count > 0 else {
preconditionFailure()
}
let shrunk = self.replacingOccurrences(of: search, with: "")
return (self.count - shrunk.count)/search.count
}
}
Try this
var mainString = "hello Swift Swift and Swift"
var count = 0
mainString.enumerateSubstrings(in: mainString.startIndex..<mainString.endIndex, options: .byWords) { (subString, subStringRange, enclosingRange, stop) in
if case let s? = subString{
if s.caseInsensitiveCompare("swift") == .orderedSame{
count += 1
}
}
}
print(count)
For the sake of completeness – and because there is a regex tag – this is a solution with Regular Expression
let string = "hello Swift Swift and Swift"
let regex = try! NSRegularExpression(pattern: "swift", options: .caseInsensitive)
let numberOfOccurrences = regex.numberOfMatches(in: string, range: NSRange(string.startIndex..., in: string))
The option .caseInsensitive is optional.
My solution, maybe it will be better to use String.Index instead of Int range but I think in such way it is a bit easier to read.
extension String {
func count(of char: Character, range: (Int, Int)? = nil) -> Int {
let range = range ?? (0, self.count)
return self.enumerated().reduce(0) {
guard ($1.0 >= range.0) && ($1.0 < range.1) else { return $0 }
return ($1.1 == char) ? $0 + 1 : $0
}
}
}
Solution which uses a higher order functions
func subStringCount(str: String, substr: String) -> Int {
{ $0.isEmpty ? 0 : $0.count - 1 } ( str.components(separatedBy: substr))
}
Unit Tests
import XCTest
class HigherOrderFunctions: XCTestCase {
func testSubstringWhichIsPresentInString() {
XCTAssertEqual(subStringCount(str: "hello Swift Swift and Swift", substr: "Swift"), 3)
}
func testSubstringWhichIsNotPresentInString() {
XCTAssertEqual(subStringCount(str: "hello", substr: "Swift"), 0)
}
}
Another way using RegexBuilder in iOS 16+ & swift 5.7+.
import RegexBuilder
let text = "hello Swift Swift and Swift"
let match = text.matches(of: Regex{"Swift"})
print(match.count) // prints 3
Using this as a function
func countSubstrings(string : String, subString : String)-> Int{
return string.matches(of: Regex{subString}).count
}
print(countSubstrings(string: text, subString: "Swift")) //prints 3
Using this as an Extension
extension String {
func countSubstrings(subString : String)-> Int{
return self.matches(of: Regex{subString}).count
}
}
print(text.countSubstrings(subString: "Swift")) // prints 3