I have a list in Groovy which contains the names in the below format:
My_name_is_Jack
My_name_is_Rock
My_name_is_Sunn
How can I trim the list and get only the last part of it; i.e. Names - Jack, Rock and Sunn. (Please note that the names are only 4 characters long)
Here you go with either one of the approach.
You can use sustring with lastIndexOf
or replace method to remove My_name_is_ with empty string
Script (using the first approach):
def list = ['My_name_is_Jack', 'My_name_is_Rock', 'My_name_is_Sunn']
//Closure to get the name
def getName = { s -> s.substring(s.lastIndexOf('_')+1, s.size()) }
println list.collect{getName it}
If you want to use replace, then use below closure.
def getName = { s -> s.replace('My_name_is_','') }
You can quickly try it online demo
Or
def list = ['My_name_is_Jack', 'My_name_is_Rock', 'My_name_is_Sunn']
println list*.split('_')*.getAt(-1)
You can either remove the common prefix:
def names = [ "My_name_is_Jack", "My_name_is_Rock", "My_name_is_Sunn", ]
assert ['Jack', 'Rock', 'Sunn'] == names*.replaceFirst('My_name_is_','')
or since you are actually interrested in the last four chars, you can also take those:
assert ['Jack', 'Rock', 'Sunn'] == names*.getAt(-4..-1)
Related
My requirement is to transform some textual message ids. Input is
a.messageid=X0001E
b.messageid=Y0001E
The task is to turn that into
a.messageid=Z00001E
b.messageid=Z00002E
In other words: fetch the first part each line (like: a.), and append a slightly different id.
My current solution:
val matcherForIds = Regex("(.*)\\.messageid=(X|Y)\\d{4,6}E")
var idCounter = 5
fun transformIds(line: String): String {
val result = matcherForIds.matchEntire(line) ?: return line
return "${result.groupValues.get(1)}.messageid=Z%05dE".format(messageCounter++)
}
This works, but find the way how I get to first match "${result.groupValues.get(1)} to be not very elegant.
Is there a nicer to read/more concise way to access that first match?
You may get the result without a separate function:
val line = s.replace("""^(.*\.messageid=)[XY]\d{4,6}E$""".toRegex()) {
"${it.groupValues[1]}Z%05dE".format(messageCounter++)
}
However, as you need to format the messageCounter into the result, you cannot just use a string replacement pattern and you cannot get rid of ${it.groupValues[1]}.
Also, note:
You may get rid of double backslashes by means of the triple-quoted string literal
There is no need adding .messageid= to the replacement if you capture that part into Group 1 (see (.*\.messageid=))
There is no need capturing X or Y since you are not using them later, thus, (X|Y) can be replaced with a more efficient character class [XY].
The ^ and $ make sure the pattern should match the entire string, else, there will be no match and the string will be returned as is, without any modification.
See the Kotlin demo online.
Maybe not really what you are looking for, but maybe it is. What if you first ensure (filter) the lines of interest and just replace what needs to be replaced instead, e.g. use the following transformation function:
val matcherForIds = Regex("(.*)\\.messageid=(X|Y)\\d{4,6}E")
val idRegex = Regex("[XY]\\d{4,6}E")
var idCounter = 5
fun transformIds(line: String) = idRegex.replace(line) {
"Z%05dE".format(idCounter++)
}
with the following filter:
"a.messageid=X0001E\nb.messageid=Y0001E"
.lineSequence()
.filter(matcherForIds::matches)
.map(::transformIds)
.forEach(::println)
In case there are also other strings that are relevant which you want to keep then the following is also possible but not as nice as the solution at the end:
"a.messageid=X0001E\nnot interested line, but required in the output!\nb.messageid=Y0001E"
.lineSequence()
.map {
when {
matcherForIds.matches(it) -> transformIds(it)
else -> it
}
}
.forEach(::println)
Alternatively (now just copying Wiktors regex, as it already contains all we need (complete match from begin of line ^ upto end of line $, etc.)):
val matcherForIds = Regex("""^(.*\.messageid=)[XY]\d{4,6}E$""")
fun transformIds(line: String) = matcherForIds.replace(line) {
"${it.groupValues[1]}Z%05dE".format(idCounter++)
}
This way you ensure that lines that completely match the desired input are replaced and the others are kept but not replaced.
I have been given two lists and I need to check whether any items in the sites list are in the ignoredSites. When I run the code below it only prints out google.co.uk however, should it not also print out amazon.co.uk and the groovy-lang.org?
Could someone explain why it doesn't
def ignoredSites = ["www.amazon.com", /amazon.co.*/, /www.scala-lang.org/,/google.co.uk/, ~/htt(p|ps):\/\/www\.amazon\.co.*/, "groovy-lang.org"]
def sites = ["amazon.co.uk", ~/groo{2}vy-lang\.org/, "google.co.uk", "amazon.com", ~/scala.*/]
sites.each { site ->
ignoredSites.contains(site) ? println("Ignored: ${site}") : ""
}
First of all you are mixing regex and strings in arrays. I suggest you to have them in separate lists.
Second of all be aware of groovy slashy strings.
I modified your code in order for you to see slashy strings (amazon.co.*, www.scala-lang.org, google.co.uk) are actually interpreted as strings and not as regex as expected.
And in your case since you are mixing regex and strings in arrays, check have to be done differently:
def ignoredSites = ["www.amazon.com", /amazon.co.*/, /www.scala-lang.org/,/google.co.uk/, ~/htt(p|ps):\/\/www\.amazon\.co.*/, "groovy-lang.org"]
def sites = ["amazon.co.uk", ~/groo{2}vy-lang\.org/, "google.co.uk", "amazon.com", ~/scala.*/]
println '==========sites============'
sites.each { site ->
println site.toString() + " == "+ site.class
}
println '==========ignoredSites============'
ignoredSites.each { site ->
println site.toString() + " == "+ site.class
}
println '======================'
sites.each { site ->
if(site.class.equals(java.util.regex.Pattern)){
ignoredSites.each{ is ->
if(is.class.equals(java.lang.String)){
if(is.matches(site)) println("Ignored: ${site}") //string = regex
} else {
//can't match 2 regex
}
}
} else {
ignoredSites.each{ is ->
if(is.class.equals(java.lang.String)){
if(is.equals(site)) println("Ignored: ${site}") //string = regex
} else {
if(site.matches(is)) println("Ignored3: ${site}") //string = regex
}
}
}
}
Edited
If you run the code, with printing element types, you will notice following thing:
==========sites============
amazon.co.uk == class java.lang.String
groo{2}vy-lang\.org == class java.util.regex.Pattern
google.co.uk == class java.lang.String
amazon.com == class java.lang.String
scala.* == class java.util.regex.Pattern
==========ignoredSites============
www.amazon.com == class java.lang.String
amazon.co.* == class java.lang.String
www.scala-lang.org == class java.lang.String
google.co.uk == class java.lang.String
htt(p|ps)://www\.amazon\.co.* == class java.util.regex.Pattern
groovy-lang.org == class java.lang.String
======================
So, amazon.co.uk is not matched, because regular expression that should match it:
amazon.co.* == class java.lang.String
is interpreted as a String by the groovy, because of slashy strings.
On the other hand
groo{2}vy-lang\.org == class java.util.regex.Pattern
is a regex, but {2} in it, means that o appears exactly 2 times.
In short, groo{2}vy-lang\.org will match grooovy-lang.org (note three o in there).
It would be rather unusual to have a site being a pattern but assuming that is what you meant:
def ignoredSites = ["www.amazon.com", /amazon.co.*/, /www.scala-lang.org/,/google.co.uk/, ~/htt(p|ps):\/\/www\.amazon\.co.*/, "groovy-lang.org"]
def sites = ["amazon.co.uk", ~/gro{2}vy-lang\.org/, "google.co.uk", "amazon.com", ~/scala.*/]
sites.findAll { site ->
ignoredSites.find{ it == site || (site in String && site.matches(it) || (it in String && it.matches(site))) }
}.each{ println "Ignored: $it" }
Actually, I disagree with the accepted answer and it looks like the trap the interviewer wants you to fall into.
To check this, you can simply change ~/groo{2}vy-lang\.org/ to ~/gro{2}vy-lang\.org/ and see for yourself that "groovy-lang.org" still won't be ignored.
This is because java.util.Collection.contains() isn't trying to be clever (probably because it isn't overwritten by Groovy) and simply checks, in this particular case, for equality (as defined here).
So "groovy-lang.org" ==~ /gro{2}vy-lang.org/ (the pattern matches) but "groovy-lang.org" != ~/gro{2}vy-lang.org/ (they're not equal objects and groovy truth doesn't abstract that particular case).
The "ignore" test is based on object equality, not on pattern matching as the interviewer probably voluntarily misleads you to believe.
Hope this helps, and I'm not mistaken.
Regex isn't my strongest point. Let's say I need a custom parser for strings which strips the string of any letters and multiple decimal points and alphabets.
For example, input string is "--1-2.3-gf5.47", the parser would return
"-12.3547".
I could only come up with variations of this :
string.replaceAll("[^(\\-?)(\\.?)(\\d+)]", "")
which removes the alphabets but retains everything else. Any pointers?
More examples:
Input: -34.le.78-90
Output: -34.7890
Input: df56hfp.78
Output: 56.78
Some rules:
Consider only the first negative sign before the first number, everything else can be ignored.
I'm trying to do this using Java.
Assume the -ve sign, if there is one, will always occur before the
decimal point.
Just tested this on ideone and it seemed to work. The comments should explain the code well enough. You can copy/paste this into Ideone.com and test it if you'd like.
It might be possible to write a single regex pattern for it, but you're probably better off implementing something simpler/more readable like below.
The three examples you gave prints out:
--1-2.3-gf5.47 -> -12.3547
-34.le.78-90 -> -34.7890
df56hfp.78 -> 56.78
import java.util.*;
import java.lang.*;
import java.io.*;
/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
System.out.println(strip_and_parse("--1-2.3-gf5.47"));
System.out.println(strip_and_parse("-34.le.78-90"));
System.out.println(strip_and_parse("df56hfp.78"));
}
public static String strip_and_parse(String input)
{
//remove anything not a period or digit (including hyphens) for output string
String output = input.replaceAll("[^\\.\\d]", "");
//add a hyphen to the beginning of 'out' if the original string started with one
if (input.startsWith("-"))
{
output = "-" + output;
}
//if the string contains a decimal point, remove all but the first one by splitting
//the output string into two strings and removing all the decimal points from the
//second half
if (output.indexOf(".") != -1)
{
output = output.substring(0, output.indexOf(".") + 1)
+ output.substring(output.indexOf(".") + 1, output.length()).replaceAll("[^\\d]", "");
}
return output;
}
}
In terms of regex, the secondary, tertiary, etc., decimals seem tough to remove. However, this one should remove the additional dashes and alphas: (?<=.)-|[a-zA-Z]. (Hopefully the syntax is the same in Java; this is a Python regex but my understanding is that the language is relatively uniform).
That being said, it seems like you could just run a pretty short "finite state machine"-type piece of code to scan the string and rebuild the reduced string yourself like this:
a = "--1-2.3-gf5.47"
new_a = ""
dash = False
dot = False
nums = '0123456789'
for char in a:
if char in nums:
new_a = new_a + char # record a match to nums
dash = True # since we saw a number first, turn on the dash flag, we won't use any dashes from now on
elif char == '-' and not dash:
new_a = new_a + char # if we see a dash and haven't seen anything else yet, we append it
dash = True # activate the flag
elif char == '.' and not dot:
new_a = new_a + char # take the first dot
dot = True # put up the dot flag
(Again, sorry for the syntax, I think you need some curly backets around the statements vs. Python's indentation only style)
How can you get the submap of map with a string being a pattern ? Example, I have this map :
def map = [val1:ATOPKLPP835, val2: ATOPKLPP847, val3:ATOPKLPP739, val4:YYHSTYSTX439, val5:UUSTETSFEE34]
The first three values are identical until the ninth character. I would like to get a submap only with the string "ATOPKLPP". How can I do ?
Have a look at this:
def map = [val1:'ATOPKLPP835', val2: 'ATOPKLPP847', val3:'ATOPKLPP739', val4:'YYHSTYSTX439', val5:'UUSTETSFEE34']
def found = map.findAll { it.value.startsWith('ATOPKLPP')}
assert found == [val1:'ATOPKLPP835', val2:'ATOPKLPP847', val3:'ATOPKLPP739']
You can define whatever criterion in closure passed to findAll.
Currently I use this to pick the first element of a list:
def Get_Read_Key =
{
logger.entering (TAG, "Get_Read_Key")
val Retval = if (Read_Key_Available)
{
val Retval = Keystrokes.head
Keystrokes = Keystrokes.tail
Retval
}
else
{
calculator.ui.IKey.No_Key
} // if
logger.exiting (TAG, "Get_Read_Key", Retval)
Retval
} // Get_Read_Key
def Read_Key_Available = Keystrokes.size > 0
But it looks all kind of clumsy — especially the double ´Retval´. Is there a better way of doing this? Or is it just the price to pay for using an immutable list?
Background: The routine is used on a Unit Test Mock class – return types are set.
The following code will get you the first element of Keystrokes list if it's not empty and calculator.ui.IKey.No_Key otherwise:
Keystrokes.headOption.getOrElse( calculator.ui.IKey.No_Key )
P.S. Reassigning Keystrokes to a tail is a definite sign of a bad design. Instead you should use the mentioned by themel already existing iteration capabilities of a list in your algorithm. Most probably using methods such as map or foreach will solve your problem.
P.P.S. You've violated several Scala naming conventions:
variable, value, method and function names begin with lowercase
camelCase is used to delimit words instead of underscore. In fact using underscore for these purposes is greatly discouraged due to Scala having a special treating of that specific character
You're implementing an Iterator on a List, that's already in the standard library.
val it = Keystrokes.iterator
def Read_Key_Available = it.hasNext
def Get_Read_Key = if(it.hasNext) it.next() else calculator.ui.IKey.No_Key
You can use pattern matching:
Keystrokes match {
case h::t =>
KeyStrokes = t
h
case _ =>
calculator.ui.IKey.No_key
}