Struggling with simple Groovy List operations - list

I've abstracted a very simple situation here, in which I want to pass a list of strings into my cleanLines function and get a list of strings back out. Unfortunately, I'm new to Groovy and I've spent about a day trying to get this to work with no avail. Here's a stand-alone test that exhibits the problem I'm having:
import static org.junit.Assert.*;
import java.util.List;
import org.junit.Test;
class ConfigFileTest {
private def tab = '\t'
private def returnCarriage = '\r'
private def equals = '='
List <String> cleanLines(List <String> lines) {
lines = lines.collect(){it.findAll(){c -> c != tab && c != returnCarriage}}
lines = lines.findAll(){it.contains(equals)}
lines = lines.collect{it.trim()}
}
#Test
public void test() {
List <String> dirtyLines = [" Colour=Red",
"Shape=Square "]
List <String> cleanedLines = ["Colour=Red",
"Shape=Square"]
assert cleanLines(dirtyLines) == cleanedLine
}
}
I believe that I've followed the correct usage for collect(), findAll() and trim(). But when I run the test, it crashes on the trim() line stating
groovy.lang.MissingMethodException: No signature of method:
java.util.ArrayList.trim() is applicable for argument types: ()
values: []
. Something's suspicious.
I've been staring at this for too long and noticed that my IDE thinks the type of my first lines within the cleanLines function is List<String>, but that by the second line it has type Collection and by the third it expects type List<Object<E>>. I think that String is an Object and so this might be okay, but it certainly hints at a misunderstanding on my part. What am I doing wrong? How can I get my test to pass here?

Here's a corrected script:
import groovy.transform.Field
#Field
def tab = '\t'
#Field
def returnCarriage = '\r'
#Field
def equals = '='
List <String> cleanLines(List <String> lines) {
lines = lines.findAll { it.contains(equals) }
lines = lines.collect { it.replaceAll('\\s+', '') }
lines = lines.collect { it.trim() }
}
def dirtyLines = [" Colour=Red",
"Shape=Square "]
def cleanedLines = ["Colour=Red", "Shape=Square"]
assert cleanLines(dirtyLines) == cleanedLines
In general findAll and collect are maybe not mutually exclusive but have different purposes. Use findAll to find elements that matches certain criteria, whereas collect when you need to process/transform the whole list.

this line
lines = lines.collect(){it.findAll(){c -> c != tab && c != returnCarriage}}`
replaces the original list of Strings with the list of lists. Hence NSME for ArrayList.trim(). You might want to replace findAll{} with find{}

You can clean the lines like this:
def dirtyLines = [" Colour=Red", "Shape=Square "]
def cleanedLines = ["Colour=Red", "Shape=Square"]
assert dirtyLines.collect { it.trim() } == cleanedLines

If you separate the first line of cleanLines() into two separate lines and print lines after each, you will see the problem.
it.findAll { c -> c != tab && c != returnCarriage }
will return a list of strings that match the criteria. The collect method is called on every string in the list of lines. So you end up with a list of lists of strings. I think what you are looking for is something like this:
def cleanLines(lines) {
return lines.findAll { it.contains(equals) }
.collect { it.replaceAll(/\s+/, '') }
}

Related

Change a tuple using pig

I need to substitute characters of a tuple using Pig UDF. For eg, if i have a line in the file as "hello world, Hello WORLD, hello\WORLD" required to be transformed as "hello_world,hello_world,hello_world". To accomplish this, i tried below UDF:
package myUDF;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.data.TupleFactory;
public class ReplaceValues extends EvalFunc<Tuple>
{
public Tuple exec(Tuple input) throws IOException {
if (input == null || input.size() == 0)
return null;
try{
String str = (String)input.get(0);
str=str.replace(" ", "_");
str=str.replace("/","");
str=str.replace("\\","");
TupleFactory tf = TupleFactory.getInstance();
Tuple t = tf.newTuple();
t.append(str);
return t;
}catch(Exception e){
throw new IOException("Caught exception processing input row ", e);
}
}
}
but when calling this UDF via pig script i am facing issues, please help me in resolving this:
A = load '/user/cloudera/Stage/ActualDataSet.csv' using PigStorage(',') AS (Rank:chararray,NCTNumber:chararray,Title:chararray,Recruitment:chararray);
B = FILTER A by Rank == 'Rank';
C = FOREACH B GENERATE PigUDF.ReplaceValues(B);
Error: Pig script failed to parse:
Invalid scalar projection: B : A column needs to be projected from a relation for it to be used as a scalar
You have to pass the field that you are trying to modify and not the relation B.Assuming the field that you are trying to match is Title, then you would call the UDF like below
C = FOREACH B GENERATE B.Rank,B.NCTNumber,PigUDF.ReplaceValues(B.Title),B.Recruitment;
Note that if you are trying to replace it in the entire record then your load statement is incorrect.You will have to load the entire record as one line:chararray and then pass the line to your UDF.
Also, instead of an UDF you can use REGEX to match and replace the string of your choice.
In your Pig script, you are passing entire Bag "B" in the UDF, while it accepts tuple as an argument.
Instead pass the field like B.Title as given below.
C = FOREACH B GENERATE PigUDF.ReplaceValues(B.Title);
you can call this UDF on other fields also in the same line as:
C = FOREACH B GENERATE PigUDF.ReplaceValues(B.Title), PigUDF.ReplaceValues(B.Rank);

Compressing series of if-else statement in Groovy

I have a series of if-else statements in Groovy:
String invoke = params?.target
AuxService aux = new AuxService()
def output
if(invoke?.equalsIgnoreCase("major")) {
output = aux.major()
}
else if(invoke?.equalsIgnoreCase("minor")) {
output = aux.minor()
}
else if(invoke?.equalsIgnoreCase("repository")) {
output = aux.repository()
}
...
else if(invoke?.equalsIgnoreCase("temp")) {
output = aux.temp()
}
else {
output = aux.propagate()
}
The ellipsis contains yet another 14 sets of if-else statements, a total of 19. You see, depending on the value of invoke the method that will be called from AuxService. Now I'm thinking the following to reduce the lines:
String invoke = params?.target()
AuxService aux = new AuxService()
def output = aux?."$invoke"() ?: aux.propagate()
But I think the third line might not work, it looks very unconventional. And just a hunch, I think that line is prone to error. Is this a valid code or are there any more optimal approach to compress these lines?
Just test String invoke before using it. Note that aux won't be null, so no need to use safe navigation (?.).
class AuxService {
def major() { 'major invoked' }
def minor() { 'minor invoked' }
def propagate() { 'propagate invoked' }
}
def callService(invoke) {
def aux = new AuxService()
return invoke != null ? aux.invokeMethod(invoke, null) : aux.propagate()
}
assert callService('major') == 'major invoked'
assert callService(null) == 'propagate invoked'
Note this is will fail if the input doesn't contain a valid method in class AuxService.
Firstly, your code is avlid in Groovy. Though, if equalsIgnoreCase is required, your reduced code will not work. The same for if params is null, since then invoke would be null. But I think your basic idea is right. So what I would do is making a map (final static somewhere) with the methods in uppercase as String key and the real method in correct casing as String value. Then you can use that to ensure correctness for different cases. Null handling I would solve separate:
def methodsMap = ["MAJOR":"major",/* more mappings here */]
String invoke = params?.target()
AuxService aux = new AuxService()
def methodName = methodsMap[invoke?.toUpperCase()]
def output = methodName ? aux."$methodName"() : aux.propagate()
An slightly different approach would be to use Closure values in the map. I personally find that a bit overkill, but it allows you to do more than just the plain invocation
def methodsMap = ["MAJOR":{it.major()},/* more mappings here */]
String invoke = params?.target()
AuxService aux = new AuxService()
def stub = methodsMap[invoke?.toUpperCase()]
def output = stub==null ? stub(aux) : aux.propagate()
I thought about using the Map#withDefault, but since that will create a new entry I decided not to. It could potentially cause memory problems. In Java8 you can use Map#getOrDefault:
String invoke = params?.target()
AuxService aux = new AuxService()
def methodName = methodsMap.getOrDefault(invoke?.toUpperCase(), "propagate")
def output = aux."$methodName"()
The Elvis operator is used to shorten the equivalent Java ternary operator expression.
For example,
def nationality = (user.nationality!=null) ? user.nationality : "undefined"
can be shortened using the Elvis operator to
def nationality = user.nationality ?: "undefined"
Note that the Elvis operator evaluates the expression to the left of the "?" symbol. If the result is non-null, it returns the result immediately, else it evaluates the expression on the right of the ":" symbol and returns the result.
What this means is that you cannot use the Elvis operator to perform some extra logic on the right side of the "?" symbol if the condition evaluates to true. So, the ternary expression
user.nationality ? reportCitizen(user) : reportAlien(user)
cannot be (directly)expressed using the Elvis operator.
Coming back to the original question, the Elvis operator cannot be (directly) applied to checking if a method exists on an object and invoking it if present. So,
def output = aux?."$invoke"() ?: aux.propagate()
will not work as expected, because the Elvis operator will try to evaluate "aux?."$invoke"()" first. If "invoke" refers to a method that does not exist, you will get a MissingMethodException.
One way I can think of to work around this is -
class AuxService {
def major() { println 'major invoked' }
def minor() { println 'minor invoked' }
def propagate() { println 'propagate invoked' }
}
def auxService = new AuxService()
def allowedMethods = ["major", "minor", "propagate"]
def method = null
allowedMethods.contains(method?.toLowerCase()) ? auxService."${method?.toLowerCase()}"() : auxService.propagate() // Prints "propagate invoked"
method = "MaJoR"
allowedMethods.contains(method?.toLowerCase()) ? auxService."${method?.toLowerCase()}"() : auxService.propagate() // Prints "major invoked"
method = "undefined"
allowedMethods.contains(method?.toLowerCase()) ? auxService."${method?.toLowerCase()}"() : auxService.propagate() // Prints "propagate invoked"
In a nutshell, store the list of invoke-able methods in a list and check to see if we're trying to invoke a method from this list. If not, invoke the default method.

Need the Groovy way to do partial file substitutions

I have a file that I need to modify. The part I need to modify (not the entire file), is similar to the properties shown below. The problem is that I only need to replace part of the "value", the "ConfigurablePart" if you will. I receive this file so can not control it's format.
alpha.beta.gamma.1 = constantPart1ConfigurablePart1
alpha.beta.gamma.2 = constantPart2ConfigurablePart2
alpha.beta.gamma.3 = constantPart3ConfigurablePart3
I made this work this way, though I know it is really bad!
def updateFile(String pattern, String updatedValue) {
def myFile = new File(".", "inputs/fileInherited.txt")
StringBuffer updatedFileText = new StringBuffer()
def ls = System.getProperty('line.separator')
myFile.eachLine{ line ->
def regex = Pattern.compile(/$pattern/)
def m = (line =~ regex)
if (m.matches()) {
def buf = new StringBuffer(line)
buf.replace(m.start(1), m.end(1), updatedValue)
line = buf.toString()
}
println line
updatedFileText.append(line).append(ls)
}
myFile.write(updatedFileText.toString())
}
The passed in pattern is required to contain a group that is substituted in the StringBuffer. Does anyone know how this should really be done in Groovy?
EDIT -- to define the expected output
The file that contains the example lines needs to be updated such that the "ConfigurablePart" of each line is replaced with the updated text provided. For my ugly solution, I would need to call the method 3 times, once to replace ConfigurablePart1, once for ConfigurablePart2, and finally for ConfigurablePart3. There is likely a better approach to this too!!!
*UPDATED -- Answer that did what I really needed *
In case others ever hit a similar issue, the groovy code improvements I asked about are best reflected in the accepted answer. However, for my problem that did not quite solve my issues. As I needed to substitute only a portion of the matched lines, I needed to use back-references and groups. The only way I could make this work was to define a three-part regEx like:
(.*)(matchThisPart)(.*)
Once that was done, I was able to use:
it.replaceAdd(~/$pattern/, "\$1$replacement\$3")
Thanks to both replies - each helped me out a lot!
It can be made more verbose with the use of closure as args. Here is how this can be done:
//abc.txt
abc.item.1 = someDummyItem1
abc.item.2 = someDummyItem2
abc.item.3 = someDummyItem3
alpha.beta.gamma.1 = constantPart1ConfigurablePart1
alpha.beta.gamma.2 = constantPart2ConfigurablePart2
alpha.beta.gamma.3 = constantPart3ConfigurablePart3
abc.item.4 = someDummyItem4
abc.item.5 = someDummyItem5
abc.item.6 = someDummyItem6
Groovy Code:-
//Replace the pattern in file and write to file sequentially.
def replacePatternInFile(file, Closure replaceText) {
file.write(replaceText(file.text))
}
def file = new File('abc.txt')
def patternToFind = ~/ConfigurablePart/
def patternToReplace = 'NewItem'
//Call the method
replacePatternInFile(file){
it.replaceAll(patternToFind, patternToReplace)
}
println file.getText()
//Prints:
abc.item.1 = someDummyItem1
abc.item.2 = someDummyItem2
abc.item.3 = someDummyItem3
alpha.beta.gamma.1 = constantPart1NewItem1
alpha.beta.gamma.2 = constantPart2NewItem2
alpha.beta.gamma.3 = constantPart3NewItem3
abc.item.4 = someDummyItem4
abc.item.5 = someDummyItem5
abc.item.6 = someDummyItem6
Confirm file abc.txt. I have not used the method updateFile() as done by you, but you can very well parameterize as below:-
def updateFile(file, patternToFind, patternToReplace){
replacePatternInFile(file){
it.replaceAll(patternToFind, patternToReplace)
}
}
For a quick answer I'd just go this route:
patterns = [pattern1 : constantPart1ConfigurablePart1,
pattern2 : constantPart2ConfigurablePart2,
pattern3 : constantPart3ConfigurablePart3]
def myFile = new File(".", "inputs/fileInherited.txt")
StringBuffer updatedFileText = new StringBuffer()
def ls = System.getProperty('line.separator')
myFile.eachLine{ line ->
patterns.each { pattern, replacement ->
line = line.replaceAll(pattern, replacement)
}
println line
updatedFileText.append(line).append(ls)
}
myFile.write(updatedFileText.toString())

Scala way / idiom of dealing with immutable List

I have found successes using ideas of immutable List but I am stumped when come to this piece of code here. I find myself has written something more Java than of Scala style. I would prefer to use List(...) instead of Buffer(...) but I don't see how I can pass the same modified immutable List to the next function. guesses is also modified within eliminate(...).
Any suggestions to help me to make this the Scala way of doing this is appreciated. Thanks
val randomGuesses = List(...) // some long list of random integers
val guesses = randomGuesses.zipWithIndex.toBuffer
for ( s <- loop()) {
val results = alphaSearch(guesses)
if (results.size == 1) {
guesses(resultes.head._2) = results.head._1
eliminate(guesses, resultes.head._2)
}
else {
val results = betaSearch(guesses)
if (results.size == 1) {
guesses(resultes.head._2) = results.head._1
eliminate(guesses, resultes.head._2)
} else {
val results = betaSearch(guesses)
if (results.size == 1) {
guesses(resultes.head._2) = results.head._1
eliminate(guesses, resultes.head._2)
}
}
}
}
Here are some general tips since this might be better suited for codereview and the code posted is incomplete with no samples.
You can use pattern matching instead of if and else for checking the size.
results.size match{
case 1 => ... //Code in the if block
case _ => ... //Code in the else block
}
Instead of mutating guesses create a new List.
val newGuesses = ...
Then pass newGuesses into eliminate.
Lastly, it looks like eliminate modifies guesses. Change this to return a new list. e.g.
def eliminate(list: List[Int]) = {
//Eliminate something from list and return a new `List`
}

Attributes are empty during unit test of a taglib in grails

I am trying to test my code in taglib (grails 2.0.1):
class ATagLib {
static namespace = "s"
def person = {attrs, body -> out << attrs.person;}
}
#TestFor(ATagLib)
class ATagLibTests {
#Test
void test() {
String p = 'Joe'
// None of these work for me.
assert applyTemplate('<s:person person="${p}"/>') == 'Joe'
assert applyTemplate('<s:person/>', [person:p]) == 'Joe'
}
}
The test always fails, because attrs.person is null. How do I properly set attributes?
This will work :
String p = 'Joe'
assert applyTemplate('<s:person person="${person}"/>', [person:p]) == 'Joe'
assertOutputEquals('Joe is cool !', '<s:person person="${person}"/>', [person:p], { it.toString() + " is cool !" } )
It calls the first signature of applyTemplate, which is :
String applyTemplate(String contents, Map model = [:])
Is the problem that you are using single quotes for your template text? Only GStrings can use the $ notation for inserting variables. Single quotes make it a regular Java String which won't substitute your value in.
Try this:
assert applyTemplate("<s:person person=\"${p}\"/>") == "Joe"