Is there default(in SDK) scala support for string templating? Example: "$firstName $lastName"(named not numbered parameters) or even constructs like for/if. If there is no such default engine, what is the best scala library to accomplish this.
If you want a templating engine, I suggest you have a look at scalate. If you just need string interpolation, "%s %s".format(firstName, lastName) is your friend.
Complementing Kim's answer, note that Java's Formatter accepts positional parameters. For example:
"%2$s %1$s".format(firstName, lastName)
Also, there's the Enhanced Strings plugin, which allows one to embed arbitrary expressions on Strings. For example:
#EnhanceStrings // enhance strings in this scope
trait Example1 {
val x = 5
val str = "Inner string arithmetics: #{{ x * x + 12 }}"
}
See also this question for more answers, as this is really a close duplicate.
In Scala 2.10 and up, you can use string interpolation
val name = "James"
println(s"Hello, $name") // Hello, James
val height = 1.9d
println(f"$name%s is $height%2.2f meters tall") // James is 1.90 meters tall
This compiler plug-in has provided string interpolation for a while:
http://jrudolph.github.com/scala-enhanced-strings/Overview.scala.html
More recently, the feature seems to be making it into the scala trunk: https://lampsvn.epfl.ch/trac/scala/browser/scala/trunk/test/files/run/stringInterpolation.scala -- which generates some interesting possiblities: https://gist.github.com/a69d8ffbfe9f42e65fbf (not sure if these were possible with the plug-in; I doubt it).
Related
I'm using textacy's pos_regex_matches method to find certain chunks of text in sentences.
For instance, assuming I have the text: Huey, Dewey, and Louie are triplet cartoon characters., I'd like to detect that Huey, Dewey, and Louie is an enumeration.
To do so, I use the following code (on testacy 0.3.4, the version available at the time of writing):
import textacy
sentence = 'Huey, Dewey, and Louie are triplet cartoon characters.'
pattern = r'<PROPN>+ (<PUNCT|CCONJ> <PUNCT|CCONJ>? <PROPN>+)*'
doc = textacy.Doc(sentence, lang='en')
lists = textacy.extract.pos_regex_matches(doc, pattern)
for list in lists:
print(list.text)
which prints:
Huey, Dewey, and Louie
However, if I have something like the following:
sentence = 'Donald Duck - Disney'
then the - (dash) is recognised as <PUNCT> and the whole sentence is recognised as a list -- which it isn't.
Is there a way to specify that only , and ; are valid <PUNCT> for lists?
I've looked for some reference about this regex language for matching PoS tags with no luck, can anybody help? Thanks in advance!
PS: I tried to replace <PUNCT|CCONJ> with <[;,]|CCONJ>, <;,|CCONJ>, <[;,]|CCONJ>, <PUNCT[;,]|CCONJ>, <;|,|CCONJ> and <';'|','|CCONJ> as suggested in the comments, but it didn't work...
Is short, it is not possible: see this official page.
However the merge request contains the code of the modified version described in the page, therefore one can recreate the functionality, despite it's less performing than using a SpaCy's Matcher (see code and example -- though I have no idea how to reimplement my problem using a Matcher).
If you want to go down this lane anyway, you have to change the line:
words.extend(map(lambda x: re.sub(r'\W', '', x), keyword_map[w]))
with the following:
words.extend(keyword_map[w])
otherwise every symbol (like , and ; in my case) will be stripped off.
We have a file that contains data that we want to match to a case class. I know enough to brute force it but looking for an idiomatic way in scala.
Given File:
#record
name:John Doe
age: 34
#record
name: Smith Holy
age: 33
# some comment
#record
# another comment
name: Martin Fowler
age: 99
(field values on two lines are INVALID, e.g. name:John\n Smith should error)
And the case class
case class Record(name:String, age:Int)
I Want to return a Seq type such as Stream:
val records: Stream records
The couple of ideas i'm working with but so far haven't implemented is:
Remove all new lines and treat the whole file as one long string. Then grep match on the string "((?!name).)+((?!age).)+age:([\s\d]+)" and create a new object of my case class for each match but so far my regex foo is low and can't match around comments.
Recursive idea: Iterate through each line to find the first line that matches record, then recursively call the function to match name, then age. Tail recursively return Some(new Record(cumulativeMap.get(name), cumulativeMap.get(age)) or None when hitting the next record after name (i.e. age was never encountered)
?? Better Idea?
Thanks for reading! The file is more complicated than above but all rules are equal. For the curious: i'm trying to parse a custom M3U playlist file format.
I'd use kantan.regex for a fairly trivial regex based solution.
Without fancy shapeless derivation, you can write the following:
import kantan.regex._
import kantan.regex.implicits._
case class Record(name:String, age:Int)
implicit val decoder = MatchDecoder.ordered(Record.apply _)
input.evalRegex[Record](rx"(?:name:\s*([^\n]+))\n(?:age:\s*([0-9]+))").toList
This yields:
List(Success(Record(John Doe,34)), Success(Record(Smith Holy,33)), Success(Record(Martin Fowler,99)))
Note that this solution requires you to hand-write decoder, but it can often be automatically derived. If you don't mind a shapeless dependency, you could simply write:
import kantan.regex._
import kantan.regex.implicits._
import kantan.regex.generic._
case class Record(name:String, age:Int)
input.evalRegex[Record](rx"(?:name:\s*([^\n]+))\n(?:age:\s*([0-9]+))").toList
And get the exact same result.
Disclaimer: I'm the library's author.
You could use Parser Combinators.
If you have the file format specification in BNF or can write one, then Scala can create a parser for you from those rules. This may be more robust than hand-made regex based parsers. It's certainly more "Scala".
I don't have much experience in Scala, but could these regexes work:
You could use (?<=name:).* to match name value, and (?<=age:).* to match the age value. If you use this, remove spaces in found matches, otherwise name: bob will match bob with a space before, you might not want that.
If name: or any other tag is in comment, or comment is after value, something will be matched. Please leave a comment if you want to avoid that.
You could try this:
Path file = Paths.get("file.txt");
val lines = Files.readAllLines(file, Charset.defaultCharset());
val records = lines.filter(s => s.startsWith("age:") || s.startsWith("name:"))
.grouped(2).toList.map {
case List(a, b) => Record(a.replaceAll("name:", "").trim,
b.replaceAll("age:", "").trim.toInt)
}
As you can see in the above picture I can use Regex class like a data source when declaring it. Why is that?
And I also noticed this LinQ lines are starting with dots. How is this possible?
Regex.Matches(string, string) returns a MatchCollection instance which implements ICollection and IEnumerable. So you can't use LINQ directly since the LINQ extension methods in System.Linq.Enumerable require IEnumerable<T>(the generic version, differences).
That's why Enumerable.OfType was used. This returns IEnumerable<Match>, so now you can use LINQ. Instead of OfType<Match> he could also have used Cast<Match>.
In general you can use Linq-To-Objects with any kind of type that implements IEnumerable<T>, even with a String since it implements IEnumerable<char>. A small example which creates a dictionary of chars and their occurrences:
Dictionary<char, int> charCounts = "Sample Text" // bad example because only unique letters but i hope you got it
.GroupBy(c => c)
.ToDictionary(g => g.Key, g => g.Count());
To answer the .dot part of your question. LINQ basically consists of many extension methods, so you call them also like any other method, you could use one line:
Dictionary<char, int> charCounts = "Sample Text".GroupBy(c => c).ToDictionary(g => g.Key, g => g.Count());
I'm newbie for python 2.7. I would like to create some function that knows which are variables in the given probability notation.
For example: Given a probability P(A,B,C|D,E,F) as string input. The function should return a list of events ['A','B','C'] and a list of sample spaces ['D','E','F']. If it is impossible to return two lists in the same time. Returning a list of two lists would be fine.
In summary:
Input:
somefunction('P(A,B,C|D,E,F)')
Expected output: [['A','B','C'],['D','E','F']]
Thank you in advance
A simple brute-force implementation. As #fjarri pointed out if you want to do anything more complex you might need a parser (like PyParser) or at least some regular expressions.
def somefunction(str):
str = str.strip()[str.index("(")+1:-1]
left, right = str.split("|")
return [left.split(","), right.split(",")]
I am new in swift, I have been working with it only few weeks and now I am trying to parse something like a price list from incoming string. It has the next format:
2.99 X 3.00 = 10 A
Some text here
1.22 X 1.5 10 A
And the hardest part is that sometime A or some digit is missing but X should be in the place.
I would like to find out how it is possible to use regex in swift (or something like that if it does not exist) to write a template for parsing the next value
d.dd X d.d SomeValueIfExists
I would very appreciate any useful information, topics to read or any other resources to get more knowledge about swift.
PS. I have access to the dev. forums but I've never used them before.
I did an example recentl, and maybe a little harder than necessary, to demonstrate RegEx use in Swift:
let str1: NSString = "I run 12 miles"
let str2 = "I run 12 miles"
let match = str1.rangeOfString("\\d+", options: .RegularExpressionSearch)
let finalStr = str1.substringWithRange(match).toInt()
let n: Double = 2.2*Double(finalStr!)
let newStr = str2.stringByReplacingOccurrencesOfString("\\d+", withString: "\(n)", options: NSStringCompareOptions.RegularExpressionSearch, range: nil)
println(newStr) //I run 26.4 miles
Two of these have "RegularExpressionSearch". If you put this in a playground you can see what each line does. Note the double \ escapes. One for the normal RegEx use and anther because \ is a special character in Swift.
Also a good article:
http://benscheirman.com/2014/06/regex-in-swift/