final Pattern PATTERN = Pattern.compile("\"[^\"]*\"");
#Test
public void parseCsvTest() {
StringBuffer result = new StringBuffer();
Matcher m = null;
String csv="\"foo$\n" + "bar\"";
try {
m = PATTERN.matcher(csv);
while (m.find()) {
m.appendReplacement(result, m.group().replaceAll("\\R+", ""));
}
m.appendTail(result);
} catch (Exception e) {
e.printStackTrace();
}
String escaped_csv = result.toString();
log.info(escaped_csv);
}
With String csv="\"foo\n" + "bar\"";
I'm getting the expected result that is: "foobar"
But with String csv="\"foo$\n" + "bar\""; (notice the $ char after foo), the pattern doesn't identify the group. Note: $ is a char, not the "end of line symbol", despite it can be followed by a "end of line symbol".
Tried with PATTERN = Pattern.compile("\"[^\"]*^$?\""); without success. Will return foo and bar in 2 lines
Any ideas ?
Got it work with: Pattern.compile("\"*[^$]|\"[^\"]*\"");
Results
csv = "\"foo\n" + "bar\n" + "doe\"" => foobardoe
csv = "\"foo$\n" + "bar\n" + "doe\"" => foo$bardoe
csv = "\"foo$\n" + "bar$\n" + "doe\"" => foo$bar$doe
csv = "\"foo$\n" + "bar$\n" + "doe$\"" => foo$bar$doe$
Related
I have a string and I have to filter the following:
"#Subject = \"#hb\" + #uv_EmployeeID + \" fdsaas\" + #test"
I have to filter only #uv_EmployeeID and #test and not the values inside ""-inner double quotes
This is working : new Regex(#"[^""]#{1}[a-zA-Z_]+");
You just have to remove the first character from the result, like this :
var reg = new Regex(#"[^""]#{1}[a-zA-Z_]+");
var matches = reg.Matches("#Subject = \"#hb\" + #uv_EmployeeID + \" fdsaas\" + #test");
var empId = matches[0].Value.Substring(1); // #uv_EmployeeID
var test = matches[1].Value.Substring(1); // #test
I would like to do pattern matching for following text in my word file, I am not sure how I can use pattern matcher
(P // TRIF)
(P)
(U//TRIF)
(U)
import java.util.ArrayList;
import java.util.List;
import java.util.regex;
public class ExtractDemo {
public static void main(String[] args) {
String input = "I have a ( U) but I (P) like my (P//TRIF) better (U//TRIF).";
Pattern p = Pattern.compile("(P|U|P//TRIF|U//TRIF)");
Matcher m = p.matcher(input);
List<String> animals = new ArrayList<String>();
while (m.find()) {
System.out.println("Found a " + m.group() + ".");
animals.add(m.group());
}
}
}
Your regex matches U, P, P, U
If you would like to match (P // TRIF) or (P) or (U//TRIF) or (U) you could change the order in your alteration to
(P//TRIF|U//TRIF|P|U)
Demo output Java
If you want to capture the text including the surrounding parenthesis in a group, you could try:
(\(\s*(?:P|U|P//TRIF|U//TRIF)\))
public static void main(String args[])
{
String input = "I have a ( U) but I (P) like my (P//TRIF) better (U//TRIF).";
Pattern p = Pattern.compile("(\\(\\s*(?:P|U|P//TRIF|U//TRIF)\\))");
Matcher m = p.matcher(input);
List<String> animals = new ArrayList<String>();
while (m.find()) {
System.out.println("Found a " + m.group() + ".");
animals.add(m.group());
}
}
Demo output Java
Another way to match this could be
\(\s*[PU](?://TRIF)?\)
Demo output Java
I have a Scala method that will be given a String like so:
"blah blah sediejdri \"foos\": {\"fizz\": \"buzz\"}, odedrfj49 blah"
And I need to strip the "foos JSON" out of it using pure Java/Scala (no external libs). That is, find the substring matching the pattern:
\"foos\" : {ANYTHING},
...and strip it out, so that the input string is now:
"blah blah sediejdri odedrfj49 blah"
The token to search for will always be \"foos\", but the content inside the JSON curly braces will always be different. My best attempt is:
// Ex: "blah \"foos\": { flim flam }, blah blah" ==> "blah blah blah", etc.
def stripFoosJson(var : toClean : String) : String = {
val regex = ".*\"foos\" {.*},.*"
toClean.replaceAll(regex, "")
}
However I my regex is clearly not correct. Can anyone spot where I'm going awry?
Here are 2 solutions I came up with, hope it helps. I think you forgot to handle possible spaces with \s* etc.
object JsonStrip extends App {
// SOLUTION 1, hard way, handles nested braces also:
def findClosingParen(text: String, openPos: Int): Int = {
var closePos = openPos
var parensCounter = 1 // if (parensCounter == 0) it's a match!
while (parensCounter > 0 && closePos < text.length - 1) {
closePos += 1
val c = text(closePos)
if (c == '{') {
parensCounter += 1
} else if (c == '}') {
parensCounter -= 1
}
}
if (parensCounter == 0) closePos else openPos
}
val str = "blah blah sediejdri \"foos\": {\"fizz\": \"buzz\"}, odedrfj49 blah"
val indexOfFoos = str.indexOf("\"foos\"")
val indexOfFooOpenBrace = str.indexOf('{', indexOfFoos)
val indexOfFooCloseBrace = findClosingParen(str, indexOfFooOpenBrace)
// here you would handle if the brace IS found etc...
val stripped = str.substring(0, indexOfFoos) + str.substring(indexOfFooCloseBrace + 2)
println("WITH BRACE COUNT: " + stripped)
// SOLUTION 2, with regex:
val reg = "\"foos\"\\s*:\\s*\\{(.*)\\}\\s*,\\s*"
println("WITH REGEX: " + str.replaceAll(reg, ""))
}
This regex \\"foos\\": {(.*?)} should match what you want, in most regex engine, you might need to replace " with \". If your JSON can contains other curly brackets, you can use this \\"foos\\": \{(?>[^()]|(?R))*\}, it uses recursion to match balanced groups of brackets. Note that this one only works in pcre regex engine, others won't support recursion.
I'm checking an array of strings for a specific combination of patterns. I'm having trouble using Meteor's Match function and regex literal together. I want to check if the second string in the array is a url.
addCheck = function(line) {
var firstString = _.first(line);
var secondString = _.indexOf(line, 1);
console.log(secondString);
var urlRegEx = /((([A-Za-z]{3,9}:(?:\/\/)?)(?:[\-;:&=\+\$,\w]+#)?[A-Za-z0-9\.\-]+|(?:www\.|[\-;:&=\+\$,\w]+#)[A-Za-z0-9\.\-]+)((?:\/[\+~%\/\.\w\-]*)?\??(?:[\-\+=&;%#\.\w]*)#?(?:[\.\!\/\\\w]*))?)/g;
if ( firstString == "+" && Match.test(secondString, urlRegEx) === true ) {
console.log( "detected: + | line = " + line )
} else {
// do stuff if we don't detect a
console.log( "line = " + line );
}
}
Any help would be appreciated.
Match.test is used to test the structure of a variable. For example: "it's an array of strings, or an object including the field createdAt", etc.
RegExp.test on the other hand, is used to test if a given string matches a regular expression. That looks like what you want.
Try something like this instead:
if ((firstString === '+') && urlRegEx.test(secondString)) {
...
}
For example,
string="help/nsomething/ncrayons"
Output:
String word count is: 3
This is what I have but the program is looping though the method several times and it looks like I am only getting the last string created. Here's the code block:
Regex regx = new Regex(#"\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*", RegexOptions.IgnoreCase);
MatchCollection matches = regx.Matches(output);
//int counte = 0;
foreach (Match match in matches)
{
//counte = counte + 1;
links = links + match.Value + '\n';
if (links != null)
{
string myString = links;
string[] words = Regex.Split(myString, #"\n");
word_count.Text = words.Length.ToString();
}
}
It is \n for newline.
Not sure if regex is a must for your case but you could use split:
string myString = "help/nsomething/ncrayons";
string[] separator = new string[] { "/n" };
string[] result = myString.Split(separator, StringSplitOptions.None);
MessageBox.Show(result.Count().ToString());
Another way using regex:
string myString = "help/nsomething/ncrayons";
string[] words = Regex.Split(myString, #"/n");
word_count.Text = words.Length;