Splitting a string into parts, including quoted strings - regex

So suppose I have this line:
print "Hello world!" out.txt
And I want to split it into:
print
"Hello world!"
out.txt
What would be the regular expression to match these?
Note that there must be a space between each of them. For example, if I had this:
print"Hello world!"out.txt
I would get:
print"Hello
world!"out.txt
The language I'm using is Haxe.

Expanding on Mark Knol's answer, this should work as expected for all your test strings so far:
static function main() {
var command = 'print "Hello to you world!" out.txt';
var regexp:EReg = ~/("[^"]+"|[^\s]+)/g;
var result = [];
var pos = 0;
while (regexp.matchSub(command, pos)) {
result.push(regexp.matched(0));
var match = regexp.matchedPos();
pos = match.pos + match.len;
}
trace(result);
}
Demo: http://try.haxe.org/#5c0B1
EDIT:
As pointed out in comments, if your use case is to split different parts of a command line, then it should be better to have a parser handle it, and not a regex.
These libs might help:
https://github.com/Simn/hxargs
https://github.com/Ohmnivore/HxCLAP

You can use regular expressions in Haxe using the EReg api class:
Demo:
http://try.haxe.org/#76Ea0
class Test {
static function main() {
var command = 'print "Hello world!" out.txt';
var regexp:EReg = ~/\s(?![\w!.]+")/g;
var result = regexp.replace(command, "\n");
js.Browser.alert(result);
}
}
About Haxe regular expressions:
http://haxe.org/manual/std-regex.html
About regular expressions replacement:
http://haxe.org/manual/std-regex-replace.html
EReg class API documentation:
http://api.haxe.org/EReg.html

regex demo
\s(?![\w!.]+"\s)
an example worked for these two case,maybe someone have more better solution

Related

Typescript regex exclude whole string if followed by specific string

I'm been running into weird issues with regex and Typescript in which I'm trying to have my expression replace the value of test minus the first instance if followed by test. In other words, replace the first two lines that have test but for the third line below, replace only the second value of test.
[test]
[test].[db]
[test].[test]
Where it should look like:
[newvalue]
[newvalue].[db]
[test].[newvalue]
I've come up with lots of variations but this is the one that I thought was simple enough to solve it and regex101 can confirm this works:
\[(\w+)\](?!\.\[test\])
But when using Typescript (custom task in VSTS build), it actually replaces the values like this:
[newvalue]
[newvalue].[db]
[newvalue].[test]
Update: It looks like a regex like (test)(?!.test) breaks when changing the use cases removing the square brackets, which makes me think this might be somewhere in the code. Could the problem be with the index that the value is replaced at?
Some of the code in Typescript that is calling this:
var filePattern = tl.getInput("filePattern", true);
var tokenRegex = tl.getInput("tokenRegex", true);
for (var i = 0; i < files.length; i++) {
var file = files[i];
console.info(`Starting regex replacement in [${file}]`);
var contents = fs.readFileSync(file).toString();
var reg = new RegExp(tokenRegex, "g");
// loop through each match
var match: RegExpExecArray;
// keep a separate var for the contents so that the regex index doesn't get messed up
// by replacing items underneath it
var newContents = contents;
while((match = reg.exec(contents)) !== null) {
var vName = match[1];
// find the variable value in the environment
var vValue = tl.getVariable(vName);
if (typeof vValue === 'undefined') {
tl.warning(`Token [${vName}] does not have an environment value`);
} else {
newContents = newContents.replace(match[0], vValue);
console.info(`Replaced token [${vName }]`);
}
}
}
Full code is for the task I'm using this with: https://github.com/colindembovsky/cols-agent-tasks/blob/master/Tasks/ReplaceTokens/replaceTokens.ts
For me this regex is working like you are expecting:
\[(test)\](?!\.\[test\])
with a Typescript code like that
myString.replace(/\[(test)\](?!\.\[test\])/g, "[newvalue]");
Instead, the regex you are using should replace also the [db] part.
I've tried with this code:
class Greeter {
myString1: string;
myString2: string;
myString3: string;
greeting: string;
constructor(str1: string, str2: string, str3: string) {
this.myString1 = str1.replace(/\[(test)\](?!\.\[test\])/g, "[newvalue]");
this.myString2 = str2.replace(/\[(test)\](?!\.\[test\])/g, "[newvalue]");
this.myString3 = str3.replace(/\[(test)\](?!\.\[test\])/g, "[newvalue]");
this.greeting = this.myString1 + "\n" + this.myString2 + "\n" + this.myString3;
}
greet() {
return "Hello, these are your replacements:\n" + this.greeting;
}
}
let greeter = new Greeter("[test]", "[test].[db]", "[test].[test]");
let button = document.createElement('button');
button.textContent = "Say Hello";
button.onclick = function() {
alert(greeter.greet());
}
document.body.appendChild(button);
Online playground here.

phrase search in meteor search-source package

I have a meteor app for which I added the search-source package to search certain collections and it works partially. That is, when I search for the term foo bar it returns results for each of "foo" and "bar". This is fine, but I want to also be able to wrap the terms in quotes this way: "foo bar" and get results for an exact match only. at the moment when i do this i get an empty set. Here is my server code:
//Server.js
SearchSource.defineSource('FruitBasket', function(searchText, options) {
// options = options || {}; // to be sure that options is at least an empty object
if(searchText) {
var regExp = buildRegExp(searchText);
var selector = {$or: [
{'fruit.name': regExp},
{'fruit.season': regExp},
{'fruit.treeType': regExp}
]};
return Basket.find(selector, options).fetch();
} else {
return Basket.find({}, options).fetch();
}
});
function buildRegExp(searchText) {
// this is a dumb implementation
var parts = searchText.trim().split(/[ \-\:]+/);
return new RegExp("(" + parts.join('|') + ")", "ig");
}
and my client code:
//Client.js
Template.dispResults.helpers({
getPackages_fruit: function() {
return PackageSearch_fruit.getData({
transform: function(matchText, regExp) {
return matchText.replace(regExp, "<b>$&</b>")
},
sort: {isoScore: -1}
});
}
});
Thanks in advance!
I've modified the .split pattern so that it ignores everything between double quotes.
/[ \-\:]+(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)/
Thus, you can simply wrap an exact phrase search in double quotes and it won't get split.
There is one more thing; since we don't need the quotes, they are removed in the next line using a .map function with a regex that replaces double quotes at the start or the end of a string part: /^"|"$/
Sample code:
function buildRegExp(searchText) {
// exact phrase search in double quotes won't get split
var arr = searchText.trim().split(/[ \-\:]+(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)/);
var parts = arr.map(function(x){return x.replace(/^"|"$/g, '');});
return new RegExp("(" + parts.join('|') + ")", "ig");
}
console.log(buildRegExp("foo bar"));
console.log(buildRegExp("\"foo bar\""));

What is the equalient of JavaScript's "s.replace(/[^\w]+/g, '-')" in Dart language?

I am trying to get the following working code in JavaScript also working in Dart.
https://jsfiddle.net/8xyxy8jp/1/
var s = "We live, on the # planet earth";
var results = s.replace(/[^\w]+/g, '-');
document.getElementById("output").innerHTML = results;
Which gives the output
We-live-on-the-planet-earth
I have tried this Dart code
void main() {
print( "We live, on the # planet earth".replaceAll("[^\w]+","-"));
}
But the output becomes the same.
What am I missing here?
If you want replaceAll() to process the argument as regular expression you need to pass a RegExp instance. I usually use r as prefix for the regex string to make it a raw string where not interpolation ($, \, ...) takes place.
main() {
var s = "We live, on the # planet earth";
var result = s.replaceAll(new RegExp(r'[^\w]+'), '-');
print(result);
}
Try it in DartPad

Regex- Get file name after last '\'

I have file name like
C:\fakepath\CI_Logo.jpg.
I need a regex for getting CI_Logo.jpg. Tried with \\[^\\]+$, but didn't workout..
Below is my Javascript Code
var regex="\\[^\\]+$";
var fileGet=$('input.clsFile').val();
var fileName=fileGet.match(regex);
alert(fileName);
Minimalist approach: demo
([\w\d_\.]+\.[\w\d]+)[^\\]
Use this
String oldFileName = "slashed file name";
String[] fileNameWithPath = oldFileName.split("\\\\");
int pathLength = fileNameWithPath.length;
oldFileName = fileNameWithPath[pathLength-1];
in java,
I guess,You can modify this for any other langs.
Edit:
make sure you split with "\\\\" - four slashes

URL rewriting with regular expressions

I want to extract 2 pieces of information from the URL below, "events/festivals" and "sandiego.storeboard.com".
How can I do that with a regular expression?
http://sandiego.storeboard.com/classifieds/events/festivals
I need this information for a URL rewrite in IIS 7
Try this:
^http://([^/]*)/classifieds/([^/]*/[^/]*)/
The [^/] snippet means "everything which is not a /"
The following C# code will retrun the two strings that you requested.
class Program
{
static void Main(string[] args)
{
GroupCollection result = GetResult("http://sandiego.storeboard.com/classifieds/events/festivals");
Console.Write(result[1] + " " + result[2]);
Console.ReadLine();
}
private static GroupCollection GetResult(string url)
{
string reg = #".*?(\w+\.\w+\.com).*?(events\/festivals)";
return Regex.Match(url, reg).Groups;
}
}
It's not the fastest solution but it works:
(.*?)/classifieds/(.*)