How to split string on comma, except when comma is parameter separator or part of string [duplicate] - regex

Given
2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,"Corvallis, OR",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34
How to use C# to split the above information into strings as follows:
2
1016
7/31/2008 14:22
Geoff Dalgas
6/5/2011 22:21
http://stackoverflow.com
Corvallis, OR
7679
351
81
b437f461b3fd27387c5d8ab47a293d35
34
As you can see one of the column contains , <= (Corvallis, OR)
Based on
C# Regex Split - commas outside quotes
string[] result = Regex.Split(samplestring, ",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");

Use the Microsoft.VisualBasic.FileIO.TextFieldParser class. This will handle parsing a delimited file, TextReader or Stream where some fields are enclosed in quotes and some are not.
For example:
using Microsoft.VisualBasic.FileIO;
string csv = "2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,\"Corvallis, OR\",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34";
TextFieldParser parser = new TextFieldParser(new StringReader(csv));
// You can also read from a file
// TextFieldParser parser = new TextFieldParser("mycsvfile.csv");
parser.HasFieldsEnclosedInQuotes = true;
parser.SetDelimiters(",");
string[] fields;
while (!parser.EndOfData)
{
fields = parser.ReadFields();
foreach (string field in fields)
{
Console.WriteLine(field);
}
}
parser.Close();
This should result in the following output:
2
1016
7/31/2008 14:22
Geoff Dalgas
6/5/2011 22:21
http://stackoverflow.com
Corvallis, OR
7679
351
81
b437f461b3fd27387c5d8ab47a293d35
34
See Microsoft.VisualBasic.FileIO.TextFieldParser for more information.
You need to add a reference to Microsoft.VisualBasic in the Add References .NET tab.

It is so much late but this can be helpful for someone. We can use RegEx as bellow.
Regex CSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
String[] Fields = CSVParser.Split(Test);

I see that if you paste csv delimited text in Excel and do a "Text to Columns", it asks you for a "text qualifier". It's defaulted to a double quote so that it treats text within double quotes as literal. I imagine that Excel implements this by going one character at a time, if it encounters a "text qualifier", it keeps going to the next "qualifier". You can probably implement this yourself with a for loop and a boolean to denote if you're inside literal text.
public string[] CsvParser(string csvText)
{
List<string> tokens = new List<string>();
int last = -1;
int current = 0;
bool inText = false;
while(current < csvText.Length)
{
switch(csvText[current])
{
case '"':
inText = !inText; break;
case ',':
if (!inText)
{
tokens.Add(csvText.Substring(last + 1, (current - last)).Trim(' ', ','));
last = current;
}
break;
default:
break;
}
current++;
}
if (last != csvText.Length - 1)
{
tokens.Add(csvText.Substring(last+1).Trim());
}
return tokens.ToArray();
}

You could split on all commas that do have an even number of quotes following them.
You would also like to view at the specf for CSV format about handling comma's.
Useful Link : C# Regex Split - commas outside quotes

Use a library like LumenWorks to do your CSV reading. It'll handle fields with quotes in them and will likely overall be more robust than your custom solution by virtue of having been around for a long time.

It is a tricky matter to parse .csv files when the .csv file could be either comma separated strings, comma separated quoted strings, or a chaotic combination of the two. The solution I came up with allows for any of the three possibilities.
I created a method, ParseCsvRow() which returns an array from a csv string. I first deal with double quotes in the string by splitting the string on double quotes into an array called quotesArray. Quoted string .csv files are only valid if there is an even number of double quotes. Double quotes in a column value should be replaced with a pair of double quotes (This is Excel's approach). As long as the .csv file meets these requirements, you can expect the delimiter commas to appear only outside of pairs of double quotes. Commas inside of pairs of double quotes are part of the column value and should be ignored when splitting the .csv into an array.
My method will test for commas outside of double quote pairs by looking only at even indexes of the quotesArray. It also removes double quotes from the start and end of column values.
public static string[] ParseCsvRow(string csvrow)
{
const string obscureCharacter = "ᖳ";
if (csvrow.Contains(obscureCharacter)) throw new Exception("Error: csv row may not contain the " + obscureCharacter + " character");
var unicodeSeparatedString = "";
var quotesArray = csvrow.Split('"'); // Split string on double quote character
if (quotesArray.Length > 1)
{
for (var i = 0; i < quotesArray.Length; i++)
{
// CSV must use double quotes to represent a quote inside a quoted cell
// Quotes must be paired up
// Test if a comma lays outside a pair of quotes. If so, replace the comma with an obscure unicode character
if (Math.Round(Math.Round((decimal) i/2)*2) == i)
{
var s = quotesArray[i].Trim();
switch (s)
{
case ",":
quotesArray[i] = obscureCharacter; // Change quoted comma seperated string to quoted "obscure character" seperated string
break;
}
}
// Build string and Replace quotes where quotes were expected.
unicodeSeparatedString += (i > 0 ? "\"" : "") + quotesArray[i].Trim();
}
}
else
{
// String does not have any pairs of double quotes. It should be safe to just replace the commas with the obscure character
unicodeSeparatedString = csvrow.Replace(",", obscureCharacter);
}
var csvRowArray = unicodeSeparatedString.Split(obscureCharacter[0]);
for (var i = 0; i < csvRowArray.Length; i++)
{
var s = csvRowArray[i].Trim();
if (s.StartsWith("\"") && s.EndsWith("\""))
{
csvRowArray[i] = s.Length > 2 ? s.Substring(1, s.Length - 2) : ""; // Remove start and end quotes.
}
}
return csvRowArray;
}
One downside of my approach is the way I temporarily replace delimiter commas with an obscure unicode character. This character needs to be so obscure, it would never show up in your .csv file. You may want to put more handling around this.

This question and its duplicates have a lot of answers. I tried this one that looked promising, but found some bugs in it. I heavily modified it so that it would pass all of my tests.
/// <summary>
/// Returns a collection of strings that are derived by splitting the given source string at
/// characters given by the 'delimiter' parameter. However, a substring may be enclosed between
/// pairs of the 'qualifier' character so that instances of the delimiter can be taken as literal
/// parts of the substring. The method was originally developed to split comma-separated text
/// where quotes could be used to qualify text that contains commas that are to be taken as literal
/// parts of the substring. For example, the following source:
/// A, B, "C, D", E, "F, G"
/// would be split into 5 substrings:
/// A
/// B
/// C, D
/// E
/// F, G
/// When enclosed inside of qualifiers, the literal for the qualifier character may be represented
/// by two consecutive qualifiers. The two consecutive qualifiers are distinguished from a closing
/// qualifier character. For example, the following source:
/// A, "B, ""C"""
/// would be split into 2 substrings:
/// A
/// B, "C"
/// </summary>
/// <remarks>Originally based on: https://stackoverflow.com/a/43284485/2998072</remarks>
/// <param name="source">The string that is to be split</param>
/// <param name="delimiter">The character that separates the substrings</param>
/// <param name="qualifier">The character that is used (in pairs) to enclose a substring</param>
/// <param name="toTrim">If true, then whitespace is removed from the beginning and end of each
/// substring. If false, then whitespace is preserved at the beginning and end of each substring.
/// </param>
public static List<String> SplitQualified(this String source, Char delimiter, Char qualifier,
Boolean toTrim)
{
// Avoid throwing exception if the source is null
if (String.IsNullOrEmpty(source))
return new List<String> { "" };
var results = new List<String>();
var result = new StringBuilder();
Boolean inQualifier = false;
// The algorithm is designed to expect a delimiter at the end of each substring, but the
// expectation of the caller is that the final substring is not terminated by delimiter.
// Therefore, we add an artificial delimiter at the end before looping through the source string.
String sourceX = source + delimiter;
// Loop through each character of the source
for (var idx = 0; idx < sourceX.Length; idx++)
{
// If current character is a delimiter
// (except if we're inside of qualifiers, we ignore the delimiter)
if (sourceX[idx] == delimiter && inQualifier == false)
{
// Terminate the current substring by adding it to the collection
// (trim if specified by the method parameter)
results.Add(toTrim ? result.ToString().Trim() : result.ToString());
result.Clear();
}
// If current character is a qualifier
else if (sourceX[idx] == qualifier)
{
// ...and we're already inside of qualifier
if (inQualifier)
{
// check for double-qualifiers, which is escape code for a single
// literal qualifier character.
if (idx + 1 < sourceX.Length && sourceX[idx + 1] == qualifier)
{
idx++;
result.Append(sourceX[idx]);
continue;
}
// Since we found only a single qualifier, that means that we've
// found the end of the enclosing qualifiers.
inQualifier = false;
continue;
}
else
// ...we found an opening qualifier
inQualifier = true;
}
// If current character is neither qualifier nor delimiter
else
result.Append(sourceX[idx]);
}
return results;
}
Here are the test methods to prove that it works:
[TestMethod()]
public void SplitQualified_00()
{
// Example with no substrings
String s = "";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "" }, substrings);
}
[TestMethod()]
public void SplitQualified_00A()
{
// just a single delimiter
String s = ",";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "", "" }, substrings);
}
[TestMethod()]
public void SplitQualified_01()
{
// Example with no whitespace or qualifiers
String s = "1,2,3,1,2,3";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_02()
{
// Example with whitespace and no qualifiers
String s = " 1, 2 ,3, 1 ,2\t, 3 ";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_03()
{
// Example with whitespace and no qualifiers
String s = " 1, 2 ,3, 1 ,2\t, 3 ";
// whitespace should be preserved
var substrings = s.SplitQualified(',', '"', false);
CollectionAssert.AreEquivalent(
new List<String> { " 1", " 2 ", "3", " 1 ", "2\t", " 3 " },
substrings);
}
[TestMethod()]
public void SplitQualified_04()
{
// Example with no whitespace and trivial qualifiers.
String s = "1,\"2\",3,1,2,\"3\"";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
s = "\"1\",\"2\",3,1,\"2\",3";
substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_05()
{
// Example with no whitespace and qualifiers that enclose delimiters
String s = "1,\"2,2a\",3,1,2,\"3,3a\"";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2,2a", "3", "1", "2", "3,3a" },
substrings);
s = "\"1,1a\",\"2,2b\",3,1,\"2,2c\",3";
substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1,1a", "2,2b", "3", "1", "2,2c", "3" },
substrings);
}
[TestMethod()]
public void SplitQualified_06()
{
// Example with qualifiers enclosing whitespace but no delimiter
String s = "\" 1 \",\"2 \",3,1,2,\"\t3\t\"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" },
substrings);
}
[TestMethod()]
public void SplitQualified_07()
{
// Example with qualifiers enclosing whitespace but no delimiter
String s = "\" 1 \",\"2 \",3,1,2,\"\t3\t\"";
// whitespace should be preserved
var substrings = s.SplitQualified(',', '"', false);
CollectionAssert.AreEquivalent(new List<String> { " 1 ", "2 ", "3", "1", "2", "\t3\t" },
substrings);
}
[TestMethod()]
public void SplitQualified_08()
{
// Example with qualifiers enclosing whitespace but no delimiter; also whitespace btwn delimiters
String s = "\" 1 \", \"2 \" , 3,1, 2 ,\" 3 \"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" },
substrings);
}
[TestMethod()]
public void SplitQualified_09()
{
// Example with qualifiers enclosing whitespace but no delimiter; also whitespace btwn delimiters
String s = "\" 1 \", \"2 \" , 3,1, 2 ,\" 3 \"";
// whitespace should be preserved
var substrings = s.SplitQualified(',', '"', false);
CollectionAssert.AreEquivalent(new List<String> { " 1 ", " 2 ", " 3", "1", " 2 ", " 3 " },
substrings);
}
[TestMethod()]
public void SplitQualified_10()
{
// Example with qualifiers enclosing whitespace and delimiter
String s = "\" 1 \",\"2 , 2b \",3,1,2,\" 3,3c \"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2 , 2b", "3", "1", "2", "3,3c" },
substrings);
}
[TestMethod()]
public void SplitQualified_11()
{
// Example with qualifiers enclosing whitespace and delimiter; also whitespace btwn delimiters
String s = "\" 1 \", \"2 , 2b \" , 3,1, 2 ,\" 3,3c \"";
// whitespace should be preserved
var substrings = s.SplitQualified(',', '"', false);
CollectionAssert.AreEquivalent(new List<String> { " 1 ", " 2 , 2b ", " 3", "1", " 2 ", " 3,3c " },
substrings);
}
[TestMethod()]
public void SplitQualified_12()
{
// Example with tab characters between delimiters
String s = "\t1,\t2\t,3,1,\t2\t,\t3\t";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_13()
{
// Example with newline characters between delimiters
String s = "\n1,\n2\n,3,1,\n2\n,\n3\n";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_14()
{
// Example with qualifiers enclosing whitespace and delimiter, plus escaped qualifier
String s = "\" 1 \",\"\"\"2 , 2b \"\"\",3,1,2,\" \"\"3,3c \"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "\"2 , 2b \"", "3", "1", "2", "\"3,3c" },
substrings);
}
[TestMethod()]
public void SplitQualified_14A()
{
// Example with qualifiers enclosing whitespace and delimiter, plus escaped qualifier
String s = "\"\"\"1\"\"\"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "\"1\"" },
substrings);
}
[TestMethod()]
public void SplitQualified_15()
{
// Instead of comma-delimited and quote-qualified, use pipe and hash
// Example with no whitespace or qualifiers
String s = "1|2|3|1|2,2f|3";
var substrings = s.SplitQualified('|', '#', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2,2f", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_16()
{
// Instead of comma-delimited and quote-qualified, use pipe and hash
// Example with qualifiers enclosing whitespace and delimiter
String s = "# 1 #|#2 | 2b #|3|1|2|# 3|3c #";
// whitespace should be removed
var substrings = s.SplitQualified('|', '#', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2 | 2b", "3", "1", "2", "3|3c" },
substrings);
}
[TestMethod()]
public void SplitQualified_17()
{
// Instead of comma-delimited and quote-qualified, use pipe and hash
// Example with qualifiers enclosing whitespace and delimiter; also whitespace btwn delimiters
String s = "# 1 #| #2 | 2b # | 3|1| 2 |# 3|3c #";
// whitespace should be preserved
var substrings = s.SplitQualified('|', '#', false);
CollectionAssert.AreEquivalent(new List<String> { " 1 ", " 2 | 2b ", " 3", "1", " 2 ", " 3|3c " },
substrings);
}

I had a problem with a CSV that contains fields with a quote character in them, so using the TextFieldParser, I came up with the following:
private static string[] parseCSVLine(string csvLine)
{
using (TextFieldParser TFP = new TextFieldParser(new MemoryStream(Encoding.UTF8.GetBytes(csvLine))))
{
TFP.HasFieldsEnclosedInQuotes = true;
TFP.SetDelimiters(",");
try
{
return TFP.ReadFields();
}
catch (MalformedLineException)
{
StringBuilder m_sbLine = new StringBuilder();
for (int i = 0; i < TFP.ErrorLine.Length; i++)
{
if (i > 0 && TFP.ErrorLine[i]== '"' &&(TFP.ErrorLine[i + 1] != ',' && TFP.ErrorLine[i - 1] != ','))
m_sbLine.Append("\"\"");
else
m_sbLine.Append(TFP.ErrorLine[i]);
}
return parseCSVLine(m_sbLine.ToString());
}
}
}
A StreamReader is still used to read the CSV line by line, as follows:
using(StreamReader SR = new StreamReader(FileName))
{
while (SR.Peek() >-1)
myStringArray = parseCSVLine(SR.ReadLine());
}

With Cinchoo ETL - an open source library, it can automatically handles columns values containing separators.
string csv = #"2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,""Corvallis, OR"",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34";
using (var p = ChoCSVReader.LoadText(csv)
)
{
Console.WriteLine(p.Dump());
}
Output:
Key: Column1 [Type: String]
Value: 2
Key: Column2 [Type: String]
Value: 1016
Key: Column3 [Type: String]
Value: 7/31/2008 14:22
Key: Column4 [Type: String]
Value: Geoff Dalgas
Key: Column5 [Type: String]
Value: 6/5/2011 22:21
Key: Column6 [Type: String]
Value: http://stackoverflow.com
Key: Column7 [Type: String]
Value: Corvallis, OR
Key: Column8 [Type: String]
Value: 7679
Key: Column9 [Type: String]
Value: 351
Key: Column10 [Type: String]
Value: 81
Key: Column11 [Type: String]
Value: b437f461b3fd27387c5d8ab47a293d35
Key: Column12 [Type: String]
Value: 34
For more information, please visit codeproject article.
Hope it helps.

Related

how to create a filter to search for a word with special characters while writing in the input without special characters

it's my first post.
I work to Quasar (Vue.js)
I have list of jobs, and in this list, i have words with special caractere.
Ex :
[ ...{ "libelle": "Agent hôtelier" },{"libelle": "Agent spécialisé / Agente spécialisée des écoles maternelles -ASEM-"},{ "libelle": "Agriculteur / Agricultrice" },{ "libelle": "Aide aux personnes âgées" },{ "libelle": "Aide de cuisine" },...]
And on "input" i would like to search "Agent spécialisé" but i want to write "agent specialise" (without special caractere) or the initial name, i want to write both and autocomplete my "input".
I just don't fin the solution for add to my filter code ...
My input :
<q-select
filled
v-model="model"
use-input
hide-selected
fill-input
input-debounce="0"
:options="options"
hint="Votre métier"
style="width: 250px; padding-bottom: 32px"
#filter="filterFn"
>
</q-select>
</div>
My code :
export default {
props: ['data'],
data() {
return {
jobList: json,
model: '',
options: [],
stringOptions: []
}
},
methods: {
jsonJobsCall(e) {
this.stringOptions = []
json.forEach(res => {
this.stringOptions.push(res.libelle)
})
},
filterFn(val, update) {
if (val === '') {
update(() => {
this.jsonJobsCall(val)
this.options = this.stringOptions
})
return
}
update(() => {
const regex = /é/i
const needle = val.toLowerCase()
this.jsonJobsCall(val)
this.options = this.stringOptions.filter(
v => v.replace(regex, 'e').toLowerCase().indexOf(needle) > -1
)
})
},
}
}
To sum up : i need filter for write with or witouth special caractere in my input for found in my list the job which can contain a special character.
I hope i was clear, ask your questions if i haven't been.
Thanks you very much.
I am not sure if its work for you but you can use regex to create valid filter for your need. For example, when there is "e" letter you want to check "e" or "é" (If I understand correctly)
//Lets say we want to match "Agent spécialisé" with the given search text
let searchText = "Agent spe";
// Lets create a character map for matching characters
let characterMap = {
e: ['e', 'é'],
a: ['a', '#']
}
// Replacing special characters with a regex part which contains all equivelant characters
// !Remember replaceAll depricated
Object.keys(characterMap).forEach((key) => {
let replaceReg = new RegExp(`${key}`, "g")
searchText = searchText.replace(replaceReg, `[${characterMap[key].join("|")}]`);
})
// Here we create a regex to match
let reg = new RegExp(searchText + ".*")
console.log("Agent spécialisé".match(reg) != null);
Another approach could be the reverse of this. You can normalize "Agent spécialisé". (I mean replace all é with normal e with a regex like above) and store in the object along with the original text. But search on this normalized string instead of original.

Sort documents based on first character in field value

I have a set of data like this:
[{name: "ROBERT"}, {name: "PETER"}, {name: "ROBINSON"} , {name: "ABIGAIL"}]
I want to make a single mongodb query that can find:
Any data which name starts with letter "R" (regex: ^R)
Followed by any data which name contains letter "R" NOT AS THE FIRST CHARACTER, like: peteR, adleR, or caRl
so it produces:
[{name: "ROBERT"}, {name: "ROBINSON"}, {name: "PETER"}]
it basically just display any data that contains "R" character in it but I want to sort it so that data with "R" as the first character appears before the rest
So far I've come out with 2 separate query then followed by an operation to eliminate any duplicated results, then joined them. So is there any more efficient way to do this in mongo ?
What you want is add a weight to you documents and sort them accordingly.
First you need to select only those documents that $match your criteria using regular expressions.
To do that, you need to $project your documents and add the "weight" based on the value of the first character of your string using a logical $condition processing.
The condition here is $eq which add weight 1 to the document if the lowercase of the first character in the name is "r" or 0 if it's not.
Of course the $substr and the $toLower string aggregation operators respectively return the the first character in lowercase.
Finally you $sort your documents by weight in descending order.
db.coll.aggregate(
[
{ "$match": { "name": /R/i } },
{ "$project": {
"_id": 0,
"name": 1,
"w": {
"$cond": [
{ "$eq": [
{ "$substr": [ { "$toLower": "$name" }, 0, 1 ] },
"r"
]},
1,
0
]
}
}},
{ "$sort": { "w": -1 } }
]
)
which produces:
{ "name" : "ROBERT", "w" : 1 }
{ "name" : "ROBINSON", "w" : 1 }
{ "name" : "PETER", "w" : 0 }
try this :
db.collectioname.find ("name":/R/)

Mongodb Regex Query first 2 characters of the string

In one of my mongodb collection, I have a date string that has a mm/dd/yyyy format. Now, I want to query the 'mm' string.
Example, 05/20/2016 and 04/05/2015.
I want to get the first 2 characters of the string and query '05'. With that, the result I will get should only be 05/20/2016.
How can I achieve this?
Thanks!
For a regex solution, the following will suffice
var search = "05",
rgx = new RegExp("^"+search); // equivalent to var rgx = /^05/;
db.collection.find({ "leave_start": rgx });
Testing
var leave_start = "05/06/2016",
test = leave_start.match(/^05/);
console.log(test); // ["05", index: 0, input: "05/06/2016"]
console.log(test[0]); // "05"
or
var search = "05",
rgx = new RegExp("^"+search),
leave_start = "05/12/2016";
var test = leave_start.match(rgx);
console.log(test); // ["05", index: 0, input: "05/06/2016"]
console.log(test[0]); // "05"
Another alternative is to use the aggregation framework and take advantage of the $substr operator to extract the first 2 characters of a field and then the $match operator will filter documents based on the new substring field above:
db.collection.aggregate([
{
"$project": {
"leaves_start": 1,
"monthSubstring": { "$substr": : [ "$leaves_start", 0, 2 ] }
}
},
{ "$match": { "monthSubstring": "05" } }
])

Swift splitting "abc1.23.456.7890xyz" into "abc", "1", "23", "456", "7890" and "xyz"

In Swift on OS X I am trying to chop up the string "abc1.23.456.7890xyz" into these strings:
"abc"
"1"
"23"
"456"
"7890"
"xyz"
but when I run the following code I get the following:
=> "abc1.23.456.7890xyz"
(0,3) -> "abc"
(3,1) -> "1"
(12,4) -> "7890"
(16,3) -> "xyz"
which means that the application correctly found "abc", the first token "1", but then the next token found is "7890" (missing out "23" and "456") followed by "xyz".
Can anyone see how the code can be changed to find ALL of the strings (including "23" and "456")?
Many thanks in advance.
import Foundation
import XCTest
public
class StackOverflowTest: XCTestCase {
public
func testRegex() {
do {
let patternString = "([^0-9]*)([0-9]+)(?:\\.([0-9]+))*([^0-9]*)"
let regex = try NSRegularExpression(pattern: patternString, options: [])
let string = "abc1.23.456.7890xyz"
print("=> \"\(string)\"")
let range = NSMakeRange(0, string.characters.count)
regex.enumerateMatchesInString(string, options: [], range: range) {
(textCheckingResult, _, _) in
if let textCheckingResult = textCheckingResult {
for nsRangeIndex in 1 ..< textCheckingResult.numberOfRanges {
let nsRange = textCheckingResult.rangeAtIndex(nsRangeIndex)
let location = nsRange.location
if location < Int.max {
let startIndex = string.startIndex.advancedBy(location)
let endIndex = startIndex.advancedBy(nsRange.length)
let value = string[startIndex ..< endIndex]
print("\(nsRange) -> \"\(value)\"")
}
}
}
}
} catch {
}
}
}
It's all about your regex pattern. You want to find a series of contiguous letters or digits. Try this pattern instead:
let patternString = "([a-zA-Z]+|\\d+)"
alternative 'Swifty' way
let str = "abc1.23.456.7890xyz"
let chars = str.characters.map{ $0 }
enum CharType {
case Number
case Alpha
init(c: Character) {
self = .Alpha
if isNumber(c) {
self = .Number
}
}
func isNumber(c: Character)->Bool {
return "1234567890".characters.map{ $0 }.contains(c)
}
}
var tmp = ""
tmp.append(chars[0])
var type = CharType(c: chars[0])
for i in 1..<chars.count {
let c = CharType(c: chars[i])
if c != type {
tmp.append(Character("."))
}
tmp.append(chars[i])
type = c
}
tmp.characters.split(".", maxSplit: Int.max, allowEmptySlices: false).map(String.init)
// ["abc", "1", "23", "456", "7890", "xyz"]

Rexexp to match all the numbers,alphabets,special characters in a string

I want a pattern to match a string that has everything in it(alphabets,numbers,special charactres)
public static void main(String[] args) {
String retVal=null;
try
{
String s1 = "[0-9a-zA-Z].*:[0-9a-zA-Z].*:(.*):[0-9a-zA-Z].*";
String s2 = "BNTPSDAE31G:BNTPSDAE:Healthcheck:Major";
Pattern pattern = null;
//if ( ! StringUtils.isEmpty(s1) )
if ( ( s1 != null ) && ( ! s1.matches("\\s*") ) )
{
pattern = Pattern.compile(s1);
}
//if ( ! StringUtils.isEmpty(s2) )
if ( s2 != null )
{
Matcher matcher = pattern.matcher( s2 );
if ( matcher.matches() )
{
retVal = matcher.group(1);
// A special case/kludge for Asentria. Temp alarms contain "Normal/High" etc.
// Switch Normal to return CLEAR. The default for this usage will be RAISE.
// Need to handle switches in XML. This won't work if anyone puts "normal" in their event alias.
if ("Restore".equalsIgnoreCase ( retVal ) )
{
}
}
}
}
catch( Exception e )
{
System.out.println("Error evaluating args : " );
}
System.out.println("retVal------"+retVal);
}
and output is:
Healthcheck
Hera using this [0-9a-zA-Z].* am matching only alpahbets and numbers,but i want to match the string if it has special characters also
Any help is highly appreciated
Try this:
If you want to match individual elements try this:
2.1.2 :001 > s = "asad3435##:$%adasd1213"
2.1.2 :008 > s.scan(/./)
=> ["a", "s", "a", "d", "3", "4", "3", "5", "#", "#", ":", "$", "%", "a", "d", "a", "s", "d", "1", "2", "1", "3"]
or you want match all at once try this:
2.1.2 :009 > s.scan(/[^.]+/)
=> ["asad3435##:$%adasd1213"]
Try the following regex, it works for me :)
[^:]+
You might need to put a global modifier on it to get it to match all strings.