Dart List count showing one when splitting an empty string - list

I'm trying to do a basic grab from a Text Column in sqlite and processing it to a field in a model that is a List<String>. If data is empty then I want to set it as an empty list []. However, when I do this for some reason I get a list that looks empty but really with a length of 1. To recreate this issue I simplified the issue with the following.
String stringList = '';
List<String> aList = [];
aList = stringList.split(',');
print(aList.length);
Why does this print 1? Shouldn't it return 0 since there are no values with a comma in it?

This should print 1.
When you split a string on commas, you are finding all the positions of commas in the string, then returning a list of the strings around those. That includes strings before the first comman and after the last comma.
In the case where the input contains no commas, you still find the initial string.
If your example had been:
String input = "451";
List<String> parts = input.split(",");
prtin(parts.length);
you would probably expect parts to be the list ["451"]. That is also what happens here because the split function doesn't distinguish empty parts from non-empty.
If a string does contain a comma, say the string ",", you get two parts when splitting, in this case two empty parts. In general, you get n+1 parts for a string containing n matches of the split pattern.

Related

compare list items against another list

So lets say I have 3 item list:
myString = "prop zebra cool"
items = myString.split(" ")
#items = ["prop", "zebra", "cool"]
And another list content containing hudreds of string items. Its actally a list of files.
Now I want to get only the items of content that contain all of the items
So I started this way:
assets = []
for c in content:
for item in items:
if item in c:
assets.append(c)
And then somehow isolate only the items that are duplicated in assets list
And this would work fine. But I dont like that, its not elegant. And Im sure that there is some other way to deal with that in python
If I interpret your question correctly, you can use all.
In your case, assuming:
content = [
"z:/prop/zebra/rig/cool_v001.ma",
"sjasdjaskkk",
"thisIsNoGood",
"shakalaka",
"z:/prop/zebra/rig/cool_v999.ma"
]
string = "prop zebra cool"
You can do the following:
assets = []
matchlist = string.split(' ')
for c in content:
if all(s in c for s in matchlist):
assets.append(c)
print assets
Alternative Method
If you want to have more control (ie. you want to make sure that you only match strings where your words appear in the specified order), then you could go with regular expressions:
import re
# convert content to a single, tab-separated, string
contentstring = '\t'.join(content)
# generate a regex string to match
matchlist = [r'(?:{0})[^\t]+'.format(s) for s in string.split(' ')]
matchstring = r'([^\t]+{0})'.format(''.join(matchlist))
assets = re.findall(matchstring, contentstring)
print assets
Assuming \t does not appear in the strings of content, you can use it as a separator and join the list into a single string (obviously, you can pick any other separator that better suits you).
Then you can build your regex so that it matches any substring containing your words and any other character, except \t.
In this case, matchstring results in:
([^\t]+(?:prop)[^\t]+(?:zebra)[^\t]+(?:cool)[^\t]+)
where:
(?:word) means that word is matched but not returned
[^\t]+ means that all characters but \t will match
the outer () will return whole strings matching your rule (in this case z:/prop/zebra/rig/cool_v001.ma and z:/prop/zebra/rig/cool_v999.ma)

Swift 3: extract regex matches with non matching parts

I want to analyze a string by many different patterns for numbers, dates and other strings. So I have an array of patterns I want to check in that order.
let patterns = [... "\\d{6}", "\\d{4}", "\\d" ] // to be extended :-)
let s = "IMG_123456_2006.10.03-13.52.59 Testfile_2009_5"
Starting with the first item in pattern I need a search in string s. If found, the string should be split in found parts e.g. "2006" and "2009" and the non matching parts. The remaining parts will be searched with the next pattern and so on. Assuming I already had the pattern defined for time/date in the middle which should be placed at the first item, the splitted string should look like:
"IMG_", "123456", "_", "2006.10.03-13.52.59", " Testfile_", "2009", "_", "5"
Can I use a build in functionality of regex.matches, or do I have to write everything by my own?
I already been able to find a match. But then I have to use the ranges to split the string and do it again and again for the remaining parts until no further matches are indicated. This will need a lot more calculations than I would expect using the results in match.numberOfRanges. Any small solutions available?

Remove fullstop, commas, quotation from list in Python

I have a python code for word frequency count from a text file. The problem with the program is that it takes fullstop into account hence altering the count. For counting word i've used a sorted list of words. I tried to remove the fullstop using
words = open(f, 'r').read().lower().split()
uniqueword = sorted(set(words))
uniqueword = uniqueword.replace(".","")
but i get error as
AttributeError: 'list' object has no attribute 'replace'
Any help would be appreciated :)
You can process the words before you make the set, using a list comprehension:
words = [word.replace(".", "") for word in words]
You could also remove them after (uniquewords = [word.replace...]), but then you will reintroduce duplicates.
Note that if you want to count these words, a Counter may be more useful:
from collections import Counter
counts = Counter(words)
You might be better off with
words = re.findall(r'\w+', open(f, 'r').read().lower())
which will grab all the strings composed of one or more “word characters” and will ignore punctuation and whitespace.

Using Regex to parse Comma Separated List

I have a list of specific valid values: XX,SX,FC,SC,Jump.
Basically I need to look at user-supplied list of values and if one of the values does not match the above list I will throw an error. Can I use a regular expression to accomplish this?
This will match a comma separated list of 5 sequences of alphanumeric characters.
[A-Za-z0-9](,[A-Za-z0-9]){4}
However, and depending on the language you are using, I'd normally split the string and then check the length of the resulting array. For instance, in Java:
String csvList = "XX,SX,FC,SC,Jump";
String[] elements = csvList.split(",");
if (elements.length != 5) {
throw new Exception();
}

Struggling with regex logic: how do I remove a param from a url query string?

I'm comparing 2 URL query strings to see if they're equal; however, I want to ignore a specific query parameter (always with a numeric value) if it exists. So, these 2 query strings should be equal:
firstName=bobby&lastName=tables&paramToIgnore=2
firstName=bobby&lastName=tables&paramToIgnore=5
So, I tried to use a regex replace using the REReplaceNoCase function:
REReplaceNoCase(myQueryString, "&paramToIgnore=[0-9]*", "")
This works fine for the above example. I apply the replace to both strings and then compare. The problem is that I can't be sure that the param will be the last one in the string... the following 2 query strings should also be equal:
firstName=bobby&lastName=tables&paramToIgnore=2
paramToIgnore=5&firstName=bobby&lastName=tables
So, I changed the regex to make the preceding ampersand optional... "&?paramToIgnore=[0-9]*". But - these strings will still not be equal as I'll be left with an extra ampersand in one of the strings but not the other:
firstName=bobby&lastName=tables
&firstName=bobby&lastName=tables
Similarly, I can't just remove preceding and following ampersands ("&?paramToIgnore=[0-9]*&?") as if the query param is in the middle of the string I'll strip one ampersand too many in one string and not the other - e.g.
firstName=bobby&lastName=tables&paramToIgnore=2
firstName=bobby&paramToIgnore=5&lastName=tables
will become
firstName=bobby&lastName=tables
firstName=bobbylastName=tables
I can't seem to get my head around the logic of this... Can anyone help me out with a solution?
If you can't be sure of the order the parameters appear i would recommend, that you don't compare them by the string itsself.
I recommend splitting the string up like this:
String stringA = "firstName=bobby&lastName=tables&paramToIgnore=2";
String stringB = "firstName=bobby&lastName=tables&paramToIgnore=5";
String[] partsA = stringA.split("&");
String[] partsB = stringB.split("&");
Then go through arrays and make the paramToIgnore somehow euqal:
for(int i = 0; i < partsA.length; i++)
{
if(partsA[i].startsWith("paramToIgnore"){
partsA[i] = "IgnoreMePlease";
}
}
for(int j = 0; j < partsB.length; j++)
{
if(partsB[i].startsWith("paramToIgnore"){
partsB[i] = "IgnoreMePlease";
}
}
Then you can sort and compare the arrays to see if they are equal:
Arrays.sort(partsA);
Arrays.sort(partsB);
boolean b = Arrays.equals(partsA, partsB);
I'm pretty sure it's possible to make this more compact and give it a better performance. But with comparing strings like you do, you somehow alsways have to care about the order of your parameters.
You can use the QueryStringDeleteVar UDF on cflib to remove the query string variables you want to ignore from both strings, then compare them.
Make it in two steps:
first remove your param, as you described in example
then remove ampersand which is left at the begining or the end of query with separate regex, or any double/triple/... ampersands in the middle of the query
How about having an 'or' in the RegEx to match an ampersand at the start or the end?
&paramToIgnore=[0-9]*|paramToIgnore=[0-9]*&
Seems to do the job when testing in regexpal.com
try changing it to:
REReplaceNoCase(myQueryString, "&?paramToIgnore=[0-9]+", "")
plus instead of star should capture 1 or more of the preceding matched characters. It won't match anything but 0-9 so if there is another parameter after that it'll stop when it can't match any more digits.
Alternatively, you could use:
REReplaceNoCase(myQueryString, "&?paramToIgnore=[^&]", "")
This will match anything but an ampersand. It will cover the case if the parameter exists but there is no value; which is probably something you'd want to account for.