I encountered a very strange thing, to me at least.
if month not in (02, 04, 06, 11):
print "Good"
whenever I add 09 to the tuple i got the error called: SyntaxError: invalid token and it's only for this particular number.
any idea?
When you use a leading 0 on a number, Python interprets that as a base-8 (octal) number. Remove the leading 0:
>>> 10
10
>>> 010
8
>>> 9
9
>>> 09
File "<stdin>", line 1
09
^
SyntaxError: invalid token
Python 3 has improved on this; all numbers with a leading 0 are considered invalid now, to create a octal number you always have to use the 0o prefix instead:
>>> 010
File "<stdin>", line 1
010
^
SyntaxError: invalid token
>>> 0o10
8
Numbers with leading 0s are considered octal, so 09 is invalid... Just drop the 0's
i think you need to get rid of the zeros before the numbers it might be interpreting it as an octal number...
you can try this.(reason is already answered why 09 provide error)
month='05'
if month not in ('02', '04', '06', '11','09'):
print "Good"
Related
I have this number: 003859389453604802410207622210986832370060. In this instance, I need to extract 07622210986832 which comes before 02 and ends with 37.
In the real world, 07622210986832 is always 14 digits, and will always start with 02 and end with 37 BUT it could appear at any point in a string that is of random length - all we know is that the number will be there somewhere.
I'm currently using the formula:
=IF(LEN(IFERROR(REGEXEXTRACT(A1:A&"", "02(.*)37")))=14,
However, you will notice in the number sample there is another 02 - "024102".
This is causing an issue.
What I really want to happen is:
Lookup 02
Find the string of 14 numbers and if number 15 is 3 and 16 is 7 (37), that is the number we need.
If you find another 02 number with a 14 digit string and the next two numbers are not 37 - ignore.
Use the pattern 02(\d{14})37, it will extract a sequence of 14 digits preceded by 02 and followed by 37.
try like this:
=ARRAYFORMULA(REGEXEXTRACT(TO_TEXT({A2:A,B2:B,C2:C}), "02(\d{14})37"))
if you want to smash it into 1 column then:
=ARRAYFORMULA(TRIM(TRANSPOSE(QUERY(TRANSPOSE(REGEXEXTRACT(TO_TEXT({A2:A,B2:B,C2:C}),
"02(\d{14})37")),,999^99))))
Does anyone why python handles the below this way.
>>> a = 099
File "<stdin>", line 1
a = 099
^
SyntaxError: invalid token
>>> a = 088
File "<stdin>", line 1
a = 088
^
SyntaxError: invalid token
>>> a = 0559
File "<stdin>", line 1
a = 0559
^
SyntaxError: invalid token
>>> a = 077
>>>
It does not seem to accept numbers starting with 0 and preceding with 8 or 9. If it is some other number, it is not throwing any error. Why is that?
In Python 2, like in C, an integer literal that starts with a 0 is in octal. Digits 8 and 9 do not exist in octal (they are written 010 and 011 respectively) so that is a syntax error.
>>> 010
8
>>> 08
File "<stdin>", line 1
008
^
SyntaxError: invalid token
In Python 3, this feature not many people know about is gone. There, nonzero literals that start with a 0 are syntax errors.
Is there any function in any package that can read a text file with regex and return string numbers of found matches. Like gsubfn read.pattern can find and extract a pattern but can't return line number and grep can't read files directly. Example:
file:
.122448110000D+06 .400000000000D+01
3 15 3 23 10 0 0.0 .267305411398D-03 .161435309564D-10 .000000000000D+01
.510000000000D+02 .625000000000D-01 .440982654411D-08 .306376855997D+00
5 15 3 23 11 59 44.0 -.263226218521D-03 .488853402202D-11 .000000000000D+01
pattern: reg="^ *\\d+ +(?:[0-9]+ +){5}[.0-9]+.*$" for 2nd and 4th line match. So what I generally want is:
>file.grep(file,reg)
[1] 2 4
Is there anything of sorts? I get the general philosophy when dealing with such things is readLines and then getting creative with grep which is fine when files are not that big. But I read here many people having problems with large and not table-structured data sets, things that could be solved with such tool (or with readLines supporting regex skip parameter) and I wonder if anyone made something like that.
EDITED1
I just found another post relating to this question with an alternative solution:
grep while reading file
ORIGINAL POST
Is this what you are looking for?
library(gsubfn)
cat(" .122448110000D+06 .400000000000D+01
3 15 3 23 10 0 0.0 .267305411398D-03 .161435309564D-10 .000000000000D+01
.510000000000D+02 .625000000000D-01 .440982654411D-08 .306376855997D+00
5 15 3 23 11 59 44.0 -.263226218521D-03 .488853402202D-11 .000000000000D+01", file = "test.txt")
read.pattern(text = readLines("test.txt"), pattern = "^ *\\d+ +(?:[0-9]+ +){5}[.0-9]+.*$")
In an application I have the need to validate a string entered by the user.
One number
OR
a range (two numbers separated by a '-')
OR
a list of comma separated numbers and/or ranges
AND
any number must be between 1 and 999999.
A space is allowed before and after a comma and or '-'.
I thought the following regular expression would do it.
(\d{1,6}\040?(,|-)?\040?){1,}
This matches the following (which is excellent). (\040 in the regular expression is the character for space).
00001
12
20,21,22
100-200
1,2-9,11-12
20, 21, 22
100 - 200
1, 2 - 9, 11 - 12
However, I also get a match on:
!!!12
What am I missing here?
You need to anchor your regex
^(\d{1,6}\040?(,|-)?\040?){1,}$
otherwise you will get a partial match on "!!!12", it matches only on the last digits.
See it here on Regexr
/\d*[-]?\d*/
i have tested this with perl:
> cat temp
00001
12
20,21,22
100-200
1,2-9,11-12
20, 21, 22
100-200
1, 2-9, 11-12
> perl -lne 'push #a,/\d*[-]?\d*/g;END{print "#a"}' temp
00001 12 20 21 22 100-200 1 2-9 11-12 20 21 22 100-200 1 2-9 11-12
As the result above shows putting all the regex matches in an array and finally printing the array elements.
I'm trying to write a simple regex but I don't know why it is not working.
User enter 2 digits number like 01, 09, 23, 55, until 82. After 82 system will refuse.
Here is my regex, 2 digits must be smaller than 82.
0[1-9]|[1-8][0-9]|8[0-2]
You should have [1-7] for the range 10-79, not [1-8]. Don't forget the ^ and $ to specify the start and ending of the string:
^(0[1-9]|[1-7]\d|8[0-2])$
Why not cast to an integer and then just test x < 82?
Your second part is wrong. It'll match from 10 to 89, whereas you want it to match from 10 to 79 and let the third part handle 80 to 82.
0[1-9]|[1-7][0-9]|8[0-2]