Lua REGEX, catch parenthesis - regex

How catch the values (r = 32, g = 36, b = 51) from this data above?
$color1: rgba(32, 36, 51, 1);
I try:
v = "$color1: rgba(32, 36, 51, 1);"
id, r, g, b = v.match(v, "%$color(%d+)%:%s+rgba%((%d+)%,%s+(%d+)%,%s+(%d+)%,%s+%d+%)%;") or 0, 0, 0, 0
But does not work.

v = "$color1: rgba(32, 36, 51, 1);"
string.gsub(v, "%((%d+),%s(%d+),%s(%d+)", function(r, g, b) print(r, g, b) end)
Output: 32 36 51
Gsub finds occurrences of a pattern in a string and will either replace text with text or in this case, run a function.
I start my pattern with %( because I assume you're only going to run this function on a string that always looks like the one you've given, so there can be no false positives, so it will be pretty safe to just start at the first opening parenthesis.
Each %d+ finds one or more digit, and it is wrapped in parentheses so the function can find it later. ,%s finds a comma and a space between the digits.
function(r, g, b) acts where r, g and b are the first, second and third set found in parenthesis.

Your id, r, g, b = v.match(v, "%$color(%d+)%:%s+rgba%((%d+)%,%s+(%d+)%,%s+(%d+)%,%s+%d+%)%;") or 0, 0, 0, 0 line is parsed by Lua as
id = v:match(pattern) or 0
r = 0
g = 0
b = 0
Since there is a match, the id variable is set to the capturing group 1 value, the rest is assigned with zeros.
You may fix this assignment using tables:
result = {string.match(v, "%$color(%d+):%s+rgba%((%d+),%s+(%d+),%s+(%d+),%s+%d+%);")}
id, r, g, b = result[1] or 0, result[2] or 0, result[3] or 0, result[4] or 0
See the Lua demo.
NOTE:; and , are not special Lua pattern characters, hence I removed % escapes in front of them in the pattern.

Related

Renaming variables in raster data using substr

I downloaded worldclimate data and changed it into raster data.
There are names like wc2.1_5m_bio_1 until 19, and I want to rename these variables to bio_1 (start = 10, stop = 16) using substr function. However, I dont know how to make it permanent on the raster data.
substr(clim#ptr[[“names”]], start = 10, stop = 16)
It gives what I want but not permanent. So everytime I reload the raster data, it still has the original long name.
You can get and set the names like this:
library(terra)
s <- rast(system.file("ex/logo.tif", package="terra"))[[1:2]]
names(s)
#[1] "red" "green"
names(s) <- substr(names(s), 1, 1)
names(s)
#[1] "r" "g"
(you should never directly use the #ptr slot)
To make this permanent you need to write the data to a new file:
writeRaster(s, "test.tif", overwrite=TRUE)
rast("test.tif")
#class : SpatRaster
#dimensions : 77, 101, 2 (nrow, ncol, nlyr)
#resolution : 1, 1 (x, y)
#extent : 0, 101, 0, 77 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs
#source : test.tif
#names : r, g
#min values : 0, 0
#max values : 255, 255

How to use grepl with a character vector length greater than one?

Trying to create a conditional dummy variable (c) which converts b >= x to c = 1 and b < x to c = 0.
An example output when x = 3:
a b c
1 1 0
2 3 1
3 4 1
4 2 0
df$c<-ifelse(grepl(b[b <= 3], df$b), as.numeric(1), as.numeric(0))
I've tried using the above ifelse() function, but grepl allows for a character of only length 1:
In grepl(b[b <= 3],df$b) :
(argument 'pattern' has length > 1 and only the first element will be used)
I think your a bit confused with grepl and how (and when) regular expressions are used. Regular expressions are used to find patterns in strings (such as figuring wheter "b", "d", or "g" a part of variable b, one could use grepl("[bdg]", b, ignore.case = TRUE)). If b is numeric, you use conditional statements (as you have).
Basically you could use
df$c <- with(df, ifelse(b <= 3, 1, 0))
or
df$c <- ifelse(df$b <= 3, 1, 0)
or similarly using transform
df <- transform(df, c = ifelse(b<=3, 1, 0))
The confusion is likely that you are trying to figure out which of the ifelse statement is 1 or 0. For this you could use which
df$c <- 0
df$c[which(b <= 3)] <- 1

Excel nesting - IF / AND Query part two?

Hi I Had a query earlier and thought I had cracked it with the help of Richard but it doesn't appear
I have attached an image and what I am trying to achieve to make my query clearer.
* If E is correct then cell F will be set to match D manually
* If E is yes and F is set to 111 then G will populate with the contents of C
* If E is no and F is set to anything but 111 then it will return 0
* If E is correct then cell F will be set to match D manually
* If E is yes and F is set to 112 then H will populate with the contents of C
* If E is no and F is set to anything but 112 then it will return 0
* If E is correct then cell F will be set to match D manually
* If E is yes and F is set to 118 then I will populate with the contents of C
* If E is no and F is set to anything but 118 then it will return 0
* If E is correct then cell F will be set to match D manually
* If E is yes and F is set to 119 then J will populate with the contents of C
* If E is no and F is set to anything but 119 then it will return 0
It's not 100% clear, but sounds like this is what you're after:
F2 = =IF(E2="Yes",IF(OR(D2=111,D2=112,D2=118,D2=119)=TRUE,D2,""),"")
G2 = =IF(AND(E2="Yes",F2=111)=TRUE,C2,"")
H2 = =IF(AND(E2="Yes",F2=112)=TRUE,C2,"")
I2 = =IF(AND(E2="Yes",F2=118)=TRUE,C2,"")
J2 = =IF(AND(E2="Yes",F2=119)=TRUE,C2,"")
Then just fill down. I've put "" instead of 0, because it's a lot easier to see what's going on without zero's everywhere. You can change them back once you're happy with the outcome.
Incidentally, sometimes it's easier to parse the code out. Excel works fine if you have code on different lines, like the following for D2:
=
IF(
E2="Yes",
IF(
OR(
D2=111,D2=112,D2=118,D2=119
)=TRUE,
D2,
""
),
""
)

R regex error replacement has x rows and data has y rows

I am new to using R and I am trying to match a pattern. I added a new column POA where I want 1 when there is a match and 0 when there is no match for every record.
The function I used in R is:
Bribery$POA<-ifelse(grepl( "\\babuse of office\\b|\\babuse of power\\b|\\babuse of position\\b|\\bsquandering public funds\\b|\\bmisuse of power\\b|\\bmisuse of public funds\\b|\\babuse of trust\\b", Bribery$),1,0)
But I am getting the following error:
Error in $<-.data.frame(*tmp*, "POA", value = c(0, 0, 0, 0, 0, 0,
: replacement has 29 rows, data has 95292
Itl be great if anyone could tell me where I am going wrong

Regular expression puzzle

This is not homework, but an old exam question. I am curious to see the answer.
We are given an alphabet S={0,1,2,3,4,5,6,7,8,9,+}. Define the language L as the set of strings w from this alphabet such that w is in L if:
a) w is a number such as 42 or w is the (finite) sum of numbers such as 34 + 16 or 34 + 2 + 10
and
b) The number represented by w is divisible by 3.
Write a regular expression (and a DFA) for L.
This should work:
^(?:0|(?:(?:[369]|[147](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\
+)*[369]0*)*\+?(?:0\+)*[258])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]|0*(?:
\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])|[
258](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0
\+)*[147])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]|0*(?:\+?(?:0\+)*[369]0*)
*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]))0*)+)(?:\+(?:0|(?:(?
:[369]|[147](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)
*\+?(?:0\+)*[258])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]|0*(?:\+?(?:0\+)*
[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])|[258](?:0*(?
:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])*
(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]|0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)
*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]))0*)+))*$
It works by having three states representing the sum of the digits so far modulo 3. It disallows leading zeros on numbers, and plus signs at the start and end of the string, as well as two consecutive plus signs.
Generation of regular expression and test bed:
a = r'0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*'
b = r'a[147]'
c = r'a[258]'
r1 = '[369]|[147](?:bc)*(?:c|bb)|[258](?:cb)*(?:b|cc)'
r2 = '(?:0|(?:(?:' + r1 + ')0*)+)'
r3 = '^' + r2 + r'(?:\+' + r2 + ')*$'
r = r3.replace('b', b).replace('c', c).replace('a', a)
print r
# Test on 10000 examples.
import random, re
random.seed(1)
r = re.compile(r)
for _ in range(10000):
x = ''.join(random.choice('0123456789+') for j in range(random.randint(1,50)))
if re.search(r'(?:\+|^)(?:\+|0[0-9])|\+$', x):
valid = False
else:
valid = eval(x) % 3 == 0
result = re.match(r, x) is not None
if result != valid:
print 'Failed for ' + x
Note that my memory of DFA syntax is woefully out of date, so my answer is undoubtedly a little broken. Hopefully this gives you a general idea. I've chosen to ignore + completely. As AmirW states, abc+def and abcdef are the same for divisibility purposes.
Accept state is C.
A=1,4,7,BB,AC,CA
B=2,5,8,AA,BC,CB
C=0,3,6,9,AB,BA,CC
Notice that the above language uses all 9 possible ABC pairings. It will always end at either A,B,or C, and the fact that every variable use is paired means that each iteration of processing will shorten the string of variables.
Example:
1490 = AACC = BCC = BC = B (Fail)
1491 = AACA = BCA = BA = C (Success)
Not a full solution, just an idea:
(B) alone: The "plus" signs don't matter here. abc + def is the same as abcdef for the sake of divisibility by 3. For the latter case, there is a regexp here: http://blog.vkistudios.com/index.cfm/2008/12/30/Regular-Expression-to-determine-if-a-base-10-number-is-divisible-by-3
to combine this with requirement (A), we can take the solution of (B) and modify it:
First read character must be in 0..9 (not a plus)
Input must not end with a plus, so: Duplicate each state (will use S for the original state and S' for the duplicate to distinguish between them). If we're in state S and we read a plus we'll move to S'.
When reading a number we'll go to the new state as if we were in S. S' states cannot accept (another) plus.
Also, S' is not "accept state" even if S is. (because input must not end with a plus).