I am assuming you would need a regex for this. The best I could come up with is
=REGEXREPLACE(C2, "\.(?=[^.]*$)", ".2")
but it only detects the period in the end and the google sheet returns #REF!
Other ways, such as directly changing the cell C2:C5, are also welcomed.
You can just check if the trailing 2 characters from the right are equal to .1
get two chars from the right
test equality
RIGHT(A1,2)=".1"
Then, to convert matching values, you can slice off the last two chars (length-2) and append the .2
LEFT(A1,LEN(A1)-2)&".2"
All together
=IF(RIGHT(A1,2)=".1",LEFT(A1,LEN(A1)-2)&".2",A1)
If you actually want to increment arbitrary values (and not just .1), you can skip the equality check and add 0.1 intermediately
=LEFT(C3,LEN(C3)-2)&((RIGHT(C3,2)+0.1)&"")
If you have values with more than a single digit, hunt them in an intermediate column so you can use their length to
add the right power of ten (.5+0.1, .993+0.001, etc.)
exclude the right number of chars when appending
If you want a full version parser, consider VBA or passing the column to a more practical language
Related
I want to be able to match and parse some parameters read from a file such as :
"type:int,register_id:15,value:123456"
"type:int,register_id:16,value:-456789"
"type:double,register_id:17,value:123.456"
"type:double,register_id:18,value:-456.789"
"type:bool,register_id:19,value:true"
"type:bool,register_id:20,value:false"
"type:string,register_id:17,value:Test Set Data Register"
I've come up with the following Regex expression :
(^(type:)\b(bool|int|double|string)\b,(\bregister_id:\b)([1-9][0-9]),(\bvalue:\b)(.)$)
but I have issues where there are negative floats or ints, I can't get the hyphen sorted properly ...
Can someone point me in the right direction ?
https://regex101.com/r/WhXmBE/3
Thanks !
Tried [\s\S] but it reads everything, tried -? as well
Given your example, this seems to work:
(^(type:)(bool|int|double|string),(register_id:)([1-9][0-9]*),(value:)(.*)$)
At least from the example, I didn't see why the \b are necessary. Apologies if I missed something.
Looking at what you try to achieve, I would actually consider moving away from regexes, as regexes by themselves add complexity. You will likely have an easier life if you approach it like this:
Split the line by "," to get the key value pairs
Split each key value pair by the first ":" to split key and value
Validate that all keys are present and that every value matches the format for the key (e.g. if the type is bool then the value should parse to a bool)
You can easily adjust every step to e.g. trim whitespaces.
Edit: Fixed typo
My client uses SKUs from which they change the first two digit suffix to represent changes/updates in models. As an analyst, I need to make a unique list of SKUs to use in my data studio dashboard. A sample of the SKUs would look like:
NP9151BM01
NL9151BM01
NL6004SL01
NN6004SL01
NP1927YM05
NN1927YM05
NQ1296BM01
NG1296BM01
NQ1044YL04
NN1044YL04
NP9151YM05
9151YM05
1044YL04
I need to use regex to check if the first two characters are alphabets and remove them if they are. For example, if I have NP9151BM01 and NL9151BM01 as SKUs, I need to remove NP and NL from them to end up with the exact same SKU. However, if I have 9151YM05 or 1044YL04 as SKUs, I need to keep it as it is.
For my solution, I have researched on google and stack overflow and I've found this regex (?<=^..).*$ which will remove the first two characters in all SKUs but I'm not sure how to customise it to only remove the first two characters if they are alphabets.
I would appreciate any help that I can get with this!
To remove the first two alphabets:
=REGEXREPLACE(A2,"^[A-Z]{2}",)
I have data in a spreadsheet describing amount of data transferred over a mobile network: data in one column (over 300 rows) has three possible forms:
123,45KB
123,45MB
1,23GB
How can I transform or use this data in order to sum or do other calculations on numbers properly?
Assuming your data is in column A and there are always two characters as unit ("KB", "MB" or "GB") at the end, then the formula for transforming the data to numeric could be:
=--LEFT(A2;LEN(A2)-2)*10^(IF(RIGHT(A2;2)="KB";3;IF(RIGHT(A2;2)="MB";6;IF(RIGHT(A2;2)="GB";9))))
Result:
Put the formula in B2 and fill downwards as needed.
I suspected the decimal delimiter in your locale is comma. If not, please state what it is.
Also since this site is English, I have used English function names. Maybe you need to translate them into your language version.
If the decimal delimiter in your locale is not comma, then you need substituting the comma with your decimal delimiter to get a proper numeric decimal value.
For example if the decimal delimiter is dot, then:
=SUBSTITUTE(LEFT(A2,LEN(A2)-2),",",".")*10^(IF(RIGHT(A2,2)="KB",3,IF(RIGHT(A2,2)="MB",6,IF(RIGHT(A2,2)="GB",9))))
An alternative formula:
=LEFT(A1,LEN(A1)-2)*10^(3*MATCH(RIGHT(LEFT(A1,LEN(A1)-1)),{"K","M","G"},0))
Uses the position of the next to last character in an array to determine the factor.
I have the following sets:
NUMBER [0-9]+
DECIMAL ("."{NUMBER})|({NUMBER}("."{NUMBER}?)?)
REAL {DECIMAL}([eE][+-]?{NUMBER})?
and I want my lexer to accept real numbers like:
0.002 or 0.004e-10 or .01
the problem is that I want it ignore the leading zeros but to keep the rest of the number for example:
when I give 000.0002 I want to keep 0.0002 and when I give 0.2e-0100 I want to keep 0.2e-100
So I was thinking something like the atof function but I do not know how to do it exactly.
Any thoughts?
Thanks in advance
lex will return the complete token that your pattern matches as one string. You cannot change that. At the expense of considerable complexity you could use start conditions to match a leading zero (which may be the only digit), and collect tokens for the pieces, e.g.,
0.2e-0100
as
0.2e-
0
100
and glue the first/last tokens together but you would find it much simpler to develop your own string function which filters out the unwanted leading zeroes.
I need to subset rows that contain <three digit number>
I wrote
foo <- grepl("<^[0-9]{3}$>", log1[,2])
others <- log1[!foo,]
but I'm not really sure how to use regex...just been using cheat sheets and Google. I think the < and > characters are throwing it off.
You almost had it. Try
^<[0-9]{3}>$
It might behoove you to read about anchors (^ and $).
The ^ and $ signs refer to the beginning and end of the string, respectively. You shouldn't be matching anything before or after them.
If you want rows that contain that pattern, you shouldn't use the anchors at all. You should just use this: <[0-9]{3}> (or shorten it to <\\d{3}>)
Just for posterity, I thought I would contribute what I think is the implied answer to the OP's stated question.
It seems the OP wants to exclude rows of a data frame where the second column contains a 3-digit integer. This can be done quite easily using the 'nchar' function to count the number of characters in each number, like so:
others <- log1[nchar(log1[,2])!=3,]
We are simply creating an array with the number of characters contained in each row of column 2 and selecting that row if the number does not equal 3.