Cast string to long in kdb - list

I want to first break a string to list of strings and based on a condition I want to return an item of a list as long.
Func:{[x]
Temp:vs "-" x;
if["AAA" ~ Temp[0];:"J"$Temp[1];:"J"$Temp[2]]
}
Func["AAA-809-AXSDF"]
This function returns 809 but when I do:
809 ~ Func["AAA-809-AXSDF"]
It returns 0b
This means it's not converting list item to long. Please suggest

There are a few errors in your code:
1: [x] is not necessary
2: vs "-" x should be "-" vs x
3: if["AAA" ~ Temp[0];:"J"$Temp[1];:"J"$Temp[2]] this statement if true always returns :"J"$Temp[1], :"J"$Temp[2] will never be executed. I think what you need is the conditional $ operator
q)func:{"J"$ $["AAA"~first a:"-"vs x;a 1;a 2]}
q)809~func["AAA-809-AXSDF"]
1b
q)111~func["AAB-AXSDF-111"]
1b

If I got the logic right, following code solves the issue:
{x: "-"vs x; "J"$ $["AAA"~x 0;x 1; x 2]}"AAA-809-AXSDF"

Related

Django queryset StringAgg on arrayfield

I have some data which includes sizes, much like the model below.
class Product(models.Model):
width = models.CharField()
height = models.CharField()
length = models.CharField()
Through annotation we have a field called at_size which produces data like:
[None, None, None]
['200', '000', '210']
['180', None, None]
This was accomplished like so (thanks to: )https://stackoverflow.com/a/70266320/5731101:
class Array(Func):
template = '%(function)s[%(expressions)s]'
function = 'ARRAY'
out_format = ArrayField(CharField(max_length=200))
annotated_qs = Product.objects.all().annotate(
at_size=Array(F('width'), F('height'), F('length'),
output_field=out_format)
)
I'm trying to get this to convert into:
''
'200 x 000 x 210'
'180'
In code, this could a bit like ' x '.join([i for i in data if i]). But as I need to accomplish this with database functions it's a bit more challenging.
I've been playing with StringAgg, but I keep getting:
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
It looks like I need to make sure the None values are excluded from the initial Array-func to begin with. But I'm not sure where to get started here.
How can I accomplish this?
Turns out the problem was two-fold.
Cleaning out the Null values could be done through using array_remove
glueing the strings with a delimiter through StringAgg only works if the input are strings. But since we use an array, that wasn't the way to go. Instead use array_to_string
The final result looks like:
class Array(Func):
# https://www.postgresql.org/docs/9.6/functions-array.html
template = '%(function)s[%(expressions)s]'
function = 'ARRAY'
class ArrayRemove(Func):
# https://www.postgresql.org/docs/9.6/functions-array.html
function = 'array_remove'
class ArrayToString(Func):
# https://stackoverflow.com/a/57873772/5731101
function = 'array_to_string'
out_format = ArrayField(CharField(max_length=200))
annotated_qs = annotated_qs.annotate(
at_size=ArrayToString(
ArrayRemove(
Array(F('width'), F('height'), F('length'), output_field=out_format),
None, # Remove None values from the Array with ArrayRemove
),
Value(" x "), # Delimiter.
Value(''), # If there are null-values, replace with... (fallback)
output_field=CharField(max_length=200),
)
)
this produces the desired format:
for product in annotated_qs:
print(product.at_size)
180 x 000 x 200
180 x 026 x 200
180 x 7 x 200
180 x 000 x 200
200 x 000 x 220
180 x 000 x 200
175 x 230 x 033
160 x 000 x 200
60 x 220
Product.objects.annotate(display_format=Concat(F('width'), '×', F('height'), '×', F('length')))
Should do the trick no?
No need to over complicate this, let's keep it nice and simple and use the database to concatenate the 3 strings and separate them with the multiplication symbol (obviously replace if you prefer another character).
Take a look at the docs over here:
https://docs.djangoproject.com/en/4.1/ref/models/database-functions/#concat

ROT 13 Cipher: Creating a Function Python

I need to create a function that replaces a letter with the letter 13 letters after it in the alphabet (without using encode). I'm relatively new to Python so it has taken me a while to figure out a way to do this without using Encode.
Here's what I have so far. When I use this to type in a normal word like "hello" it works but if I pass through a sentence with special characters I can't figure out how to JUST include letters of the alphabet and skip numbers, spaces or special characters completely.
def rot13(b):
b = b.lower()
a = [chr(i) for i in range(ord('a'),ord('z')+1)]
c = []
d = []
x = a[0:13]
for i in b:
c.append(a.index(i))
for i in c:
if i <= 13:
d.append(a[i::13][1])
elif i > 13:
y = len(a[i:])
z = len(x)- y
d.append(a[z::13][0])
e = ''.join(d)
return e
EDIT
I tried using .isalpha() but this doesn't seem to be working for me - characters are duplicating for some reason when I use it. Is the following format correct:
def rot13(b):
b1 = b.lower()
a = [chr(i) for i in range(ord('a'),ord('z')+1)]
c = []
d = []
x = a[0:13]
for i in b1:
if i.isalpha():
c.append(a.index(i))
for i in c:
if i <= 12:
d.append(a[i::13][1])
elif i > 12:
y = len(a[i:])
z = len(x)- y
d.append(a[z::13][0])
else:
d.append(i)
if message[0].istitle() == True:
d[0] = d[0].upper()
e = ''.join(d)
return e
Following on from comments. OP was advised to use isalpha, and wondering why that's causing duplication (see OP's edit)
This isn't tied to the use of isalpha, it's to do with the second for loop
for i in c:
isn't necessary, and is causing the duplication. You should remove that. Instead you can do the same by just using index = a.index(i). You were already doing this, but for some reason appending to a list instead and causing confusion
Use the index variable any time you would have used i inside the for i in c loop. On a side note, in nested for loops try not to reuse the same variables. It just causes confusion...but that's a matter for code review
Assuming you do all that right it should work.

Use a regular expression extract substring from data frame columns in R

I am fairly new to R so please go easy on me if this is a stupid question.
I have a dataframe called foo:
< head(foo)
Old.Clone.Name New.Clone.Name File
1 A Aa A_mask_MF_final_IS2_SAEE7-1_02.nrrd
2 B Bb B_mask_MF_final_IS2ViaIS2h_SADQ15-1_02.nrrd
3 C Cc C_mask_MF_final_IS2ViaIS2h_SAEC16-1_02.nrrd
4 D Dd D_mask_MF_final_IS2ViaIS2h_SAEJ6-1_02.nrrd
5 E Ee F_mask_MF_final_IS2_SAED9-1_02.nrrd
6 F Ff F_mask_MF_final_IS2ViaIS2h_SAGP3-1_02.nrrd
I want to extract codes from the File column that match the regular expression (S[A-Z]{3}[0-9]{1,2}-[0-9]_02), to give me:
SAEE7-1_02
SADQ15-1_02
SAEC16-1_02
SAEJ6-1_02
SAED9-1_02
SAGP3-1_02
I then want to use these codes to search another directory for other files that contain the same code.
I fail, however, at the first hurdle and cannot extract the codes from that column of the data frame.
I have tried:
library('stringr')
str_extract(foo[3],regex("(S[A-Z]{3}[0-9]{1,2}-[0-9]_02)", ignore_case = TRUE))
but this just returns [1] NA.
Am I simply missing something obvious? I look forward to cracking this with a bit of help from the community.
Hello if you are reading the data as a table file then foo[3] is a list and str_extract does not accept lists, only strings, then you should use lapply to extract the match of every element.
lapply(foo[3], function(x) str_extract(x, "[sS][a-zA-Z]{3}[0-9]{1,2}-[0-9]_02"))
Result:
[1] "SAEE7-1_02" "SADQ15-1_02" "SAEC16-1_02" "SAEJ6-1_02" "SAED9-1_02"
[6] "SAGP3-1_02"
str_extract(foo[3],"(?i)S[A-Z]{3}[0-9]{1,2}-[0-9]_02")
seems to work. Somehow, my R gave me
"Error in check_pattern(pattern, string) : could not find function "regex""
when using your original expression.
The following code will repeat what you asked (just copy and paste to your R console):
library(stringr)
foo = scan(what='')
Old.Clone.Name New.Clone.Name File
A Aa A_mask_MF_final_IS2_SAEE7-1_02.nrrd
B Bb B_mask_MF_final_IS2ViaIS2h_SADQ15-1_02.nrrd
C Cc C_mask_MF_final_IS2ViaIS2h_SAEC16-1_02.nrrd
D Dd D_mask_MF_final_IS2ViaIS2h_SAEJ6-1_02.nrrd
E Ee F_mask_MF_final_IS2_SAED9-1_02.nrrd
F Ff F_mask_MF_final_IS2ViaIS2h_SAGP3-1_02.nrrd
foo = matrix(foo,ncol=3,byrow=T)
colnames(foo)=foo[1,]
foo = foo[-1,]
foo
str_extract(foo[,3],regex("(S[A-Z]{3}[0-9]{1,2}-[0-9]_02)", ignore_case = T))
The reason you get NULL is hidden: R stores entries by column, hence foo[3] is the 3rd row and 1st column of foo matrix/data frame. To quote the third column, you may need to use foo[,3]. or foo<-data.frame(foo); foo[[3]].

Format long number to shorter version in Lua

I'm trying to figure out how I would go about formatting a large number to the shorter version by appending 'k' or 'm' using Lua. Example:
17478 => 17.5k
2832 => 2.8k
1548034 => 1.55m
I would like to have the rounding in there as well as per the example. I'm not very good at Regex, so I'm not sure where I would begin. Any help would be appreciated. Thanks.
Pattern matching doesn't seem like the right direction for this problem.
Assuming 2 digits after decimal point are kept in the shorter version, try:
function foo(n)
if n >= 10^6 then
return string.format("%.2fm", n / 10^6)
elseif n >= 10^3 then
return string.format("%.2fk", n / 10^3)
else
return tostring(n)
end
end
Test:
print(foo(17478))
print(foo(2832))
print(foo(1548034))
Output:
17.48k
2.83k
1.55m
Here a longer form, which uses the hint from Tom Blodget.
Maybe its not the perfect form, but its a little more specific.
For Lua 5.0, replace #steps with table.getn(steps).
function shortnumberstring(number)
local steps = {
{1,""},
{1e3,"k"},
{1e6,"m"},
{1e9,"g"},
{1e12,"t"},
}
for _,b in ipairs(steps) do
if b[1] <= number+1 then
steps.use = _
end
end
local result = string.format("%.1f", number / steps[steps.use][1])
if tonumber(result) >= 1e3 and steps.use < #steps then
steps.use = steps.use + 1
result = string.format("%.1f", tonumber(result) / 1e3)
end
--result = string.sub(result,0,string.sub(result,-1) == "0" and -3 or -1) -- Remove .0 (just if it is zero!)
return result .. steps[steps.use][2]
end
print(shortnumberstring(100))
print(shortnumberstring(200))
print(shortnumberstring(999))
print(shortnumberstring(1234567))
print(shortnumberstring(999999))
print(shortnumberstring(9999999))
print(shortnumberstring(1345123))
Result:
> dofile"test.lua"
100.0
200.0
1.0k
1.2m
1.0m
10.0m
1.3m
>
And if you want to get rid of the "XX.0", uncomment the line before the return.
Then our result is:
> dofile"test.lua"
100
200
1k
1.2m
1m
10m
1.3m
>

Regular expression puzzle

This is not homework, but an old exam question. I am curious to see the answer.
We are given an alphabet S={0,1,2,3,4,5,6,7,8,9,+}. Define the language L as the set of strings w from this alphabet such that w is in L if:
a) w is a number such as 42 or w is the (finite) sum of numbers such as 34 + 16 or 34 + 2 + 10
and
b) The number represented by w is divisible by 3.
Write a regular expression (and a DFA) for L.
This should work:
^(?:0|(?:(?:[369]|[147](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\
+)*[369]0*)*\+?(?:0\+)*[258])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]|0*(?:
\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])|[
258](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0
\+)*[147])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]|0*(?:\+?(?:0\+)*[369]0*)
*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]))0*)+)(?:\+(?:0|(?:(?
:[369]|[147](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)
*\+?(?:0\+)*[258])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]|0*(?:\+?(?:0\+)*
[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])|[258](?:0*(?
:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])*
(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]|0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)
*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]))0*)+))*$
It works by having three states representing the sum of the digits so far modulo 3. It disallows leading zeros on numbers, and plus signs at the start and end of the string, as well as two consecutive plus signs.
Generation of regular expression and test bed:
a = r'0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*'
b = r'a[147]'
c = r'a[258]'
r1 = '[369]|[147](?:bc)*(?:c|bb)|[258](?:cb)*(?:b|cc)'
r2 = '(?:0|(?:(?:' + r1 + ')0*)+)'
r3 = '^' + r2 + r'(?:\+' + r2 + ')*$'
r = r3.replace('b', b).replace('c', c).replace('a', a)
print r
# Test on 10000 examples.
import random, re
random.seed(1)
r = re.compile(r)
for _ in range(10000):
x = ''.join(random.choice('0123456789+') for j in range(random.randint(1,50)))
if re.search(r'(?:\+|^)(?:\+|0[0-9])|\+$', x):
valid = False
else:
valid = eval(x) % 3 == 0
result = re.match(r, x) is not None
if result != valid:
print 'Failed for ' + x
Note that my memory of DFA syntax is woefully out of date, so my answer is undoubtedly a little broken. Hopefully this gives you a general idea. I've chosen to ignore + completely. As AmirW states, abc+def and abcdef are the same for divisibility purposes.
Accept state is C.
A=1,4,7,BB,AC,CA
B=2,5,8,AA,BC,CB
C=0,3,6,9,AB,BA,CC
Notice that the above language uses all 9 possible ABC pairings. It will always end at either A,B,or C, and the fact that every variable use is paired means that each iteration of processing will shorten the string of variables.
Example:
1490 = AACC = BCC = BC = B (Fail)
1491 = AACA = BCA = BA = C (Success)
Not a full solution, just an idea:
(B) alone: The "plus" signs don't matter here. abc + def is the same as abcdef for the sake of divisibility by 3. For the latter case, there is a regexp here: http://blog.vkistudios.com/index.cfm/2008/12/30/Regular-Expression-to-determine-if-a-base-10-number-is-divisible-by-3
to combine this with requirement (A), we can take the solution of (B) and modify it:
First read character must be in 0..9 (not a plus)
Input must not end with a plus, so: Duplicate each state (will use S for the original state and S' for the duplicate to distinguish between them). If we're in state S and we read a plus we'll move to S'.
When reading a number we'll go to the new state as if we were in S. S' states cannot accept (another) plus.
Also, S' is not "accept state" even if S is. (because input must not end with a plus).