How to group data in kdb+ using customized groups?

How to group data in kdb+ using customized groups? - grouping

I have a table (allsales) with a column for time (sale_time). I want to group the data by sale_time. But I want to be able to bucket this. ex any data where time is between 00:00:00-03:00:00 should be grouped together, 03:00:00-06:00:00 should be grouped together and so on. Is there a way to write such a query?

xbar is useful for rounding to interval values e.g.
q)5 xbar 1 3 5 8 10 11 12 14 18
0 0 5 5 10 10 10 10 15
We can then use this to group rows into time groups, for your example:
q)s:([] t:13:00t+00:15t*til 24; v:til 24)
q)s
t v
--------------
13:00:00.000 0
13:15:00.000 1
13:30:00.000 2
13:45:00.000 3
14:00:00.000 4
14:15:00.000 5
..
q)select count i,sum v by xbar[`int$03:00t;t] from s
t | x v
------------| ------
12:00:00.000| 8 28
15:00:00.000| 12 162
18:00:00.000| 4 86
"by xbar[`int$03:00t;t]" rounds the time column t to the nearest three hour value, then this is used as the group by.

There are few more ways to achieve the same results.
q)select count i , sum v by t:01:00u*3 xbar t.hh from s
q)select count i , sum v by t:180 xbar t.minute from s
t | x v
-----| ------
12:00| 8 28
15:00| 12 162
18:00| 4 86
But in all cases, be careful of the date column if present in the table, otherwise same time window across different dates will generate the wrong results.
q)s:([] d:24#2013.05.07 2013.05.08; t:13:00t+00:15t*til 24; v:til 24)
q)select count i , sum v by d, t:180 xbar t.minute from s
d t | x v
----------------| ----
2013.05.07 12:00| 4 12
2013.05.07 15:00| 6 78
2013.05.07 18:00| 2 42
2013.05.08 12:00| 4 16
2013.05.08 15:00| 6 84
2013.05.08 18:00| 2 44

Related

Subtracting values based on a index column and using a condition in the same column in DAX

I've a lot of material on Stack about this, but i'm still not able to reproduce it.
Sample data set.
Asset
Value
Index
A
10
1
B
15
1
C
20
1
A
11
2
B
17
2
C
24
2
A
18
3
B
25
3
C
30
3
What i want to do is, subtract the Asset values individually based on the index column.
Ex:
Asset A [1] -> 10
Asset A [2] -> 11
11 - 10 = 1
So the table would be like this.
Asset
Value
Index
Diff
A
10
1
0
B
15
1
0
C
20
1
0
A
11
2
1
B
17
2
2
C
24
2
4
A
18
3
7
B
25
3
8
C
30
3
6
This need's to be done using DAX.
Can you guys help me ?
Best Regards!

I just did this and it worked.
Diff =
var Assets = 'Table'[Asset]
var Ind = 'Table'[Index] - 1
Return
IF(Ind = -1, 0, 'Table'[Value] - CALCULATE(SUM('Table'[Value]),FILTER('Table','Table'[Asset] = Assets && 'Table'[Index] = Ind)))

Get the max of the average for each group

have the following table :
EmpId DeptId WeekNumber Month NumberofCalls
1 3 4 1 34
2 3 2 3 59
I created a measure to calculate the average of number of calls :
AvgCalls = AVG('MyTable'[NumberofCalls])
now I want to get the max average calls by month, week.
I will be having 3 filters :
Month
Week
Once I select all of them, the result in the histogram bar will be the employee having the max average calls.
Once I select the Month and the Week I want the histogram to display the code of the Employee (W1,W2,W3...) having the maximum average, in my case I get the following result all the employees but not the employee having the max average.

Here is my solution:
I tested it with some random datasets, Here is my data:
EmpId DeptId WeekNumber Month NumberofCalls
Emp01 3 W4 1 34
Emp01 3 W2 3 59
Emp02 3 W5 4 68
Emp02 3 W6 4 76
Emp03 3 W10 5 90
Emp04 4 W10 6 98
Emp04 4 W11 6 45
Emp05 4 W12 7 56
Emp06 4 W13 7 23
Emp07 4 W15 9 45
Emp08 4 W34 8 56
Emp09 4 W52 8 44
Emp05 4 W36 9 23
Emp01 4 W17 10 51
Emp02 4 W23 9 67
Emp06 4 W29 11 28
Emp05 4 W34 12 34
Emp07 4 W41 11 21
Emp04 4 W37 12 33
I wrote this measure using Iterator Function (ADDCOLUMNS):
MaxAverageEmployer =
VAR TAvgCalls =
ADDCOLUMNS(
SUMMARIZE(MyTable,MyTable[EmpId],MyTable[Month ],MyTable[WeekNumber ]),
"AvgCall",CALCULATE(AVERAGE('MyTable'[NumberofCalls]))
)
VAR TMaxAvgCalls =
ADDCOLUMNS(
TAvgCalls,
"MaxAvg",CALCULATE(MAXX(TAvgCalls,[AvgCall]))
)
VAR MaxEmpID =
ADDCOLUMNS(
TMaxAvgCalls,
"MaxEmp",CALCULATE(VALUES(MyTable[EmpId]),FILTER(TMaxAvgCalls,[AvgCall] = [MaxAvg]))
)
RETURN
MAXX(MaxEmpID,[MaxEmp])
Here is the part:
It showed nothing when I tried to show it on histogram (or Bar Chart Visual); but It gave me correct values on a table visual:
WeekNumber : I put in on Rows
MonthNumber : I put it on Slicer to filter it!
Here is the final solution, and I hope It is what you are looking for!

Density of fractions between 2 given numbers

I'm trying to do some analysis over a simple Fraction class and I want some data to compare that type with doubles.
The problem
Right know I'm looking for some good way to get the density of Fractions between 2 numbers. Fractions is basically 2 integers (e.g. pair< long, long>), and the density between s and t is the amount of representable numbers in that range. And it needs to be an exact, or very good approximation done in O(1) or very fast.
To make it a bit simpler, let's say I want all the numbers (not fractions) a/b between s and t, where 0 <= s <= a/b < t <= M, and 0 <= a,b <= M (b > 0, a and b are integers)
Example
If my fractions were of a data type which only count to 6 (M = 6), and I want the density between 0 and 1, the answer would be 12. Those numbers are:
0, 1/6, 1/5, 1/4, 1/3, 2/5, 1/2, 3/5, 2/3, 3/4, 4/5, 5/6.
What I thought already
A very naive approach would be to cycle trough all the possible fractions, and count those which can't be simplified. Something like:
long fractionsIn(double s, double t){
long density = 0;
long M = LONG_MAX;
for(int d = 1; d < floor(M/t); d++){
for(int n = ceil(d*s); n < M; n++){
if( gcd(n,d) == 1 )
density++;
}
}
return density;
}
But gcd() is very slow so it doesn't works. I also try doing some math but i couldn't get to anything good.
Solution
Thanks to #m69 answer, I made this code for Fraction = pair<Long,Long>:
//this should give the density of fractions between first and last, or less.
double fractionsIn(unsigned long long first, unsigned long long last){
double pi = 3.141592653589793238462643383279502884;
double max = LONG_MAX; //i can't use LONG_MAX directly
double zeroToOne = max/pi * max/pi * 3; // = approx. amount of numbers in Farey's secuence of order LONG_MAX.
double res = 0;
if(first == 0){
res = zeroToOne;
first++;
}
for(double i = first; i < last; i++){
res += zeroToOne/(i * i+1);
if(i == i+1)
i = nextafter(i+1, last); //if this happens, i might not count some fractions, but i have no other choice
}
return floor(res);
}
The main change is nextafter, which is important with big numbers (1e17)
The result
As I explain at the begining, I was trying to compare Fractions with double. Here is the result for Fraction = pair<Long,Long> (and here how I got the density of doubles):
Density between 0,1: | 1,2 | 1e6,1e6+1 | 1e14,1e14+1 | 1e15-1,1e15 | 1e17-10,1e17 | 1e19-10000,1e19 | 1e19-1000,1e19
Doubles: 4607182418800017408 | 4503599627370496 | 8589934592 | 64 | 8 | 1 | 5 | 0
Fraction: 2.58584e+37 | 1.29292e+37 | 2.58584e+25 | 2.58584e+09 | 2.58584e+07 | 2585 | 1 | 0

Density between 0 and 1
If the integers with which you express the fractions are in the range 0~M, then the density of fractions between the values 0 (inclusive) and 1 (exclusive) is:
M: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0~(1): 1 2 4 6 10 12 18 22 28 32 42 46 58 64 72 80 96 102 120 128 140 150 172 180 200 212 230 242 270 278 308 ...
This is sequence A002088 on OEIS. If you scroll down to the formula section, you'll find information about how to approximate it, e.g.:
Φ(n) = (3 ÷ π2) × n2 + O[n × (ln n)2/3 × (ln ln n)4/3]
(Unfortunately, no more detail is given about the constants involved in the O[x] part. See discussion about the quality of the approximation below.)
Distribution across range
The interval from 0 to 1 contains half of the total number of unique fractions that can be expressed with numbers up to M; e.g. this is the distribution when M = 15 (i.e. 4-bit integers):
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
72 36 12 6 4 2 2 2 1 1 1 1 1 1 1 1
for a total of 144 unique fractions. If you look at the sequence for different values of M, you'll see that the steps in this sequence converge:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1: 1 1
2: 2 1 1
3: 4 2 1 1
4: 6 3 1 1 1
5: 10 5 2 1 1 1
6: 12 6 2 1 1 1 1
7: 18 9 3 2 1 1 1 1
8: 22 11 4 2 1 1 1 1 1
9: 28 14 5 2 2 1 1 1 1 1
10: 32 16 5 3 2 1 1 1 1 1 1
11: 42 21 7 4 2 2 1 1 1 1 1 1
12: 46 23 8 4 2 2 1 1 1 1 1 1 1
13: 58 29 10 5 3 2 2 1 1 1 1 1 1 1
14: 64 32 11 5 4 2 2 1 1 1 1 1 1 1 1
15: 72 36 12 6 4 2 2 2 1 1 1 1 1 1 1 1
Not only is the density between 0 and 1 half of the total number of fractions, but the density between 1 and 2 is a quarter, and the density between 2 and 3 is close to a twelfth, and so on.
As the value of M increases, the distribution of fractions across the ranges 0-1, 1-2, 2-3 ... converges to:
1/2, 1/4, 1/12, 1/24, 1/40, 1/60, 1/84, 1/112, 1/144, 1/180, 1/220, 1/264 ...
This sequence can be calculated by starting with 1/2 and then:
0-1: 1/2 x 1/1 = 1/2
1-2: 1/2 x 1/2 = 1/4
2-3: 1/4 x 1/3 = 1/12
3-4: 1/12 x 2/4 = 1/24
4-5: 1/24 x 3/5 = 1/40
5-6: 1/40 x 4/6 = 1/60
6-7: 1/60 x 5/7 = 1/84
7-8: 1/84 x 6/8 = 1/112
8-9: 1/112 x 7/9 = 1/144 ...
You can of course calculate any of these values directly, without needing the steps inbetween:
0-1: 1/2
6-7: 1/2 x 1/6 x 1/7 = 1/84
(Also note that the second half of the distribution sequence consists of 1's; these are all the integers divided by 1.)
Approximating the density in given interval
Using the formulas provided on the OEIS page, you can calculate or approximate the density in the interval 0-1, and multiplied by 2 this is the total number of unique values that can be expressed as fractions.
Given two values s and t, you can then calculate and sum the densities in the intervals s ~ s+1, s+1 ~ s+2, ... t-1 ~ t, or use an interpolation to get a faster but less precise approximate value.
Example
Let's assume that we're using 10-bit integers, capable of expressing values from 0 to 1023. Using this table linked from the OEIS page, we find that the density between 0~1 is 318452, and the total number of fractions is 636904.
If we wanted to find the density in the interval s~t = 100~105:
100~101: 1/2 x 1/100 x 1/101 = 1/20200 ; 636904/20200 = 31.53
101~102: 1/2 x 1/101 x 1/102 = 1/20604 ; 636904/20604 = 30.91
102~103: 1/2 x 1/102 x 1/103 = 1/21012 ; 636904/21012 = 30.31
103~104: 1/2 x 1/103 x 1/104 = 1/21424 ; 636904/21424 = 29.73
104~105: 1/2 x 1/104 x 1/105 = 1/21840 ; 636904/21840 = 29.16
Rounding these values gives the sum:
32 + 31 + 30 + 30 + 29 = 152
A brute force algorithm gives this result:
32 + 32 + 30 + 28 + 28 = 150
So we're off by 1.33% for this low value of M and small interval with just 5 values. If we had used linear interpolation between the first and last value:
100~101: 31.53
104~105: 29.16
average: 30.345
total: 151.725 -> 152
we'd have arrived at the same value. For larger intervals, the sum of all the densities will probably be closer to the real value, because rounding errors will cancel each other out, but the results of linear interpolation will probably become less accurate. For ever larger values of M, the calculated densities should converge with the actual values.
Quality of approximation of Φ(n)
Using this simplified formula:
Φ(n) = (3 ÷ π2) × n2
the results are almost always smaller than the actual values, but they are within 1% for n ≥ 182, within 0.1% for n ≥ 1880 and within 0.01% for n ≥ 19494. I would suggest hard-coding the lower range (the first 50,000 values can be found here), and then using the simplified formula from the point where the approximation is good enough.
Here's a simple code example with the first 182 values of Φ(n) hard-coded. The approximation of the distribution sequence seems to add an error of a similar magnitude as the approximation of Φ(n), so it should be possible to get a decent approximation. The code simply iterates over every integer in the interval s~t and sums the fractions. To speed up the code and still get a good result, you should probably calculate the fractions at several points in the interval, and then use some sort of non-linear interpolation.
function fractions01(M) {
var phi = [0,1,2,4,6,10,12,18,22,28,32,42,46,58,64,72,80,96,102,120,128,140,150,172,180,200,212,230,242,270,278,308,
324,344,360,384,396,432,450,474,490,530,542,584,604,628,650,696,712,754,774,806,830,882,900,940,964,1000,
1028,1086,1102,1162,1192,1228,1260,1308,1328,1394,1426,1470,1494,1564,1588,1660,1696,1736,1772,1832,1856,
1934,1966,2020,2060,2142,2166,2230,2272,2328,2368,2456,2480,2552,2596,2656,2702,2774,2806,2902,2944,3004,
3044,3144,3176,3278,3326,3374,3426,3532,3568,3676,3716,3788,3836,3948,3984,4072,4128,4200,4258,4354,4386,
4496,4556,4636,4696,4796,4832,4958,5022,5106,5154,5284,5324,5432,5498,5570,5634,5770,5814,5952,6000,6092,
6162,6282,6330,6442,6514,6598,6670,6818,6858,7008,7080,7176,7236,7356,7404,7560,7638,7742,7806,7938,7992,
8154,8234,8314,8396,8562,8610,8766,8830,8938,9022,9194,9250,9370,9450,9566,9654,9832,9880,10060];
if (M < 182) return phi[M];
return Math.round(M * M * 0.30396355092701331433 + M / 4); // experimental; see below
}
function fractions(M, s, t) {
var half = fractions01(M);
var frac = (s == 0) ? half : 0;
for (var i = (s == 0) ? 1 : s; i < t && i <= M; i++) {
if (2 * i < M) {
var f = Math.round(half / (i * (i + 1)));
frac += (f < 2) ? 2 : f;
}
else ++frac;
}
return frac;
}
var M = 1023, s = 100, t = 105;
document.write(fractions(M, s, t));
Comparing the approximation of Φ(n) with the list of the 50,000 first values suggests that adding M÷4 is a workable substitute for the second part of the formula; I have not tested this for larger values of n, so use with caution.
Blue: simplified formula. Red: improved simplified formula.
Quality of approximation of distribution
Comparing the results for M=1023 with those of a brute-force algorithm, the errors are small in real terms, never more than -7 or +6, and above the interval 205~206 they are limited to -1 ~ +1. However, a large part of the range (57~1024) has fewer than 100 fractions per integer, and in the interval 171~1024 there are only 10 fractions or fewer per integer. This means that small errors and rounding errors of -1 or +1 can have a large impact on the result, e.g.:
interval: 241 ~ 250
fractions/integer: 6
approximation: 5
total: 50 (instead of 60)
To improve the results for intervals with few fractions per integer, I would suggest combining the method described above with a seperate approach for the last part of the range:
Alternative method for last part of range
As already mentioned, and implemented in the code example, the second half of the range, M÷2 ~ M, has 1 fraction per integer. Also, the interval M÷3 ~ M÷2 has 2; the interval M÷4 ~ M÷3 has 4. This is of course the Φ(n) sequence again:
M/2 ~ M : 1
M/3 ~ M/2: 2
M/4 ~ M/3: 4
M/5 ~ M/4: 6
M/6 ~ M/5: 10
M/7 ~ M/6: 12
M/8 ~ M/7: 18
M/9 ~ M/8: 22
M/10 ~ M/9: 28
M/11 ~ M/10: 32
M/12 ~ M/11: 42
M/13 ~ M/12: 46
M/14 ~ M/13: 58
M/15 ~ M/14: 64
M/16 ~ M/15: 72
M/17 ~ M/16: 80
M/18 ~ M/17: 96
M/19 ~ M/18: 102 ...
Between these intervals, one integer can have a different number of fractions, depending on the exact value of M, e.g.:
interval fractions
202 ~ 203 10
203 ~ 204 10
204 ~ 205 9
205 ~ 206 6
206 ~ 207 6
The interval 204 ~ 205 lies on the edge between intervals, because M ÷ 5 = 204.6; it has 6 + 3 = 9 fractions because M modulo 5 is 3. If M had been 1022 or 1024 instead of 1023, it would have 8 or 10 fractions. (This example is straightforward because 5 is a prime; see below.)
Again, I would suggest using the hard-coded values for Φ(n) to calculate the number of fractions for the last part of the range. If you use the first 17 values as listed above, this covers the part of the range with fewer than 100 fractions per integer, so that would reduce the impact of rounding errors below 1%. The first 56 values would give you 0.1%, the first 182 values 0.01%.
Together with the values of Φ(n), you could hard-code the number of fractions of the edge intervals for each modulo value, e.g.:
modulo: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
M/ 2 1 2
M/ 3 2 3 4
M/ 4 4 5 5 6
M/ 5 6 7 8 9 10
M/ 6 10 11 11 11 11 12
M/ 7 12 13 14 15 16 17 18
M/ 8 18 19 19 20 20 21 21 22
M/ 9 22 23 24 24 25 26 26 27 28
M/10 28 29 29 30 30 30 30 31 31 32
M/11 32 33 34 35 36 37 38 39 40 41 42
M/12 42 43 43 43 43 44 44 45 45 45 45 46
M/13 46 47 48 49 50 51 52 53 54 55 56 57 58
M/14 58 59 59 60 60 61 61 61 61 62 62 63 63 64
M/15 64 65 66 66 67 67 67 68 69 69 69 70 70 71 72
M/16 72 73 73 74 74 75 75 76 76 77 77 78 78 79 79 80
M/17 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
M/18 96 97 97 97 97 98 98 99 99 99 99 100 100 101 101 101 101 102

This is exactly the same as: (Sum of phi(k)) where m <= k <= M where phi(k) is the Euler Totient Function and with phi(0) = 1 (as defined by the problem). There is no known closed form for this sum. However there are many optimizations known as mentioned in the wiki link. This is known as the Totient Summatory Function in Wolfram. The same website also links to the series: A002088 and provides a few asymptotic approximations.
The reasoning is this: consider the number of values of the form {1/M, 2/M, ...., (M-1)/M, M/M}. All those fractions that will be reducible to a smaller value will not be counted in phi(M) because they are not relatively prime. They will appear in the summation of another totient.
For example, phi(6) = 12 and you have 1 + phi(6), since you also count the 0.

How to retain calculated values between rows when calculating running totals?

I have a tricky question about conditional sum in SAS. Actually, it is very complicated for me and therefore, I cannot explain it by words. Therefore I want to show an example:
A B
5 3
7 2
8 6
6 4
9 5
8 2
3 1
4 3
As you can see, I have a datasheet that has two columns. First of all, I calculated the conditional cumulative sum of column A ( I can do it by myself-So no need help for that step):
A B CA
5 3 5
7 2 12
8 6 18
6 4 8 ((12+8)-18)+6
9 5 17
8 2 18
3 1 10 (((17+8)-18)+3
4 3 14
So my condition value is 18. If the cumulative more than 18, then it equal 18 and next value if sum of the first value after 18 and exceeds amount over 18. ( As I said I can do it by myself )
So the tricky part is I have to calculate the cumulative sum of column B according to column A:
A B CA CB
5 3 5 3
7 2 12 5
8 6 18 9.5 (5+(6*((18-12)/8)))
6 4 8 5.5 ((5+6)-9.5)+4
9 5 17 10.5 (5.5+5)
8 2 18 10.75 (10.5+(2*((18-7)/8)))
3 1 10 2.75 ((10.5+2)-10.75)+1
4 3 14 5.75 (2.75+3)
As you can see from example the cumulative sum of column B is very specific. When column CA is equal to our condition value (18), then we calculate the proportion of the last value for getting our condition value (18) and then use this proportion for computing cumulative sum of column B.

Looks like when the sum of A reaches 18 or more you want to split the values of A and B between the current and the next record. One way is to remember the left over values for A and B and carry them forward in your new cumulative variables. Just make sure to output the observation before resetting those variables.
data want ;
set have ;
ca+a;
cb+b;
if ca >= 18 then do;
extra_a=ca - 18;
extra_b=b - b*((a - extra_a)/a) ;
ca=18;
cb=cb-extra_b ;
end;
output;
if ca=18 then do;
ca=extra_a;
cb=extra_b;
end;
drop extra_a extra_b ;
run;

Can I use regular expressions to search for multiples of a number?

I'm trying to search a big project for all examples of where I've declared an array with [48] as the size or any multiples of 48.
Can I use a regular expression function to find matches of 48 * n?
Thanks.

Here you go (In PHP's PCRE syntax):
^(0*|(1(01*?0)*?1|0)+?0{4})$
Usage:
preg_match('/^(0*|(1(01*?0)*?1|0)+?0{4})$/', decbin($number));
Now, why it works:
Well we know that 48 is really just 3 * 16. And 16 is just 2*2*2*2. So, any number divisible by 2^4 will have the 4 most bits in its binary representation 0. So by ending the regexp with 0{4}$ is equivalent to saying that the number is divisible by 2^4 (or 16). So then, the bits to the left need to be divisible by 3. So using the regexp from this answer, we can tell if they are divisible by 3. So if the whole regexp matches, the number is divisible by both 3 and 16, and hence 48...
QED...
(Note, the leading 0| case handles the failed match when $number is 0). I've tested this on all numbers from 0 to 48^5, and it correctly matches each time...

A generalization of your question is asking whether x is a string representing a multiple of n in base b. This is the same thing as asking whether the remainder of x divided by n is 0. You can easily create a DFA to compute this.
Create a DFA with n states, numbered from 0 to n - 1. State 0 is both the initial state and the sole accepting state. Each state will have b outgoing transitions, one for each symbol in the alphabet (since base-b gives you b digits to work with).
Each state represents the remainder of the portion of x we've seen so far, divided by n. This is why we have n of them (dividing a number by n yields a remainder in the range 0 to n - 1), and also why state 0 is the accepting state.
Since the digits of x are processed from left to right, if we have a number y from the first few digits of x and read the digit d, we get the new value of y from yb + d. But more importantly, the remainder r changes to (rb + d) mod n. So we now know how to connect the transition arcs and complete the DFA.
You can do this for any n and b. Here, for example, is one that accepts multiples of 18 in base-10 (states on the rows, inputs on the columns):
| 0 1 2 3 4 5 6 7 8 9
---+-------------------------------
→0 | 0 1 2 3 4 5 6 7 8 9 ←accept
1 | 10 11 12 13 14 15 16 17 0 1
2 | 2 3 4 5 6 7 8 9 10 11
3 | 12 13 14 15 16 17 0 1 2 3
4 | 4 5 6 7 8 9 10 11 12 13
5 | 14 15 16 17 0 1 2 3 4 5
6 | 6 7 8 9 10 11 12 13 14 15
7 | 16 17 0 1 2 3 4 5 6 7
8 | 8 9 10 11 12 13 14 15 16 17
9 | 0 1 2 3 4 5 6 7 8 9
10 | 10 11 12 13 14 15 16 17 0 1
11 | 2 3 4 5 6 7 8 9 10 11
12 | 12 13 14 15 16 17 0 1 2 3
13 | 4 5 6 7 8 9 10 11 12 13
14 | 14 15 16 17 0 1 2 3 4 5
15 | 6 7 8 9 10 11 12 13 14 15
16 | 16 17 0 1 2 3 4 5 6 7
17 | 8 9 10 11 12 13 14 15 16 17
These get really tedious as n and b get larger, but you can obviously write a program to generate them for you no problem.

1|48|2304|110592|5308416
You are unlikely to have declared an array of size 48^5 or larger.

No, regular expressions can't calculate multiples (except in the unary number system: decimal 4 = unary 1111; decimal 8 = unary 11111111, so the regex ^(1111)+$ matches multiples of 4).

import re
# For real example,
# construction of a chain with integers multiples of 48
# and integers not multiple of 48.
from random import *
w = [ 48*randint( 1,10) for j in xrange(10) ]
w.extend( 48*randint(11,20) for j in xrange(10) )
w.extend( 48*randint(21,70) for j in xrange(10) )
a = [ el if el%48!=0 else el+1 for el in sample(xrange(1000),40) ]
w.extend(a)
shuffle(w)
texte = [ ''.join(sample(' abcdefghijklmonopqrstuvwxyz',randint(1,7))) for i in xrange(40) ]
X = ''.join(texte[i]+str(w[i]) for i in xrange(40))
# Searching the multiples of 48 in the chain X
def mult48(match):
g1 = match.group()
if int(g1)%48==0:
return ( g1, X[0:match.end()] )
else:
return ( g1, 'not multiple')
for match in re.finditer('\d+',X):
print '%s %s\n' % mult48(match)

Any multiple is difficult, but here's a (python-style) regexp that matches the first 200 multiples of 48.
0$|1(?:0(?:08$|56$)|1(?:04$|52$)|2(?:00$|48$|96$)|3(?:44$|92$)|4(?:4(?:$|0$)|88$\
)|5(?:36$|84$)|6(?:32$|80$)|7(?:28$|76$)|8(?:24$|72$)|9(?:2(?:$|0$)|68$))|2(?:0(\
?:16$|64$)|1(?:12$|60$)|2(?:08$|56$)|3(?:04$|52$)|4(?:0(?:$|0$)|48$|96$)|5(?:44$\
|92$)|6(?:40$|88$)|7(?:36$|84$)|8(?:32$|8(?:$|0$))|9(?:28$|76$))|3(?:0(?:24$|72$\
)|1(?:20$|68$)|2(?:16$|64$)|3(?:12$|6(?:$|0$))|4(?:08$|56$)|5(?:04$|52$)|6(?:00$\
|48$|96$)|7(?:44$|92$)|8(?:4(?:$|0$)|88$)|9(?:36$|84$))|4(?:0(?:32$|80$)|1(?:28$\
|76$)|2(?:24$|72$)|3(?:2(?:$|0$)|68$)|4(?:16$|64$)|5(?:12$|60$)|6(?:08$|56$)|7(?\
:04$|52$)|8(?:$|0(?:$|0$)|48$|96$)|9(?:44$|92$))|5(?:0(?:40$|88$)|1(?:36$|84$)|2\
(?:32$|8(?:$|0$))|3(?:28$|76$)|4(?:24$|72$)|5(?:20$|68$)|6(?:16$|64$)|7(?:12$|6(\
?:$|0$))|8(?:08$|56$)|9(?:04$|52$))|6(?:0(?:00$|48$|96$)|1(?:44$|92$)|2(?:4(?:$|\
0$)|88$)|3(?:36$|84$)|4(?:32$|80$)|5(?:28$|76$)|6(?:24$|72$)|7(?:2(?:$|0$)|68$)|\
8(?:16$|64$)|9(?:12$|60$))|7(?:0(?:08$|56$)|1(?:04$|52$)|2(?:0(?:$|0$)|48$|96$)|\
3(?:44$|92$)|4(?:40$|88$)|5(?:36$|84$)|6(?:32$|8(?:$|0$))|7(?:28$|76$)|8(?:24$|7\
2$)|9(?:20$|68$))|8(?:0(?:16$|64$)|1(?:12$|6(?:$|0$))|2(?:08$|56$)|3(?:04$|52$)|\
4(?:00$|48$|96$)|5(?:44$|92$)|6(?:4(?:$|0$)|88$)|7(?:36$|84$)|8(?:32$|80$)|9(?:2\
8$|76$))|9(?:0(?:24$|72$)|1(?:2(?:$|0$)|68$)|2(?:16$|64$)|3(?:12$|60$)|4(?:08$|5\
6$)|5(?:04$|52$)|6(?:$|0$))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to group data in kdb+ using customized groups? - grouping

Related

Subtracting values based on a index column and using a condition in the same column in DAX

Get the max of the average for each group

Density of fractions between 2 given numbers

How to retain calculated values between rows when calculating running totals?

Can I use regular expressions to search for multiples of a number?

Categories

Resources