Writing functions problems

Writing functions problems - list

new R programming student here. I am trying to write a function following these guidelines:
Use sample() to randomly select m integers from a vector y.
If input values are less than or equal to 100 & divisible by 3, then assign these values into a vector s.
If input values are greater than 100 & divisible by 4, then assign these values into a vector d.
Return a list l which contains the selected me integers, s & d.
Do not use the function which()
Apply the function to a smaple which contains 20 randomly selected integers from vec = c(1:200). Show the output.
Below is the code I'm using. I have a couple of problems:
1) I don't understand where to input the m to select a specified number of integers from vector y.
2) I can't get my list to separate into labels for s & d vectors.
Any help would be much appreciated!
# Write function for problem
myfun <- function(y){
x <- sample(y) # I need to selected m integers from vector y; why doesn't sample(y, m) work here?
s <- NULL
d <- NULL
for (i in x){
if (i%%4==0 && i>100){
d <- print(i)
}
}
for (i in x){
if (i%%3==0 && i<=100){
s <- print(i)
}
}
l <- sapply() # trying to combine vectors s & d into a list
l[["d"]] <- d # trying to label values in vector d
l[["s"]] <- s # trying to label values in vector s
return(l)
}
myfun(aa) # prints correct numbers, but not properly labeled.
myfun(bb) # prints correct numbers, but not properly labeled.
vec = c(1:200) # make vector of integers from 1-200.
myfun(vec, 20) # run myfun on vector vec with 20 samples selected. Does not work; don't know where
to put m in function.

Related

Find the same numbers between [a,b] intervals

Suppose I have 3 array of consecutive numbers
a = [1, 2, 3]
b = [2, 3, 4]
c = [3, 4]
Then the same number that appears in all 3 arrays is 3.
My algorithm is to use two for loops in each other to check for the same array and push it in another array (let's call it d). Then
d = [2, 3] (d = a overlap b)
And use it again to check for array d and c => The final result is 1, cause there are only 1 numbers that appears in all 3 arrays.
e = [3] (e = c overlap d) => e.length = 1
Other than that, if there exists only 1 array, then the algo should return the length of the array, as all of its numbers appear in itself. But I think my said algo above would take too long because the numbers of array can go up to 10^5. So, any idea of a better algorithm?

But I think my said algo above would take too long because the numbers of array can go up to 105. So, any idea of a better algorithm?
Yes, since these are ranges, you basically want to calculate the intersection of the ranges. This means that you can calculate the maximum m of all the first elements of the lists, and the minimum n of all the last elements of the list. All the numbers between m and n (both inclusive) are then members of all lists. If m>n, then there are no numbers in these lists.
You do not need to calculate the overlap by enumerating over the first list, and check if these are members of the last list. Since these are consecutive numbers, we can easily find out what the overlap is.
In short, the overlap of [a, ..., b] and [c, ..., d] is [ max(a,c), ..., min(b,d) ], there is no need to check the elements in between.

Finding the sum of all elements of a list in list which has characters in 1 line

ab = [ ['5','6','7','8','9','10'],['1','2','3'],['3','4','5']]
print sum([sum(int(x) for x in y for y in ab])])
I have to find the sum of all elements in ab with a single print statement. I'm trying to convert each element of each of the lists to int and creating a list which has sum of each individual list.
I get a syntax error and not sure how to do it.

You are defining the for loop for x (the inner loop) before the for loop for y (the outer loop). That is why it is not working. What you need is print sum(int(x) for y in ab for x in y)
Also it might be better to use a generator here because you are not reusing the list, since it is more efficient.

Erase matrix element and give it new size and elements in rcpp

Example in R:
A: a = matrix(1:100,10,10)
B: a = matrix(1:9,3,3)
C: a = matrix(1:400,20,20)
What is the equivalent rcpp code for this simple example?
a is always one variable with mutable contents and size.
In A, I created the matrix a with this rcpp code:
NumericMatrix a(10,10)
And fill it with sequence of number from 1 to 100.
I want to resize this matrix with a command like this:
a(3,3)
or
a(20,20)
and put 1 to 9 or 1 to 400 in it.

The RcppArmadillo can solve the problem:
arma::mat m1 = arma::eye<arma::mat>( 10, 10 ) ;
m1.set_size(20,20);
m1.set_size(3,3);
I do not know if it is possible in rcpp.

haskell, counting how many prime numbers are there in a list

i m a newbie to haskell, currently i need a function 'f' which, given two integers, returns the number of prime numbers in between them (i.e., greater than the first integer but smaller than the second).
Main> f 2 4
1
Main> f 2 10
3
here is my code so far, but it dosent work. any suggestions? thanks..
f :: Int -> Int -> Int
f x y
| x < y = length [ n | n <- [x..y], y 'mod' n == 0]
| otherwise = 0

Judging from your example, you want the number of primes in the open interval (x,y), which in Haskell is denoted [x+1 .. y-1].
Your primality testing is flawed; you're testing for factors of y.
To use a function name as an infix operator, use backticks (`), not single quotes (').
Try this instead:
-- note: no need for the otherwise, since [x..y] == [] if x>y
nPrimes a b = length $ filter isPrime [a+1 .. b-1]
Exercise for the reader: implement isPrime. Note that it only takes one argument.

Look at what your list comprehension does.
n <- [x..y]
Draw n from a list ranging from x to y.
y `mod` n == 0
Only select those n which evenly divide y.
length (...)
Find how many such n there are.
What your code currently does is find out how many of the numbers between x and y (inclusive) are factors of y. So if you do f 2 4, the list will be [2, 4] (the numbers that evenly divide 4), and the length of that is 2. If you do f 2 10, the list will be `[2, 5, 10] (the numbers that evenly divide 10), and the length of that is 3.
It is important to try to understand for yourself why your code doesn't work. In this case, it's simply the wrong algorithm. For algorithms that find whether a number is prime, among many other sources, you can check the wikipedia article: Primality test.

I you want to work with large intervals, then it might be a better idea to compute a list of primes once (instead of doing a isPrime test for every number):
primes = -- A list with all prime numbers
candidates = [a+1 .. b-1]
myprimes = intersectSortedLists candidates primes
nPrimes = length $ myprimes

Enumerating combinations in a distributed manner

I have a problem where I must analyse 500C5 combinations (255244687600) of something. Distributing it over a 10-node cluster where each cluster processes roughly 10^6 combinations per second means the job will be complete in about seven hours.
The problem I have is distributing the 255244687600 combinations over the 10 nodes. I'd like to present each node with 25524468760, however the algorithms I'm using can only produce the combinations sequentially, I'd like to be able to pass the set of elements and a range of combination indicies, for example, [0-10^7), [10^7,2.0 10^7), etc. and have the nodes themselves figure out the combinations.
The algorithms I'm using at the moment are from the following:
http://howardhinnant.github.io/combinations.html
Stack Overflow question Efficiently computing vector combinations
I've considered using a master node, that enumerates each of the combinations and sends work to each of the nodes. However, the overhead incurred in iterating the combinations from a single node and communicating back and forth work is enormous, and it will subsequently lead to the master node becoming the bottleneck.
Is there any good combination iterating algorithms geared up for efficient/optimal distributed enumeration?

You may have some success with combinatorial numbers, which allow you to retrieve the N'th (n/10th) k-combination with a simple algorithm; then run the next_combination algorithm n/10 times on each of the ten nodes to iterate.
Sample code (in C#, but quite readable for a C++ programmer) can be found on MSDN.

Have node number n process every tenth combination, starting from the nth.

I know this question is old, but here is how it may be done efficiently.
All code currently in Python which I'm sure will be easy enough to translate to C++.- You will probably want to move from using an integer for the characteristic vector to using a bit array, since the integers used will need 500 bits (not a problem in Python)- Feel free to update to C++ anyone.
Distribute to the nodes their range of combinations (start number and length to process), the set of items from which combinations are to be chosen, and the number, k, of items to choose.
Initialise each node by having it find its starting combination directly from start and items.
Run each node by having it do the work for the first combination then iterate through the rest of its combinations, and do the associated work.
To perform 1 do as you suggest find n-choose-k and divide it into ranges - in your case 500-Choose-5 is, as you said, 255244687600 so, for node=0 to 9 you distribute:(start=node*25524468760, length=25524468760, items=items, k=5)
To perform 2 you can find the starting combination directly (without iteration) using the combinatorial number system and find the integer representation of the combination's characteristic vector (to be used for the iteration in 3) at the same time:
def getCombination(index, items, k):
'''Returns (combination, characteristicVector)
combination - The single combination, of k elements of items, that would be at
the provided index if all possible combinations had each been sorted in
descending order (as defined by the order of items) and then placed in a
sorted list.
characteristicVector - an integer with chosen item's bits set.
'''
combination = []
characteristicVector = 0
n = len(items)
nCk = 1
for nMinusI, iPlus1 in zip(range(n, n - k, -1), range(1, k + 1)):
nCk *= nMinusI
nCk //= iPlus1
curIndex = nCk
for k in range(k, 0, -1):
nCk *= k
nCk //= n
while curIndex - nCk > index:
curIndex -= nCk
nCk *= (n - k)
nCk -= nCk % k
n -= 1
nCk //= n
n -= 1
combination .append(items[n])
characteristicVector += 1 << n
return combination, characteristicVector
The integer representation of the characteristic vector has k bits set in the positions of the items that make up the combination.
To perform 3 you can use Gosper's hack to iterate to the next characteristic vector for the combination in the same number system (the next combination that would appear in a sorted list of reverse sorted combinations relative to items) and, at the same time, create the combination:
def nextCombination(items, characteristicVector):
'''Returns the next (combination, characteristicVector).
combination - The next combination of items that would appear after the
combination defined by the provided characteristic vector if all possible
combinations had each been sorted in descending order (as defined by the order
of items) and then placed in a sorted list.
characteristicVector - an integer with chosen item's bits set.
'''
u = characteristicVector & -characteristicVector
v = u + characteristicVector
if v <= 0:
raise OverflowError("Ran out of integers") # <- ready for C++
characteristicVector = v + (((v ^ characteristicVector) // u) >> 2)
combination = []
copiedVector = characteristicVector
index = len(items) - 1
while copiedVector > 0:
present, copiedVector = divmod(copiedVector, 1 << index)
if present:
combination.append(items[index])
index -= 1
return combination, characteristicVector
Repeat this length-1 times (since you already found the first one directly).
For example:
Five nodes processing 7-choose-3 letters:
>>> items = ('A','B','C','D','E','F','G')
>>> k = 3
>>> nodes = 5
>>> n = len(items)
>>> for nmip1, i in zip(range(n - 1, n - k, -1), range(2, k + 1)):
... n = n * nmip1 // i
...
>>> for node in range(nodes):
... length = n // nodes
... start = node * length
... print("Node {0} initialised".format(node))
... combination, cv = getCombination(start, items, k)
... doWork(combination)
... for i in range(length-1):
... combination, cv = nextCombination(items, cv)
... doWork(combination)
...
Node 0 initialised
Doing work with: C B A
Doing work with: D B A
Doing work with: D C A
Doing work with: D C B
Doing work with: E B A
Doing work with: E C A
Doing work with: E C B
Node 1 initialised
Doing work with: E D A
Doing work with: E D B
Doing work with: E D C
Doing work with: F B A
Doing work with: F C A
Doing work with: F C B
Doing work with: F D A
Node 2 initialised
Doing work with: F D B
Doing work with: F D C
Doing work with: F E A
Doing work with: F E B
Doing work with: F E C
Doing work with: F E D
Doing work with: G B A
Node 3 initialised
Doing work with: G C A
Doing work with: G C B
Doing work with: G D A
Doing work with: G D B
Doing work with: G D C
Doing work with: G E A
Doing work with: G E B
Node 4 initialised
Doing work with: G E C
Doing work with: G E D
Doing work with: G F A
Doing work with: G F B
Doing work with: G F C
Doing work with: G F D
Doing work with: G F E
>>>

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Writing functions problems - list

Related

Find the same numbers between [a,b] intervals

Finding the sum of all elements of a list in list which has characters in 1 line

Erase matrix element and give it new size and elements in rcpp

haskell, counting how many prime numbers are there in a list

Enumerating combinations in a distributed manner

Categories

Resources