How to find closest exceeding number between two lists? - python-2.7

I have two lists of numbers list A and list B
I want to map every number in list A to a number in list B. That number is the closest number that list A exceeds in list B.
So for example, if i have the number 5 in list A and there are the numbers 3 and 6 in list B, then I want the number 5 to map to 3.
I realize I could do this by taking the difference between each number in list A with each number in list B then indexing and such but my list A and list B are extremely long and was wondering if there was a more efficient way to go about this.
Thanks!

You say you are looking for something faster than getting the difference. If you look at this answer, which computes the closest value for a single item in O(n), your list would only take O(n^2), which is really quick. Your solution would look like this:
>>> A = [100, 7, 9]
>>> B = [2, 5, 6, 8, 123, 12]
>>> [min(A, key=lambda x: 2**16 if x > y else abs(x-y)) for y in B]
[12, 6, 8]
The 2**16 is slightly dirty, but gets the job done.

Related

Simple way to comprehend a list from a secondary list, only for ascending values

I have data in a pandas dataframe that consists of values that increase to a point, and then start decreasing. I am wondering how to simply extract the values up to the point at which they stop increasing.
For example,
d = {'values' : [1, 2, 3, 3, 2, 1]}
df = pd.DataFrame(data=d)
desired result = [1, 2, 3]
This is my attempt, which I thought would check to see if the current list index is larger than the previous, then move on:
result = [i for i in df['values'] if df['values'][i-1] < df['values'][i]]
which returns
[1, 2, 2, 1]
I'm unsure what is happening for that to be the result.
Edit:
Utilizing the .diff() function, suggested by Andrej, combined with list comprehension, I get the same result. (the numpy np.isnan() is used to include the first element of the difference list, which is NaN).
result = [i for i in df['values']
if df['values'].diff().iloc[i]>0
or np.isnan(df['values'].diff().iloc[i])]
result = [1, 2, 2, 1]
You can use .diff() to get difference between the values. If the values are increasing, the difference will be positive. So as next step do a .cumsum() of these values and search for maximum value:
print(df.loc[: df["values"].diff().cumsum().idxmax()])
Prints:
values
0 1
1 2
2 3

Combination of elements of lists that meet some condition?

Given:
a = [5, 2, 8, 3, 9]
b = [3, 5, 7, 6, 8]
c = [8, 5, 7, 4, 9].
What is needed:
d = [(9, 8), (8, 7), ..., (5, 5, 5), (5, 6, 5), (5, 6, 7), ..., (8, 7, 7), (9, 8, 9), ...].
Description:
(1) In the above example, there are three lists a, b, c having integer elements and the output is another list d of tuples.
(2) The tuples in d have elements belonging to (a and b and c) or (a and b) or (b and c) such that difference between elements within any tuple is not greater than 1.
(3) Problem: How to find the complete list d where we take any element from any input list and find the difference less than or equal to 1. Generalize to more than just three input list: a, b, c, d, e, ... and each one is having ~ 1000 elements. I also need to retrieve the indices relative to the input lists/ arrays that form the tuples.
(4) Clarification: (a) All such tuples which contain entries not differing by more than 1 are allowed.
(b) Tuples must have elements that are close to at least one other element by not more than 1.
(c) Entries within a tuple must belong to different input arrays/ lists.
Let me know if there are further clarifications needed!
You can use sorting to find results faster than a naive brute-force. That being said, this assumes the number of output tuple is reasonably small. Otherwise, there is no way to find a solution in a reasonable time (eg. several months). As #mosway pointed out in the comments, the number of combinations can be insanely huge since the complexity is O(N ** M) (ie. exponential) where N is the number of list and M is the length of the lists.
The idea is to use np.unique on all lists so to get many sorted arrays with unique items. Then, you can iterate over the first array, and for each number (in the first array), find the range of values in the second one fitting in [n-1;n+1] using a np.searchsorted. You can then iterate over the filtered values of the second array and recursively do that on other array.
Note that regarding which array is chosen first, the method can be significantly faster. Thus, a good heuristic could be to select an array containing values very distant from others. Computing a distance matrix with all the values of all array and selecting the one having the biggest average distance should help.
Note also that using Numba should significantly speed up the recursive calls.

How would I compare a list (or equivalent) to another list in c++

I am attempting to learn C++ from scratch and possess a medium amount of python knowledge.
Here is some of my python code which takes a number, turns it into a list and checks if it contains all digits 0-9. If so it returns True, if not it returns False.
def val_checker(n):
values = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
lst = []
for i in range(len(str(n))):
lst.append((n // 10 ** i) % 10)
lst = lst[::-1]
return all(i in lst for i in values)
How would I achieve a similar thing in C++?
You would use the standard library container std::set or better yet std::unordered_set
This container will hold at most one of each distinct element, duplicates insertions are ignored.
So you can run through your original number in a loop, adding each digit into the set, and consider success if s.size() == 10 if your set is called s

Find the same numbers between [a,b] intervals

Suppose I have 3 array of consecutive numbers
a = [1, 2, 3]
b = [2, 3, 4]
c = [3, 4]
Then the same number that appears in all 3 arrays is 3.
My algorithm is to use two for loops in each other to check for the same array and push it in another array (let's call it d). Then
d = [2, 3] (d = a overlap b)
And use it again to check for array d and c => The final result is 1, cause there are only 1 numbers that appears in all 3 arrays.
e = [3] (e = c overlap d) => e.length = 1
Other than that, if there exists only 1 array, then the algo should return the length of the array, as all of its numbers appear in itself. But I think my said algo above would take too long because the numbers of array can go up to 10^5. So, any idea of a better algorithm?
But I think my said algo above would take too long because the numbers of array can go up to 105. So, any idea of a better algorithm?
Yes, since these are ranges, you basically want to calculate the intersection of the ranges. This means that you can calculate the maximum m of all the first elements of the lists, and the minimum n of all the last elements of the list. All the numbers between m and n (both inclusive) are then members of all lists. If m>n, then there are no numbers in these lists.
You do not need to calculate the overlap by enumerating over the first list, and check if these are members of the last list. Since these are consecutive numbers, we can easily find out what the overlap is.
In short, the overlap of [a, ..., b] and [c, ..., d] is [ max(a,c), ..., min(b,d) ], there is no need to check the elements in between.

Python3 how to create a list of partial products

I have a very long list (of big numbers), let's say for example:
a=[4,6,7,2,8,2]
I need to get this output:
b=[4,24,168,336,2688,5376]
where each b[i]=a[0]*a[1]...*a[i]
I'm trying to do this recursively in this way:
b=[4] + [ a[i-1]*a[i] for i in range(1,6)]
but the (wrong) result is: [4, 24, 42, 14, 16, 16]
I don't want to compute all the products each time, I need a efficient way (if possible), because the list is very long
At the moment this works for me:
b=[0]*6
b[0]=4
for i in range(1,6): b[i]=a[i]*b[i-1]
but it's too slow. Any ideas? Is it possible to avoid "for" or to speedup it in other way?
You can calculate the product step-by-step since every next calculation heavily depends on the previous one.
What I mean is:
1) Compute the product for the first i - 1 numbers
2) The i-th product will be equal to a[i] * product of the last i - 1 numbers
This method is called dynamic programming
Dynamic programming (also known as dynamic optimization) is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions
This is the implementation:
a = [4, 6, 7, 2, 8, 2]
b = []
product_so_far = 1
for i in range(len(a)):
product_so_far *= a[i]
b.append(product_so_far)
print(b)
This algorithm works in linear time (O(n)), which is the most efficient complexity you'll get for such a task
If you want a little optimization, you could generate the b list to the predefined length (b = [0] * len(a)) and, instead of appending, you would do this in a loop:
b[i] = product_so_far