np.delete and np.s_. What's so special about np_s? - python-2.7

I don't really understand why regular indexing can't be used for np.delete. What makes np.s_ so special?
For example with this code, used to delete the some of the rows of this array..
inlet_names = np.delete(inlet_names, np.s_[1:9], axis = 0)
Why can't I simply use regular indexing and do..
inlet_names = np.delete(inlet_names, [1:9], axis = 0)
or
inlet_names = np.delete(inlet_names, inlet_names[1:9], axis = 0)
From what I can gather, np.s_ is the same as np.index_exp except it doesn't return a tuple, but both can be used anywhere in Python code.
Then when I look into the np.delete function, it indicates that you can use something like [1,2,3] to delete those specific indexes along the entire array. So whats preventing me from using something similar to delete certain rows or columns from the array?
I'm simply assuming that this type of indexing is read as something else in np.delete so you need to use np.s_ in order to specify, but I can't get to the bottom of what exactly it would be reading it as because when I try the second piece of code it simply returns "invalid syntax". Which is weird because this code works...
inlet_names = np.delete(inlet_names, [1,2,3,4,5,6,7,8,9], axis = 0)
So I guess the answer could possibly be that np.delete only accepts a list of the indexes that you would like to delete. And that np._s returns a list of the indexes that you specify for the slice.
Just could use some clarification and some corrections on anything I just said about the functions that may be wrong, because a lot of this is just my take, the documents don't exactly explain everything that I was trying to understand. I think I'm just overthinking this, but I would like to actually understand it, if someone could explain it.

np.delete is not doing anything unique or special. It just returns a copy of the original array with some items missing. Most of the code just interprets the inputs in preparation to make this copy.
What you are asking about is the obj parameter
obj : slice, int or array of ints
In simple terms, np.s_ lets you supply a slice using the familiar : syntax. The x:y notation cannot be used as a function parameter.
Let's try your alternatives (you allude to these in results and errors, but they are buried in the text):
In [213]: x=np.arange(10)*2 # some distinctive values
In [214]: np.delete(x, np.s_[3:6])
Out[214]: array([ 0, 2, 4, 12, 14, 16, 18])
So delete with s_ removes a range of values, namely 6 8 10, the 3rd through 5th ones.
In [215]: np.delete(x, [3:6])
File "<ipython-input-215-0a5bf5cc05ba>", line 1
np.delete(x, [3:6])
^
SyntaxError: invalid syntax
Why the error? Because [3:4] is an indexing expression. np.delete is a function. Even s_[[3:4]] has problems. np.delete(x, 3:6) is also bad, because Python only accepts the : syntax in an indexing context, where it automatically translates it into a slice object. Note that is is a syntax error, something that the interpreter catches before doing any calculations or function calls.
In [216]: np.delete(x, slice(3,6))
Out[216]: array([ 0, 2, 4, 12, 14, 16, 18])
A slice works instead of s_; in fact that is what s_ produces
In [233]: np.delete(x, [3,4,5])
Out[233]: array([ 0, 2, 4, 12, 14, 16, 18])
A list also works, though it works in different way (see below).
In [217]: np.delete(x, x[3:6])
Out[217]: array([ 0, 2, 4, 6, 8, 10, 14, 18])
This works, but produces are different result, because x[3:6] is not the same as range(3,6). Also the np.delete does not work like the list delete. It deletes by index, not by matching value.
np.index_exp fails for the same reason that np.delete(x, (slice(3,6),)) does. 1, [1], (1,) are all valid and remove one item. Even '1', the string, works. delete parses this argument, and at this level, expects something that can be turned into an integer. obj.astype(intp). (slice(None),) is not a slice, it is a 1 item tuple. So it's handled in a different spot in the delete code. This is TypeError produced by something that delete calls, very different from the SyntaxError. In theory delete could extract the slice from the tuple and proceed as in the s_ case, but the developers did not choose to consider this variation.
A quick study of the code shows that np.delete uses 2 distinct copying methods - by slice and by boolean mask. If the obj is a slice, as in our example, it does (for 1d array):
out = np.empty(7)
out[0:3] = x[0:3]
out[3:7] = x[6:10]
But with [3,4,5] (instead of the slice) it does:
keep = np.ones((10,), dtype=bool)
keep[[3,4,5]] = False
return x[keep]
Same result, but with a different construction method. x[np.array([1,1,1,0,0,0,1,1,1,1],bool)] does the same thing.
In fact boolean indexing or masking like this is more common than np.delete, and generally just as powerful.
From the lib/index_tricks.py source file:
index_exp = IndexExpression(maketuple=True)
s_ = IndexExpression(maketuple=False)
They are slighly different versions of the same thing. And both are just convenience functions.
In [196]: np.s_[1:4]
Out[196]: slice(1, 4, None)
In [197]: np.index_exp[1:4]
Out[197]: (slice(1, 4, None),)
In [198]: np.s_[1:4, 5:10]
Out[198]: (slice(1, 4, None), slice(5, 10, None))
In [199]: np.index_exp[1:4, 5:10]
Out[199]: (slice(1, 4, None), slice(5, 10, None))
The maketuple business applies only when there is a single item, a slice or index.

Related

Printing a reverse list in python

In the following code I got None, why ?
Thanks,
numbers = [10, 15, 20, 30]
print(numbers)
print(numbers.reverse())
numbers.reverse()
print(numbers)
If you look at that operation in the documentation of the Mutable Sequence Types, you will see this note:
The reverse() method modifies the sequence in place for economy of space when reversing a large sequence. To remind users that it operates by side effect, it does not return the reversed sequence.
So numbers.reverse() won't return anything but will still change the order of the list.

Name of List with maximum value

I am new to python. Using it with grasshopper.
I have 5 lists, each actually with 8760 items for which i have found max values at each index "but I also need to know which list the value came from at any given index."
I would put a simple example to explain myself better.
For 2 lists
A = [5,10,15,20,25]
B = [4,9,16,19,26]
Max value per index = [5,10,16,20,26]
What I want is something like
Max value per index = [5(A), 10(A), 16(B), 20(A), 26(B)]
Or something along the line that can relate. I am not sure whether its possible.
I would really appreciate the help. Thank you.
This can be adapted to N lists.
[(max(a),a.index(max(a))) for a in list(zip(A,B))]
The .index(max(a)) gets the index at which the max(a) occurs.
The output for your example is
[(5, 0), (10, 0), (16, 1), (20, 0), (26, 1)]
Of course, if both A and B share the same value, then the index will be the first one found, A.
See https://docs.python.org/3.3/library/functions.html for description of very useful zip built-in function.

Python 3 list with range and other individual numbers

I need to make a list of numbers. These numbers represent binary masks. The first 100 or so masks are all included in this range. In the next group of masks only certain masks are included. I need a list similar to the following.
[1,2,3,5,6,7,8,9,10,30,34,48,53,62]
Can I do something like [range(1,10),30,34,48,53,62]
or do I need to create my list using range(1,10) and then append the next list to it?
Thanks
Python 3 actually allow you to build a list literal prepending an * to any iterable objects - which are in turn expanded in place:
>>> [1,2, *range(10), *range(2)]
[1, 2, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
If you need this n older Pythons, or if you'd prefer to keep readability for people not too proeficient in Python who might have to walk through your code, an option is just to concatenate your different list fragments using the + operator:
a = list(range(1,10)) + [ 30,34,48,53,62]
Looks like I had to use the list(range(1,10)+[47,34,57]
solution

Understanding; for i in range, x,y = [int(i) in i.... Python3

I am stuck trying to understand the mechanics behind this combined input(), loop & list-comprehension; from Codegaming's "MarsRover" puzzle. The sequence creates a 2D line, representing a cut-out of the topology in an area 6999 units wide (x-axis).
Understandably, my original question was put on hold, being to broad. I am trying to shorten and to narrow the question: I understand list comprehension basically, and I'm ok experienced with for-loops.
Like list comp:
land_y = [int(j) for j in range(k)]
if k = 5; land_y = [0, 1, 2, 3, 4]
For-loops:
for i in the range(4)
a = 2*i = 6
ab.append(a) = 0,2,4,6
But here, it just doesn't add up (in my head):
6999 points are created along the x-axis, from 6 points(x,y).
surface_n = int(input())
for i in range(surface_n):
land_x, land_y = [int(j) for j in input().split()]
I do not understand where "i" makes a difference.
I do not understand how the data "packaged" inside the input. I have split strings of integers on another task in almost exactly the same code, and I could easily create new lists and work with them - as I understood the structure I was unpacking (pretty simple being one datatype with one purpose).
The fact that this line follows within the "game"-while-loop confuses me more, as it updates dynamically as the state of the game changes.
x, y, h_speed, v_speed, fuel, rotate, power = [int(i) for i in input().split()]
Maybe someone could give an example of how this could be written in javascript, haskell or c#? No need to be syntax-correct, I'm just struggling with the concept here.
input() takes a line from the standard input. So it’s essentially reading some value into your program.
The way that code works, it makes very hard assumptions on the format of the input strings. To the point that it gets confusing (and difficult to verify).
Let’s take a look at this line first:
land_x, land_y = [int(j) for j in input().split()]
You said you already understand list comprehension, so this is essentially equal to this:
inputs = input().split()
result = []
for j in inputs:
results.append(int(j))
land_x, land_y = results
This is a combination of multiple things that happen here. input() reads a line of text into the program, split() separates that string into multiple parts, splitting it whenever a white space character appears. So a string 'foo bar' is split into ['foo', 'bar'].
Then, the list comprehension happens, which essentially just iterates over every item in that splitted input string and converts each item into an integer using int(j). So an input of '2 3' is first converted into ['2', '3'] (list of strings), and then converted into [2, 3] (list of ints).
Finally, the line land_x, land_y = results is evaluated. This is called iterable unpacking and essentially assumes that the iterable on the right has exactly as many items as there are variables on the left. If that’s the case then it’s just a nice way to write the following:
land_x = results[0]
land_y = results[1]
So basically, the whole list comprehension assumes that there is an input of two numbers separated by whitespace, it then splits those into separate strings, converts those into numbers and then assigns each number to a separate variable land_x and land_y.
Exactly the same thing happens again later with the following line:
x, y, h_speed, v_speed, fuel, rotate, power = [int(i) for i in input().split()]
It’s just that this time, it expects the input to have seven numbers instead of just two. But then it’s exactly the same.

Append a list to a list

I have a list of numbers and I want to extract N elements as lists, and store them in another list.
Example:
list1 = [1,2,3,4,5,6,7,8,9]
resultList = [[1,2,3],[4,5,6],[7,8,9]]
I've done the following
def getLines(square, N):
i = 0
line = [None]*N
lines = list()
for elt in square:
line[i] = elt
i += 1
if i == N:
lines.append(line)
i = 0
return lines
Why do I always get the last list three times
[[7,8,9],[7,8,9],[7,8,9]]
when I call the function getLines(list1, 3).
I also tried to eliminate the temporary list and add the elements directly to resultList like this:
def getLines(square, N):
i = 0
j = 0
lines = [[None]*N]*N # Need to be initialized to be able to index it.
for elt in square:
lines[i][j] = elt
j += 1
if j == N:
i += 1
j = 0
return lines
The last group is still appearing N times. Any hints on how to fix that?
This is because you are creating only one inner list object, and altering it.
In pseudocode, what you are doing is:
Create a list called line assigning [None, None, None] to it
Create an empty list called lines
For three times:
-- Pick n items from the square list
-- Assign these three items to line[0], line[1] and line[2]
-- Append line to lines
So, what you are doing is assigning to individual items of line. This is important - you're not making a new object each time, you're changing individual items in the line list.
At the end of it all, line will point to the list [7, 8, 9]. And you can see lines as being substantially [line, line, line] (a list of three times the same object), so specifically now it will point to [[7,8,9], [7,8,9], [7,8,9]].
To solve this, possibly the solution that most keeps your original code is to re-define line after appending it. This way, the variable name line will refer to a different list each time, and you won't have this problem.
def getLines(square, N):
i = 0
line = [None]*N
lines = list()
for elt in square:
line[i] = elt
i += 1
if i == N:
lines.append(line)
line = [None]*N # Now `line` points to a different object
i = 0
return lines
Of course, there is leaner, more Pythonic code that can do the same thing (I see that an answer has already been given).
EDIT - Ok, here goes a somehow more detailed explanation.
Perhaps one of the key concepts is that lists are not containers of other objects; they merely hold references to other objects.
Another key concept is that when you change an item in a list (item assignment), you're not making the whole list object become another object. You're merely changing a reference inside it. This is something we give for granted in a lot of situations, but somehow becomes counter-intuitive when we'd want things to go the other way and "recycle" a list.
As I was writing in the comments, if list was a cat named Fluffy, every time you're appending you're creating a mirror that points to Fluffy. So you can dress Fluffy with a party hat, put a mirror pointing to it, then give Fluffy a clown nose, put on another mirror, then dress Fluffy as a ballerina, add a third mirror, and when you look at the mirrors, all three of them will show the ballerina Fluffy. (Sorry Fluffy).
What I mean is that in practice in your first script, when you do the append:
lines.append(line)
by the first concept I mentioned, you are not making lines contain the current status of line as a separate object. You are appending a reference to the line list.
And when you do,
line[i] = elt
by the second concept, of course line is always the same object; you're just changing what's referenced at the i-th position.
This is why, at the end of your script, lines will appear to "contain three identical objects": because you actually appended three references to the same object. And when you ask to see the content of lists, you will read, three times, the list object in its current status.
In the code I provided above, I re-define the name lists to make it reference a brand new list every time it's been appended to lists:
lines.append(line)
line = [None]*N # Now `line` points to a different object
This way, at the end of the script I have "three different cats" appended, and each one was conveniently named Fluffy just until I had appended it, to give room for a new Fluffy list after that.
Now, in your second script, you do something similar. The key instruction is:
lines = [[None]*N]*N # Need to be initialized to be able to index it.
In this line, you are creating two objects:
- the list [None, None, None]
- the list named lines, which contains N references to the same list [None, None, None].
What you did was just to create straight away Fluffy and the three mirrors pointing at him.
In fact if you change lines[0][2], or lines[1][2], you're just changing the same item [2] of your same Fluffy.
What you actually wanted to do is,
lines = [[None]*N for i in range(N)]
which creates three different cats - I mean, lists, and have lines point to the three.
You might consider solving this like:
def getLines(square, N):
return [square[i:i + N] for i in range(0, len(square), N)]
For example: getLines([1, 2, 3, 4, 5, 6, 7, 8, 9], 3) will return [[1, 2, 3], [4, 5, 6], [7, 8, 9]], or getLines([1, 2, 3, 4, 5, 6, 7, 8, 9], 2) results in [[1, 2], [3, 4], [5, 6], [7, 8], [9]], and so on.