Scrapy/ item loader / How to load items in order? - python-2.7

I am trying to scrap the following a map coordinates , and I have the coordinates as var Data = {lat: 45.000000 , long : 68.00000}
I am able to scrap the above data items 'lng':68.0000 and 'lat':45.0000 as separate items.
and a also put them in a new Key "loc" as 'loc':{'lat':45.0000 ,long: 68.000000}. I am trying to store these scraped items in to MongoDb . In Mongodb ,I need the 'lng and 'lat' values ( coordinates) in a particular order So the Mongodb will recognize them as Geo locations. Mongodb needs 'lng' first and followed by 'lat'.
How do I do that?. Can some one help me?.
This is my Item file
class Citylist(scrapy.Item):
lng = scrapy.Field()
lat = scrapy.Field()
loc = scrapy.Filed()
This my spider file
for newlist in HtmlXPathSelector(response).select('/html/body'):
l = ItemLoader(item=Citylist(),response = response)
l.add_xpath('lng', '//......text()')
l.add_xpath('lat', '//......text()')
l.add_value('loc', {'lng': l.get_output_value('lng'),
'lat': l.get_output_value('lat')})
yield l.load_item()
My current out put is .....
'lng':'68.00000',
'lat':'45.00000',
'loc':{'lat':'45.00000','lng':'68.00000}.
1) I need my out put to be only .....
'loc':{'lng':'68.00000 ,'lat':'45.00000'},
I do not need .....
'lng':'68.00000', 'lat':'45.00000',
as separate values. Please advise how to do this
Thanks

First thing: here's a good answer about how to properly handle nested data in scrapy https://stackoverflow.com/a/25096896/2446893
Second thing: if order is important, you can use an OrderedDict https://docs.python.org/2/library/collections.html#collections.OrderedDict.
You can also use a tuple to return only the values, not the keys.

Related

Flutter: Create map from list using a delimiter

I am trying to store a list of activities with a specific color locally and trying to convert the list into either a map or a list of lists.
Using shared preferences to save the data locally I have the following list:
List<String> value = ['Sleep: Blue', 'Meditation: Green', 'Running: Red'];
prefs.setStringList('ActivityList', value); //save data locally
But I want to be able to retrieve an object of the form:
values = [ {'Sleep', 'Blue'}, {'Meditation', 'Green'}, {'Running', 'Red'} ];
What would be the best way to do this and how would I use the delimiter ':' to split the data accordingly?
Thanks in advance!
I am not sure what you mean by array of objects. If you simply want an array of pairs, then the following should work for you
value.map((item) => item.split(": "))
Or if you want a key value map from your data, then you can do something like this:
Map.fromEntries(value.map((item) {
List<String> pair = item.split(": ");
return MapEntry(pair[0], pair[1]);
}));

Looping through a list (with sublists) and assign the matching IDs to the same key and all of the corresponding values from that sublist?

I'm quite new in python coding and I canĀ“t solve the following problem:
I have a list with trackingpoints for different animals(ID,date,time,lat,lon) given in strings:
aList = [[id,date,time,lat,lon],
[id2,date,time,lat,lon],
[...]]
The txt file is very big and the IDs(a unique animal) is occuring multiple times:
i.e:
aList = [['25','20-05-13','15:16:17','34.89932','24.09421'],
['24','20-05-13','15:16:18','35.89932','23.09421],
['25','20-05-13','15:18:15','34.89932','24.13421'],
[...]]
What I'm trying to do is order the ID's in dictionaries so each unique ID will be the key and all the dates, times, latitudes and longitudes will be the values. Then I would like to write each individual ID to a new txt file so all the values for a specific ID are in one txt file. The output should look like this:
{'25':['20-05-13','15:16:17','34.89932','24.09421'],
['20-05-13','15:18:15','34.89932','24.13421'],
[...],
'24':['20-05-13','15:16:18','35.89932','23.09421'],
[...]
}
I have tried the following (and a lot of other solutions which didn't work):
items = {}
for line in aList:
key,value = lines[0],lines[1:]
items[key] = value
Which results in a key with the last value in the list forthat particular key :
{'25':['20-05-13','15:18:15','34.89932','24.13421'],
'24':['20-05-13','15:16:18','35.89932','23.09421']}
How can I loop through my list and assign the same IDs to the same key and all the corresponding values?
Is there any simple solution to this? Other "easier to implement" solutions are welcome!
I hope it makes sense :)
Try adding all the lists that match to the same ID as list of lists:
aList = [['25','20-05-13','15:16:17','34.89932','24.09421'],
['24','20-05-13','15:16:18','35.89932','23.09421'],
['25','20-05-13','15:18:15','34.89932','24.13421'],
]
items = {}
for line in aList:
key,value = line[0],line[1:]
if key in items:
items[key].append(value)
else:
items[key] = [value]
print items
OUTPUT:
{'24': [['20-05-13', '15:16:18', '35.89932', '23.09421']], '25': [['20-05-13', '15:16:17', '34.89932', '24.09421'], ['20-05-13', '15:18:15', '34.89932', '24.13421']]}

Python 2.7 - How to call individual columns from transposed csv file

I understand that the csv module exists, however for my current project we are not allowed to use the module to call csv files.
My code is as follows;
table = []
for line in open("data.csv"):
data = line.split(",")
table.append(data)
transposed = [[table[j][i] for j in range(len(table))] for i in range(len(table[0]))]
rows = transposed[1][1:]
rows = [float(i) for i in rows]
I'm really new to python so this is probably a massively basic question, I've been scouring the internet all day and struggle to find a solution. All I need to do is to be able to call data from any individual column so I can analyse it. Thanks
your data is organized in a list of lists. Each sub list represents a row. To better illustrate this I would avoid using list comprehensions because they are more difficult to read. Additionally I would avoid using variables like 'i' and 'j' and instead use more descriptive names like row or column. Here is a simple example of how I would accomplish this
def read_csv():
table = []
with open("data.csv") as fileobj:
for line in fileobj.readlines():
data = line.strip().split(',')
table.append(data)
return table
def get_column_data(data, column_index):
column_data = []
for row in data:
cell_data = row[column_index]
column_data.append(cell_data)
return column_data
data = read_csv()
get_column_data(data, column_index=2) #example usage

String from CSV to list - Python

I don't get it. I have a CSV data with the following content:
wurst;ball;hoden;sack
1;2;3;4
4;3;2;1
I want to iterate over the CSV data and put the heads in one list and the content in another list. Heres my code so far:
data = [ i.strip() for i in open('test.csv', 'r').readlines() ]
for i_c, i in enumerate(data):
if i_c == 0:
heads = i
else:
content = i
heads.split(";")
content.split(";")
print heads
That always returns the following string, not a valid list.
wurst;ball;hoden;sack
Why does split not work on this string?
Greetings and merry Christmas,
Jan
The split method returns the list, it does not modify the object in place. Try:
heads = heads.split(";")
content = content.split(";")
I've noticed also that your data seems to all be integers. You might consider instead the following for content:
content = [int(i) for i in content.split(";")]
The reason is that split returns a list of strings, and it seems like you might need to deal with them as numbers in your code later on. Of course, disregard if you are expecting non-numeric data to show up at some point.

How not to order a list of pk's in a query?

I have a list of pk's and I would like to get the result in the same order that my list is defined... But the order of the elements is begging changed. How any one help me?
print list_ids
[31189, 31191, 31327, 31406, 31352, 31395, 31309, 30071, 31434, 31435]
obj_opor=Opor.objects.in_bulk(list_ids).values()
for o in obj_oportunidades:
print o
31395 31435 31434 30071 31309 31406 31189 31191 31352 31327
This object should be used in template to show some results to the user... But how you can see, the order is different from the original list_ids
Would have been nice to have this feature in SQL - sorting by a known list of values.
Instead, what you could do is:
obj_oportunidades=Opor.objects.in_bulk(list_ids).values()
all_opor = []
for o in obj_oportunidades:
print o
all_opor.append(o)
for i in list_ids:
if i in all_opor:
print all_opor.index(i)
Downside is that you have to get all the result rows first and store them before getting them in the order you want. (all_opor could be a dictionary above, with the table records stored in the values and the PKeys as dict keys.)
Other way, create a temp table with (Sort_Order, Pkey) and add that to the query:
Sort_Order PKey
1 31189
2 31191
...
So when you sort on Sort_Order and Opor.objects, you'll get Pkeys it in the order you specify.
I found a solution in: http://davedash.com/2010/02/11/retrieving-elements-in-a-specific-order-in-django-and-mysql/ it's suited me perfectly.
ids = [a_list, of, ordered, ids]
addons = Addon.objects.filter(id__in=ids).extra(
select={'manual': 'FIELD(id,%s)' % ','.join(map(str,ids))},
order_by=['manual'])
This code do something similiar to MySQL "ORDER BY FIELD".
This guy: http://blog.mathieu-leplatre.info/django-create-a-queryset-from-a-list-preserving-order.html
Solved the problem for both MySQL and PostgreSQL!
If you are using PostgreSQL go to that page.