# given the following list
list = [3,10,10,10,6,7]
# how can I make a function that returns the following
list_2 = 310AA:67
Related
I just want to edit list using remove and append method
input_list = ["SAS","R","PYTHON","SPSS"]
import ast,sys
input_list = (sys.stdin.read()).split(",")
input_list.remove("SPSS")
input_list.append("SPARK")
print(input_list)
Then I get this error: ValueError: list.remove(x): x not in list
It's not really clear what you are doing with your stdin but you try to modify your list while you change its value.
input_list = ["SAS","R","PYTHON","SPSS"]
# the list is correctly set
import ast,sys # import...
input_list = (sys.stdin.read()).split(",")
# you have changed the value of your list
input_list.remove("SPSS")
# so you can't remove this item if the list is changed.
# Maybe the new list doesn't contain "SPSS" item
input_list.append("SPARK")
# The append should work fine
print(input_list) # print...
If you didn't give a right value in your stdin, obviously your remove has failed
I am trying to populate a list in Python3 with 3 random items being read from a file using REGEX, however i keep getting duplicate items in the list.
Here is an example.
import re
import random as rn
data = '/root/Desktop/Selenium[FILTERED].log'
with open(data, 'r') as inFile:
index = inFile.read()
URLS = re.findall(r'https://www\.\w{1,10}\.com/view\?i=\w{1,20}', index)
list_0 = []
for i in range(3):
list_0.append(URLS[rn.randint(1, 30)])
inFile.close()
for i in range(len(list_0)):
print(list_0[i])
What would be the cleanest way to prevent duplicate items being appended to the list?
(EDIT)
This is the code that i think has done the job quite well.
def random_sample(data):
r_e = ['https://www\.\w{1,10}\.com/view\?i=\w{1,20}', '..']
with open(data, 'r') as inFile:
urls = re.findall(r'%s' % r_e[0], inFile.read())
x = list(set(urls))
inFile.close()
return x
data = '/root/Desktop/[TEMP].log'
sample = random_sample(data)
for i in range(3):
print(sample[i])
Unordered collection with no duplicate entries.
Use the builtin random.sample.
random.sample(population, k)
Return a k length list of unique elements chosen from the population sequence or set.
Used for random sampling without replacement.
Addendum
After seeing your edit, it looks like you've made things much harder than they have to be. I've wired a list of URLS in the following, but the source doesn't matter. Selecting the (guaranteed unique) subset is essentially a one-liner with random.sample:
import random
# the following two lines are easily replaced
URLS = ['url1', 'url2', 'url3', 'url4', 'url5', 'url6', 'url7', 'url8']
SUBSET_SIZE = 3
# the following one-liner yields the randomized subset as a list
urlList = [URLS[i] for i in random.sample(range(len(URLS)), SUBSET_SIZE)]
print(urlList) # produces, e.g., => ['url7', 'url3', 'url4']
Note that by using len(URLS) and SUBSET_SIZE, the one-liner that does the work is not hardwired to the size of the set nor the desired subset size.
Addendum 2
If the original list of inputs contains duplicate values, the following slight modification will fix things for you:
URLS = list(set(URLS)) # this converts to a set for uniqueness, then back for indexing
urlList = [URLS[i] for i in random.sample(range(len(URLS)), SUBSET_SIZE)]
Or even better, because it doesn't need two conversions:
URLS = set(URLS)
urlList = [u for u in random.sample(URLS, SUBSET_SIZE)]
seen = set(list_0)
randValue = URLS[rn.randint(1, 30)]
# [...]
if randValue not in seen:
seen.add(randValue)
list_0.append(randValue)
Now you just need to check list_0 size is equal to 3 to stop the loop.
I'm using scrapy to iteratively scrape some data, and the data is being output as two lists through each iteration. I want to combine the two lists into one list at each iteration, so that in the end I will have one big list with many sublists(each sublist being the combination of the two lists created from each iteration)
That may be confusing so I will show my current output and code:
using Scrapy I"m iterating in the following way,
for i in response.css(''tr.insider....."):
i.css(a.tab-link:text).extract() #creating the first list
i.css('td::text').extract() #creating the second list
So the current output is something like this
[A,B,C] #first iteration
[1,2,3]
[D,E,F] #second iteration
[4,5,6]
[G,H,I] #third iteration
[7,8,9]
Desired output is
[[A,B,C,1,2,3], [D,E,F,4,5,6],[G,H,I,7,8,9]]
I tried the following code but I'm getting a list of None.
x =[]
for i in response.css(''tr.insider....."):
x.append(i.css(a.tablink::text).extract().extend(i.css('td::text').extract()))
But the return is just
None
None
None
None
None.....
Thanks!
extend function returns None, so you always append None to x.
For your purpose, I this is what you want:
for i in response.css(''tr.insider....."):
i.css('a.tab-link:text, td::text').extract()
You can simply add two lists together and append them to your results list.
results = []
for i in response.css("tr.insider....."):
first = i.css(a.tab-link:text).extract()
second = i.css('td::text').extract()
# combine both and append to results
results.append(first + second)
print(results)
# e.g.: [[A,B,C,1,2,3], [D,E,F,4,5,6],[G,H,I,7,8,9]]
Trying to extract text from a tag based on href containing a certain string, below is part of my sample code:
Experience = soup.find_all(id='background-experience-container')
Exp = {}
for element in Experience:
Exp['Experience'] = {}
for element in Experience:
role = element.find(href=re.compile("title").get_text()
Exp['Experience']["Role"] = role
for element in Experience:
company = element.find(href=re.compile("exp-company-name").get_text()
Exp['Experience']['Company'] = company
It doesn't like the syntax for how I've defined the Exp['outer_key']['inner_key'] = value it is returning SyntaxError.
I'm trying to buld a Dict.dict which contains info on role and company, will also look to include dates for each but haven't got that far yet.
Can anyone spot any glaringly obvious mistakes in my code?
Really appreciate any help with this!
find_all can return many values (even if you search by id) so better use list to keep all values - Exp = [].
Experience = soup.find_all(id='background-experience-container')
# create empty list
Exp = []
for element in Experience:
# create empty dictionary
dic = {}
# add elements to dictionary
dic['Role'] = element.find(href=re.compile("title")).get_text()
dic['Company'] = element.find(href=re.compile("exp-company-name")).get_text()
# add dictionary to list
Exp.append(dic)
# display
print(Exp[0]['Role'])
print(Exp[0]['Company'])
print(Exp[1]['Role'])
print(Exp[1]['Company'])
# or
for x in Exp:
print(x['Role'])
print(x['Company'])
if you sure that find_all gives you only one element (and you need key 'Experience') then you can do
Experience = soup.find_all(id='background-experience-container')
# create main dictionary
Exp = {}
for element in Experience:
# create empty dictionary
dic = {}
# add elements to dictionary
dic['Role'] = element.find(href=re.compile("title")).get_text()
dic['Company'] = element.find(href=re.compile("exp-company-name")).get_text()
# add dictionary to main dictionary
Exp['Experience'] = dic
# display
print(Exp['Experience']['Role'])
print(Exp['Experience']['Company'])
or
Experience = soup.find_all(id='background-experience-container')
# create main dictionary
Exp = {}
for element in Experience:
Exp['Experience'] = {
'Role': element.find(href=re.compile("title")).get_text()
'Company': element.find(href=re.compile("exp-company-name")).get_text()
}
# display
print(Exp['Experience']['Role'])
print(Exp['Experience']['Company'])
Given the following list:
colors=['#c85200','#5f9ed1','lightgrey','#ffbc79','#006ba4','dimgray','#ff800e','#a2c8ec'
,'grey','salmon','cyan','silver']
And this list:
Hospital=['a','b','c','d']
After I get the number of colors based on the length of the list - 'Hospital':
num_hosp=len(Hospital)
colrs=colors[:num_hosp]
colrs
['#c85200', '#5f9ed1', 'lightgrey', '#ffbc79']
...and zip the lists together:
hcolrs=zip(Hospitals,colrs)
Next, I'd like to be able to select 1 or more colors from hcolrs if given a list of one or more hospitals from 'Hospitals'.
Like this:
newHosps=['a','c'] #input
newColrs=['#c85200','lightgrey'] #output
Thanks in advance!
Pass the result of zip to the dict constructor to make lookup simple/fast:
# Don't need to slice colors; zip stops when shortest iterable exhausted
hosp_to_color = dict(zip(Hospitals, colors))
then use it:
newHosps = ['a','c']
newColrs = [hosp_to_color[h] for h in newHosps]