I'm writing a code where it fetches some text from a site and then, with a for-loop I take the part of the text of my interest. I can print this text but I would like to know how can I send it to a list for latter use. So far the code I've wrote is this one.
import urllib2
keyword = raw_input('keyword: ')
URL = "http://www.uniprot.org/uniprot/?sort=score&desc=&compress=no&query=%s&fil=&limit=10&force=no&preview=true&format=fasta" % keyword
filehandle = urllib2.urlopen(URL)
url_text = filehandle.readlines()
for line in url_text:
if line.startswith('>'):
print line[line.index(' ') : line.index('OS')]
Just use append:
lines = []
for line in url_text:
if line.startswith('>'):
lines.append(line) # or whatever else you wanted to add to the list
print line[line.index(' ') : line.index('OS')]
Edit: on a side note, python can for loop directly over a file - as in:
url_text = filehandle.readlines()
for line in url_text:
pass
# can be shortened to:
for line in filehandle:
pass
Related
i have recently started building a bot in discord.py and got stuck in this situation here
#client.command()
async def replace(ctx, *, arg):
msg = f"{arg}" .format(ctx.message).replace('happe', '<:happe:869470107066302484>')
await ctx.send(msg)
this is a command which replaces the word "happe" with the emoji so this looks like :
Command : {prefix}replace i am very happe
Result : i am very "emoji for happe"
but i want to make it so we can replace multiple words with different emoji and not just a single one like
Command : {prefix}replace i am not happe, i am sad
Result : i am not "emoji for happe", i am "emoji for sad"
is there a way to edit multiple words in just one sentence like using a json file of making a list of emojis and its id?
also this command doesnt seems to work in cogs and says command is invalid
is there a way to edit multiple words in just one sentence like using a json file ot making a list of emojis and its id?
Yes, you can do just that. Make a list of word-emoji pairs and iterate over all of them to replace every word.
async def replace(ctx, *, arg):
l = [("happe", "<:happe:869470107066302484>"), ("sad", "..."), ...]
msg = arg # Making a copy here to avoid editing the original arg in case you want to use it at some point
for word, emoji in l:
msg = msg.replace(word, emoji)
await ctx.send(msg)
also this command doesnt seems to work in cogs and says command is invalid
In cogs the decorator is #commands.command(), not #client.command(). Don't forget the from discord.ext import commands import to get access to it.
Lastly, I'm a bit confused what f"{arg}" .format(ctx.message) is supposed to be doing? You can remove the .format and the f-string entirely. args is already a string so putting it in an f-string with nothing else has no effect, and .format(ctx.message) doesn't do anything either. The result of that entire thing is the same as just arg.
>>> arg = "this is the arg"
>>> f"{arg}".format("this has no effect")
'this is the arg'
I'm trying to write a small crawler to crawl multiple wikipedia pages.
I want to make the crawl somewhat dynamic by concatenating the hyperlink for the exact wikipage from a file which contains a list of names.
For example, the first line of "deutsche_Schauspieler.txt" says "Alfred Abel" and the concatenated string would be "https://de.wikipedia.org/wiki/Alfred Abel". Using the txt file will result in heading being none, yet when I complete the link with a string inside the script, it works.
This is for python 2.x.
I already tried to switch from " to ',
tried + instead of %s
tried to put the whole string into the txt file (so that first line reads "http://..." instead of "Alfred Abel"
tried to switch from "Alfred Abel" to "Alfred_Abel
from bs4 import BeautifulSoup
import requests
file = open("test.txt","w")
f = open("deutsche_Schauspieler.txt","r")
content = f.readlines()
for line in content:
link = "https://de.wikipedia.org/wiki/%s" % (str(line))
response = requests.get(link)
html = response.content
soup = BeautifulSoup(html)
heading = soup.find(id='Vorlage_Personendaten')
uls = heading.find_all('td')
for item in uls:
file.write(item.text.encode('utf-8') + "\n")
f.close()
file.close()
I expect to get the content of the table "Vorlage_Personendaten" which actually works if i change line 10 to
link = "https://de.wikipedia.org/wiki/Alfred Abel"
# link = "https://de.wikipedia.org/wiki/Alfred_Abel" also works
But I want it to work using the textfile
Looks like the problem in your text file where you have used "Alfred Abel" that is why you are getting the following exceptions
uls = heading.find_all('td')
AttributeError: 'NoneType' object has no attribute 'find_all'
Please remove the string quotes "Alfred Abel" and use Alfred Abel inside the text file deutsche_Schauspieler.txt . it will work as expected.
I found the solution myself.
Although there are no additionaly lines on the file, the content array displays like
['Alfred Abel\n'], but printing out the first index of the array will result in 'Alfred Abel'. It still gets interpreted like the string in the array, thus forming a false link.
So you want to move the last(!) character from the current line.
A solution could look like so:
from bs4 import BeautifulSoup
import requests
file = open("test.txt","w")
f = open("deutsche_Schauspieler.txt","r")
content = f.readlines()
print (content)
for line in content:
line=line[:-1] #Note how this removes \n which are technically two characters
link = "https://de.wikipedia.org/wiki/%s" % str(line)
response = requests.get(link)
html = response.content
soup = BeautifulSoup(html,"html.parser")
try:
heading = soup.find(id='Vorlage_Personendaten')
uls = heading.find_all('td')
for item in uls:
file.write(item.text.encode('utf-8') + "\n")
except:
print ("That did not work")
pass
f.close()
file.close()
My code grabs information and stores it in a list. The sorted list is:
example/
example/text1.txt
example/text2.txt
example/text3.txt
I would like to refer to text1.txt and perform a function to it, then move on to the next entry in the list (in this case, text2.txt).
I was able to see a bit of what I can do with regex, but it outputs nothing.
Here's a portion of my code so far:
FileNames = name in sorted(zip_file.namelist())
regex = r"[1-9]+ \d"
matches = re.findall(regex, str(FileNames))
for match in matches:
print("%s" % (match))
EDIT:
utilizing a different technique, here's what I got so far:
import zipfile
import re
zip_file = zipfile.ZipFile('/example.zip','r')
#zip_file = zipfile.ZipFIle(input("What's the filepath?: "))
for name in sorted(zip_file.namelist()):
#print(name)
for file_path in name:
file_name = file_path.split("/")[-1]
if "1" in file_name:
print(file_name)
else:
print("This line does not contain a valid path to a text file.")
zip_file.close()
It gives me a really gross output, something along the lines of
example/text1.txt
The Line does not contain a valid path to a text file.
^repeated a ton of times
I would not use regex but just a simple split since you are dealing with paths. For each line, you can take the rightmost string of the list after "/" if it exists. If it contains ".txt"it's your file name, otherwise ignore that line.
files_paths = ["example/",
"example/text1.txt",
"example/text2.txt",
"example/text3.txt"]
for file_path in files_paths :
file_name = file_path.split("/")[-1]
if ".txt" in file_name:
... # call a function with file_name as an argument.
else:
print("This line does not contain a valid path to a text file.")
The print is just there for testing of course, feel free to delete the else clause if you want your script to stay silent.
Assuming that every entry in the list is a path to a file with which you want to perform some work on, you can use the os.path library to get the basename of the file path.
import os
path_str = "example/text1.txt"
file_name = os.path.basename(os.path.normpath(path_str))
I'm reading a file outputfromextractand I want to split the contents of that file with the delimiter ',' which I have done.
When reading the contents into a list there's two 'faff' entries at the beginning that I'm just trying to remove however I find myself unable to remove the index
import json
class device:
ipaddress = None
macaddress = None
def __init__(self, ipaddress, macaddress):
self.ipaddress = ipaddress
self.macaddress = macaddress
listofItems = []
listofdevices = []
def format_the_data():
file = open("outputfromextract")
contentsofFile = file.read()
individualItem = contentsofFile.split(',')
listofItems.append(individualItem)
print(listofItems[0][0:2]) #this here displays the entries I want to remove
listofItems.remove[0[0:2]] # fails here and raises a TypeError (int object not subscriptable)
In the file I have created the first three lines are enclosed below for reference:
[u' #created by system\n', u'time at 12:05\n', u'192.168.1.1\n',...
I'm wanting to simply remove those two items from the list and the rest will be put into an constructor
I believe listofItems.remove[0[0:2]] should be listofItems.remove[0][0:2].
But, slicing will be much easier, for example:
with open("outputfromextract") as f:
contentsofFile = f.read()
individualItem = contentsofFile.split(',')[2:]
I am writing a scripts that can be placed in e.g. the meny.py file to load all the custom plugins/gizmos/.nk files into a new Menu at startup. It is supposed to work with subdirectories in the specified folder to create submenus to order the items by category. PROBLEM: It creates the menu and its submenus as well as the items in place, but while the names of the items are different they create all the exact same node when executed. I don't get what is happening there.
Here is what I have so far:
import os
pluginpath = 'C:\Users\Workstation\.nuke\userplugins'
#print nuke.pluginPath()
customMenu = nuke.menu('Nodes').addMenu('UserPlugIns')
for dirpath, dirnames, filenames in os.walk ( pluginpath ):
print ('')
print ('CurrentPath: ' , dirpath)
nuke.pluginAddPath(dirpath)
dirname = os.path.split(dirpath)[-1]
subMenu = customMenu.addMenu(dirname)
#print ('Directories: ' , dirnames)
#print ('Filenames: ' , filenames)
for x in filenames:
print x
subMenu.addCommand(x, lambda: nuke.createNode('{}'.format(x)))
I guess it is the last line causing the problem. Any ideas?
That's know danger from lambda, so you probably start using partial
from functools import partial
....
....
........
subMenu.addCommand(x, partial(nuke.createNode('{}'.format(x))))