I am trying to write a code that will scramble the words in a sentence and return a string that is in a different order
from random import shuffle
def scramble():
a=len("this is a sentence")
for i in range(a):
random.shuffle("this is a sentence")
print(random.shuffle)
Not sure if I am even on the right track, however I believe the loop might be the issue
random.shuffle works on a a sequence, not a string. So first, use str.split to split the sentence into a list of words, call shuffle that, then turn it into a string again using str.join:
from random import shuffle
def scramble(sentence):
split = sentence.split() # Split the string into a list of words
shuffle(split) # This shuffles the list in-place.
return ' '.join(split) # Turn the list back into a string
print scramble("this is a sentence")
Output:
sentence a this is
Related
I want to take words a user provides, store them in a list, and then modify those words so that every other letter is capitalized. I have working code but it is repetitive. I cannot for the life of me figure out how to get all the words ran through in one function and not have it output one long string with the spaces removed. Any help is appreciated.
This is my current code:
def sarcastic_caps(lis1):
list=[]
index=0
for ltr in lis1[0]:
if index % 2 == 0:
list.append(ltr.upper())
else:
list.append(ltr.lower())
index=index+1
return ''.join(list)
final_list.append(sarcastic_caps(lis1))
Imagine 4 more iterations of this ^. I would like to do it all in one function if possible?
I have tried expanding the list index but that returns all of the letters smashed together, not individual words. That is because of the .join but I need that to get all of the letters back together after running them through the .upper/.lower.
I am trying to go from ['hat', 'cat', 'fat'] to ['HaT', 'CaT', 'FaT'].
I'm trying to solve a challenge that I found online. It gives an input word and the expected output is a list of the indexes of all the capital letters. My program works unless there's duplicate capital letters. I can't figure out how to deal with it. Here's my code right now:
def capital_indexes(string):
string = list(string)
print(string)
output = []
for i in string:
if i.isupper():
output.append(string. index(i))
return output
Like I said, it works for words like "HeLlO" but not for words like "TesT"
Try this one and compare the difference with OP:
You don't have to use index() method to search the character again, just use enumerate to get the tuple of (index, char) at the same time, and check if the character is capital case.
def capital_indexes(string):
#string = list(string) # string is an iterable!
#print(string)
output = []
for i, ch in enumerate(string): # get index, char
if ch.isupper():
output.append(i)
return output
print(capital_indexes('TesT')) # [0, 3]
Each number call that contains a value (2,4,8) will output "Clap", if not "No clap"
Call
def clapping (number)
def clapping (772)
output:
"Clap"
I made the program, but it seems something is wrong. Can I ask for help to check which is wrong
import re
def clapping(number):
return "Clap" if re.findall("[248]+",number) else "No Clap"
print(clapping(779))
The regexp functions require a string to search, so you have to convert the number to a string with str(number).
There's also no need to use findall(). You only need to know if the regexp matches one time, you don't need a list of all the matches. Similarly, you don't need the + quantifier, since matching a single character is enough.
def clapping(number):
return "Clap" if re.search("[248]",str(number)) else "No Clap"
In Python 3, we can use re.compile(), nltk.tokenize() and TextBlob.words() to tokenize a given text. I think there may be other methods too, but I am unaware of them.
Which of these methods or other unmentioned methods tokenizes a given text the fastest?
Thank you in advance.
After calculating the difference in the timestamps between the start and end of each tokenize function, I have come to the following observation:
1) Regex operation is the fastest. The code is as follows:
import re
WORD = re.compile(r'\w+')
def regTokenize(text):
words = WORD.findall(text)
return words
The time taken for tokenizing 100,000 simple, one-lined strings is 0.843757 seconds.
2) NLTK word_tokenize(text) is second. The code is as follows:
import nltk
def nltkTokenize(text):
words = nltk.word_tokenize(text)
return words
The time taken for tokenizing 100,000 simple, one-lined strings is 18.869182 seconds.
3) TextBlob.words is the slowest. The code is as follows:
from textblob import TextBlob as tb
def blobTokenize(text):
words = tb(text).words
return words
The time taken for tokenizing 100,000 simple, one-lined strings is 34.310102 seconds.
Regex operation is extremely fast.
However, NLTK also tokenizes characters, so it returns a bigger list.
TextBlob is almost twice as slow as NLTK, but stores only the words from the
tokenized list.
If anybody else was wondering the same thing, here is the answer.
I have a python code for word frequency count from a text file. The problem with the program is that it takes fullstop into account hence altering the count. For counting word i've used a sorted list of words. I tried to remove the fullstop using
words = open(f, 'r').read().lower().split()
uniqueword = sorted(set(words))
uniqueword = uniqueword.replace(".","")
but i get error as
AttributeError: 'list' object has no attribute 'replace'
Any help would be appreciated :)
You can process the words before you make the set, using a list comprehension:
words = [word.replace(".", "") for word in words]
You could also remove them after (uniquewords = [word.replace...]), but then you will reintroduce duplicates.
Note that if you want to count these words, a Counter may be more useful:
from collections import Counter
counts = Counter(words)
You might be better off with
words = re.findall(r'\w+', open(f, 'r').read().lower())
which will grab all the strings composed of one or more “word characters” and will ignore punctuation and whitespace.