I am new user in python. I would like to convert the word "cook" to the ASCII value. I want to calculate the total number. For e.g. for the word "cook" the total will be (99+210+321+428)=1058. Below is my code :
import nltk
s="cook"
sum=0
for c in s:
x=ord(c)
sum=sum+x
print(sum)
Output :
99
210
321
428
I want the total (1058). What I have to add more?
This appears to be the formula that you want:
x, total = 0, 0
for c in 'cook':
x += ord(c)
total += x
print(total)
It produces the number you want:
1058
Alternative: using numpy
>>> from numpy import sum, cumsum
>>> sum(cumsum([ord(c) for c in 'cook']))
1058
Related
how can I plot data from a list against its indices using a for loop?
(I need do write something because my post is mostly code and I have dot add some more details...)
This is the code sample:
import pandas as pd
import os
import matplotlib.pyplot as plt
import numpy as np
path = r'D:\Experiments\20210924_SureliteOPO_beampointing\all_log'
filesnames = os.listdir(path)
filesnames = [f for f in filesnames if (f.startswith("2") and f.lower().endswith(".csv"))]
dfs = list() # a list of dataframes
for csvfile in filesnames:
fpath = path + '/' + csvfile
df = pd.read_csv(fpath, skiprows=26, skipfooter=5)
dfs.append(df)
print(dfs)
The output looks like this:
[ Powermeter1 Start
0 0.001864
1 0.001756
2 0.001818
3 0.001837
4 0.001932
.. ...
95 0.001697
96 0.001950
97 0.001871
98 0.001757
99 0.001849
[100 rows x 1 columns], Powermeter1 Start
0 0.001771
1 0.001863
2 0.001796
3 0.001885
4 0.001746
.. ...
95 0.001827
96 0.001678
97 0.001813
98 0.001776
99 0.001637
[100 rows x 1 columns]], ...
...
...
Regards,
Karl
found it:
for i in range(len(dfs)):
plt.plot(df.index,dfs[i])
enumerate() is designed for when you need to loop over an iterable's elements with access to their indices:
for i, elem in enumerate(dfs):
...
I am creating a spaCy regular expression matches for matching number and extracting it pandas data frame.
Question: Panda picks up from number but overwrites value instead of appending. How to solve it?
(original code credit: yarongon)
from __future__ import unicode_literals
import spacy
import re
import pandas as pd
from datetime import date
nlp = spacy.load('en_core_web_sm', disable=['parser', 'tagger', 'ner'])
doc = nlp("This is a sample number: 11. This is second sample number: 1145.")
NUM_PATTERN = re.compile(r"\d+")
for match in re.finditer(NUM_PATTERN, doc.text):
start, end = match.span()
Number = doc.char_span(start, end)
print Number
pandas_attributes = [Number,]
df = pd.DataFrame(pandas_attributes,
columns=['Number'])
print df
Output:
11
1145
Number
0 1145
Expected output:
Number
o 11
1 1145
Edit 1:
I am trying multiple pattern match on single text.
from __future__ import unicode_literals
import spacy
import re
import pandas as pd
from datetime import date
nlp = spacy.load('en_core_web_sm', disable=['parser', 'tagger', 'ner'])
doc = nlp("This is a sample-number: 11. This is second sample number: 1145.")
NUM_PATTERN = re.compile(r"\d+")
HYPH_PATTERN = re.compile('\w+(?:-)\w+')
for match in re.finditer(NUM_PATTERN, doc.text):
start, end = match.span()
Number = doc.char_span(start, end)
print Number
for match in re.finditer(HYPH_PATTERN, doc.text):
start, end = match.span()
Hyph_word = doc.char_span(start, end)
print Hyph_word
pandas_attributes = [Number,Hyph_word]
df = pd.DataFrame(pandas_attributes,
columns=['Number','Hyphenword'])
print df
Current output.
Output:
11
1145
sample-number
AssertionError: 2 columns passed, passed data had 3 columns
Expected output:
Number Hyphen_word
11 sample-number
1145
edit 2: output
Number Hyphenword
0 (11) (1145)
1 (sample, -, number) Non
Expected output:
Number Hyphenword
0 11 sample-word
1 1145 Non
You need append values to list in loop:
L = []
for match in re.finditer(NUM_PATTERN, doc.text):
start, end = match.span()
L.append(doc.char_span(start, end))
and then use DataFrame constructor:
df = pd.DataFrame(L,columns=['Number'])
You can also append tuples with multiple values:
Sample:
L = []
for x in range(3):
Number = x + 1
Val = x + 4
L.append((Number, Val))
print (L)
[(1, 4), (2, 5), (3, 6)]
df = pd.DataFrame(L,columns=['Number', 'Val'])
print (df)
Number Val
0 1 4
1 2 5
2 3 6
I believe you can use double append:
PATTERNS = [NUM_PATTERN, HYPH_PATTERN]
pandas_attributes = []
for pat in PATTERNS:
L = []
for match in re.finditer(pat, doc.text):
start, end = match.span()
L.append(doc.char_span(start, end))
pandas_attributes.append(L)
df = pd.DataFrame(pandas_attributes,
index=['Number','Hyphenword']).T
I am creating a spaCy regular expression matches for matching number and extracting it pandas data frame.
Question: Panda picks up from number but overwrites value instead of appending. How to solve it?
(original code credit: yarongon)
from __future__ import unicode_literals
import spacy
import re
import pandas as pd
from datetime import date
nlp = spacy.load('en_core_web_sm', disable=['parser', 'tagger', 'ner'])
doc = nlp("This is a sample number: 11. This is second sample number: 1145.")
NUM_PATTERN = re.compile(r"\d+")
for match in re.finditer(NUM_PATTERN, doc.text):
start, end = match.span()
Number = doc.char_span(start, end)
print Number
pandas_attributes = [Number,]
df = pd.DataFrame(pandas_attributes,
columns=['Number'])
print df
Output:
11
1145
Number
0 1145
Expected output:
Number
o 11
1 1145
Edit 1:
I am trying multiple pattern match on single text.
from __future__ import unicode_literals
import spacy
import re
import pandas as pd
from datetime import date
nlp = spacy.load('en_core_web_sm', disable=['parser', 'tagger', 'ner'])
doc = nlp("This is a sample-number: 11. This is second sample number: 1145.")
NUM_PATTERN = re.compile(r"\d+")
HYPH_PATTERN = re.compile('\w+(?:-)\w+')
for match in re.finditer(NUM_PATTERN, doc.text):
start, end = match.span()
Number = doc.char_span(start, end)
print Number
for match in re.finditer(HYPH_PATTERN, doc.text):
start, end = match.span()
Hyph_word = doc.char_span(start, end)
print Hyph_word
pandas_attributes = [Number,Hyph_word]
df = pd.DataFrame(pandas_attributes,
columns=['Number','Hyphenword'])
print df
Current output.
Output:
11
1145
sample-number
AssertionError: 2 columns passed, passed data had 3 columns
Expected output:
Number Hyphen_word
11 sample-number
1145
edit 2: output
Number Hyphenword
0 (11) (1145)
1 (sample, -, number) Non
Expected output:
Number Hyphenword
0 11 sample-word
1 1145 Non
You need append values to list in loop:
L = []
for match in re.finditer(NUM_PATTERN, doc.text):
start, end = match.span()
L.append(doc.char_span(start, end))
and then use DataFrame constructor:
df = pd.DataFrame(L,columns=['Number'])
You can also append tuples with multiple values:
Sample:
L = []
for x in range(3):
Number = x + 1
Val = x + 4
L.append((Number, Val))
print (L)
[(1, 4), (2, 5), (3, 6)]
df = pd.DataFrame(L,columns=['Number', 'Val'])
print (df)
Number Val
0 1 4
1 2 5
2 3 6
I believe you can use double append:
PATTERNS = [NUM_PATTERN, HYPH_PATTERN]
pandas_attributes = []
for pat in PATTERNS:
L = []
for match in re.finditer(pat, doc.text):
start, end = match.span()
L.append(doc.char_span(start, end))
pandas_attributes.append(L)
df = pd.DataFrame(pandas_attributes,
index=['Number','Hyphenword']).T
I have written a code to find approximated sum of an exponential function, which should run iteration till N-1 terms, then return the iteration no, sum, abs error and relative error for each iteration step.
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
import math
N = input ("Please enter an integer at which term you want to turncate your summation")
x = input ("please enter a number for which you want to run the exponential summation e^{x}")
function= math.exp(x)
exp_sum = 0.0
abs_err = 0.0
rel_err = 0.0
for n in range (0, N):
factorial = math.factorial(n)
power = x**n
nth_term = power/factorial
exp_sum = exp_sum + nth_term
abs_err = abs(function - exp_sum)
rel_err = abs(abs_err)/abs(function)
print "The exponential function which has %d-term expansion, returns the approximated sum to be %.16f." % (n, exp_sum)
print "This approximated sum has an absolute error to be %.25f" % abs_err
print "and a relative error to be %.25f" % rel_err
right now, it actually looks silly printing values at each iteration and it only looks good till a few iteration, my plan is to get the output as a table with proper column headings (iteration, sum, abs err, rel err) in the terminal after I execute the .py file.
also I wish to save a .txt file of the output, if anyone has idea how to do that in python, I would very much appreciate the help and thanks.
You might use a pretty_table() function in order to pretty print tabular data, like this:
def pretty_table(rows, column_count, column_spacing=4):
aligned_columns = []
for column in range(column_count):
column_data = list(map(lambda row: row[column], rows))
aligned_columns.append((max(map(len, column_data)) + column_spacing, column_data))
for row in range(len(rows)):
aligned_row = map(lambda x: (x[0], x[1][row]), aligned_columns)
yield ''.join(map(lambda x: x[1] + ' ' * (x[0] - len(x[1])), aligned_row))
This little function, given a list of rows and the number of columns, will yield pretty-formatted table data, line by line. You can even adjust the spacing between columns if you wish.
In your particular code, you may do the following:
# At first, contains just the header columns.
rows = [['Term', 'Exponential sum', 'Absolute error', 'Relative error']]
for n in range (0, N):
factorial = math.factorial(n)
power = x**n
nth_term = power/factorial
exp_sum = exp_sum + nth_term
abs_err = abs(function - exp_sum)
rel_err = abs(abs_err)/abs(function)
rows.append((str(n), str(exp_sum), str(abs_err), str(rel_err)))
for line in pretty_table(rows, 4):
print(line)
For an input of N = 10, X = 5, this code outputs:
Term Exponential sum Absolute error Relative error
0 1.0 147.413159103 0.993262053001
1 6.0 142.413159103 0.959572318005
2 18.5 129.913159103 0.875347980517
3 39.3333333333 109.079825769 0.734974084703
4 65.375 83.0381591026 0.559506714935
5 91.4166666667 56.9964924359 0.384039345167
6 113.118055556 35.295103547 0.237816537027
7 128.619047619 19.7941114835 0.13337167407
8 138.307167659 10.1059914438 0.0680936347218
9 143.68945657 4.72370253291 0.0318280573062
If you want to redirect it into a file, do this instead of the last for loop:
with open('my_file.txt', 'w') as output:
for line in pretty_table(rows, 4):
print >> output, line
I am trying to clean up the data. For the first name variable, I would like to 1) assign missing value (NaN) to those entries that have one character only, 2) assign missing value if it contains only two characters AND one of the characters is a symbol (ie: ".", or "?"), and 3) convert "wm" to string "william"
I tried the following and other codes, but none seems to work:
import pandas as pd
from pandas import DataFrame, Series
import numpy as np
import re
def CleanUp():
data = pd.read_csv("C:\sample.csv")
frame2 = DataFrame(data)
frame2.columns = ["First Name", "Ethnicity"]
# Convert weird values to missing value
for Name in frame2["First_Name"]:
if len(Name) == 1:
Name == np.nan
if (len(Name) == 2) and (Name.str.contain(".|?|:", na=False)):
Name == np.nan
if Name == "wm":
Name == "william"
print frame2["First_Name"]
You're looking for df.replace
make up some data:
np.random.seed(3)
n=6
df = pd.DataFrame({'Name' : np.random.choice(['wm','bob','harry','chickens'], size=n),
'timeStamp' : np.random.randint(1000, size=n)})
print df
Name timeStamp
0 harry 256
1 wm 789
2 bob 659
3 chickens 714
4 wm 875
5 wm 681
run the replace:
df.Name = df.Name.replace('wm','william')
print df
Name timeStamp
0 harry 256
1 william 789
2 bob 659
3 chickens 714
4 william 875
5 william 681