Using ArcDesktop 10.1 & Python 2.7:
I am working on a code that searches for values within 13 fields and, based on what it finds within those 13 fields, it concatenates a string and puts the result in an existing (empty) field.
It uses a search cursor to search the 13 fields. Then uses the result of that in an update cursor to concatenate the string.
I am having trouble getting the result into the field using the setValue - Line 40 of the code below # urow.setValue(commentsField, easementType). The error message is very unhelpful (RuntimeError: ERROR 999999: Error executing function.)
I am not sure how to correctly get the value set in the field desired. Any help would be greatly appreciated!
import arcpy, os, math
from itertools import izip
arcpy.env.workspace = "C:\\Users\\mdelgado\\Desktop\\WorkinDog.gdb"
#These are my variables
fc = "EASEMENTS"
commentsField = "Comments"
typeFields = ["ABANDONED", "ACCESS", "AERIAL", "BLANKET", "COMM", "DRAIN", "ELEC", "GEN_UTIL", "LANDSCAPE", "PARKING", "PIPELINE", "SAN_SEWR", "SIDEWALK", "SPECIAL", "STM_SEWR", "WATER"]
fieldNames = ["ABANDONED", "ACCESS", "AERIAL", "BLANKET", "COMMUNICATION", "DRAINAGE", "ELECTRIC", "GENERAL UTILITY", "LANDSCAPE", "PARKING", "PIPELINE", "SANITATION SEWER", "SIDEWALK", "SPECIAL", "STORM SEWER", "WATER"]
fieldValues = []
easementType = ""
#This is my search cursor
scursor = arcpy.SearchCursor(fc)
srow = scursor.next()
for field in typeFields:
srowValue = (srow.getValue(field))
fieldValues.append(srowValue)
srow = scursor.next()
print fieldValues
#This is my update cursor
ucursor = arcpy.UpdateCursor(fc)
for urow in ucursor:
#This is where I begin the loop to concatenate the comment field
for (value, name) in izip(fieldValues, fieldNames):
print str(value) + " " + name
#This is where I check each field to find out which types the easement is
if value == 1:
easementType = easementType + name + ", "
#This is where I format the final concatenated string
easementType = easementType[:-2]
print easementType
#This is where the field is updated with the final string using the cursor
urow.setValue(commentsField, easementType)
ucursor.updateRow(urow)
urow = cursor.next()
del urow
del ucursor
del srow
del scursor
The uninformative 999999 error is one of the worst.
I suggest a couple modifications to your approach that may make it simpler to troubleshoot. First, use the da Cursors -- they are faster, and the syntax is a little simpler.
Second, you don't need a separate Search and Update -- the Update can "search" other fields in the same row in addition to updating fields. (The current code, assuming it was working correctly, would be putting the same fieldValues into every row the UpdateCursor affected.)
fieldNames = ["ABANDONED", "ACCESS", "AERIAL", "BLANKET", "COMMUNICATION", "DRAINAGE",
"ELECTRIC", "GENERAL UTILITY", "LANDSCAPE", "PARKING", "PIPELINE", "SANITATION SEWER",
"SIDEWALK", "SPECIAL", "STORM SEWER", "WATER"]
cursorFields = ["ABANDONED", "ACCESS", "AERIAL", "BLANKET", "COMM", "DRAIN",
"ELEC", "GEN_UTIL", "LANDSCAPE", "PARKING", "PIPELINE", "SAN_SEWR",
"SIDEWALK", "SPECIAL", "STM_SEWR", "WATER", "Comments"]
with arcpy.da.UpdateCursor(fc, cursorFields) as cursor:
for row in cursor:
easementType = ""
for x in range(13):
if row[x] == 1:
easementType += fieldNames[x] + ", "
easementType = easementType[:-2]
print easementType
row[13] = easementType
cursor.updateRow(row)
Related
I am trying to read a text file and collect addresses from it. Here's an example of one of the entries in the text file:
Electrical Vendor Contact: John Smith Phone #: 123-456-7890
Address: 1234 ADDRESS ROAD Ship To:
Suite 123 ,
Nowhere, CA United States 12345
Phone: 234-567-8901 E-Mail: john.smith#gmail.com
Fax: 345-678-9012 Web Address: www.electricalvendor.com
Acct. No: 123456 Monthly Due Date: Days Until Due
Tax ID: Fed 1099 Exempt Discount On Assets Only
G/L Liab. Override:
G/L Default Exp:
Comments:
APPROVED FOR ELECTRICAL THINGS
I cannot wrap my head around how to search for and store the address for each of these entries when the amount of lines in the address varies. Currently, I have a generator that reads each line of the file. Then the get_addrs() method attempts to capture markers such as the Address: and Ship keywords in the file to signify when an address needs to be stored. Then I use a regular expression to search for zip codes in the line following a line with the Address: keyword. I think I've figured out how successfully save the second line for all addresses using that method. However, in a few addresses,es there is a suite number or other piece of information that causes the address to become three lines instead of two. I'm not sure how to account for this and I tried expanding my save_previous() method to three lines, but I can't get it quite right. Here's the code that I was able to successfully save all of the two line addresses with:
import re
class GetAddress():
def __init__(self):
self.line1 = []
self.line2 = []
self.s_line1 = []
self.addr_index = 0
self.ship_index = 0
self.no_ship = False
self.addr_here = False
self.prev_line = []
self.us_zip = ''
# Check if there is a shipping address.
def set_no_ship(self, line):
try:
self.no_ship = line.index(',') == len(line) - 1
except ValueError:
pass
# Save two lines at a time to see whether or not the previous
# line contains 'Address:' and 'Ship'.
def save_previous(self, line):
self.prev_line += [line]
if len(self.prev_line) > 2:
del self.prev_line[0]
def get_addrs(self, line):
self.addr_here = 'Address:' in line and 'Ship' in line
self.po_box = False
self.no_ship = False
self.addr_index = 0
self.ship_index = 0
self.zip1_index = 0
self.set_no_ship(line)
self.save_previous(line)
# Check if 'Address:' and 'Ship' are in the previous line.
self.prev_addr = (
'Address:' in self.prev_line[0]
and 'Ship' in self.prev_line[0])
if self.addr_here:
self.po_box = 'Box' in line or 'BOX' in line
self.addr_index = line.index('Address:') + 1
self.ship_index = line.index('Ship')
# Get the contents of the line between 'Address:' and
# 'Ship' if both words are present in this line.
if self.addr_index is not self.ship_index:
self.line1 += [' '.join(line[self.addr_index:self.ship_index])]
elif self.addr_index is self.ship_index:
self.line1 += ['']
if len(self.prev_line) > 1 and self.prev_addr:
self.po_box = 'Box' in line or 'BOX' in line
self.us_zip = re.search(r'(\d{5}(\-\d{4})?)', ' '.join(line))
if self.us_zip and not self.po_box:
self.zip1_index = line.index(self.us_zip.group(1))
if self.no_ship:
self.line2 += [' '.join(line[:line.index(',')])]
elif self.zip1_index and not self.no_ship:
self.line2 += [' '.join(line[:self.zip1_index + 1])]
elif len(self.line1) > 0 and not self.line1[-1]:
self.line2 += ['']
# Create a generator to read each line of the file.
def read_gen(infile):
with open(infile, 'r') as file:
for line in file:
yield line.split()
infile = 'Vendor List.txt'
info = GetAddress()
for i, line in enumerate(read_gen(infile)):
info.get_addrs(line)
I am still a beginner in Python so I'm sure a lot of my code may be redundant or unnecessary. I'd love some feedback as to how I might make this simpler and shorter while capturing both two and three line addresses.
I also posted this question to Reddit and u/Binary101010 pointed out that the text file is a fixed width, and it may be possible to slice each line in a way that only selects the necessary address information. Using this intuition I added some functionality to the generator expression, and I was able to produce the desired effect with the following code:
infile = 'Vendor List.txt'
# Create a generator with differing modes to read the specified lines of the file.
def read_gen(infile, mode=0, start=0, end=0, rows=[]):
lines = list()
with open(infile, 'r') as file:
for i, line in enumerate(file):
# Set end to correct value if no argument is given.
if end == 0:
end = len(line)
# Mode 0 gives all lines of the file
if mode == 0:
yield line[start:end]
# Mode 1 gives specific lines from the file using the rows keyword
# argument. Make sure rows is formatted as [start_row, end_row].
# rows list should only ever be length 2.
elif mode == 1:
if rows:
# Create a list for indices between specified rows.
for element in range(rows[0], rows[1]):
lines += [element]
# Return the current line if the index falls between the
# specified rows.
if i in lines:
yield line[start:end]
class GetAddress:
def __init__(self):
# Allow access to infile for use in set_addresses().
global infile
self.address_indices = list()
self.phone_indices = list()
self.addresses = list()
self.count = 0
def get(self, i, line):
# Search for appropriate substrings and set indices accordingly.
if 'Address:' in line[18:26]:
self.address_indices += [i]
if 'Phone:' in line[18:24]:
self.phone_indices += [i]
# Add address to list if both necessary indices have been collected.
if i in self.phone_indices:
self.set_addresses()
def set_addresses(self):
self.address = list()
start = self.address_indices[self.count]
end = self.phone_indices[self.count]
# Create a generator that only yields substrings for rows between given
# indices.
self.generator = read_gen(
infile,
mode=1,
start=40,
end=91,
rows=[start, end])
# Collect each line of the address from the generator and remove
# unnecessary spaces.
for element in range(start, end):
self.address += [next(self.generator).strip()]
# This document has a header on each page and a portion of that is
# collected in the address substring. Search for the header substring
# and remove the corresponding elements from self.address.
if len(self.address) > 3 and not self.address[-1]:
self.address = self.address[:self.address.index('header text')]
self.addresses += [self.address]
self.count += 1
info = GetAddress()
for i, line in enumerate(read_gen(infile)):
info.get(i, line)
I would like to count different universities from which the mail was sent for which i used the following code:
fname = raw_input('Enter the file name: ')
try:
fhan = open(fname)
except:
print 'File cannot be opened:', fname
count = 0
sum = 0
for i in fhan:
if i.startswith('From'):
x=i.find('#')
y=i.find(' ',x)
str1=i[x+1:y].strip()
print str1
count=count+1
print count
The final output gives me the handles but can i remove the repeated ones, if i print uct.ac.za it shouldnot print and count again
link for file: www.py4inf.com/code/mbox-short.txt
You can append the handles in a list instead of printing it. And then convert that list in a set. In a set there are no repeated elements so you will get the a set of unique universities. And Finally, you can iterate through the set and print the universities.
For count you can use the len function that will count the universities in the set.
This is the modified code:-
fname = raw_input('Enter the file name: ')
try:
fhan = open(fname)
except:
print 'File cannot be opened:', fname
universities = []
for i in fhan:
if i.startswith('From'):
x=i.find('#')
y=i.find(' ',x)
str1=i[x+1:y].strip()
universities.append(str1)
universities = set(universities)
for i in universities:
print i
print len(universities)
I'm taking user input into a Textarea widget, then looping by line, and trying to split the three "words" (first name, last name, email) from each line into a list, which I'll then deal with later. When I use split() on the line, though, it always splits into characters, which I assume is part of the CharField def'n of the field, meaning that it's not a string and the split() method won't behave as I want it to. Edit: even the for construct is failing - it's analyzing each character, instead of each line.
What's the workaround for that?
class UserImportForm(forms.Form):
importtext = forms.CharField(required=True,widget=forms.Textarea(attrs={'cols': 40, 'rows': 15}))
elif "UserImport" in request.POST:
g = UserImportForm(request.POST, prefix='usrimp')
rawtext = g['importtext'].value()
if g.is_valid():
newusers = []
for lines in rawtext:
row = lines.split(" ")
if len(row) == 3 and validate_email(row[2]):
newusers.append(row)
While this is likely not the best way to do it, here's what I ended up doing. Still welcoming better answers!
elif "UserImport" in request.POST:
g = UserImportForm(request.POST, prefix='usrimp')
if g.is_valid():
rawtext = g.cleaned_data['importtext'].encode('utf8')
rawtext = "".join(rawtext)
rawtext = rawtext.split("\n")
newusers = []
for lines in rawtext:
row = lines.split()
if len(row) == 3:
try:
validate_email(row[2])
newusers.append([row[0],row[1],row[2],"processmore"])
except:
newusers.append([row[0],row[1],row[2],"Invalid email address"])
I am trying to manipulate the data via Python Pandas. However, I am not quite sure how to do this.
Imagine I have the name data, and each name has a corresponding string length. I want to create a new name variable in which I fill the current name variable with "?" up till it has 15 total characters.
For example:
Mike Miller will be converted into mike#miller???? and
G I Joe will be converted into g#i#joe????????
It seems the line below: frame3["name_filled"] = frame3["name"] + filler*"???" is not right, but I am not sure how to iterate within a single line based on two other variables.
import pandas as pd
from pandas import DataFrame
import re
# Get csv file into data frame
data = pd.read_csv("C:\Users\KubiK\Desktop\OddNames_sampleData.csv")
frame = DataFrame(data)
frame.columns = ["name", "ethnicity"]
name = frame.name
ethnicity = frame.ethnicity
# Remove missing ethnicity data cases
index_missEthnic = frame.ethnicity.isnull()
index_missName = frame.name.isnull()
frame2 = frame.loc[~index_missEthnic, :]
frame3 = frame2.loc[~index_missName, :]
# Make all letters into lowercase
frame3.loc[:, "name"] = frame3["name"].str.lower()
frame3.loc[:, "ethnicity"] = frame3["ethnicity"].str.lower()
# Remove all non-alphabetical characters in Name
frame3.loc[:, "name"] = frame3["name"].str.replace(r'[^a-zA-Z\s\-]', '') # Retain space and hyphen
# Replace empty space as "#"
frame3.loc[:, "name"] = frame3["name"].str.replace('[\s]', '#')
# Find the longest name in the dataset
frame3["name_length"] = frame3["name"].str.len()
nameLength = frame3.name_length
frame3["filler"] = 15 - nameLength
filler = frame3.filler
# Add "?" to fill spaces up to 15 characters
frame3["name_filled"] = frame3["name"] + filler*"???"
# Test outputs
print frame3
Use the vectorised str method pad:
In [2]:
df = pd.DataFrame({'a':['asdasd','Fred','Ginger']})
df
Out[2]:
a
0 asdasd
1 Fred
2 Ginger
In [6]:
df.a.str.pad(side='right',width=15,fillchar='?')
Out[6]:
0 asdasd?????????
1 Fred???????????
2 Ginger?????????
Name: a, dtype: object
Is it possible to make this title on line 1 a list of items from each word or symbol seperated by a space with a keyboard shortcut. So that I can select the title and then hit a shortcut and it will make the title a list of items like below:
Tried saving the Key Binding file.
Nothing built in, but you can do it with a plugin.
import sublime
import sublime_plugin
import re
class SplitLineCommand(sublime_plugin.TextCommand):
def run(self, edit, split_pattern=" "):
view = self.view
cursors = view.sel()
if len(cursors) == 1:
cursor = cursors[0]
begin_offset = 0
end_offset = 0
if cursor.empty():
region = view.line(cursor)
content = view.substr(region)
new_content = re.sub(split_pattern, "\n", content)
view.replace(edit, region, new_content)
else:
region = cursor
content = view.substr(region)
new_content = ""
if view.line(region).begin() != region.begin():
new_content = "\n"
begin_offset = 1
new_content += re.sub(split_pattern, "\n", content)
if view.line(region).end() != region.end():
new_content += "\n"
end_offset = - 1
view.replace(edit, region, new_content)
cursors.clear()
cursors.add(sublime.Region(region.begin() + begin_offset, region.begin() + len(new_content) + end_offset))
view.run_command("split_selection_into_lines")
You can then add the following in your key binding file.
[
{ "keys": ["f8"], "command": "split_line", "args": {"split_pattern": " "}}
]
Of course changing the key to something that you want. You don't actually need the args argument if you are just using a space. It defaults to that. I just included it for completeness.
Edit:
I've updated the plugin so it now handles selections, though it does not handle multiple cursors at this point.
Edit 2
If it is not working, try opening the console and entering view.run_command("split_line"). This will run the command in whatever view you were in prior to switching to the console. This way you know if the command actually works. If it doesn't then there is a problem with the plugin. If it does, then there is a problem with the key binding.
I adapted the above code for my own use, so that it now respects whitespace. But I hard-coded tabs instead of spaces, so if you use spaces you might have to change it further. It also now assumes you have no text selected and instead have the cursor in the middle of the line to be changed to vertical spacing. I left intro/outro as arguments so you can also use it for [] or (), although maybe some more escaping is needed in that case for the regex.
Before:
fields = { 'Team1', 'Team2', 'Player1', 'Player2', 'Tab=Round', 'DateTime_UTC=DateTime', 'HasTime=TimeEntered', 'OverviewPage=Tournament', 'ShownName', 'Winner', 'Stream' },
After:
fields = {
'Team1',
'Team2',
'Player1',
'Player2',
'Tab=Round',
'DateTime_UTC=DateTime',
'HasTime=TimeEntered',
'OverviewPage=Tournament',
'ShownName',
'Winner',
'Stream',
},
import sublime
import sublime_plugin
import re
class SplitLineCommand(sublime_plugin.TextCommand):
def run(self, edit, sep=",", repl= "\n", intro="{", outro="}"):
view = self.view
find = re.escape(sep + ' ') + '*(?! *$| *\n)'
intro_repl = intro + repl
intro = intro + ' *'
outro_repl_start = sep + repl
outro_repl_end = outro
outro = ',? *' + outro
repl = sep + repl
cursors = view.sel()
if len(cursors) == 1:
cursor = cursors[0]
begin_offset = 0
end_offset = 0
if cursor.empty():
region = view.line(cursor)
content = view.substr(region)
line_str = view.substr(view.line(view.sel()[0]))
tabs = len(line_str) - len(line_str.lstrip())
intro_repl = intro_repl + '\t' * (tabs + 1)
repl = repl + '\t' * (tabs + 1)
outro_repl = outro_repl_start + ('\t' * tabs) + outro_repl_end
content = re.sub(outro, outro_repl, content)
content = re.sub(find, repl, content)
content = re.sub(intro, intro_repl, content)
view.replace(edit, region, content)
cursors.clear()
cursors.add(sublime.Region(region.begin() + begin_offset, region.begin() + len(content) + end_offset))
view.run_command("split_selection_into_lines")