Error using ElementTree to parse data from .config file - python-2.7

Im trying to use ElementTree to get data from a .config file. The structure of this file is like this for example:
<userSettings>
<AutotaskUpdateTicketEstimatedHours.My.MySettings>
<setting name="Username" serializeAs="String">
<value>AAA</value>
</setting>
My code is this:
import os, sys
import xml.etree.ElementTree as ET
class Init():
script_dir = os.path.dirname(__file__)
rel_path = "app.config"
abs_file_path = os.path.join(script_dir, rel_path)
tree = ET.parse(abs_file_path)
root = tree.getroot()
sites = root.iter('userSettings')
for site in sites:
apps = site.findall('AutotaskUpdateTicketEstimatedHours.My.MySettings')
for app in apps:
print(''.join([site.get('Username'), app.get('value')]))
if __name__ == '__main__':
handler = Init()
However, when I run this code I get:
Traceback (most recent call last):
File "/Users/AAAA/Documents/Aptana/AutotaskUpdateTicketEstimatedHours/Main.py", line 5, in <module>
class Init():
File "/Users/AAA/Documents/Aptana/AutotaskUpdateTicketEstimatedHours/Main.py", line 16, in Init
print(''.join([site.get('Username'), app.get('value')]))
TypeError: sequence item 0: expected string, NoneType found
What I'm I doing wrong the causes this error?
(My problem seems to be accessing the tree structure of my config.file correctly)

You may change your code to:
print(''.join([app.get('name'), app.find('value').text]))
app is an Element Object in this case <setting>. Using the get function you will get an attribute value by name (e.g. name, serializeAs), using the find
function you will get a subelement (e.g <value>).
Once you have <value> you can get the data inside with text
Note that site (<AutotaskUpdateTicketEstimatedHours.My.MySettings>) doesn't have any attributes, therefore you get None.

Related

Bypass file as parameter with a string for lxml iterparse function using Python 2.7

I am interating over an xml tree using the lxml.tree function iterparse().
This works ok with an input file
xml_source = "formatted_html_diff.xml"
context = ET.iterparse(xml_source, events=("start",))
event, root = context.next()
However, I would like to use a string containing the same information in the file.
I tried using
context = ET.iterparse(StringIO(result), events=("start",))
But this causes the following error:
Traceback (most recent call last):
File "c:/Users/pag/Documents/12_raw_handle/remove_from_xhtmlv02.py", line 96, in <module>
event, root = context.next()
File "src\lxml\iterparse.pxi", line 209, in lxml.etree.iterparse.__next__
TypeError: reading file objects must return bytes objects
Does anyone know how could I solve this error?
Thanks in advance.
Use BytesIO instead of StringIO. The following code works with both Python 2.7 and Python 3:
from lxml import etree
from io import BytesIO
xml = """
<root>
<a/>
<b/>
</root>"""
context = etree.iterparse(BytesIO(xml.encode("UTF-8")), events=("start",))
print(next(context))
print(next(context))
print(next(context))
Output:
('start', <Element root at 0x315dc10>)
('start', <Element a at 0x315dbc0>)
('start', <Element b at 0x315db98>)

Google Vision API 'TypeError: invalid file'

The following piece of code comes from Google's Vision API Documentation, the only modification I've made is adding the argument parser for the function at the bottom.
import argparse
import os
from google.cloud import vision
import io
def detect_text(path):
"""Detects text in the file."""
client = vision.ImageAnnotatorClient()
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
print('Texts:')
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", type=str,
help="path to input image")
args = vars(ap.parse_args())
detect_text(args)
If I run it from a terminal like below, I get this invalid file error:
PS C:\VisionTest> python visionTest.py --image C:\VisionTest\test.png
Traceback (most recent call last):
File "visionTest.py", line 31, in <module>
detect_text(args)
File "visionTest.py", line 10, in detect_text
with io.open(path, 'rb') as image_file:
TypeError: invalid file: {'image': 'C:\\VisionTest\\test.png'}
I've tried with various images and image types as well as running the code from different locations with no success.
Seems like either the file doesn't exist or is corrupt since it isn't even read. Can you try another image and validate it is in the location you expect?

Getting ParseError when parsing using xml.etree.ElementTree

I am trying to extract the <comment> tag (using xml.etree.ElementTree) from the XML and find the comment count number and add all of the numbers. I am reading the file via a URL using urllib package.
sample data: http://python-data.dr-chuck.net/comments_42.xml
But currently i am trying to trying to print the name, and count.
import urllib
import xml.etree.ElementTree as ET
serviceurl = 'http://python-data.dr-chuck.net/comments_42.xml'
address = raw_input("Enter location: ")
url = serviceurl + urllib.urlencode({'sensor': 'false', 'address': address})
print ("Retrieving: ", url)
link = urllib.urlopen(url)
data = link.read()
print("Retrieved ", len(data), "characters")
tree = ET.fromstring(data)
tags = tree.findall('.//comment')
for tag in tags:
Name = ''
count = ''
Name = tree.find('commentinfo').find('comments').find('comment').find('name').text
count = tree.find('comments').find('comments').find('comment').find('count').number
print Name, count
Unfortunately, I am not able to even parse the XML file into Python, because i am getting this error as follows:
Traceback (most recent call last):
File "ch13_parseXML_assignment.py", line 14, in <module>
tree = ET.fromstring(data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML
parser.feed(text)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: syntax error: line 1, column 49
I have read previously in a similar situation that maybe the parser isn't accepting the XML file. Anticipating this, i did a Try and Except around tree = ET.fromstring(data) and I was able to get past this line, but later it is throwing an erro saying tree variable is not defined. This defeats the purpose of the output I am expecting.
Can somebody please point me in a direction that helps me?

How to solve AttributeError in python active_directory?

Running the below script works for 60% of the entries from the MasterGroupList however suddenly fails with the below error. although my questions seem to be poor ou guys have been able to help me before. Any idea how I can avoid getting this error? or what is trhoughing off the script? The masterGroupList looks like:
Groups Pulled from AD
SET00 POWERUSER
SET00 USERS
SEF00 CREATORS
SEF00 USERS
...another 300 entries...
Error:
Traceback (most recent call last):
File "C:\Users\ks185278\OneDrive - NCR Corporation\Active Directory Access Scr
ipt\test.py", line 44, in <module>
print group.member
File "C:\Python27\lib\site-packages\active_directory.py", line 805, in __getat
tr__
raise AttributeError
AttributeError
Code:
from active_directory import *
import os
file = open("C:\Users\NAME\Active Directory Access Script\MasterGroupList.txt", "r")
fileAsList = file.readlines()
indexOfTitle = fileAsList.index("Groups Pulled from AD\n")
i = indexOfTitle + 1
while i <= len(fileAsList):
fileLocation = 'C:\\AD Access\\%s\\%s.txt' % (fileAsList[i][:5], fileAsList[i][:fileAsList[i].find("\n")])
#Creates the dir if it does not exist already
if not os.path.isdir(os.path.dirname(fileLocation)):
os.makedirs(os.path.dirname(fileLocation))
fileGroup = open(fileLocation, "w+")
#writes group members to the open file
group = find_group(fileAsList[i][:fileAsList[i].find("\n")])
print group.member
for group_member in group.member: #this is line 44
fileGroup.write(group_member.cn + "\n")
fileGroup.close()
i+=1
Disclaimer: I don't know python, but I know Active Directory fairly well.
If it's failing on this:
for group_member in group.member:
It could possibly mean that the group has no members.
Depending on how phython handles this, it could also mean that the group has only one member and group.member is a plain string rather than an array.
What does print group.member show?
The source code of active_directory.py is here: https://github.com/tjguk/active_directory/blob/master/active_directory.py
These are the relevant lines:
if name not in self._delegate_map:
try:
attr = getattr(self.com_object, name)
except AttributeError:
try:
attr = self.com_object.Get(name)
except:
raise AttributeError
So it looks like it just can't find the attribute you're looking up, which in this case looks like the 'member' attribute.

During migrating tool from windows to linux lxml error

I have developed a tool in python 2.7 that take xsd file as input ,
and give the process data into a test file
During processing the xsd file I used lxml, I am unable to resolve this sort of error.
AttributeError: 'Element' object has no attribute 'iterdescendants'
I don`t know what wrong with the lxml lib.
I want to know is there any lxml Linux compatible version for python 2.7
I have imported in the file like below:
try:
from lxml import etree
except ImportError:
import xml.etree.ElementTree as etree
I have imported only in file , and sending the element tree pointer to process the the element into another file ,
it is OK in the declared file , giving error in another file only.
the code throw the error is :
for tdocNode in lincFileRootNode:
rootNode = tdocNode.getroot()
lchildren = rootNode.getchildren()
for elt in lchildren:
if 'complex' == elt.tag:
if 'name' in elt.attrib:
if 'element' == item.tag:
if 'type' in item.attrib:
if elt.attrib['name'] == item.attrib['type']:
for key in elt.iterdescendants(tag='element'):
bIsElemTypeSimple = false
bIsElemTypeSimple = process_elementtype(key, lincFileRootNode)
where :
lincFileRootNode --> is list that containe the xsd file pointer to be processed
the error thrown is :
Traceback (most recent call last):
File "run.py", line 1210, in <module>
iret = xsd2dic_main()
File "run.py", line 71, in xsd2dic_main
iRet = yxsdtodic()
File "run.py", line 352, in yxsdtodic
iret = process_xsdfile(sXsdPath)
File "run.py", line 485, in xsdfile
sRet = process_dic_elementtype(item,lincFileRootNode)
File "run.py", line 817, in process_dic_elementtype
for key in elt.iterdescendants(tag='element'):
AttributeError: 'Element' object has no attribute 'iterdescendants'
i tired in the both the cases :
1:writing all code in a same file
2:writing different files
still i am getting the same error
This is mostly a guess, but look into it.
You appear to be calling iterdescendants from lxml's implementation of the Element type. However, if lxml fails to import, you fall back on Python's built in xml library instead. But it's implementation of Element doesn't have an iterdescendants methods of any kind. In other words, the two implementations have different public APIs. Add some print statements to see which library you're importing and do some additionally checking to see exactly what type elt is. If you want to be able to fall back on Python's built in xml, you'll need to structure your code to accommodate the different APIs.