I have a source file o with the following data
#POST_START
#ID 2
#NAME John Doe
#STREET Big Street 1
#ZIP 1111111
#CITY LONDON
#POST_END
#POST_START
#ID 3
#NAME Jonathan Swift
#STREET Little Street 9
#ZIP 333333
#CITY PARIS
POST_END
I need to transform this to the following structure
<persons>
<person>
<id>2</id>
<name>John Doe</name>
<address>
<street>Big Street</street>
<zip>111111</zip>
<city>LONDON</city>
</address>
</person>
<person>
<id>3</id>
<name>Jonathan Swift</name>
<address>
<street>Little Street</street>
<zip>333333</zip>
<city>PARIS</city>
</address>
</person>
</persons>
What is the best way to achive this? I´ve tried and managed to get most part right, but i don´t seem to be able to insert the address element and "push down" the street, zip and city to a child level.
Any help appreciated.
Related
I am trying to extract ID values from an XML and save them to a CSV file. The XML looks like this:
<?xml version="1.0" encoding="utf-8" ?>
<YourMembership_Response>
<Items>
<Item>
<ItemID></ItemID>
<ID>92304823A-2932</ID>
<WebsiteID>0987</WebsiteID>
<NamePrefix></NamePrefix>
<FirstName>John</FirstName>
<MiddleName></MiddleName>
<LastName>Smith</LastName>
<Suffix></Suffix>
<Nickname></Nickname>
<EmployerName>abc company</EmployerName>
<WorkTitle>manager</WorkTitle>
<Date>3/14/2013 2:12:39 PM</Date>
<Description>Removed from group by Administration.</Description>
</Item>
<Item>
<ItemID></ItemID>
<ID>92304823A-2932</ID>
<WebsiteID>0987</WebsiteID>
<NamePrefix></NamePrefix>
<FirstName>John</FirstName>
<MiddleName></MiddleName>
<LastName>Smith</LastName>
<Suffix></Suffix>
<Nickname></Nickname>
<EmployerName>abc company</EmployerName>
<WorkTitle>manager</WorkTitle>
<Date>3/14/2013 2:12:39 PM</Date>
<Description>Removed from group by Administration.</Description>
</Item>
I have been able to parse the API responses for ID with the following:
ID = tree.find('.//ID').text
print ID
which only gives me one ID, e.g. 92304823A-2932.
I want to be able to loop through the ID tag to extract all the IDs.
This is what I tried, I'm not really sure what I'm doing wrong but I don't even get an error message.
for node in tree.find('.//ID'):
ID = tree.find('.//ID').text
print ID
Secondly, I am not sure if I can write the IDs into a CSV within the same for loop.
At a high level my question is how do I loop through all the ID tags in the XML and then how do I write those IDs to a CSV?
Please let me know if my question does not make sense.
Thank you in advance.
this code worked for me:
with open("output1.csv", "wb") as f:
writer = csv.writer(f)
for node in tree.findall('.//ID'):
writer.writerow([u' '.join(node.text).encode('utf8').strip()])
I need to insert the line items on my XML to a Map or a flat XML in mulesoft. Iam planning to use XSLT but Im having only single values instead of multiple Line Items. Im not sure how the for each function works for this. any help would be appreciated.
Input
<?xml version="1.0" encoding="utf-8"?><XmlInterchange xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1" xmlns="http://www.edi.com.au/EnterpriseService/">
<InterchangeInfo>
<Date>2016-02-29T05:56:10.272+05:00</Date>
<XmlType>LightWeight</XmlType>
<Source></Source>
<Target></Target>
</InterchangeInfo>
<Payload>
<WhsDockets>
<WhsDocket>
<Identifier>
<Reference>2370519</Reference>
</Identifier>
<DocketDetail>
<WarehouseCode>ROC</WarehouseCode>
<CustomerReference>3340527</CustomerReference>
<Units>41</Units>
<Packages>0</Packages>
<Pallets>0</Pallets>
<Weight DimensionType="KG">720</Weight>
<Cubic DimensionType="M3">5.922</Cubic>
<TransportInsurance>0.0000</TransportInsurance>
<ShipperCODAmount>0.0000</ShipperCODAmount>
<CustomerOrderDetail>
<OrderType>ORD</OrderType>
<DateRequired>2015-09-02T00:00:00</DateRequired>
<Consignee AddressType="CEA">
<AddressLine1>Cnr Maroochydore and BroadmeadowRds</AddressLine1>
<CityOrSuburb>MAROOCHYDORE</CityOrSuburb>
<StateOrProvince>QLD</StateOrProvince>
<PostCode>4558</PostCode>
<CompanyName>Bunnings Maroochydore OLD Warehouse</CompanyName>
<CountryCode>AU</CountryCode>
<ContactName>The Import Manager</ContactName>
</Consignee>
</CustomerOrderDetail>
<CustomAttributes />
</DocketDetail>
<DocketLines>
<DocketLine>
<Product>E4342</Product>
<Description>R 3 5/3 6 175mm x 430mm x 1160mm</Description>
<QuantityFromClientOrder>5</QuantityFromClientOrder>
<QuantityActuallyOrdered>5</QuantityActuallyOrdered>
<ProductUQ>MST</ProductUQ>
<LineAttributes />
<LineNumber>1</LineNumber>
<Confirmation>
<Lines>
<Line>
<Quantity>25</Quantity>
<QuantityUQ>PAC</QuantityUQ>
</Line>
</Lines>
<Quantity>25</Quantity>
</Confirmation>
</DocketLine>
<DocketLine>
<Product>E2281</Product>
<Description>R 3 5 175mm x 580mm x 1160mm</Description>
<QuantityFromClientOrder>4</QuantityFromClientOrder>
<QuantityActuallyOrdered>4</QuantityActuallyOrdered>
<ProductUQ>MST</ProductUQ>
<LineAttributes />
<LineNumber>2</LineNumber>
<Confirmation>
<Lines>
<Line>
<Quantity>16</Quantity>
<QuantityUQ>PAC</QuantityUQ>
</Line>
</Lines>
<Quantity>16</Quantity>
</Confirmation>
</DocketLine>
</DocketLines>
</WhsDocket>
</WhsDockets>
</Payload></XmlInterchange>
I need to flatten the XML but use the Litem Item details together with the Reference Number per each Item.
Output
<?xml version="1.0" encoding="utf-8"?><Items>
<LineItem>
<Date/>
<Order>2370519</Order>
<Client>Bunnings Maroochydore OLD Warehouse</Client>
<Product>E2281</Product>
<Description>R 3 5 175mm x 580mm x 1160mm</Description>
<Quantity>4</Quantity>
<UOM>MST</UOM>
<Warebouse>ROC</Warebouse>
<Carrier>Deluxe</Carrier>
</LineItem>
</Items>
Have you looked at DataWeave to transform it from current xml to new xml?
https://docs.mulesoft.com/mule-user-guide/v/3.7/dataweave-examples#xml-basic
I have requirement on to find the duplicate elements in the input xml and sum the quantity with single record as output.
Input xml is:
<Input>
<A1>
<NAME>A</NAME>
<QTY>1</QTY>
</A1>
<A1>
<NAME>A</NAME>
<QTY>2</QTY>
</A1>
<A2>
<NAME>B</NAME>
<QTY>3</QTY>
</A2>
<A1>
<NAME>A</NAME>
<QTY>5</QTY>
</A1>
<A2>
<NAME>b</NAME>
<QTY>8</QTY>
</A2>
</Input>
output should be as below:
<Input>
<A1>
<NAME>A</NAME>
<QTY>8</QTY>
</A1>
<A2>
<NAME>B</NAME>
<QTY>11</QTY>
</A2>
</Input>
If you want to sum several nodes of type number you can use the XPath sum() function. This adds all your QTY nodes:
sum(//QTY)
If you just want to add the nodes that are below A1 you can use:
sum(/Input/A1/QTY)
or
sum(//A1/QTY)
which will have the same result considering the source you provided.
You can select the first A1 with the same name using
//A1[1]
So, to obtain the result you want you could match A1[1] in a template and call sum(//A1/QTY) or sum(/Input/A1/QTY) inside it to obtain the sum. Then you repeat the process with A2.
You can achieve this with two recursive templates:
The sum expression here obtains the value of the node * which may be A1 or A2. The XPath expression compares its name name(current()) with the name() of each child of Input (/Input/*), which will match either A1 or A2, adding the amount in the QTY of each node.
I'm trying to write an XPath that will select certain nodes that contain a specific word.
In this case the word is, "Lockwood". The correct answer is 3. Both of these paths give me 3.
count(//*[contains(./*,'Lockwood')])
count(BusinessLetter/*[contains(../*,'Lockwood')])
But when I try to output the text of each specific node
//*[contains(./*,'Lockwood')][1]
//*[contains(./*,'Lockwood')][2]
//*[contains(./*,'Lockwood')][3]
Node 1 ends up containing all the text and nodes 2 and 3 are blank.
Can some one please tell me what's happening or what I'm doing wrong.
Thanks.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="XPathFunctions.xsl"?>
<BusinessLetter>
<Head>
<SendDate>November 29, 2005</SendDate>
<Recipient>
<Name Title="Mr.">
<FirstName>Joshua</FirstName>
<LastName>Lockwood</LastName>
</Name>
<Company>Lockwood & Lockwood</Company>
<Address>
<Street>291 Broadway Ave.</Street>
<City>New York</City>
<State>NY</State>
<Zip>10007</Zip>
<Country>United States</Country>
</Address>
</Recipient>
</Head>
<Body>
<List>
<Heading>Along with this letter, I have enclosed the following items:</Heading>
<ListItem>two original, execution copies of the Webucator Master Services Agreement</ListItem>
<ListItem>two original, execution copies of the Webucator Premier Support for Developers Services Description between Lockwood & Lockwood and Webucator, Inc.</ListItem>
</List>
<Para>Please sign and return all four original, execution copies to me at your earliest convenience. Upon receipt of the executed copies, we will immediately return a fully executed, original copy of both agreements to you.</Para>
<Para>Please send all four original, execution copies to my attention as follows:
<Person>
<Name>
<FirstName>Bill</FirstName>
<LastName>Smith</LastName>
</Name>
<Address>
<Company>Webucator, Inc.</Company>
<Street>4933 Jamesville Rd.</Street>
<City>Jamesville</City>
<State>NY</State>
<Zip>13078</Zip>
<Country>USA</Country>
</Address>
</Person>
</Para>
<Para>If you have any questions, feel free to call me at <Phone>800-555-1000 x123</Phone> or e-mail me at <Email>bsmith#webucator.com</Email>.</Para>
</Body>
<Foot>
<Closing>
<Name>
<FirstName>Bill</FirstName>
<LastName>Smith</LastName>
</Name>
<JobTitle>VP of Operations</JobTitle>
</Closing>
</Foot>
</BusinessLetter>
But when I try to output the text of
each specific node
//*[contains(./*,'Lockwood')][1]
//*[contains(./*,'Lockwood')][2]
//*[contains(./*,'Lockwood')][3]
Node 1 ends up containing all the text
and nodes 2 and 3 are blank
This is a FAQ.
//SomeExpression[1]
is not the equivalent to
(//someExpression)[1]
The former selects all //SomeExpression nodes that are the first child of their parent.
The latter selects the first (in document order) of all //SomeExpression nodes in the whole document.
How does this apply to your problem?
//*[contains(./*,'Lockwood')][1]
This selects all elements that have at least one child whose string value contains 'Lockwood' and that are the first such child of their parent. All three elements that have a text node containing the string 'Lockwood' are the first such child of their parents, so the result is that three elements are selected.
//*[contains(./*,'Lockwood')][2]
There is no element that has a child with string value containing the string 'Lockwood' and is the second such child of its parent. No nodes are selected.
//*[contains(./*,'Lockwood')][3]
There is no element that has a child with string value containing the string 'Lockwood' and is the third such child of its parent. No nodes are selected.
Solution:
Use:
(//*[contains(./*,'Lockwood')])[1]
(//*[contains(./*,'Lockwood')])[2]
(//*[contains(./*,'Lockwood')])[3]
Each of these selects exactly the Nth element (N = {1,2,3}) selected by //*[contains(./*,'Lockwood')], correspondingly: BusinesLetter, Recipient and Body.
Remember:
The [] operator has higher priority (precedence) than the // abbreviation.
I know that if I have an XML file like this:
<persons>
<class name="English">
<person name="Tarzan" id="050676"/>
<person name="Donald" id="070754"/>
<person name="Dolly" id="231256"/>
</class>
<class name="Math">
<person name="Winston" id="050677"/>
<person name="Donald" id="070754"/>
<person name="Fred" id="231257"/>
</class>
</persons>
I can define a key in an XSL file like this:
<xsl:key name="preg" match="person" use="#id"/>
where I'm using id as the key. However, Donald is listed twice, but is only in one place in preg.
Suppose I want him listed twice in preg. That is, I want to make the class name be part of the identifier. Basically, I want preg to have keys that are equivalent to ordered pairs: (class-name, id). How do I do that (using XSLT 1.0)?
Concatenate the keys? How about
use="concat(../#name, #id)"
This would serve to keep them separate in the index. You'd of course have to use the same key to retrieve them. To avoid any ambiguity I'd also include a delimiter that won't occur in either subkey, as in
use="concat(../#name, '|', #id)"
This is the recommended approach in Michael Kay's XSLT2 reference.