i need simplify data in xml able read single table, csv. found python 2.7 examples elementtree, far not tailor work further down tree, not collecting highest-level elements. repeat highest level element each of rows , rest.
i know , should rtfm, need solve problem asap sadly.
maybe xsd file linked help?
my data looks
<!-- moneymate (tm) xmlperfs application version 1.0.1.1 - copyright © 2000 moneymate limited. rights reserved. moneymate ® --> <!-- discrete perfs 180 periods monthly frequency --> <moneymate_xml_feed xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:nonamespaceschemalocation="http://mmia2.moneymate.com/xml/moneymatecomplete.xsd" version="1.0" calccurrency="sek"> <types> <type typecountry="se" typeid="85" typename="string" calctodate="2013-07-16"> <companies> <company companyid="25000068" companyname="string"/> … <categories> <category categoryid="1101" categoryname="aktie -- asien"> <funds> <fund fundid="6201" fundname="string" fundcurrency="gbp" fundcompanyid="25000068"><performances><monthlyperfs><performancemonth perfendmonth="2006-05-31" perfmonth="-0.087670"/><performancemonth> … </performances></fund></funds> </category> <category categoryid="13" categoryname="räntefonder"> <funds></funds> </category> </categories> </type> </types> </moneymate_xml_feed> so hope see table data funds only, but:
fundid fundname fundcurrency fundcompanyid perfendmonth perfmonth … … … … … … etc.
and in csv file, did not want break formatting.
and please note perfmonth key, code did not wrap in box above data example.
i used lxml.
import csv import lxml.etree x = u'''<!-- moneymate (tm) xmlperfs application version 1.0.1.1 - copyright 2000 moneymate limited. rights reserved. moneymate --> <!-- discrete perfs 180 periods monthly frequency --> <moneymate_xml_feed xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:nonamespaceschemalocation="http://mmia2.moneymate.com/xml/moneymatecomplete.xsd" version="1.0" calccurrency="sek"> <types> <type typecountry="se" typeid="85" typename="string" calctodate="2013-07-16"> <companies> <company companyid="25000068" companyname="string"/> <categories> <category categoryid="1101" categoryname="aktie -- asien"> <funds> <fund fundid="6201" fundname="string" fundcurrency="gbp" fundcompanyid="25000068"> <performances> <monthlyperfs> <performancemonth perfendmonth="2006-05-31" perfmonth="-0.087670"/> </monthlyperfs> </performances> </fund> </funds> </category> <category categoryid="13" categoryname="rntefonder"> <funds></funds> </category> </categories> </companies> </type> </types> </moneymate_xml_feed> ''' open('output.csv', 'w') f: writer = csv.writer(f) writer.writerow(('fundid', 'fundname', 'fundcurrency', 'fundcompanyid', 'perfendmonth', 'perfmonth')) root = lxml.etree.fromstring(x) fund in root.iter('fund'): perf = fund.find('.//performancemonth') row = fund.get('fundid'), fund.get('fundname'), fund.get('fundcurrency'), fund.get('fundcompanyid'), perf.get('perfendmonth'), perf.get('perfmonth') writer.writerow(row) note
given xml in question has mismatched tag. may need fix first.
Comments
Post a Comment