How to parse an XML file with encoding declaration in Python? -
i have xml file, called xmltest.xml:
<?xml version="1.0" encoding="gbk"?> <productmeta> <bands>1,2,3,4</bands> <imagename>testname.tif</imagename> <browsename>testname.jpg</browsename> </productmeta> and have python dummy code:
import xml.etree.elementtree et xmldoc = et.parse('xmltest.xml') but raises valueerror:
valueerror: multi-byte encodings not supported
i understand error, raises because encoding declaration in first line of xml file. xml file utf-8 encoded have declaration (i'm not creator of xml files analyzed). how can avoid such encoding declaration when parsing xml file such former one?
one thing tried, worked me open xml file file object , use elementtree.fromstring() passing in complete contents of file.
example -
>>> import xml.etree.elementtree et >>> ef = et.parse('a.xml') traceback (most recent call last): file "<stdin>", line 1, in <module> file "c:\python34\lib\xml\etree\elementtree.py", line 1187, in parse tree.parse(source, parser) file "c:\python34\lib\xml\etree\elementtree.py", line 598, in parse self._root = parser._parse_whole(source) valueerror: multi-byte encodings not supported >>> open('a.xml','r') f: ... ef = et.fromstring(f.read()) ... >>> ef <element 'productmeta' @ 0x028df180> you can also, create xmlparser required encoding, , should enable able parse strings encoding, example -
import xml.etree.elementtree et xmlp = et.xmlparser(encoding="utf-8") f = et.parse('a.xml',parser=xmlp)
Comments
Post a Comment