Hello,
I noticed a weird behavior with the last release 0.11.0 not present in 0.10.2, this shows up when parsing a simple xml with process_namespace option.
XML File
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<MyXML xmlns="http://www.xml.org/schemas/Test">
<Tag1>Text1</Tag1>
<Tag2 attr2="en">Text2</Tag2>
<Tag3>Text3</Tag3>
<Tag4 attr4="en">Text4</Tag4>
</MyXML>
Parser
import xmltodict
import json
def parse_xml(filename, force_list=None):
with open(filename) as file:
# Collapse these namespace
namespaces = {
"http://www.xml.org/schemas/Test": None,
}
res_dict = xmltodict.parse(
file.read(),
process_namespaces=True,
namespaces=namespaces,
force_list=force_list
)
res_dict = json.loads(json.dumps(res_dict))
return res_dict
res_dict = parse_xml("xml_path")
print(res_dict)
With 0.10.2 release you get the following result as expected :
{u'MyXML': {u'Tag4': {u'@attr4': u'en', u'#text': u'Text4'}, u'Tag1': u'Text1', u'Tag2': {u'#text': u'Text2', u'@attr2': u'en'}, u'Tag3': u'Text3'}}
Instead in 0.11.0 you get this :
{u'MyXML': {u'Tag4': {u'@attr4': u'en', u'#text': u'Text4'}, u'Tag1': u'Text1', u'Tag2': {u'@xmlns': {u'': u'http://www.xml.org/schemas/Test'}, u'#text': u'Text2', u'@attr2': u'en'}, u'Tag3': u'Text3'}}
An attribute @xmlns is appended for the first xml tag in the file with an attribute, in this case Tag2 but if we remove the attribute for Tag2 in the file, the @xmlns will be present in Tag4.
Does that make sense ?
Hello,
I noticed a weird behavior with the last release 0.11.0 not present in 0.10.2, this shows up when parsing a simple xml with process_namespace option.
XML File
Parser
With 0.10.2 release you get the following result as expected :
Instead in 0.11.0 you get this :
An attribute @xmlns is appended for the first xml tag in the file with an attribute, in this case Tag2 but if we remove the attribute for Tag2 in the file, the @xmlns will be present in Tag4.
Does that make sense ?