Read XML block in Python

I have an XML file like below which contain multiple xml. I want to fetch <Sacd> content.

<?xml version="1.0" encoding="utf-8"?> <Sacd>     <Acdpktg> <Acdpktg/> </Sacd> <?xml version="1.0" encoding="utf-8"?> <Sacd>     <Acdpktg/> </Sacd> <?xml version="1.0" encoding="utf-8"?> <Sacd>     <AcdpktG>         <Result Value="0"/>         <Packet Value="Dnd"/>         <Invoke Value="abc"/>     </AcdpktG> </Sacd> 

How do I extract the value inside Sacd tag?

Add Comment
1 Answer(s)

Well, your xml is problematic in several respects. First, it contains multiple xml files within in – not a good idea; they have to be split into separate xml files. Second, the first <Acdpktg> <Acdpktg/> tag pair is invalid; it should be <Acdpktg> </Acdpktg>.

But once it’s all fixed, you can get your expected output. So:

from lxml import etree big = """[your xml above,fixed]"""  smalls = big.replace('<?xml','xxx<?xml').split('xxx')[1:] #split it into small xml files  for small in smalls:     xml = bytes(bytearray(small, encoding='utf-8')) #either this, or remove the xml declarations from each small file     doc = etree.XML(xml)     for value in doc.xpath('.//AcdpktG//*/@Value'):                  print(value) 

Output:

0 Dnd abc 

Or, a bit fancier output can be obtained by changing the inner for loop a bit:

for value in doc.xpath('.//AcdpktG//*'):                  print(value.tag, value.xpath('./@Value')[0]) 

Output:

Result 0 Packet Dnd Invoke abc 
Answered on July 17, 2020.
Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.