XSLT: Transform XML containing text to XML only

I have the following example XML file containing text elements:

<?xml version="1.0" encoding="UTF-8"?> <Envelope>   <Body>     <Zoo>       <TotalAmountOfAnimals>18</TotalAmountOfAnimals>       <Animals>&lt;Animal xmlns="zoo"&gt;         &lt;HEADER&gt;         &lt;Amount&gt;7&lt;/Amount&gt;         &lt;/HEADER&gt;         &lt;/Animal&gt;       </Animals>     </Zoo>   </Body> </Envelope> 

I now want to find a generic way to transform the text part of the XML into proper XML without losing the surrounding structure. My current approach is the following:

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">   <xsl:output method="xml" media-type="text/xml"></xsl:output>      <xsl:template match="@*|node()">       <xsl:copy>         <xsl:apply-templates select="@*|node()"></xsl:apply-templates>       </xsl:copy>   </xsl:template>      <xsl:template match="Animals">       <xsl:element name="Animals">         <xsl:value-of select="@*|node()" disable-output-escaping="yes"></xsl:value-of>       </xsl:element>   </xsl:template> </xsl:stylesheet> 

The outcome fits, but it’s not generic as I have to give the specific name of the element (here: Animals) containing the text.

edit: The output for this case would then be:

<?xml version="1.0" encoding="UTF-8"?> <Envelope>     <Body>         <Zoo>             <TotalAmountOfAnimals>18</TotalAmountOfAnimals>             <Animals>                 <Animal xmlns="zoo">                     <HEADER>                         <Amount>7</Amount>                     </HEADER>                 </Animal>             </Animals>         </Zoo>     </Body> </Envelope> 
Add Comment
1 Answer(s)

To avoid having to specify the name in a template match pattern, you could simply match on any text node:

<xsl:template match="text()">   <xsl:value-of select="." disable-output-escaping="yes"/> </xsl:template> 

This, however, risks transforming e.g. <name>John &amp; Jane</name> into <name>John & Jane</name>, i.e. to result in previously well-formed input to be transformed into a not well-formed output. Therefore I wouldn’t recommend using disable-output-escaping on arbitrary text nodes.

It is hard to find a generic strategy to deal with some content containing escaped markup. In XSLT 3 you might have some more power using parse-xml and xsl:try/catch

  <xsl:template match="text()">       <xsl:try>           <xsl:sequence select="parse-xml(.)"/>           <xsl:catch>               <xsl:value-of select="."/>           </xsl:catch>       </xsl:try>   </xsl:template> 

but even then I am not sure at which stages you want to use it and at which you simply want to copy the input through to the output and whether it is not too expensive to start parsing as XML.

Answered on July 16, 2020.
Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.