Reading, updating and writing XML document with PowerShell
I want to read, update and then write back to disk a Web.config file which is a XML. The way I’m doing it right now is:
[xml]$config = Get-Content -Path $ConfigPath # Update... $config.Save($ConfigPath)
The thing is that it messes up a little bit the initial config formatting. I have some nodes, for example:
<add key="DEBUG_API_ABC" value=' { "A": { "B": "asd", "C": "qwe" } } '/>
And it makes it like:
<add key="DEBUG_API_ABC" value=" { "..."/>
I would like to save it exactly as it is found, keeping the format, the text spacing, only injecting some values when updating. Is it possible?
Thanks 🙂
The correct way to load an XML document (not only in PowerShell) is to use the XML parser to load it and avoid Get-Content
, because Get-Content
will happily mangle the file encoding if it gets a chance.
You seem to have an XML file with JSON data in an attribute, which is odd, but let’s work with what you have:
$config = New-Object xml $config.Load $ConfigPath $debugApi = $config.selectSingleNode("//add[@key='DEBUG_API_ABC']") $configData = $debugApi.getAttribute("value") | ConvertFrom-JSON $configData.A.B = "new value" $configJson = $configData | ConvertTo-JSON $debugApi.setAttribute("value", $configJson) $config.Save($ConfigPath)
ConvertTo-JSON
will pretty-print its output by default, so while it will probably not maintain the "original" white-space layout in the JSON, it will still result in a recognizable structure in the XML.
Regarding the question "Can I keep the "
instead of "
in the attribute value?"
No, you can’t. Here’s why:
When it comes to serializing "special" characters (of which XML does not have many, but "
and '
are two), XML DOM APIs are opinionated. For example, value='something " something'
is valid XML and will cause the @value
attribute of the node in question to get the string value something " something
in RAM after parsing.
However, when that string is serialized again, value="something " something"
is 100% exactly the same thing – but in order to reproduce the original layout, a parser would need to remember that in the original file, that particular attribute had single quote delimiters.
This is a lot of extra work, it would slow down parsing, it would take more memory, and doing it would not make the end result more correct. So parsers generally don’t, and they use a default that’s just as correct but much easier to produce.
For example, the DOM API’s opinion on serialization could be "all attributes use double quotes as delimiters, therefore all double quotes in attribute values will be escaped", and that’s completely fine because it will retain data integrity and that’s all that matters.
It will also "normalize" all single-quoted attributes into double-quoted ones, making JSON in attributes harder to read. But part of that problem is that maybe storing JSON in XML is not the best choice to begin with, at least as long you’re relying on human editors.