XML scrapping using nodeJs

Question

Home

XML scrapping using nodeJs

0

I have a very huge xml file that I got by exporting all the data from tally, I am trying to use web scraping to get elements out of my code using cheerio, but I am having trouble with the formatting or something similar. Reading it with fs.readFileSync() works fine and the console.log shows complete xml file but when I write the file using the fs.writeFileSync it makes it look like this:

And my web scrapping code outputs empty file:

const cheerio = require('cheerio'); const fs = require ('fs');    var xml = fs.readFileSync('Master.xml','utf8');               const htmlC = cheerio.load(xml);                      var list = [];              list = htmlC('ENVELOPE').find('BODY>TALLYMESSAGE>STOCKITEM>LANGUAGENAME.LIST>NAME.LIST>NAME').each(function (index, element) {                 list.push(htmlC(element).attr('data-prefix'));              })              console.log(list)              fs.writeFileSync("data.html",list,()=>{})

Judsontimpearlie Asked on July 17, 2020 in XML.

Share
Comment(0)

Add Comment

1 Answer(s)

Votes
Oldest

0

You might try checking to make sure that Cheerio isn’t decoding all the HTML entities. Change:

const htmlC = cheerio.load(xml);

to:

const htmlC = cheerio.load(xml, { decodeEntities: false });

Bennypeterjeanne Answered on July 17, 2020.

Share
Comment(0)

Add Comment

Your Answer

Answer 1

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 2

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 3

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 4

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 5

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 6

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 7

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 8

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

Answer 9

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 10

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 11

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 12

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 13

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 14

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 15

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 16

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

LATEST ANSWERS

XML scrapping using nodeJs

Your Answer

TOP USERS

HOT QUESTIONS

LATEST ANSWERS

XML scrapping using nodeJs

Your Answer

Tags Widget

TOP USERS

HOT QUESTIONS