Jump to content

The ultimate community for Ruby on Rails developers.


Photo

Nokogiri to Hash

nokogiri xml to hash

  • Please log in to reply
2 replies to this topic

#1 bardia_1

bardia_1

    Passenger

  • Members
  • 5 posts

Posted 09 August 2014 - 07:28 PM

I'm parsing through some data I'm scraping from the web using Mechanize and Nokogiri. Nokogiri is doing a good job along with xpath to get the data that I want.

 

But here is my problem:

 

Once I have the data back, The result is in the class type ' Nokogiri::XML::NodeSet ' and this is rather challenging for me to iterate through for the nested xml data in order to get what I want.

 

I NEED TO Convert this to a hash, and make it more friendly to iterate through. I've looked at the following libraries

 

'roxml'

'libxml_to_hash'

 

 

And neither of them have just worked.

 

Now I'm looking at 'RubyXL::Hash'.

 

 

In the meantime, any ideas and inside experience with this problem would be appreciated.

 

 

why hasn't Nokogiri hasn't built in a .to_hash method into it's library?

 

 

 



#2 Ohm

Ohm

    Driver

  • Moderators
  • 398 posts
  • LocationCopenhagen

Posted 11 August 2014 - 05:35 AM

Rails has a built in Hash#from_xml that you might be able to use:

xml = <<-XML
  <?xml version="1.0" encoding="UTF-8"?>
    <hash>
      <foo type="integer">1</foo>
      <bar type="integer">2</bar>
    </hash>
XML

hash = Hash.from_xml(xml)
# => {"hash"=>{"foo"=>1, "bar"=>2}}

Example taken from http://apidock.com/r.../from_xml/class


Blog: http://ohm.sh | Twitter: @madsohm

#3 bardia_1

bardia_1

    Passenger

  • Members
  • 5 posts

Posted 12 August 2014 - 03:47 AM

doing a:

 

something.class

=> Nokogiri::XML::NodeSet

 

something.to_xml #this works

 

Hash.from_xml something.class #errors out

 

output:

attempted adding second root element to document

Line: 55

Position: 4239

Last 80 unconsumed characters

 

 

I'd share a snippet of the output, but pretty print isn't really doing anything on the output

 

pp something.to_xml.

 

 

BTW: I did get a solution working, I chose to stay within Nokogiri, and did a relative lookup to a another xpath and got the data I was looking for. Now that I'm more comfortable with Xpath and Nokogiri, I can take this on with speed the next time, but having a comfortable way to goto into hash data structure would be nice to know howto knowledgeable.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users