javascript - How to automate selecting certain codes in an html? -


hi have question automating selecting content in html. if save webpage html only, we'll html codes along other stylesheets , javascript codes. however, want extract html codes between <div class='post-content' itemprop='articlebody'>and</div> , create new html file has extracted html codes. there possible way it? example codes down below:

<html> <script src='.....'> </script> <style> ... </style> <div class='header-outer'> <div class='header-title'> <div class='post-content' itemprop='articlebody'> <p>content want</p> </div> </div></div> <div class='footer'> </div> </html> 

while i'm typing, i'm thinking javascript, seems able manipulate html dom elements..is ruby able that? can generate new clean html contains content between <div class='post-content' itemprop='articlebody'>and</div> using javascript or ruby? however, how write actual code, don't have clue.

so has idea it? thank much!

i'm not quite sure you're asking, i'll take crack @ it.

can ruby modify dom on webpage?

short answer, no. browsers don't know how run ruby. know how run javascript, that's used real-time dom manipulation.

can generate new clean html

yes? @ end of day, html formatted string. if want download source page , find in <div class='post-content' itemprop='articlebody'> tag, there couple of ways go that. best nokogiri gem, ruby html parser. you'll able feed string (from file or otherwise) represents old page , strip out want. doing this:

require 'nokogiri'  page = nokogiri::html(open("https://googleblog.blogspot.com")) # finds first child of <div class="post-content"> element text = page.css('.post-content')[0].text  

i believe gives text you're looking for. more detailed nokogiri instructions can found here.


Comments

Popular posts from this blog

Delphi XE2 Indy10 udp client-server interchange using SendBuffer-ReceiveBuffer -

Qt ActiveX WMI QAxBase::dynamicCallHelper: ItemIndex(int): No such property in -

Enable autocomplete or intellisense in Atom editor for PHP -