Scraping the web for information has always been a difficult task. Web browsers use static HTML to generate a DOM and the HTML is not always complete or correct. Luckily,browsers do an incredible job of rendering a page from poorly written or even broken HTML. There are several libraries for various platforms that attempt to make it easy to extract information from static HTML. Unfortunately, these solutions are not robust or easy to use for the average web developer. To further complicate things, the web is evolving into a dynamic medium. Instead of learning a new model for extracting information from a web page, why not leverage our understanding of jQuery to reliably get information from a live fully rendered DOM in an automated fashion?
- ISBN10 1449321860
- ISBN13 9781449321864
- Publish Date 8 March 2012
- Publish Status Cancelled
- Publish Country US
- Imprint O'Reilly Media, Inc, USA
- Format Paperback
- Pages 80
- Language English