Perl & LWP
Free

Perl & LWP

By Sean M. Burke
Free
Book Description

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages.The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling, reading or writing, uploading or downloading, news to e-commerce, can be controlled with Perl and LWP. You can automate Web-based purchase orders as easily as you can set up a program to download MP3 files from a web site.

Perl & LWP covers:

  • Understanding LWP and its design
  • Fetching and analyzing URLs
  • Extracting information from HTML using regular expressions and tokens
  • Working with the structure of HTML documents using trees
  • Setting and inspecting HTTP headers and response codes
  • Managing cookies
  • Accessing information that requires authentication
  • Extracting links
  • Cooperating with proxy caches
  • Writing web spiders (also known as robots) in a safe fashion
Perl & LWP includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from the web sites of BBC News, Altavista, ABEBooks.com, and the Weather Underground, to name just a few, are explained in detail, so that you understand how and why they work.Perl programmers who want to automate and mine the web can pick up this book and be immediately productive. Written by a contributor to LWP, and with a foreword by one of LWP's creators, Perl & LWP is the authoritative guide to this powerful and popular toolkit.



This edition is version 1.1 (2007), an online version licensed with Creative Commons on the author's website.
Table of Contents
  • Introduction to the 2007 online edition
  • Foreword
  • Copyright
  • Preface
  • Chapter 1. Introduction to Web Automation
  • Chapter 2. Web Basics
  • Chapter 3. The LWP Class Model
  • Chapter 4. URLs
  • Chapter 5. Forms
  • Chapter 6. Simple HTML Processing with Regular Expressions
  • Chapter 7. HTML Processing with Tokens
  • Chapter 8. Tokenizing Walkthrough
  • Chapter 9. HTML Processing with Trees
  • Chapter 10. Modifying HTML with Trees
  • Chapter 11. Cookies, Authentication, and Advanced Requests
  • Chapter 12. Spiders
  • Appendix A. LWP Modules
  • Appendix B. HTTP Status Codes
  • Appendix C. Common MIME Types
  • Appendix D. Language Tags
  • Appendix E. Common Content Encodings
  • Appendix F. ASCII Table
  • Appendix G. User's View of Object-Oriented Modules
  • Colophon
    No review for this book yet, be the first to review.
      No comment for this book yet, be the first to comment
      You May Also Like
      Also Available On
      App store smallGoogle play small
      Categories
      Curated Lists
      • Pattern Recognition and Machine Learning (Information Science and Statistics)
        by Christopher M. Bishop
        Data mining
        by I. H. Witten
        The Elements of Statistical Learning: Data Mining, Inference, and Prediction
        by Various
        See more...
      • CK-12 Chemistry
        by Various
        Concept Development Studies in Chemistry
        by John Hutchinson
        An Introduction to Chemistry - Atoms First
        by Mark Bishop
        See more...
      • Microsoft Word - How to Use Advanced Algebra II.doc
        by Jonathan Emmons
        Advanced Algebra II: Activities and Homework
        by Kenny Felder
        de2de
        by
        See more...
      • The Sun Who Lost His Way
        by
        Tania is a Detective
        by Kanika G
        Firenze_s-Light
        by
        See more...
      • Java 3D Programming
        by Daniel Selman
        The Java EE 6 Tutorial
        by Oracle Corporation
        JavaKid811
        by
        See more...