Web Scraping with Python: Collecting Data from the Modern Web

Learn web scraping and crawling techniques to access unlimited data from any web source in any format. With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once.Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice.Learn how to parse complicated HTML pagesTraverse multiple pages and sitesGet a general overview of APIs and how they workLearn several methods for storing the data you scrapeDownload, read, and extract data from documentsUse tools and techniques to clean badly formatted dataRead and write natural languagesCrawl through forms and loginsUnderstand how to scrape JavaScriptLearn image processing and text recognition