Python 爬虫/抓取框架
BeautifulSoup
BeautifulSoup (美丽的汤)是一个纯 Python 的html(xml)解释器,为许多 python 开发者所钟爱,其官方网站已有详尽的文档可作参考。
Scrapy
解析支持 HTML,XML,CSV ,Javascript。Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide ran
Ruya
It is targeted solely towards developers who want crawling functionality in their code.
mechanize
非常高层次的浏览功能(超简单的表格填写和提交)
