python实现从web抓取文档的方法，python爬取数据

来源：未知浏览 125次时间 2021-06-13 03:44

实例代码如下：

import urllib doc = urllib.urlopen("http://www.python.org").read() print doc#直接打印出网页 def reporthook(*a): print a #将网页保存到renre.html中网站制作

实例代码如下：

python实现从web抓取文档的方法

!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"..........................网页内容/body/html(0, 8192, -1)(1, 8192, -1)(2, 8192, -1)

其中urllib.urlopen返回一个类文件对象。

网站制作#每读取一个块调用一字reporthook函数 urllib.urlretrieve("http://www.renren.com",'renren.html',reporthook) #将网页保存到renre.html中 urllib.urlretrieve("http://www.renren.com",'renren.html')

程序运行结果如下：

python实现从web抓取文档的方法

标签： http 网页 reporthook 8192

上一篇: python实现dict版图遍历示例
下一篇: python实现从字符串中找出字符1的位置以及个数的，python字符串提取数字

python实现从web抓取文档的方法，python爬取数据

热门文章

最新文章