python简单的爬虫实例

爬取百度的html源码

from urllib import request

url = 'http://www.baidu.com'
headers = {'User-Agent':'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) '
                        'AppleWebKit/534.50 (KHTML, like Gecko) '
                        'Version/5.1 Safari/534.50'}

req = request.Request(url=url, headers=headers)
res = request.urlopen(req)
html = res.read().decode('utf-8')

print(html)

你可能感兴趣的