spider代做 | 代做Python | 爬虫代做 - ass 3 – 学霸代写– CS代写| CS作业代写| Computer Science留学生作业代写| 淘宝托管

ass 3

spider代做 | 代做Python | 爬虫代做 – 该题目是一个常规的爬虫的练习题目代写, 是比较典型的bs4等代写方向

# Part 1. Get information from  html content.

import requests

# Get douban book list under some tag. eg: 
def get_response_by_douban_tag(tag):
url = 'https://book.douban.com/tag/' + tag
headers = {'User-Agent':"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.
(KHTML, like Gecko) Chrome/62.0.3202.62 Safari/537.36"}
resp = requests.get(url,headers=headers)
return resp

tag = ''
resp = get_response_by_douban_tag(tag)

from bs4 import BeautifulSoup
import re

# Extract book information from the html text
soup=BeautifulSoup(resp.text,'html.parser')
items=soup.find_all(class_='subject-item')
for item in items:
title = item.find('h2').find('a').get_text(strip=True)
link = item.find('h2').find('a').get('href')
publish_info = item.find('div', attrs={'class':'pub'}).get_text(strip=True)
score = item.find('span', attrs={'class':
'rating_nums'}).get_text(strip=True)
comment_text = item.find('span', attrs={'class':
'pl'}).get_text(strip=True).strip('()')
comment_num = int(re.search(r'\d+', comment_text).group())

print('Book Title: ', title)
print('Book Link: ', link)
print('Publish Infomation: ', publish_info)
print('Douban Score: ', score)
print('Douban Comment Number: ', comment_num)

print()

# Part2. Get stock information from JSON content.

import requests
import json
import time

def crawl_finance_info(code):

url =
'http://emweb.securities.eastmoney.com/PC_HSF10/NewFinanceAnalysis/ZYZBAjaxNew?
type=0&code={code}'.format(code=code)
resp = requests.get(url)
data = resp.json()
return data

code = 'SZ300059'
data = crawl_finance_info(code)

# Extract stock finance information from json content.
for item in data['data']:
print(item['REPORT_DATE_NAME'])
print(item['NOTICE_DATE'])
print(item['REPORT_DATE'])
print(item['CURRENCY']) # 
print(item['TOTALOPERATEREVE']) #
print(item['BPS']) # ()
print(item['EPSJB']) # ()
print(item['MGZBGJ']) # ()
print()

Documentation

In the first part, we extract the book list in Douban website based on the tag user input. We firstly compose the tag and the url, then we send request to Douban server. The server return the html content. We use BeautifulSoup to extract book information such as title, url link, publish information, score, comment number and so on. The tag can be user defined by anyone else.

In the second part, we extract the stock finance information from Eastmoney.com which is a famous stock website in China. You can input the stock code and get finance information when the company report its finance data. When we send request to the website server, we then get the json content. We use the Python json library to extract the important data.

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Posts

spider代做 | 代做Python | 爬虫代做 – ass 3

ass 3