导言:初次爬虫,若有不足之处,多多指正,内容借鉴一位大神爬虫经历,我这边锦上添花,添加获取音乐播放路径和连接mysql数据库等相关内容,
涉及软件: Navicat for MySQL破解版 以及 postman
爬虫数据有: 歌词,歌曲,歌手,播放路径,封面图显示,歌曲时长,歌词次数大小等等
爬虫涉及模块:
import time import pymysql import requests from bs4 import BeautifulSoup from selenium import webdriver import jsonView Code
爬虫思路以及问题:
1·hash以及mid加密问题
2·歌曲播放路径以及歌曲详情请求路径巧妙绕过方案
3·获到数据编码问题
4·歌曲页面获取localstorage以及cookie值问题
5·连接数据库储存mysql问题
解决问题过程以其中一条歌曲路径为例:
URL:https://wwwapi.kugou.com/yy/index.php?r=play/getdata&callback=jQuery19108013258872165683_1631704461109&hash=BC4E172CF13BB79303203A48246D84E1&dfid=2C8TCD3wtvFq3A2P4h4Slbtf&appid=1014&mid=ee9a0573ca7b9cda6b916c684b10b6da&platid=4&album_id=38915273&_=1631704461111
至于这条URL怎么来的,暂时不管,先分析这条get请求
·涉及参数:
1·hash:BC4E172CF13BB79303203A48246D84E1
2·dfid:2C8TCD3wtvFq3A2P4h4Slbtf
3·appid:1014
4·mid:ee9a0573ca7b9cda6b916c684b10b6da
5·platid:4
6·_:1631085855865
7·callback:jQuery19108013258872165683_1631704461109
然后再postman打开这请求:显示结果如下
转码JSON:
jQuery19108013258872165683_1631704461109({ "status": 1, "err_code": 0, "data": { "hash": "BC4E172CF13BB79303203A48246D84E1", "timelength": 168000, "filesize": 2701932, "audio_name": "傅梦彤、安苏羽 - 潮汐 (Natural)", "have_album": 1, "album_name": "潮汐 (Natural)", "album_id": "38915273", "img": "http://imge.kugou.com/stdmusic/20201204/20201204164503970613.jpg", "have_mv": 1, "video_id": "4709291", "author_name": "傅梦彤、安苏羽", "song_name": "潮汐 (Natural)", "lyrics": "[id:$00000000]\r\n[ar:傅梦彤、安苏羽]\r\n[ti:潮汐 (Natural)]\r\n[by:]\r\n[hash:bc4e172cf13bb79303203a48246d84e1]\r\n[al:]\r\n[sign:]\r\n[qq:]\r\n[total:168829]\r\n[offset:0]\r\n[00:00.08]傅梦彤、安苏羽 - 潮汐 (Natural)\r\n[00:00.87]作词:安苏羽、舒心\r\n[00:01.12]混音:谢骁\r\n[00:21.12]当海面迎来汹涌的潮汐\r\n[00:23.55]我奔跑寻找昔日的足迹\r\n[00:26.18]夕阳下倒影迷人的美丽\r\n[00:28.71]可我却丢失故事和你\r\n[00:31.34]你说过向往大海的神秘\r\n[00:33.92]也憧憬我们遗失的过去\r\n[00:36.50]分享给大海秘密\r\n[00:39.74]蓝色的海底\r\n[00:42.27]远山的风景\r\n[00:45.16]我们的距离遥不可及\r\n[00:50.02]退守的爱情\r\n[00:52.70]还剩下回忆\r\n[00:55.03]疯狂地寻觅你的身影\r\n[01:00.69]残月忧郁\r\n[01:03.07]星夜静谧\r\n[01:05.60]潮落叹息\r\n[01:11.04]聆听山语\r\n[01:13.42]回荡不清\r\n[01:15.91]若即若离\r\n[01:23.05]当海面迎来汹涌的潮汐\r\n[01:25.53]我奔跑寻找昔日的足迹\r\n[01:28.11]夕阳下倒影迷人的美丽\r\n[01:30.65]可我却丢失故事和你\r\n[01:33.28]你说过向往大海的神秘\r\n[01:35.86]也憧憬我们遗失的过去\r\n[01:38.40]分享给大海秘密\r\n[01:41.59]蓝色的海底\r\n[01:44.27]远山的风景\r\n[01:46.90]我们的距离遥不可及\r\n[01:52.06]退守的爱情\r\n[01:54.59]还剩下回忆\r\n[01:57.12]疯狂地寻觅你的身影\r\n[02:02.59]残月忧郁\r\n[02:04.92]星夜静谧\r\n[02:07.55]潮落叹息\r\n[02:12.97]聆听山语\r\n[02:15.45]回荡不清\r\n[02:17.87]若即若离\r\n[02:23.24]残月忧郁\r\n[02:25.57]星夜静谧\r\n[02:28.15]潮落叹息\r\n[02:33.65]聆听山语\r\n[02:36.07]回荡不清\r\n[02:38.50]若即若离\r\n", "author_id": "968893", "privilege": 8, "privilege2": "1000", "play_url": "https://webfs.ali.kugou.com/202109151914/04b52e1a2976cec776bee89f7690b6ae/G226/M05/18/14/gocBAF87mjyAfkS4ACk6bFP5o1Q758.mp3", "authors": [ { "author_id": "968893", "author_name": "傅梦彤", "is_publish": "1", "sizable_avatar": "http://singerimg.kugou.com/uploadpic/softhead/{size}/20210610/20210610031307109445.jpg", "avatar": "http://singerimg.kugou.com/uploadpic/softhead/400/20210610/20210610031307109445.jpg" }, { "author_id": "87264", "author_name": "安苏羽", "is_publish": "1", "sizable_avatar": "http://singerimg.kugou.com/uploadpic/softhead/{size}/20190321/20190321201004866434.jpg", "avatar": "http://singerimg.kugou.com/uploadpic/softhead/400/20190321/20190321201004866434.jpg" } ], "is_free_part": 0, "bitrate": 128, "recommend_album_id": "38915273", "audio_id": "80133277", "has_privilege": true, "play_backup_url": "https://webfs.cloud.kugou.com/202109151914/b75e58fc8b2b687b5a4a65df5d319aa1/G226/M05/18/14/gocBAF87mjyAfkS4ACk6bFP5o1Q758.mp3" } });View Code
额......先下班了