前言:
现在咱们对“百度翻译js”大概比较关切,同学们都想要剖析一些“百度翻译js”的相关知识。那么小编也在网上收集了一些关于“百度翻译js””的相关文章,希望姐妹们能喜欢,朋友们快快来学习一下吧!js逆向爬取百度翻译简介
创建时间:2021-09-20 15:30 星期一
本文通过构造Python爬虫直接爬取百度翻译,其中涉及到一些js逆向,通过逆向分析并构造出相关参数,发送相关参数获取翻译结果。
一
打开百度翻译,尝试输入一些单词(随你啦),打开开发者工具查看Network
发现表单数据
from: ento: zhquery: testtranstype: translangsimple_means_flag: 3sign: 431039.159886token: 8a137e841303b997efa7ed8d0881c83adomain: common
from和to明显是从英文到中文query是查询的单词transtype是翻译类型sign和token估计是加密参数✒二
再换单词进行查询
from: ento: zhquery: resulttranstype: translangsimple_means_flag: 3sign: 586451.791010token: 8a137e841303b997efa7ed8d0881c83adomain: common
发现不同的参数只有sign,同时在Elements中发现token
在全部文件中查找sign
里面有很多只是含有sign,找到真正的sign,发现如下
sign: L(e)
在sign处加上断点并调试,点击下一步进行调试
发现跳转到的代码处r就是要查询的单词
复制function e(r)的代码到本地(我保存为sign.js)
function e(r) {var o = r.match(/[\uD800-\uDBFF][\uDC00-\uDFFF]/g);if (null === o) {var t = r.length;t > 30 && (r = "" + r.substr(0, 10) + r.substr(Math.floor(t / 2) - 5, 10) + r.substr(-10, 10))} else {for (var e = r.split(/[\uD800-\uDBFF][\uDC00-\uDFFF]/), C = 0, h = e.length, f = []; h > C; C++)"" !== e[C] && f.push.apply(f, a(e[C].split(""))),C !== h - 1 && f.push(o[C]);var g = f.length;g > 30 && (r = f.slice(0, 10).join("") + f.slice(Math.floor(g / 2) - 5, Math.floor(g / 2) + 5).join("") + f.slice(-10).join(""))}var u = void 0, l = "" + String.fromCharCode(103) + String.fromCharCode(116) + String.fromCharCode(107);u = null !== i ? i : (i = window[l] || "") || "";for (var d = u.split("."), m = Number(d[0]) || 0, s = Number(d[1]) || 0, S = [], c = 0, v = 0; v < r.length; v++) {var A = r.charCodeAt(v);128 > A ? S[c++] = A : (2048 > A ? S[c++] = A >> 6 | 192 : (55296 === (64512 & A) && v + 1 < r.length && 56320 === (64512 & r.charCodeAt(v + 1)) ? (A = 65536 + ((1023 & A) << 10) + (1023 & r.charCodeAt(++v)),S[c++] = A >> 18 | 240,S[c++] = A >> 12 & 63 | 128) : S[c++] = A >> 12 | 224,S[c++] = A >> 6 & 63 | 128),S[c++] = 63 & A | 128)}for (var p = m, F = "" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(97) + ("" + String.fromCharCode(94) + String.fromCharCode(43) + String.fromCharCode(54)), D = "" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(51) + ("" + String.fromCharCode(94) + String.fromCharCode(43) + String.fromCharCode(98)) + ("" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(102)), b = 0; b < S.length; b++)p += S[b],p = n(p, F);return p = n(p, D),p ^= s,0 > p && (p = (2147483647 & p) + 2147483648),p %= 1e6,p.toString() + "." + (p ^ m)}console.log(e('test')) //加上这句显示结果
使用node调试
node sign.js
报错i is not defined
回到开发者工具查看,一步步调试发现i="320305.131321201"
直接在代码中定义i="320305.131321201"
var i="320305.131321201"function e(r) {var o = r.match(/[\uD800-\uDBFF][\uDC00-\uDFFF]/g);if (null === o) {var t = r.length;t > 30 && (r = "" + r.substr(0, 10) + r.substr(Math.floor(t / 2) - 5, 10) + r.substr(-10, 10))} else {for (var e = r.split(/[\uD800-\uDBFF][\uDC00-\uDFFF]/), C = 0, h = e.length, f = []; h > C; C++)"" !== e[C] && f.push.apply(f, a(e[C].split(""))),C !== h - 1 && f.push(o[C]);var g = f.length;g > 30 && (r = f.slice(0, 10).join("") + f.slice(Math.floor(g / 2) - 5, Math.floor(g / 2) + 5).join("") + f.slice(-10).join(""))}var u = void 0, l = "" + String.fromCharCode(103) + String.fromCharCode(116) + String.fromCharCode(107);u = null !== i ? i : (i = window[l] || "") || "";for (var d = u.split("."), m = Number(d[0]) || 0, s = Number(d[1]) || 0, S = [], c = 0, v = 0; v < r.length; v++) {var A = r.charCodeAt(v);128 > A ? S[c++] = A : (2048 > A ? S[c++] = A >> 6 | 192 : (55296 === (64512 & A) && v + 1 < r.length && 56320 === (64512 & r.charCodeAt(v + 1)) ? (A = 65536 + ((1023 & A) << 10) + (1023 & r.charCodeAt(++v)),S[c++] = A >> 18 | 240,S[c++] = A >> 12 & 63 | 128) : S[c++] = A >> 12 | 224,S[c++] = A >> 6 & 63 | 128),S[c++] = 63 & A | 128)}for (var p = m, F = "" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(97) + ("" + String.fromCharCode(94) + String.fromCharCode(43) + String.fromCharCode(54)), D = "" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(51) + ("" + String.fromCharCode(94) + String.fromCharCode(43) + String.fromCharCode(98)) + ("" + String.fromCharCode(43) + String.fromCharCode(45) + String.fromCharCode(102)), b = 0; b < S.length; b++)p += S[b],p = n(p, F);return p = n(p, D),p ^= s,0 > p && (p = (2147483647 & p) + 2147483648),p %= 1e6,p.toString() + "." + (p ^ m)}console.log(e('test')) //加上这句显示结果
继续使用node调试,又报错n is not defined
再回到开发者工具,定位到n的位置,复制代码到sign.js
function n(r, o) {for (var t = 0; t < o.length - 2; t += 3) {var a = o.charAt(t + 2);a = a >= "a" ? a.charCodeAt(0) - 87 : Number(a),a = "+" === o.charAt(t + 1) ? r >>> a : r << a,r = "+" === o.charAt(t) ? r + a & 4294967295 : r ^ a}return r}
成功输出sign,对比发现确实是与之前获取的单词test的sign相同
三
编写Python代码
注意:下列代码需要调用node运行javascript,最好又node环境
import requestsimport jsonimport osfrom retry import retryimport timeclass BaiduFanyi: def __init__(self) -> None: self.url='' self.judge_type_url='; self.result='' self.headers={ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36', 'referer': ';, 'origin': ';, 'Cookie': 你的cookie } self.data={} //调用sign.js获取sign def get_sign(self,word): sign=os.popen(f'node sign.js {word}').read() return sign def get_response(self,url,data): response=requests.post(url,data=data,headers=self.headers).content.decode('utf8') return json.loads(response) def run(self): # 获取语言类型 print('欢迎使用━(*`∀´*)ノ亻!') while True: word=input('请输入要查询的单词或汉语(按q退出):') if word=='q': break else: os.system('cls') judge_lan=self.get_response(self.judge_type_url,{'query': word}) lan_type=judge_lan['lan'] if lan_type=='zh': tran_type='en' else: tran_type='zh' self.url=f'{lan_type}&to={tran_type}' self.data={ 'from': lan_type, 'to': tran_type, 'query': word, 'transtype': 'translang', 'simple_means_flag': 3, 'sign': self.get_sign(word).replace('\n',''), 'token': '8a137e841303b997efa7ed8d0881c83a', 'domain': 'common' } tran_data=self.get_response(self.url,self.data) try: self.result=" ".join(tran_data['dict_result']['simple_means']['word_means']) for i in range(22): time.sleep(0.01) print('\r'+'翻译中'+''*i,end='') print() print('') print('翻译结果:') print(self.result) print('⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡') except Exception as ret: print('查询无果!!!')test=BaiduFanyi()test.run()
运行结果
标签: #百度翻译js #baidujs #javascript百度