龙空技术网

c#爬虫-使用ChromeDriver 所见即所得

opendotnet 2719

前言:

此时大家对“js模拟hover”大体比较重视,我们都需要剖析一些“js模拟hover”的相关文章。那么小编在网络上汇集了一些对于“js模拟hover””的相关文章,希望你们能喜欢,小伙伴们快快来学习一下吧!

问题

最近在做爬虫的时候发现很多网页都是浏览器看得见,但是源文件是看不到的,也就是所谓的异步加载。这时候如果我们需要那些异步内容,要么是了解他的规则,进行条件的组合进而再次进行http请求,得到数据;这种方式有时候遇到逻辑复杂的就比较不好处理。这时候ChromeDriver就派上用场了。

办法

下面我们来看下这个例子

爬取腾讯视频,获取电视剧或电影链接。

浏览器是这样的查看文件是这样的,压根没有视频地址

使用ChromeOptions模拟用户行为

 ChromeOptions options = new ChromeOptions();

options.AddArguments("--test-type", "--ignore-certificate-errors");

options.AddArguments("user-agent=mozilla/5.0 (linux; u; android 2.3.3; en-us; sdk build/ gri34) applewebkit/533.1 (khtml, like gecko) version/4.0 mobile safari/533.1");

options.AddArgument("enable-automation");

// options.AddArgument("headless");

// options.AddArguments("--proxy-server=");

// IWebDriver driver = new ChromeDriver(System.Environment.CurrentDirectory, options);

//chromeDriverService System.Environment.CurrentDirectory System.Environment.CurrentDirectory

using (IWebDriver driver = new OpenQA.Selenium.Chrome.ChromeDriver(@"C:\Users\Administrator\Downloads\chromedriver_win32", options, TimeSpan.FromSeconds(120)))

{

// trylogin(driver);

driver.Url = ";auto=0&vid=z0023uikqoj";

//tenvideo_video_player_0

SetText(driver.PageSource);

////Thread.Sleep(200);

////try

////{

//// for (int a = 1; a < 2; a++)

//// {

//// SetText("\r\n第" + a.ToString() + "个");

//// driver.Navigate().GoToUrl(";imageType=oss&imageAddress=cbuimgsearch/eWXC7XHHPN1607529600000&spm=");

//// //登录

//// if (driver.Url.Contains("login.1688.com"))

//// {

//// SetText("\r\n需要登录,开始尝试...");

//// trylogin(driver); //尝试登录完成

//// //再试试

//// driver.Navigate().GoToUrl(";imageType=oss&imageAddress=cbuimgsearch/eWXC7XHHPN1607529600000&spm=");

//// if (driver.Url.Contains("login.1688.com"))

//// {

//// //没办法退出

//// SetText("\r\n退出,换ip重试...");

//// return;

//// }

//// }

//// //鼠标放上去的内容因为页面自带只能显示一个的原因 没办法做到全部显示 然后在下载 只能是其他方式下载

//// // var elements = document.getElementsByClassName('hover-container');

//// // Array.prototype.forEach.call(elements, function(element) {

//// // element.style.display = "block";

//// // console.log(element);

//// // });

//// IJavaScriptExecutor js = (IJavaScriptExecutor)driver;

//// var sss = js.ExecuteScript(" var elements = document.getElementsByClassName('hover-container'); Array.prototype.forEach.call(elements, function(element) { console.log(element); element.setAttribute(\"class\", \"测试title\"); element.style.display = \"block\"; console.log(element); });");

//// Thread.Sleep(500);

//// var responseModel = Write(driver.PageSource, Pagetypeenum.列表);

//// Thread.Sleep(500);

//// int i = 1;

//// foreach (var offer in responseModel?.data?.offerList ?? new List<OfferItemModel>())

//// {

//// driver.Navigate().GoToUrl(offer.information.detailUrl);

//// string responseDatadetail = driver.PageSource;

//// Write(driver.PageSource, Pagetypeenum.详情);

//// SetText("\r\n第" + a.ToString() + "-" + i.ToString() + "个");

//// Thread.Sleep(500);

//// i++;

//// }

//// }

////}

////catch (Exception ex)

////{

//// CloseChromeDriver(driver);

//// throw;

////}

}

// Thread thread = new Thread(go);

// thread.Start();

}

得到网页信息SetText(driver.PageSource);

 private void button2_Click(object sender, EventArgs e)

{

//文件路径

string filePath = @"G:\conan\reptiles1688\bin\Debug\test.txt";

using (FileStream fsRead = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))

{

int fsLen = (int)fsRead.Length;

byte[] heByte = new byte[fsLen];

fsRead.Read(heByte, 0, heByte.Length);

string myStr = System.Text.Encoding.Default.GetString(heByte);

this.textBox1.Text = myStr;///读取

}

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

doc.LoadHtml(this.textBox1.Text);

HtmlNode node = doc.GetElementbyId("tenvideo_video_player_0");

textBox1.Text = node.Attributes["src"].Value;

// var node = doc.DocumentNode.SelectNodes("//video[@id='tenvideo_video_player_0']//video");

// textBox1.Text = (node[3].InnerHtml);

}

}

解析得到我们想到的视频地址。

标签: #js模拟hover