用selenium控制已打开的浏览器

今天接了一个需求，要求去操作一个已打开的浏览器，执行相关的自动化脚本，原理则是根据调试端口通过webdriver调试浏览器。这里面比较特别的是它使用的是基于chromium的非常规浏览器，需要手动调用浏览器驱动。在这里分享一下，希望对未来踩坑的同学有帮助，顺便请教一下用puppeteer是否可以完成这些？

const webdriver = require('selenium-webdriver')
const chrome = require('selenium-webdriver/chrome')
const path = require('path')

async function main() {
    const options = new chrome.Options()
    options.options_["debuggerAddress"] = "127.0.0.1:8800";

    const service = new chrome.ServiceBuilder(path.join(__dirname, './chromedriver.exe')).build()
    chrome.setDefaultService(service)

    const driver = new webdriver.Builder()
        .setChromeOptions(options)
        .withCapabilities(webdriver.Capabilities.chrome())
        .build()

    await driver.get('https://www.baidu.com');
}

main()

spnt 1楼•6 年前

puppeteer 是更好的一个方案, 我的一个开源项目是用这个做的, https://github.com/zuoyanart/sparender

hyj1991 2楼•6 年前

@spnt 这个思路蛮有意思的，不过 CDN 怎么解决呢？CDN 没法过滤爬虫请求让回源吧

spnt 3楼•6 年前

@hyj1991 我刚看了七牛,七牛没有这个功能, 也就是说, 前端代码部署到七牛之类的, 就没办法根据蜘蛛做页面渲染了