- Published on
Browser Automation: Pyppeteer and Undetected ChromeDriver
- Authors
- Name
- Shelton Ma
Pyppeteer
1. Environment Setup: Install Chromium
# Download blocked? Copy chromium_588429.zip from /data
# Extract it to the directory: /home/{{user}}/.local/share/pyppeteer/local-chromium/588429/, replace {{user}} with your username
unzip chromium_588429.zip -d /home/{{user}}/.local/share/pyppeteer/local-chromium/588429/
# Verify installation
import pyppeteer
pyppeteer.executablePath()
# Output: '/home/{{user}}/.local/share/pyppeteer/local-chromium/588429/chrome-linux/chrome'
2. Configuration
- Headless Detection: Headless Chrome detection and anti-detection
- Additional Resources::
Undetected ChromeDriver
1. Environment Setup: Custom Environment
# 下载指定版本的chromedriver
wget https://chromedriver.storage.googleapis.com/112.0.5615.49/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
# chromedriver无需指定目录, 默认使用
~/.local/share/undetected_chromedriver/chromedriver
# 下载浏览器
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
# dpkg -i google-chrome-stable_current_amd64.deb 将会安装在/usr/bin/google-chromes
# 解压在自定义目录
dpkg -x google-chrome-stable_current_amd64.deb ~/.local/share/undetected_chromedriver/
# chrome path
~/.local/share/undetected_chromedriver/opt/google/chrome/google-chrome
# 检查版本
~/.local/share/undetected_chromedriver/opt/google/chrome/google-chrome --version
~/.local/share/undetected_chromedriver/chromedriver --version
2. Code Example
import undetected_chromedriver as uc
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--proxy-server={}'.format(get_jp_intranet_proxies()))
options.add_argument('--ignore-ssl-errors=true')
browser_path='~/.local/share/undetected_chromedriver/opt/google/chrome/google-chrome'
driver = uc.Chrome(options=options, browser_executable_path=browser_path, headless=True)
driver.get(url)
3. Error Handling
Issue: Chromedriver Deletion
Symptom:
[Errno 2] No such file or directory: '~/.local/share/undetected_chromedriver/undetected/chromedriver' -> '~/.local/share/undetected_chromedriver/undetected_chromedriver'
Cause: uc.Chrome() downloads the latest chromedriver and deletes it during driver.quit() (see undetected_chromedriver/patcher.py
__del__
). Multi-process usage may cause conflicts.Solution, Use a fixed driver_executable_path to avoid dynamic downloads.
Download a Matching Chromedriver:
cd ~/.local/share/undetected_chromedriver wget https://chromedriver.storage.googleapis.com/112.0.5615.49/chromedriver_linux64.zip unzip chromedriver_linux64.zip # Keep versioned files for coexistence cp chromedriver chromedriver_112
Specify Driver and Browser Paths:
browser_path='~/.local/share/undetected_chromedriver/opt/google/chrome/google-chrome' driver_path='~/.local/share/undetected_chromedriver/chromedriver_112' driver = uc.Chrome(options=options, driver_executable_path=driver_path, browser_executable_path=browser_path, headless=True, version_main=112) driver.get(url)
What Is Undetected ChromeDriver?
Key Mechanism: The undetected_chromedriver/patcher.py file uses the self.patch_exe() method for patching.