TL;DR
- Selenium WebDriver automates real browsers (Chrome, Firefox, Safari, Edge) for testing web applications
- Start with Python — simpler syntax, faster feedback loop for beginners
- Master locators (ID, CSS, XPath) and explicit waits before writing complex tests
- Use Page Object Model from day one — refactoring later is painful
Best for: QA engineers starting browser automation, developers writing E2E tests Skip if: You only need API testing (use Postman) or already know Playwright/Cypress Read time: 15 minutes
Your first Selenium test will probably fail. Not because Selenium is broken, but because web automation has timing issues you haven’t encountered before. Elements load asynchronously. Buttons become clickable after JavaScript executes. Forms validate on blur.
This tutorial teaches you Selenium the right way — handling these real-world challenges from the start.
What is Selenium WebDriver?
Selenium WebDriver is a browser automation tool that controls real browsers programmatically. Unlike tools that simulate browsers, Selenium drives actual Chrome, Firefox, Safari, and Edge instances.
Core components:
- WebDriver API — language bindings (Python, Java, C#, JavaScript, Ruby)
- Browser drivers — ChromeDriver, GeckoDriver, SafariDriver
- Selenium Grid — distributed test execution across machines
Selenium is free, open-source, and maintained by a large community. It’s the foundation that tools like Appium (mobile) and Selenide (Java wrapper) build upon.
Environment Setup
Python Setup
# Install Selenium
pip install selenium
# Install WebDriver Manager (auto-downloads drivers)
pip install webdriver-manager
# test_first.py
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
# Automatic driver management
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get("https://example.com")
print(f"Page title: {driver.title}")
driver.quit()
Java Setup
<!-- pom.xml -->
<dependencies>
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>4.18.1</version>
</dependency>
<dependency>
<groupId>io.github.bonigarcia</groupId>
<artifactId>webdrivermanager</artifactId>
<version>5.7.0</version>
</dependency>
</dependencies>
// FirstTest.java
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import io.github.bonigarcia.wdm.WebDriverManager;
public class FirstTest {
public static void main(String[] args) {
WebDriverManager.chromedriver().setup();
WebDriver driver = new ChromeDriver();
driver.get("https://example.com");
System.out.println("Page title: " + driver.getTitle());
driver.quit();
}
}
Locator Strategies
Locators find elements on the page. Choose the right strategy for maintainable tests.
Priority Order (Best to Worst)
| Strategy | Example | When to Use |
|---|---|---|
| ID | #login-button | Unique IDs (best option) |
| Name | [name="email"] | Form inputs |
| CSS Selector | .btn.primary | Most elements |
| Link Text | Sign In | Links with stable text |
| XPath | //div[@class='card'] | Complex relationships |
Python Examples
from selenium.webdriver.common.by import By
# By ID (fastest, most reliable)
driver.find_element(By.ID, "username")
# By CSS Selector (flexible, readable)
driver.find_element(By.CSS_SELECTOR, "button.submit-btn")
driver.find_element(By.CSS_SELECTOR, "[data-testid='login']")
# By XPath (when CSS fails)
driver.find_element(By.XPATH, "//button[contains(text(), 'Submit')]")
driver.find_element(By.XPATH, "//div[@class='form']//input[@type='email']")
# By Link Text
driver.find_element(By.LINK_TEXT, "Forgot Password?")
driver.find_element(By.PARTIAL_LINK_TEXT, "Forgot")
Java Examples
import org.openqa.selenium.By;
// By ID
driver.findElement(By.id("username"));
// By CSS Selector
driver.findElement(By.cssSelector("button.submit-btn"));
driver.findElement(By.cssSelector("[data-testid='login']"));
// By XPath
driver.findElement(By.xpath("//button[contains(text(), 'Submit')]"));
// By Link Text
driver.findElement(By.linkText("Forgot Password?"));
CSS vs XPath Decision
Use CSS when:
- Selecting by class, ID, or attribute
- Element has
data-testidor similar test attribute - Performance matters (CSS is faster)
Use XPath when:
- Finding element by text content
- Navigating to parent elements
- Complex conditions (
and,or,contains)
# CSS: cleaner for attributes
driver.find_element(By.CSS_SELECTOR, "[data-testid='submit-btn']")
# XPath: necessary for text matching
driver.find_element(By.XPATH, "//button[text()='Submit Order']")
# XPath: parent navigation (CSS can't do this)
driver.find_element(By.XPATH, "//span[text()='Error']/parent::div")
Waits: The Most Important Concept
90% of flaky Selenium tests fail because of timing. Elements aren’t ready when your code tries to interact with them.
Types of Waits
| Type | How It Works | When to Use |
|---|---|---|
| Implicit | Polls DOM for N seconds | Never (global, hides real issues) |
| Explicit | Waits for specific condition | Always (precise, readable) |
| Fluent | Explicit + custom polling | Slow-loading elements |
Explicit Waits (Python)
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
wait = WebDriverWait(driver, 10) # Max 10 seconds
# Wait for element to be clickable
button = wait.until(EC.element_to_be_clickable((By.ID, "submit")))
button.click()
# Wait for element to be visible
message = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, "success")))
# Wait for element to disappear
wait.until(EC.invisibility_of_element_located((By.ID, "loading")))
# Wait for text to appear
wait.until(EC.text_to_be_present_in_element((By.ID, "status"), "Complete"))
# Custom condition
wait.until(lambda d: len(d.find_elements(By.CLASS_NAME, "item")) > 5)
Explicit Waits (Java)
import org.openqa.selenium.support.ui.WebDriverWait;
import org.openqa.selenium.support.ui.ExpectedConditions;
import java.time.Duration;
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
// Wait for element to be clickable
WebElement button = wait.until(
ExpectedConditions.elementToBeClickable(By.id("submit"))
);
button.click();
// Wait for visibility
WebElement message = wait.until(
ExpectedConditions.visibilityOfElementLocated(By.className("success"))
);
// Wait for element to disappear
wait.until(ExpectedConditions.invisibilityOfElementLocated(By.id("loading")));
Common Wait Conditions
from selenium.webdriver.support import expected_conditions as EC
# Element state
EC.presence_of_element_located((By.ID, "elem")) # In DOM
EC.visibility_of_element_located((By.ID, "elem")) # Visible
EC.element_to_be_clickable((By.ID, "elem")) # Clickable
EC.invisibility_of_element_located((By.ID, "elem")) # Hidden/gone
# Page state
EC.title_contains("Dashboard")
EC.url_contains("/dashboard")
EC.alert_is_present()
# Multiple elements
EC.presence_of_all_elements_located((By.CLASS_NAME, "item"))
Complete Test Example
Login Test (Python)
# tests/test_login.py
import pytest
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
class TestLogin:
def setup_method(self):
self.driver = webdriver.Chrome(
service=Service(ChromeDriverManager().install())
)
self.driver.implicitly_wait(0) # Disable implicit waits
self.wait = WebDriverWait(self.driver, 10)
def teardown_method(self):
self.driver.quit()
def test_successful_login(self):
self.driver.get("https://example.com/login")
# Fill login form
email_input = self.wait.until(
EC.visibility_of_element_located((By.ID, "email"))
)
email_input.send_keys("user@example.com")
password_input = self.driver.find_element(By.ID, "password")
password_input.send_keys("password123")
# Submit form
submit_btn = self.driver.find_element(By.CSS_SELECTOR, "[type='submit']")
submit_btn.click()
# Verify redirect to dashboard
self.wait.until(EC.url_contains("/dashboard"))
welcome_message = self.wait.until(
EC.visibility_of_element_located((By.CLASS_NAME, "welcome"))
)
assert "Welcome" in welcome_message.text
def test_invalid_credentials(self):
self.driver.get("https://example.com/login")
self.wait.until(
EC.visibility_of_element_located((By.ID, "email"))
).send_keys("invalid@example.com")
self.driver.find_element(By.ID, "password").send_keys("wrongpass")
self.driver.find_element(By.CSS_SELECTOR, "[type='submit']").click()
error_message = self.wait.until(
EC.visibility_of_element_located((By.CLASS_NAME, "error"))
)
assert "Invalid credentials" in error_message.text
Login Test (Java)
// src/test/java/LoginTest.java
import org.junit.jupiter.api.*;
import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.support.ui.*;
import io.github.bonigarcia.wdm.WebDriverManager;
import java.time.Duration;
import static org.junit.jupiter.api.Assertions.*;
class LoginTest {
private WebDriver driver;
private WebDriverWait wait;
@BeforeEach
void setup() {
WebDriverManager.chromedriver().setup();
driver = new ChromeDriver();
wait = new WebDriverWait(driver, Duration.ofSeconds(10));
}
@AfterEach
void teardown() {
driver.quit();
}
@Test
void testSuccessfulLogin() {
driver.get("https://example.com/login");
WebElement emailInput = wait.until(
ExpectedConditions.visibilityOfElementLocated(By.id("email"))
);
emailInput.sendKeys("user@example.com");
driver.findElement(By.id("password")).sendKeys("password123");
driver.findElement(By.cssSelector("[type='submit']")).click();
wait.until(ExpectedConditions.urlContains("/dashboard"));
WebElement welcomeMessage = wait.until(
ExpectedConditions.visibilityOfElementLocated(By.className("welcome"))
);
assertTrue(welcomeMessage.getText().contains("Welcome"));
}
}
Page Object Model
Page Object Model (POM) separates page structure from test logic. Each page becomes a class with elements and actions.
Python Page Object
# pages/login_page.py
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
class LoginPage:
URL = "https://example.com/login"
# Locators
EMAIL_INPUT = (By.ID, "email")
PASSWORD_INPUT = (By.ID, "password")
SUBMIT_BUTTON = (By.CSS_SELECTOR, "[type='submit']")
ERROR_MESSAGE = (By.CLASS_NAME, "error")
def __init__(self, driver):
self.driver = driver
self.wait = WebDriverWait(driver, 10)
def open(self):
self.driver.get(self.URL)
self.wait.until(EC.visibility_of_element_located(self.EMAIL_INPUT))
return self
def login(self, email: str, password: str):
self.driver.find_element(*self.EMAIL_INPUT).send_keys(email)
self.driver.find_element(*self.PASSWORD_INPUT).send_keys(password)
self.driver.find_element(*self.SUBMIT_BUTTON).click()
def get_error_message(self) -> str:
error = self.wait.until(
EC.visibility_of_element_located(self.ERROR_MESSAGE)
)
return error.text
# pages/dashboard_page.py
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
class DashboardPage:
WELCOME_MESSAGE = (By.CLASS_NAME, "welcome")
USER_MENU = (By.ID, "user-menu")
LOGOUT_LINK = (By.LINK_TEXT, "Logout")
def __init__(self, driver):
self.driver = driver
self.wait = WebDriverWait(driver, 10)
def is_loaded(self) -> bool:
self.wait.until(EC.url_contains("/dashboard"))
return True
def get_welcome_text(self) -> str:
message = self.wait.until(
EC.visibility_of_element_located(self.WELCOME_MESSAGE)
)
return message.text
def logout(self):
self.driver.find_element(*self.USER_MENU).click()
self.wait.until(
EC.element_to_be_clickable(self.LOGOUT_LINK)
).click()
# tests/test_login_pom.py
import pytest
from pages.login_page import LoginPage
from pages.dashboard_page import DashboardPage
class TestLoginPOM:
def test_successful_login(self, driver):
login_page = LoginPage(driver)
login_page.open()
login_page.login("user@example.com", "password123")
dashboard = DashboardPage(driver)
assert dashboard.is_loaded()
assert "Welcome" in dashboard.get_welcome_text()
def test_invalid_login(self, driver):
login_page = LoginPage(driver)
login_page.open()
login_page.login("invalid@example.com", "wrongpass")
assert "Invalid credentials" in login_page.get_error_message()
Java Page Object
// pages/LoginPage.java
import org.openqa.selenium.*;
import org.openqa.selenium.support.ui.*;
import java.time.Duration;
public class LoginPage {
private final WebDriver driver;
private final WebDriverWait wait;
private static final String URL = "https://example.com/login";
// Locators
private final By emailInput = By.id("email");
private final By passwordInput = By.id("password");
private final By submitButton = By.cssSelector("[type='submit']");
private final By errorMessage = By.className("error");
public LoginPage(WebDriver driver) {
this.driver = driver;
this.wait = new WebDriverWait(driver, Duration.ofSeconds(10));
}
public LoginPage open() {
driver.get(URL);
wait.until(ExpectedConditions.visibilityOfElementLocated(emailInput));
return this;
}
public void login(String email, String password) {
driver.findElement(emailInput).sendKeys(email);
driver.findElement(passwordInput).sendKeys(password);
driver.findElement(submitButton).click();
}
public String getErrorMessage() {
return wait.until(
ExpectedConditions.visibilityOfElementLocated(errorMessage)
).getText();
}
}
Handling Common Scenarios
Dropdowns
from selenium.webdriver.support.ui import Select
# Standard HTML select
dropdown = Select(driver.find_element(By.ID, "country"))
dropdown.select_by_visible_text("United States")
dropdown.select_by_value("us")
dropdown.select_by_index(1)
# Custom dropdown (React, Vue, etc.)
driver.find_element(By.CSS_SELECTOR, ".dropdown-trigger").click()
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".dropdown-menu")))
driver.find_element(By.XPATH, "//li[text()='United States']").click()
Alerts
# Accept alert
alert = wait.until(EC.alert_is_present())
alert.accept()
# Dismiss alert
alert.dismiss()
# Get alert text
alert_text = alert.text
# Type into prompt
alert.send_keys("My input")
alert.accept()
Frames and Windows
# Switch to iframe
driver.switch_to.frame("iframe-name")
driver.switch_to.frame(driver.find_element(By.ID, "my-iframe"))
# Switch back to main content
driver.switch_to.default_content()
# Handle new window/tab
original_window = driver.current_window_handle
driver.find_element(By.LINK_TEXT, "Open New Tab").click()
wait.until(EC.number_of_windows_to_be(2))
for handle in driver.window_handles:
if handle != original_window:
driver.switch_to.window(handle)
break
# Do something in new window
driver.close()
driver.switch_to.window(original_window)
Screenshots
# Full page screenshot
driver.save_screenshot("screenshot.png")
# Element screenshot
element = driver.find_element(By.ID, "chart")
element.screenshot("chart.png")
# Screenshot on failure (pytest fixture)
@pytest.fixture
def driver():
d = webdriver.Chrome()
yield d
if hasattr(sys, '_current_test_failed') and sys._current_test_failed:
d.save_screenshot(f"failure_{datetime.now().isoformat()}.png")
d.quit()
Headless Mode and CI/CD
Headless Browser
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--headless=new")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--window-size=1920,1080")
driver = webdriver.Chrome(options=options)
GitHub Actions Integration
# .github/workflows/selenium-tests.yml
name: Selenium Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install selenium pytest webdriver-manager
- name: Run tests
run: pytest tests/ -v --tb=short
- name: Upload screenshots
if: failure()
uses: actions/upload-artifact@v4
with:
name: screenshots
path: screenshots/
AI-Assisted Selenium Development
AI tools can accelerate Selenium test development when used correctly.
What AI does well:
- Generating locator strategies from HTML snippets
- Converting manual test steps to Selenium code
- Writing Page Object boilerplate
- Explaining cryptic error messages
- Suggesting wait strategies for specific scenarios
What still needs humans:
- Deciding which tests to automate
- Debugging timing-related flakiness
- Choosing between locator strategies for ambiguous elements
- Understanding application-specific behavior
Useful prompt:
I have this HTML form:
<form id="login">
<input type="email" name="email" placeholder="Email">
<input type="password" name="password">
<button type="submit">Sign In</button>
</form>
Write a Selenium Python test that:
1. Fills in email and password
2. Submits the form
3. Waits for redirect to /dashboard
4. Verifies a welcome message appears
Use explicit waits and proper locators.
FAQ
Is Selenium hard to learn?
Selenium basics take 2-4 weeks to learn. The WebDriver API is straightforward — find_element(), click(), send_keys(). The challenge is mastering waits, choosing stable locators, and structuring tests with Page Object Model. Practice with a real application, not just tutorials.
Which language is best for Selenium?
Python for beginners — simple syntax, quick feedback, excellent documentation. Java for enterprise projects with existing Java infrastructure and CI/CD pipelines. JavaScript if your team already uses Node.js. All languages have mature Selenium support.
Is Selenium still relevant in 2026?
Yes. Selenium remains the industry standard with the largest community, most tutorials, and broadest browser support. While Playwright and Cypress offer better developer experience, Selenium integrates with more tools, supports more browsers, and has more enterprise adoption.
What is the difference between Selenium and Playwright?
Selenium controls browsers through WebDriver protocol (W3C standard). Playwright uses Chrome DevTools Protocol for Chromium browsers. Playwright has better auto-wait, built-in assertions, trace viewer, and parallel execution. Selenium has broader browser support, larger community, and more learning resources. For new projects with Chromium focus, Playwright is often the better choice. For existing projects or Safari/legacy browser needs, Selenium remains solid.
When to Choose Selenium
Choose Selenium when:
- Team has existing Selenium expertise
- Need Safari or legacy browser support
- Corporate environment with Selenium infrastructure
- Want maximum community support and resources
- Using Appium for mobile (same API)
Consider alternatives when:
- Starting fresh with Chromium-only needs (Playwright)
- Component testing with JavaScript framework (Cypress)
- Prefer auto-wait and simpler API (Playwright)
- Need video recording and trace debugging (Playwright)
Official Resources
See Also
- Playwright Comprehensive Guide - Modern alternative with better developer experience
- Cypress Tutorial - JavaScript-focused E2E testing
- Selenium WebDriver 2025 - Deep dive into advanced Selenium features
- TestNG vs JUnit5 - Test framework comparison for Java Selenium projects
