What Is Selenium WebDriver?
Selenium WebDriver is the most established web browser automation tool, used by millions of testers worldwide. It provides a programming interface to control web browsers through the W3C WebDriver protocol.
Selenium Architecture
Test Code (Java/Python/JS/C#)
↓
WebDriver API
↓
Browser Driver (ChromeDriver, GeckoDriver)
↓
Browser (Chrome, Firefox, Safari, Edge)
Your test code calls the WebDriver API, which sends commands to the browser-specific driver, which controls the actual browser.
Setting Up Selenium
JavaScript (WebdriverIO)
npm init -y
npm install webdriverio @wdio/cli @wdio/mocha-framework
npx wdio config
Java (Maven)
<dependencies>
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>4.18.0</version>
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>7.9.0</version>
</dependency>
</dependencies>
Python
pip install selenium pytest
Writing Your First Test
JavaScript (WebdriverIO)
describe('Login Page', () => {
it('should login with valid credentials', async () => {
await browser.url('/login');
await $('#email').setValue('admin@test.com');
await $('#password').setValue('secret123');
await $('button[type="submit"]').click();
await expect(browser).toHaveUrl('/dashboard');
await expect($('.welcome')).toHaveText('Welcome, Admin');
});
});
Java
public class LoginTest {
WebDriver driver;
@BeforeMethod
public void setup() {
driver = new ChromeDriver();
driver.manage().window().maximize();
}
@Test
public void testValidLogin() {
driver.get("https://app.example.com/login");
driver.findElement(By.id("email")).sendKeys("admin@test.com");
driver.findElement(By.id("password")).sendKeys("secret123");
driver.findElement(By.cssSelector("button[type='submit']")).click();
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
wait.until(ExpectedConditions.urlContains("/dashboard"));
String welcome = driver.findElement(By.className("welcome")).getText();
assertEquals(welcome, "Welcome, Admin");
}
@AfterMethod
public void teardown() {
driver.quit();
}
}
Python
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
def test_valid_login():
driver = webdriver.Chrome()
driver.get("https://app.example.com/login")
driver.find_element(By.ID, "email").send_keys("admin@test.com")
driver.find_element(By.ID, "password").send_keys("secret123")
driver.find_element(By.CSS_SELECTOR, "button[type='submit']").click()
wait = WebDriverWait(driver, 10)
wait.until(EC.url_contains("/dashboard"))
welcome = driver.find_element(By.CLASS_NAME, "welcome").text
assert welcome == "Welcome, Admin"
driver.quit()
Locator Strategies
| Strategy | Example | Reliability |
|---|---|---|
By.id | By.id("email") | High (if unique) |
By.css | By.css("[data-testid='email']") | High |
By.xpath | By.xpath("//input[@name='email']") | Medium |
By.name | By.name("email") | Medium |
By.className | By.className("input-email") | Low |
By.tagName | By.tagName("input") | Very low |
By.linkText | By.linkText("Sign In") | Medium |
Best Locator Practices
- Prefer data-testid attributes:
[data-testid="login-submit"] - Use CSS selectors over XPath when possible — they are faster
- Avoid absolute XPath:
/html/body/div[3]/form/input[2]breaks easily - Avoid classes used for styling:
.btn-primarymay change during redesign - Use relative XPath when needed:
//button[contains(text(), 'Submit')]
Waits
Implicit Wait
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));
Sets a global timeout for all element lookups. Simple but can mask timing issues.
Explicit Wait (Recommended)
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
// Wait for element to be visible
WebElement element = wait.until(
ExpectedConditions.visibilityOfElementLocated(By.id("dashboard"))
);
// Wait for element to be clickable
wait.until(ExpectedConditions.elementToBeClickable(By.id("submit")));
// Wait for text to appear
wait.until(ExpectedConditions.textToBePresentInElementLocated(
By.className("status"), "Complete"
));
// Wait for URL change
wait.until(ExpectedConditions.urlContains("/dashboard"));
Fluent Wait
Wait<WebDriver> fluentWait = new FluentWait<>(driver)
.withTimeout(Duration.ofSeconds(30))
.pollingEvery(Duration.ofMillis(500))
.ignoring(NoSuchElementException.class);
WebElement element = fluentWait.until(d -> d.findElement(By.id("dynamic-element")));
Advanced Interactions
Actions API
Actions actions = new Actions(driver);
// Hover over element
actions.moveToElement(menuItem).perform();
// Drag and drop
actions.dragAndDrop(source, target).perform();
// Right-click
actions.contextClick(element).perform();
// Double-click
actions.doubleClick(element).perform();
// Keyboard shortcuts
actions.keyDown(Keys.CONTROL).click(link).keyUp(Keys.CONTROL).perform();
Handling Dropdowns
Select dropdown = new Select(driver.findElement(By.id("country")));
dropdown.selectByVisibleText("United States");
dropdown.selectByValue("US");
dropdown.selectByIndex(5);
Handling Alerts
Alert alert = driver.switchTo().alert();
String alertText = alert.getText();
alert.accept(); // Click OK
alert.dismiss(); // Click Cancel
alert.sendKeys("input text"); // Type into prompt
Handling Frames and Windows
// Switch to iframe
driver.switchTo().frame("frame-name");
driver.switchTo().frame(0); // by index
driver.switchTo().defaultContent(); // back to main page
// Handle new window/tab
String originalWindow = driver.getWindowHandle();
// ... action that opens new window
for (String handle : driver.getWindowHandles()) {
if (!handle.equals(originalWindow)) {
driver.switchTo().window(handle);
break;
}
}
Screenshots
File screenshot = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
FileUtils.copyFile(screenshot, new File("screenshots/test-failure.png"));
JavaScript Execution
JavascriptExecutor js = (JavascriptExecutor) driver;
js.executeScript("window.scrollTo(0, document.body.scrollHeight)");
js.executeScript("arguments[0].click()", hiddenButton);
String title = (String) js.executeScript("return document.title");
Selenium Best Practices
- Always use explicit waits — never Thread.sleep()
- Quit the driver in teardown — prevent browser zombie processes
- Use the Page Object Model — separate test logic from page details
- Prefer CSS selectors — faster and more readable than XPath
- Run in headless mode for CI —
options.addArguments("--headless") - Set reasonable timeouts — 10-30 seconds for explicit waits
- Handle stale element exceptions — re-locate elements when the DOM changes
Exercise: Build a Selenium Test Suite
Create a Selenium test suite for a web application:
- Set up a project with Selenium + your preferred language
- Write a BasePage class with common methods (navigate, wait, screenshot)
- Create page objects for Login, Dashboard, and Settings pages
- Write 5 test cases covering login, navigation, form submission, dropdown selection, and logout
- Add explicit waits for all dynamic elements
- Run tests in both headed and headless modes
Key Takeaways
- Selenium WebDriver controls browsers via the W3C WebDriver protocol
- Use explicit waits (WebDriverWait) instead of Thread.sleep() for reliable tests
- CSS selectors and data-testid attributes are the best locator strategies
- The Actions API handles complex interactions (hover, drag, keyboard)
- Always use Page Object Model for maintainable test code
- Selenium supports Java, Python, JavaScript, C#, Ruby, and Kotlin