Server Categories
Pillar Guide

Browser & Automation MCP Servers (Playwright, Puppeteer for Agents)

Browser automation MCP servers — Playwright, Puppeteer, Selenium integrations that let AI agents browse the web, test UIs, and extract data.

18 min read
Updated February 25, 2026
By MCP Server Spot

Browser automation MCP servers give AI agents the ability to interact with the web -- navigating websites, filling forms, clicking buttons, extracting data, and testing user interfaces. Built on battle-tested frameworks like Playwright and Puppeteer, these servers transform AI assistants from text-only tools into web-capable agents that can accomplish tasks across the internet.

This guide covers everything about browser automation MCP servers: how they work, which options are available, how to set them up, and the workflows they enable for web scraping, testing, and autonomous agent tasks.

How Browser Automation MCP Servers Work

Browser automation MCP servers operate by running a real web browser (Chromium, Firefox, or WebKit) and exposing browser control operations as MCP tools. When an AI assistant needs to interact with a web page, it calls these tools to:

  1. Navigate to a URL
  2. Read the page content (as an accessibility tree or structured text)
  3. Interact with elements (click, type, select)
  4. Capture screenshots or page state
  5. Extract structured data from the page

The key innovation is how pages are represented to the AI. Rather than sending raw HTML (which is verbose and hard to reason about), browser MCP servers typically convert the page into an accessibility tree -- a simplified representation that mirrors how screen readers see the page, with semantic labels for interactive elements.

Page: https://example.com/login
[1] heading "Sign In"
[2] textbox "Email address"
[3] textbox "Password" (password)
[4] checkbox "Remember me"
[5] button "Sign In"
[6] link "Forgot password?"

The AI then references elements by their index numbers: "Click element [5]" or "Type into element [2]".

Playwright MCP Server

Playwright is the most popular browser automation framework for MCP, maintained by Microsoft. The Playwright MCP server provides comprehensive browser control.

Installation and Setup

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp-server"],
      "env": {
        "PLAYWRIGHT_HEADLESS": "true"
      }
    }
  }
}

For headed mode (visible browser window for debugging):

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp-server", "--headed"]
    }
  }
}

Available Tools

ToolDescription
navigateNavigate to a URL
screenshotTake a screenshot of the current page
clickClick an element by index or selector
typeType text into an input element
fillFill a form field (clears existing content first)
selectSelect an option from a dropdown
hoverHover over an element
scrollScroll the page or a specific element
get_textGet the text content of the page or element
get_accessibility_treeGet the accessibility tree of the page
evaluateExecute JavaScript in the page context
wait_forWait for an element or condition
go_backNavigate back in browser history
go_forwardNavigate forward
new_tabOpen a new browser tab
close_tabClose the current tab
list_tabsList all open tabs

Playwright Configuration Options

OptionDescriptionDefault
--headedShow browser windowHeadless
--browser chromium|firefox|webkitBrowser engineChromium
--viewport 1280x720Browser viewport size1280x720
--user-data-dir /pathPersistent browser profileTemporary
--no-sandboxDisable sandbox (Linux containers)Sandbox enabled
--allowed-domains example.com,*.test.comRestrict navigationAll domains

Example: Web Research Workflow

User: "Research the latest MCP server releases on GitHub"

Claude's workflow:
1. navigate("https://github.com/modelcontextprotocol/servers")
2. get_accessibility_tree() — understand page structure
3. click(releases_link) — navigate to releases
4. get_text() — extract release information
5. navigate("https://github.com/topics/mcp-server")
6. get_text() — find popular MCP server repositories
7. Compile findings into a structured summary

Example: Form Automation

User: "Fill out the contact form on our staging site
       with test data"

Claude's workflow:
1. navigate("https://staging.example.com/contact")
2. get_accessibility_tree() — identify form fields
3. fill([2], "Test User") — name field
4. fill([3], "test@example.com") — email field
5. fill([4], "This is a test submission") — message field
6. screenshot() — capture filled form for review
7. (Await user confirmation before submitting)
8. click([5]) — submit button

Puppeteer MCP Server

Puppeteer, maintained by Google, provides Chrome/Chromium-specific browser automation:

Setup

{
  "mcpServers": {
    "puppeteer": {
      "command": "npx",
      "args": ["-y", "mcp-server-puppeteer"],
      "env": {
        "PUPPETEER_HEADLESS": "true"
      }
    }
  }
}

Key Tools

ToolDescription
puppeteer_navigateNavigate to a URL
puppeteer_screenshotTake a screenshot
puppeteer_clickClick an element (CSS selector)
puppeteer_fillFill a form field
puppeteer_evaluateExecute JavaScript
puppeteer_selectSelect dropdown option
puppeteer_hoverHover over element

When to Choose Puppeteer Over Playwright

ConsiderationPlaywrightPuppeteer
Browser supportChromium, Firefox, WebKitChrome/Chromium only
Multi-browser testingExcellentNot applicable
Resource usageHigher (multi-engine)Lower (single engine)
Chrome DevTools ProtocolVia bridgeNative support
Mobile emulationFull emulation profilesBasic emulation
API complexityHigher (more features)Simpler API
Best forCross-browser testing, complex workflowsChrome-specific tasks, lighter usage

Specialized Browser MCP Servers

Beyond the main frameworks, several specialized browser MCP servers exist for specific use cases:

Browserbase MCP Server

Browserbase provides cloud-hosted browser instances for scalable web automation:

{
  "mcpServers": {
    "browserbase": {
      "command": "npx",
      "args": ["-y", "mcp-server-browserbase"],
      "env": {
        "BROWSERBASE_API_KEY": "your_api_key",
        "BROWSERBASE_PROJECT_ID": "your_project_id"
      }
    }
  }
}

Advantages:

  • No local browser installation required
  • Scalable to many concurrent sessions
  • Built-in proxy and anti-detection features
  • Session recording and replay

Firecrawl MCP Server

Firecrawl specializes in web scraping and content extraction:

{
  "mcpServers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "mcp-server-firecrawl"],
      "env": {
        "FIRECRAWL_API_KEY": "your_api_key"
      }
    }
  }
}

Key Features:

  • Crawl entire websites with depth control
  • Convert pages to clean Markdown
  • Extract structured data with AI
  • Handle JavaScript-rendered content
  • Respect robots.txt and rate limits

Fetch/HTTP MCP Server

For simpler HTTP requests without full browser rendering:

{
  "mcpServers": {
    "fetch": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-fetch"]
    }
  }
}

Use Cases:

  • Fetching API responses
  • Reading static web pages
  • Downloading files
  • Checking URL availability

Browser Automation Comparison

FeaturePlaywright MCPPuppeteer MCPBrowserbaseFirecrawlFetch
TypeFull browserFull browserCloud browserWeb scraperHTTP client
JavaScriptFull renderingFull renderingFull renderingFull renderingNo rendering
InteractionFull (click, type, etc.)FullFullLimitedNone
ScreenshotsYesYesYesNoNo
Multi-browserYesChrome onlyChromeN/AN/A
ConcurrencyLocal instancesLocal instancesCloud-scaledAPI-scaledLightweight
Best ForTesting, complex automationChrome-specific tasksScale, cloudContent extractionSimple fetches

Use Case: AI-Powered Web Testing

Browser MCP servers enable powerful testing workflows when combined with AI capabilities.

Exploratory Testing

The AI autonomously explores a web application, looking for bugs:

User: "Explore our e-commerce staging site and report any issues"

Claude's workflow:
1. navigate("https://staging.example.com")
2. get_accessibility_tree() — understand the homepage
3. Test navigation: click through main menu items
4. Test search: fill search box, verify results
5. Test product pages: click products, check images, prices
6. Test cart: add items, verify quantities, check totals
7. Test forms: fill contact form, validate error messages
8. screenshot() at each step — document findings
9. Compile a test report with issues found

Accessibility Testing

User: "Test our website for accessibility issues"

Claude's workflow:
1. navigate("https://example.com")
2. get_accessibility_tree() — analyze semantic structure
3. evaluate("axe.run()") — run accessibility audit
4. Check for missing alt text, improper heading hierarchy,
   color contrast issues, keyboard navigation
5. Navigate through pages and test interactive elements
6. Generate an accessibility report with WCAG compliance status

Visual Regression Testing

User: "Compare the production and staging versions of our homepage"

Claude's workflow:
1. navigate("https://example.com") → screenshot("production.png")
2. navigate("https://staging.example.com") → screenshot("staging.png")
3. Compare screenshots and identify visual differences
4. Report layout shifts, missing elements, or style changes

Use Case: Web Data Extraction

Structured Data Extraction

User: "Extract all product listings from this category page"

Claude's workflow:
1. navigate(url)
2. get_accessibility_tree() — understand page structure
3. evaluate("document.querySelectorAll('.product-card')") — find products
4. For each product: extract name, price, rating, availability
5. Handle pagination: click "Next" and repeat
6. Return structured data as JSON/CSV

Competitive Intelligence

User: "Check competitor pricing for these 5 products"

Claude's workflow:
1. For each competitor website:
   a. navigate(product_url)
   b. get_text() — extract pricing information
   c. screenshot() — capture for reference
2. Compile comparison table with prices across competitors
3. Highlight significant price differences

Use Case: AI Agent Web Workflows

Browser MCP servers are essential building blocks for AI agents that need to interact with the web.

Multi-Step Web Workflows

User: "Book a meeting room for tomorrow 2-3pm through our
       internal booking system"

Claude's workflow:
1. navigate("https://rooms.company.com")
2. Authenticate (using pre-saved session)
3. Select tomorrow's date
4. Find available rooms for 2-3pm
5. Select the best option
6. Fill booking details
7. screenshot() — confirm before submitting
8. Submit the booking (with user approval)

Form Filing and Submission

AI agents can handle repetitive form-filling tasks:

  • Expense report submissions
  • IT ticket creation across multiple portals
  • Data entry into web-based systems
  • Survey completion for testing purposes

Security Best Practices

Domain Restrictions

Restrict which domains the browser can access:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": [
        "-y", "@playwright/mcp-server",
        "--allowed-domains", "example.com,*.example.com,staging.example.com"
      ]
    }
  }
}

Authentication Safety

  • Never pass passwords through AI conversations
  • Use pre-authenticated browser profiles with saved sessions
  • Implement session timeout and re-authentication flows
  • Store cookies and auth tokens securely
{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": [
        "-y", "@playwright/mcp-server",
        "--user-data-dir", "/path/to/authenticated-profile"
      ]
    }
  }
}

Content Safety

  • Filter responses for sensitive data (credit card numbers, SSNs)
  • Block navigation to known malicious domains
  • Disable file downloads by default
  • Monitor and log all URLs accessed

Resource Isolation

  • Run browser instances in containers or sandboxes
  • Limit CPU and memory allocation per instance
  • Close idle browser sessions automatically
  • Use separate browser profiles for different security contexts

Performance Optimization

Headless Mode

Always use headless mode in production for better performance:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp-server", "--headless"]
    }
  }
}

Headless mode reduces resource usage by approximately 30-50% compared to headed mode.

Page Load Optimization

  • Block unnecessary resources (images, fonts, analytics) when only extracting text
  • Use wait_for with specific conditions rather than fixed timeouts
  • Reuse browser instances across multiple page navigations
  • Close tabs when done to free memory

Caching Strategies

  • Cache frequently accessed pages to reduce browser invocations
  • Store extracted data to avoid re-scraping unchanged pages
  • Use conditional navigation (check if data changed before full scrape)

Troubleshooting

Common Issues

IssueCauseSolution
Browser fails to launchMissing dependenciesInstall Playwright browsers: npx playwright install
Page loads but elements not foundContent loaded dynamicallyUse wait_for tool to wait for elements
Screenshots are blankHeadless rendering issueTry --headed mode for debugging
Timeout errorsSlow page load or networkIncrease timeout settings
Memory errorsToo many browser instancesClose unused tabs and pages
Element not clickableOverlapping elementsUse evaluate to scroll element into view

Debugging Tips

  1. Switch to headed mode to see what the browser is doing
  2. Take screenshots at each step to verify page state
  3. Use get_accessibility_tree to understand element indices
  4. Check browser console logs with evaluate("console.log")
  5. Verify the page has fully loaded before interacting

Advanced Patterns

Multi-Tab Workflows

Browser MCP servers support multiple tabs for complex workflows:

User: "Compare the pricing pages of these three competitors"

Claude's workflow:
1. new_tab() → Tab 1
2. navigate("https://competitor1.com/pricing")
3. get_text() → extract pricing data
4. new_tab() → Tab 2
5. navigate("https://competitor2.com/pricing")
6. get_text() → extract pricing data
7. new_tab() → Tab 3
8. navigate("https://competitor3.com/pricing")
9. get_text() → extract pricing data
10. Compile comparison table from all three tabs

Persistent Browser Sessions

For workflows requiring authentication or state:

{
  "mcpServers": {
    "playwright-authenticated": {
      "command": "npx",
      "args": [
        "-y", "@playwright/mcp-server",
        "--user-data-dir", "/path/to/profile",
        "--headed"
      ]
    }
  }
}

Using a persistent user data directory means:

  • Login sessions persist between MCP server restarts
  • Cookies and local storage are preserved
  • Browser extensions remain installed
  • Form autofill data is available

Intercepting Network Requests

Advanced browser MCP servers can intercept and modify network requests:

Use Cases:
- Block analytics scripts for cleaner page content
- Mock API responses for testing
- Capture API calls to understand app behavior
- Modify request headers for authentication

Integration with Other MCP Servers

Browser automation servers become most powerful when combined with other MCP servers:

Browser + Filesystem: Screenshot Documentation

User: "Take screenshots of all pages in our app for documentation"

Claude's workflow:
1. (Filesystem) Read sitemap.json or route configuration
2. For each page:
   a. (Browser) navigate(page_url)
   b. (Browser) wait_for(content_loaded)
   c. (Browser) screenshot(page_name.png)
3. (Filesystem) Write screenshots to docs/screenshots/
4. (Filesystem) Generate an index.md linking all screenshots

Browser + Database: Dynamic Content Testing

User: "Verify that the user dashboard displays correct data
       for test accounts"

Claude's workflow:
1. (Database) Query test account data
2. (Browser) Navigate to login page
3. (Browser) Log in as test user
4. (Browser) Navigate to dashboard
5. (Browser) Extract displayed values
6. Compare displayed values against database values
7. Report any discrepancies

Browser + GitHub: Visual Regression in CI

Automated visual regression workflow:
1. (GitHub) get_pull_request_files(pr) — identify changed components
2. (Browser) Navigate to production URL
3. (Browser) screenshot() — baseline screenshot
4. (Browser) Navigate to staging/preview URL
5. (Browser) screenshot() — new version screenshot
6. Compare screenshots, identify visual changes
7. (GitHub) create_pull_request_review() — report findings

Responsible Web Automation

Ethical Considerations

When using browser automation MCP servers for web interaction:

  1. Respect robots.txt: Check and follow site-specific automation policies
  2. Rate limiting: Do not overwhelm target websites with rapid requests
  3. Terms of service: Ensure your use case complies with the target site's ToS
  4. Data privacy: Be careful with personal data encountered during browsing
  5. Attribution: When extracting content, respect copyright and licensing

Legal Considerations

Use CaseGenerally AcceptableRequires Caution
Testing your own sitesYesN/A
Public data extractionUsuallyCheck ToS
Price comparisonDepends on jurisdictionCheck competitor ToS
Academic researchGenerallyIRB approval may be needed
Login to services you ownYesUse dedicated test accounts
Automated form submissionFor your own servicesGet permission for third-party

Anti-Detection and Ethics

Some websites employ bot detection. While browser MCP servers can potentially bypass detection:

  • Do not bypass bot detection on sites where you are not authorized
  • Do use your own authenticated sessions for services you have legitimate access to
  • Do use browser automation for testing your own applications
  • Do not use browser automation for unauthorized scraping at scale

Building Custom Browser MCP Servers

For specialized browser automation needs:

from mcp.server import Server
from mcp.types import Tool, TextContent, ImageContent
from playwright.async_api import async_playwright

app = Server("custom-browser")

@app.list_tools()
async def list_tools():
    return [
        Tool(
            name="check_website_status",
            description="Check if a website is up and responding correctly",
            inputSchema={
                "type": "object",
                "properties": {
                    "url": {"type": "string", "description": "URL to check"},
                    "expected_text": {
                        "type": "string",
                        "description": "Text expected on the page"
                    }
                },
                "required": ["url"]
            }
        )
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "check_website_status":
        async with async_playwright() as p:
            browser = await p.chromium.launch(headless=True)
            page = await browser.new_page()

            try:
                response = await page.goto(
                    arguments["url"],
                    timeout=30000
                )
                status = response.status
                title = await page.title()
                text_found = True

                if "expected_text" in arguments:
                    content = await page.text_content("body")
                    text_found = arguments["expected_text"] in content

                return [TextContent(
                    type="text",
                    text=f"Status: {status}\n"
                         f"Title: {title}\n"
                         f"Expected text found: {text_found}"
                )]
            except Exception as e:
                return [TextContent(
                    type="text",
                    text=f"Error: {str(e)}"
                )]
            finally:
                await browser.close()

Monitoring and Health Check Workflows

Browser MCP servers enable powerful website monitoring use cases when combined with scheduling:

Uptime and Availability Monitoring

Scheduled every 15 minutes:

Agent workflow:
1. For each monitored URL:
   a. navigate(url)
   b. Check response status and page title
   c. Verify expected content is present
   d. Measure page load time
2. If any check fails:
   a. screenshot() — capture the error state
   b. Retry once after 30 seconds
   c. If still failing, trigger alert via Slack/email
3. Log results for trend analysis

Visual Monitoring for Content Changes

User: "Monitor our competitor's pricing page for changes"

Claude's workflow:
1. navigate(competitor_pricing_url)
2. get_text() — extract current pricing data
3. Compare against the last saved snapshot
4. If changes detected:
   a. screenshot() — capture new state
   b. Generate a diff report highlighting what changed
   c. Notify via Slack with the summary
5. Save current snapshot for future comparison

SSL Certificate and Security Monitoring

User: "Check SSL certificates for all our domains"

Claude's workflow:
1. For each domain:
   a. navigate("https://domain.com")
   b. evaluate("
      const cert = window.performance.getEntries()[0];
      return { protocol: location.protocol, secure: true };
   ") — verify HTTPS is active
   c. Check for mixed content warnings
   d. Verify no security headers are missing
2. Generate a security report:
   - Certificate expiration dates
   - Security header compliance (HSTS, CSP, etc.)
   - Mixed content issues
   - Redirect chain analysis

Browser Profile Management

Managing Multiple Authenticated Sessions

For workflows requiring access to multiple authenticated services:

{
  "mcpServers": {
    "playwright-internal": {
      "command": "npx",
      "args": [
        "-y", "@playwright/mcp-server",
        "--user-data-dir", "/profiles/internal-tools"
      ]
    },
    "playwright-external": {
      "command": "npx",
      "args": [
        "-y", "@playwright/mcp-server",
        "--user-data-dir", "/profiles/external-sites"
      ]
    }
  }
}

This separation ensures that internal authentication tokens are never exposed to external websites, and vice versa.

Cookie and Session Management Best Practices

PracticeDescriptionReason
Use separate profiles per security contextDifferent user-data-dir pathsPrevents cookie leakage
Clear sessions periodicallyDelete profile and re-authenticateReduces stale session risk
Never share profiles between usersEach user gets their own profile pathMaintains access isolation
Use headless for productionOnly use headed for debuggingReduces attack surface
Set session timeoutsConfigure browser to expire sessionsLimits exposure window

Scaling Browser Automation

Cloud-Based Scaling with Browserbase

For workflows requiring many concurrent browser sessions:

Architecture for scaled browser automation:

┌──────────────┐     ┌───────────────┐     ┌──────────────┐
│  AI Client   │────▶│  Browserbase  │────▶│  Cloud       │
│  (Claude)    │     │  MCP Server   │     │  Browsers    │
│              │     │               │     │  (100+ inst) │
└──────────────┘     └───────────────┘     └──────────────┘

Key advantages of cloud-based scaling:

  • No local resource constraints
  • Geographic distribution for location-specific testing
  • Built-in proxy rotation for scraping use cases
  • Session recording and replay for debugging
  • Automatic browser updates and patch management

Resource Management Guidelines

Deployment SizeConcurrent BrowsersRAM RequiredCPU Required
Single developer1-2512 MB1 core
Small team3-52 GB2 cores
CI/CD pipeline5-104 GB4 cores
Production monitoring10-5016 GB8 cores
Large-scale scraping50+Cloud (Browserbase)Cloud

What to Read Next

Frequently Asked Questions

What is the Playwright MCP server?

The Playwright MCP server is a Model Context Protocol server that gives AI applications the ability to control web browsers through Playwright's automation framework. It exposes tools for navigating to URLs, clicking elements, filling forms, taking screenshots, extracting page content, and running automated test sequences. This enables AI agents to interact with web applications just as a human user would.

How do browser MCP servers differ from web scraping?

Browser MCP servers provide full browser automation — they render JavaScript, handle dynamic content, interact with SPAs, fill forms, and simulate user actions. Traditional web scraping typically only fetches and parses static HTML. Browser servers use real browser engines (Chromium, Firefox, WebKit) and can handle authentication flows, cookie management, and complex user interactions that simple HTTP requests cannot.

Can AI agents browse the internet through MCP?

Yes. Browser automation MCP servers give AI agents the ability to navigate websites, read page content, click links, fill out forms, and extract information. The AI receives structured representations of web pages (accessibility trees or simplified DOM) rather than raw HTML, making it easier to understand and interact with web content. However, you should implement guardrails to prevent accessing inappropriate or unauthorized content.

Is the Playwright MCP server safe to use?

The Playwright MCP server is safe when configured properly. It runs a real browser that can be sandboxed (headless mode, restricted network access, isolated profile). Key safety measures include: running in headless mode to prevent desktop interference, restricting navigation to allowed domains, disabling file downloads, and requiring user confirmation for sensitive actions like form submissions.

What is the difference between Playwright MCP and Puppeteer MCP?

Playwright MCP supports multiple browser engines (Chromium, Firefox, WebKit) and is maintained by Microsoft. Puppeteer MCP is specifically for Chrome/Chromium and is maintained by Google. Playwright generally offers more features (auto-waiting, better mobile emulation, parallel execution), while Puppeteer has a simpler API and lighter resource usage. Both provide similar core functionality for MCP-based browser automation.

Can I use browser MCP servers for automated testing?

Yes. Browser automation MCP servers are excellent for AI-assisted testing workflows. The AI can navigate your application, interact with UI elements, verify expected behavior, and report issues. Common patterns include exploratory testing (AI discovers and tests flows autonomously), regression testing (AI runs predefined scenarios), and accessibility testing (AI evaluates page accessibility).

How do browser MCP servers handle authentication?

Browser MCP servers handle authentication through several methods: (1) the AI fills in login forms with provided credentials, (2) pre-authenticated browser profiles with saved cookies/sessions, (3) injected authentication tokens or cookies before navigation, and (4) OAuth flows where the AI navigates the authorization process. For security, use pre-authenticated profiles rather than passing passwords through the AI.

What are the resource requirements for browser MCP servers?

Browser automation servers are more resource-intensive than other MCP servers because they run actual browser instances. Expect each browser instance to use 100-300 MB of RAM. Headless mode uses less resources than headed mode. For servers that need multiple concurrent browser instances, allocate at least 512 MB RAM per instance. Consider closing browser tabs/pages when not in use to free resources.

Related Guides