Python script to manage user agents

Python Script to Manage User Agents: A Complete Guide for Developers

Introduction

When developing web scraping, automation, or testing tools, one essential component you must handle carefully is the user agent string. It defines how your script identifies itself to a web server. Without proper management, your requests may be blocked, throttled, or misinterpreted as bot traffic. This is where managing user agents efficiently becomes crucial.

In this article, we’ll explore why user agent management matters, how to use Python to rotate and customize user agents, and how to implement a script that automates this process effectively. By the end, you’ll be able to build your own Python-based user agent management system — ideal for developers, testers, and data analysts who work with HTTP requests or web automation.

What Is a User Agent?

A user agent is a string of text included in the HTTP request headers that identifies the client software — usually the browser, operating system, and sometimes device type.

Here’s a simple example:

User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0 Safari/537.36

When you send requests with Python libraries such as requests, urllib, or Selenium, the default user agent might identify your script as a bot, like:

python-requests/2.31.0

This immediately signals to most websites that your traffic is automated, which can trigger rate limits or blocks. Managing your user agents — especially rotating them dynamically — helps your automation scripts appear more natural and professional.

Why Managing User Agents Matters

Proper user agent management provides several benefits, especially in scraping, testing, and automation.

  1. Avoid IP Blocking: Many websites block requests from default Python user agents because they identify them as bots.
  2. Improve Data Accuracy: Using realistic user agents ensures the content you fetch is the same as what real users see.
  3. Enable Device-Specific Testing: You can simulate different browsers, devices, or platforms by changing user agents.
  4. Enhance Privacy and Anonymity: Rotating user agents reduces the risk of tracking or fingerprinting during automated requests.
  5. Compliance with Crawling Policies: Managing user agents allows you to respect website robots.txt and avoid cloaking.

Popular Python Libraries for Managing User Agents

Python provides several libraries to simplify user agent management and rotation. Here are the most popular ones:

1. fake_useragent

This library generates random, realistic browser user agents from a constantly updated database.

pip install fake-useragent

Example:

from fake_useragent import UserAgent
ua = UserAgent()
print(ua.random)

2. requests

The built-in requests library allows you to manually set user agents in headers.

import requests
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/122.0 Safari/537.36'}
response = requests.get('https://example.com', headers=headers)

3. selenium

For browser automation, Selenium enables setting custom user agents through browser options.

from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("user-agent=Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)")
driver = webdriver.Chrome(options=options)
driver.get("https://example.com")

4. random-user-agent or custom lists

You can create your own custom list of user agents and rotate them randomly to simulate different visitors.

How to Create a Python Script to Manage User Agents

Let’s build a Python script that automatically rotates user agents for every request using the requests library and fake_useragent.

Step 1: Install Required Libraries

pip install requests fake-useragent

Step 2: Create a List of URLs or Tasks

Assume you’re scraping several pages or testing multiple endpoints:

urls = [
    "https://example.com",
    "https://httpbin.org/user-agent",
    "https://www.wikipedia.org"
]

Step 3: Implement User Agent Rotation

Use the UserAgent() object to get random user agents for each request.

import requests
from fake_useragent import UserAgent
import time

ua = UserAgent()

for url in urls:
    headers = {'User-Agent': ua.random}
    print(f"Using User Agent: {headers['User-Agent']}")
    response = requests.get(url, headers=headers)
    print(response.status_code, response.text[:120])
    time.sleep(2)

This script changes the user agent string each time it sends a request, which mimics organic browsing behavior and helps prevent blocking.

Step 4: Create a Fallback for Failures

Sometimes, fake_useragent might fail to fetch its database. In such cases, set a manual fallback list:

import random

fallback_agents = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_0_1)",
    "Mozilla/5.0 (Linux; Android 13; Pixel 7)",
    "Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)"
]

def get_random_user_agent():
    try:
        return ua.random
    except:
        return random.choice(fallback_agents)

Step 5: Add Logging

To analyze which user agents were used, log them for reference.

import logging

logging.basicConfig(filename="user_agents.log", level=logging.INFO)
logging.info(f"Using User Agent: {headers['User-Agent']}")

Step 6: Combine Everything Into One Script

Here’s a complete example:

import requests
from fake_useragent import UserAgent
import random, time, logging

ua = UserAgent()
fallback_agents = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_0_1)",
    "Mozilla/5.0 (Linux; Android 13; Pixel 7)",
    "Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)"
]

def get_user_agent():
    try:
        return ua.random
    except:
        return random.choice(fallback_agents)

def fetch_page(url):
    headers = {'User-Agent': get_user_agent()}
    logging.info(f"Using User Agent: {headers['User-Agent']}")
    response = requests.get(url, headers=headers)
    print(f"Fetched {url} with status {response.status_code}")
    return response.text

if __name__ == "__main__":
    logging.basicConfig(filename="user_agents.log", level=logging.INFO)
    urls = ["https://httpbin.org/user-agent", "https://example.com"]
    for u in urls:
        fetch_page(u)
        time.sleep(1)

This script can be adapted for scraping, load testing, or browser automation. It ensures each request uses a realistic and unique user agent string, reducing the risk of detection.

Managing User Agents with APIs or Databases

If you work on large-scale automation, you can manage user agents dynamically through APIs or databases.
You can store thousands of user agent strings in a local JSON or database file and fetch them randomly:

import json, random

with open("user_agents.json", "r") as f:
    agents = json.load(f)
agent = random.choice(agents)

You can also use API-based solutions such as:

  • UserAgents.io API — fetches fresh, verified user agent strings.
  • WhatIsMyBrowser API — provides categorized user agents by browser and platform.

Best Practices for Managing User Agents in Python

  1. Avoid aggressive rotation: Excessive changes can appear suspicious; moderate rotation mimics real browsing patterns.
  2. Combine with proxies: Rotate both IPs and user agents for better anonymity.
  3. Validate headers: Always verify that your user agent string is valid and not truncated.
  4. Respect robots.txt: Never use user agent spoofing to bypass website policies.
  5. Monitor responses: Detect and handle captchas or access denials automatically.

Troubleshooting Common Issues

  • Error fetching fake-useragent data: Use fallback lists or update the library.
  • 403 Forbidden responses: Try using a different user agent or a delay between requests.
  • Incorrect encoding: Set appropriate headers like Accept-Language and Accept-Encoding.
  • SSL or timeout errors: Add retries or error handling in your script.

Conclusion

Managing user agents in Python is a key skill for anyone working with automation, scraping, or testing tools. With a few lines of code, you can rotate user agents dynamically, simulate multiple browsers, and reduce blocking risks. By following best practices, respecting site policies, and logging each session’s behavior, your scripts can operate efficiently and safely.

As the web evolves, user agent handling will continue shifting toward privacy-focused APIs like Client Hints. However, the ability to manage and rotate user agents in Python remains essential for testing, research, and legitimate automation tasks. When done correctly, it ensures both better performance and compliance with modern web standards.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *