Skip to content

Google Play Scraper

PyPI - Python Version License

A modern, fast, and robust Google Play Store scraper using httpx, selectolax, and pydantic. Designed to handle complex, nested JSON responses safely and efficiently.

Core Features

  • App Details: Fetch complete metadata (title, developer, installs, description, etc.).
  • Search & Suggestions: Search for apps and get real-time query suggestions.
  • Developer Apps: List all applications published by a specific developer.
  • Reviews: Retrieve pages of user reviews using Google Play's internal RPC API.
  • Collections & Categories: Browse popular collections (topselling_free) and standard categories.
  • Privacy & Safety: Advanced extraction of app permission requirements and data safety practices.

Installation

This project is built using uv.

uv add google-play-scraper-httpx

Quick Start

The main entry point is the PlayScraper class, which uses a context manager to handle the underlying HTTP client session safely.

from google_play_scraper_httpx import PlayScraper

with PlayScraper(hl="en", gl="us") as scraper:
    # Get basic details
    app = scraper.get_app_details("com.whatsapp")
    print(f"{app.title} by {app.developer}")

    # Search for an app
    results = scraper.search("messenger")
    for res in results[:3]:
        print(f"Found: {res.title}")

Why this scraper?

Unlike older scrapers that rely heavily on regular expressions to parse constantly changing HTML, this project:

  1. Prioritizes extracting the structured ds:5 and ds:4 JSON bundles embedded in Google Play pages.
  2. Interacts directly with Google Play's internal batchexecute RPC endpoints for complex queries (like endless reviews and data safety metrics).
  3. Validates all output strictly using Pydantic Models, preventing silent data corruption.