Google Play Scraper
A modern, fast, and robust Google Play Store scraper using httpx, selectolax, and pydantic. Designed to handle complex, nested JSON responses safely and efficiently.
Core Features
- App Details: Fetch complete metadata (title, developer, installs, description, etc.).
- Search & Suggestions: Search for apps and get real-time query suggestions.
- Developer Apps: List all applications published by a specific developer.
- Reviews: Retrieve pages of user reviews using Google Play's internal RPC API.
- Collections & Categories: Browse popular collections (
topselling_free) and standard categories. - Privacy & Safety: Advanced extraction of app permission requirements and data safety practices.
Installation
This project is built using uv.
Quick Start
The main entry point is the PlayScraper class, which uses a context manager to handle the underlying HTTP client session safely.
from google_play_scraper_httpx import PlayScraper
with PlayScraper(hl="en", gl="us") as scraper:
# Get basic details
app = scraper.get_app_details("com.whatsapp")
print(f"{app.title} by {app.developer}")
# Search for an app
results = scraper.search("messenger")
for res in results[:3]:
print(f"Found: {res.title}")
Why this scraper?
Unlike older scrapers that rely heavily on regular expressions to parse constantly changing HTML, this project:
- Prioritizes extracting the structured
ds:5andds:4JSON bundles embedded in Google Play pages. - Interacts directly with Google Play's internal
batchexecuteRPC endpoints for complex queries (like endless reviews and data safety metrics). - Validates all output strictly using Pydantic Models, preventing silent data corruption.