Welcome to pqg documentation

pqg (Pandas Query Generator) is a tool for generating synthetic pandas queries to help train machine learning models for query cost estimation and cardinality prediction.

Features

  • Generate complex pandas queries based on schema definitions

  • Support for selections, projections, joins, and aggregations

  • Configurable query complexity and operation probabilities

  • Built-in query execution and validation

  • Both CLI and library interfaces

Getting Started

Check out the Quickstart Guide guide or dive into the API Reference.