Skip to content

Command line interface

ehrql [--help] [--version] COMMAND_NAME ...
The command line interface for ehrQL, a query language for electronic health record (EHR) data.

COMMAND_NAME 🔗

Name of the sub-command to execute.

Take a dataset definition file and output a dataset.

Take a measures definition file and output measures.

Start the ehrQL sandbox environment.

Dump example data for the ehrQL tutorial to the current directory.

Output the SQL that would be executed to fetch the results of the dataset definition.

Generate dummy data for a dataset and write out tables as CSV.

Experimental command for running assurance tests.

Internal command for testing the database connection configuration.

-h, --help 🔗

show this help message and exit

--version 🔗

Show the exact version of ehrQL in use and then exit.

generate-dataset 🔗

ehrql generate-dataset DEFINITION_FILE [--help] [--output DATASET_FILE]
      [--dummy-data-file DUMMY_DATA_FILE] [--dummy-tables DUMMY_TABLES_PATH]
      [--dsn DSN] [--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
      [ -- ... PARAMETERS ...]
Take a dataset definition file and output a dataset.

ehrQL is designed so that exactly the same command can be used to output dummy data when run on your own computer and then output real data when run inside the secure environment as part of an OpenSAFELY pipeline.

DEFINITION_FILE 🔗

Path of the Python file where the dataset is defined.

-h, --help 🔗

show this help message and exit

--output DATASET_FILE 🔗

Path of the file where the dataset will be written (console by default).

The file extension determines the file format used. Supported formats are: .arrow, .csv, .csv.gz

--dummy-data-file DUMMY_DATA_FILE 🔗

Path to a file to use as dummy output data.

This allows you to take complete control of the dummy data produced. ehrQL will ensure that the column names, types and categorical values match what they will be in the real data, but does no further validation.

Note that the dummy data file doesn't need to be of the same type as the final output file (e.g. you can use a .csv file here to produce a .arrow file).

This argument is ignored when running against real data.

--dummy-tables DUMMY_TABLES_PATH 🔗

Path to directory of CSV files (one per table) to use as dummy data tables (see create-dummy-tables).

This argument is ignored when running against real data.

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

Internal Arguments

You should not normally need to use these arguments: they are for the internal operation of ehrQL and the OpenSAFELY platform.

--dsn DSN 🔗

Data Source Name: URL of remote database, or path to data on disk (defaults to value of DATABASE_URL environment variable).

--query-engine QUERY_ENGINE_CLASS 🔗

Dotted import path to Query Engine class, or one of: mssql, sqlite, csv, trino

--backend BACKEND_CLASS 🔗

Dotted import path to Backend class, or one of: emis, tpp

generate-measures 🔗

ehrql generate-measures DEFINITION_FILE [--help] [--output OUTPUT_FILE]
      [--dummy-tables DUMMY_TABLES_PATH] [--dummy-data-file DUMMY_DATA_FILE]
      [--dsn DSN] [--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
      [ -- ... PARAMETERS ...]
Take a measures definition file and output measures.

DEFINITION_FILE 🔗

Path of the Python file where measures are defined.

-h, --help 🔗

show this help message and exit

--output OUTPUT_FILE 🔗

Path of the file where the measures will be written (console by default), supported formats: .arrow, .csv, .csv.gz

--dummy-tables DUMMY_TABLES_PATH 🔗

Path to directory of CSV files (one per table) to use as dummy data tables (see create-dummy-tables).

This argument is ignored when running against real data.

--dummy-data-file DUMMY_DATA_FILE 🔗

Path to a file to use as dummy output data.

This allows you to take complete control of the dummy data produced. ehrQL will ensure that the column names, types and categorical values match what they will be in the real data, but does no further validation.

Note that the dummy data file doesn't need to be of the same type as the final output file (e.g. you can use a .csv file here to produce a .arrow file).

This argument is ignored when running against real data.

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

Internal Arguments

You should not normally need to use these arguments: they are for the internal operation of ehrQL and the OpenSAFELY platform.

--dsn DSN 🔗

Data Source Name: URL of remote database, or path to data on disk (defaults to value of DATABASE_URL environment variable).

--query-engine QUERY_ENGINE_CLASS 🔗

Dotted import path to Query Engine class, or one of: mssql, sqlite, csv, trino

--backend BACKEND_CLASS 🔗

Dotted import path to Backend class, or one of: emis, tpp

sandbox 🔗

ehrql sandbox DUMMY_TABLES_PATH [--help]
Start the ehrQL sandbox environment.

DUMMY_TABLES_PATH 🔗

Path to directory of CSV files (one per table).

-h, --help 🔗

show this help message and exit

dump-example-data 🔗

ehrql dump-example-data [--help]
Dump example data for the ehrQL tutorial to the current directory.

-h, --help 🔗

show this help message and exit

dump-dataset-sql 🔗

ehrql dump-dataset-sql DEFINITION_FILE [--help] [--output OUTPUT_FILE]
      [--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
      [ -- ... PARAMETERS ...]
Output the SQL that would be executed to fetch the results of the dataset definition.

By default, this command will output SQL suitable for the SQLite database. To get the SQL as it would be run against the real data you will to supply the appropriate --backend argument, for example --backend tpp.

Note that due to configuration differences this may not always exactly match what gets run against the real data.

DEFINITION_FILE 🔗

Path of the Python file where the dataset is defined.

-h, --help 🔗

show this help message and exit

--output OUTPUT_FILE 🔗

SQL output file (outputs to console by default).

--query-engine QUERY_ENGINE_CLASS 🔗

Dotted import path to Query Engine class, or one of: mssql, sqlite, csv, trino

--backend BACKEND_CLASS 🔗

Dotted import path to Backend class, or one of: emis, tpp

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

create-dummy-tables 🔗

ehrql create-dummy-tables DEFINITION_FILE DUMMY_TABLES_PATH [--help]
      [ -- ... PARAMETERS ...]
Generate dummy data for a dataset and write out tables as CSV.

This generates the same dummy data that generate-dataset would, but instead of using this to produce a dataset it writes the underlying data tables out as CSV (one file per table).

The directory containing these CSV files can then be used as the --dummy-tables argument to generate-dataset to produce the dataset.

The CSV files can be edited in any way you wish, giving you full control over the dummy data.

DEFINITION_FILE 🔗

Path of the Python file where the dataset is defined.

DUMMY_TABLES_PATH 🔗

Path to directory where CSV files (one per table) will be written.

-h, --help 🔗

show this help message and exit

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

assure 🔗

ehrql assure TEST_DATA_FILE [--help] [ -- ... PARAMETERS ...]
Experimental command for running assurance tests.

Note that this command is experimental and not yet intended for widespread use.

TEST_DATA_FILE 🔗

Path of the file where the test data is defined.

-h, --help 🔗

show this help message and exit

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

test-connection 🔗

ehrql test-connection [--help] [-b BACKEND_CLASS] [-u URL]
Internal command for testing the database connection configuration.

Note that this in an internal command and not intended for end users.

-h, --help 🔗

show this help message and exit

--backend, -b BACKEND_CLASS 🔗

Dotted import path to Backend class, or one of: emis, tpp

--url, -u URL 🔗

Database connection string.