Defog converts your natural language text queries into SQL and other machine readable code. This library allows you to easily integrate Defog into your python application, and has a CLI to help you get started.
We now recommend using defog-desktop instead of using this repo directly.
cli.mp4
For a generic installation with Postgres or Redshift binaries, use
pip install --upgrade defog
For a Snowflake installation, use
pip install --upgrade 'defog[snowflake]'
For a MySQL installation, use
pip install --upgrade 'defog[mysql]'
For a BigQuery installation, use
pip install --upgrade 'defog[bigquery]'
For a Databricks installation, use
pip install --upgrade 'defog[databricks]'
For a SQLServer installation, use
pip install --upgrade 'defog[sqlserver]'
You can get your API key by going to https://defog.ai/signup and creating an account. If you fail to verify your email, you can email us at support(at)defog.ai
You can either use our command line interface (CLI), which will take you through the setup step-by-step, or pass it in explicitly in python to the Defog
class. The CLI uses the python api's behind the hood, and is just an interactive wrapper over it that does some extra validation on your behalf.
To get started, you can run the following CLI command, which will prompt you for your defog api key, database type, and the corresponding database credentials required.
defog init
If this is your first time running, we will write these information into a json config file, which will be stored in ~/.defog/connection.json
. If we detect a file present already, we will ask you if you intend to re-initialize the file. You can always delete the file and defog init
all over again. Note that your credentials are never sent to defog's servers.
Once you have setup the connection settings, we will ask you for the names of the tables that you would like to register (space separated), generate the schema for each of them, upload the schema to defog, and print out the filename of a CSV with your metadata. If you do not wish to provide those at this point, you can exit this prompt by hitting ctrl+c
To include tables in Defog's indexed, you can run the following to generate column descriptions for your tables and columns:
defog gen <table1> <table2> ...
This will generate a CSV file that is stored locally on your disk.
If you would like to edit the auto-generate column descriptions, just edit the CSV and run the following to update the schema with defog:
defog update <csv_filename>
You can now run queries directly:
defog query "<your query>"
Happy querying!
You might notice that sometimes our model fails to take into account some prior context for your own domain, eg converting certain fields into different types, joining certain tables, how to perform string matching, etc. To give the model a standard set of instructions attached to each query, you can pass us a glossary
, which is basically just a string blob of up to 1000 characters that gives our model more specific instructions. You can manage your glossary using the following commands:
defog glossary update <path/to/glossary.txt> # Update your glossary
defog glossary get # Get your current glossary
defog glossary delete # Delete your glossary
In certain cases where the generated query follows a complex pattern, you can provide certain examples to our model to help it generate according to your desired patterns. You can manage your golden queries using the following commands:
defog golden get <json|csv> # Get your golden queries in JSON or CSV format
defog golden add <path/to/golden_queries.json> # Add golden queries from a JSON or CSV file
defog golden delete <path/to/golden_queries.json|all> # Delete specific golden queries or all of them
Note that when adding golden queries, the json/csv file provided needs to have the following keys/columns:
- prev_question (optional): the existing question in the database if we're replacing a golden question-query pair
- prev_sql (optional): the existing SQL in the database if we're replacing a golden question-query pair
- question: the new question
- sql: the new SQL
You can deploy a defog server as a cloud function using the following command:
defog deploy <gcp|aws> [function_name] # Deploy to GCP or AWS, optionally specifying the function name
You can check your quota usage per month by running:
defog quota
Free-tier users have 1000 queries per month, while premium users have unlimited queries.
You can use the API from within Python as below
from defog import Defog
# your credentials are never sent to our server, and always run locally
defog = Defog() # your credentials will automatically be loaded if you have initialized defog already
question = "question asked by a user"
# run chat version of query
results = defog.run_query(
question=question,
)
print(results)
For developers who want to test or add tests for this client, you can run:
pytest tests
Note that we will transfer the existing .defog/connection.json file over to /tmp (if at all), and transfer the original file back once the tests are done to avoid messing with the original config.
If submitting a PR, please use the black
linter to lint your code. You can add it as a git hook to your repo by running the command below:
echo -e '#!/bin/sh\n#\n# Run linter before commit\nblack $(git rev-parse --show-toplevel)' > .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit