Skip to content

Latest commit

 

History

History
123 lines (79 loc) · 4.33 KB

CONTRIBUTING.md

File metadata and controls

123 lines (79 loc) · 4.33 KB

Contributing

Ways to Contribute

We're happy you're looking to contribute. We recommend you join the Slack channel to discuss how to get involved, or how to use Fugue. There are many ways to help this project.

  1. Use Fugue in your project - Having more users helps us come across more use cases and make a better framework. We're always happy to help you get started using Fugue for your company use case or personal project. Feel free to message us on Slack.

  2. Give us feedback/Post issues - If you have ideas of how to make Fugue better, or have general questions about Fugue, we'd be happy to hear them. Hearing unclear parts helps us write better documentation. Posting issues helps us fix bugs or make new features.

  3. Make a blog post or presentation - Are you interested in presenting Fugue to your company? at a Meetup? or at a conference? We'd be happy to chat with you and share some resources we have.

  4. Write code - Is there an issue you want to take a stab at? We recommend touching base with us before you pick up an issue. Documentation is also a good way to help.

Project Structure

There are 2 main parts to the codebase

  • fugue - This contains the core of Fugue, including the fundamental classes such as DataFrames, ExecutionEngine, and Extensions.
  • fugue_sql - Fugue SQL is a Domain Specific Language (DSL) for Fugue

Then there are backend-specific portions

  • fugue_spark - Spark-specific code
  • fugue_dask - Dask-specific code
  • fugue_ray - Ray-specific code
  • fugue_duckdb - DuckDB-specific code
  • fugue_ibis - Ibis-specific code

There are 2 main parts to tests

  • fugue_test - Contains suites for testing (dataframe, execution engine). These unify the concepts of distributed computing and ensure consistent behavior across different execution engines (Pandas, Spark, Dask).
  • tests - Contains all tests for the repository

Lastly, there is documentation. Note that tutorials live in another repository.

Setting up the dev environment

There are three steps to setting-up a development environment

  1. Create a virtual environment with your choice of environment manager
  2. Install the requirements
  3. Install the pre-commit hooks

Creating an environment

Below are examples for how to create and activate an environment in virtualenv and conda.

Using virtualenv

python3 -m venv venv
. venv/bin/activate

Using conda

conda create --name fugue-dev
conda activate fugue-dev

Installing requirements

The Fugue repo has a Makefile that can be used to install the requirements. It supports installation in both pip and conda. Instructions to install make for Windows users can be found later.

Pip install requirements

make setupinpip

Conda install requirements

make setupinconda

Manually install requirements

For Windows users who don't have the make command, you can use your package manager of choice. For pip:

pip3 install -r requirements.txt

For Anaconda users, first install pip in the newly created environment. If pip install is used without installing pip, conda will use the system-wide pip

conda install pip
pip install -r requirements.txt

Notes for Windows Users For Windows users, you will need to download Microsoft C++ Build Tools found here

make is a GNU command that does not come with Windows. An installer can be downloaded here

After installing, add the bin to your PATH environment variable.

Installing pre-commit hooks

Fugue has pre-commit hooks to check if code is appropriate to be committed. The previous make command installs this.

If you installed the requirements manually, install the git hook scripts with:

pre-commit install

Running Tests

The Makefile has the following targets for testing

make test      - All tests
make testcore  - All Fugue code not specific to any backend
make testspark - Only the Spark specific tests
make testdask  - Only the Dask specific tests
make testray   - Only the Ray specific tests
make testduck  - Only the DuckDB specific tests
make testibis  - Only the Ibis specific tests