PyGrid is a peer-to-peer network of data owners and data scientists who can collectively train AI models using PySyft.
- Overview
- Architecture
- Getting started
- Try out the Tutorials
- Start Contributing
- High-level Architecture
- Disclaimer
- License
PyGrid platform is composed by three different components.
PyGrid App - A Flask based application used to manage/monitor/control and route grid Nodes/Workers remotely.
Grid Nodes - Server based apps used to store and manage data access in a secure and private way.
Grid Workers - Client based apps that uses different Syft based libraries to perform federated learning (ex: syft.js, KotlinSyft, SwiftSyft).
To boot the entire PyGrid platform locally, we will use docker containers. To install docker the dependencies, just follow docker documentation.
The latest PyGrid Gateway and Node images are available on the Docker Hub.
- PyGrid -
openmined/grid-gateway
- Grid Node -
openmined/grid-node
Before start the grid platform locally using docker, we need to set up the domain names used by the bridge network. In order to use these nodes from outside of containers context, you should add the following domain names on your /etc/hosts
127.0.0.1 gateway
127.0.0.1 bob
127.0.0.1 alice
127.0.0.1 bill
127.0.0.1 james
To setup and start the PyGrid platform you just need start the docker-compose process.
$ docker-compose up
It will download the latest openmined's docker images and start a grid platform with 1 gateway and 4 grid nodes.
PS: Feel free to increase/decrease the number of initial PyGrid nodes (you can do this by changing the docker-compose.yml file).
$ docker build -t openmined/grid-node ./app/websocket/ # Build PyGrid node image
$ docker build -t openmined/grid-gateway ./gateway/ # Build gateway image
To start the PyGrid app manually, run:
python grid.py
You can pass the arguments or use environment variables to set the gateway configs.
Arguments
-h, --help shows the help message and exit
-p [PORT], --port [PORT] port to run server on (default: 5000)
--host [HOST] the grid gateway host
--num_replicas the number of replicas to provide fault tolerance to model hosting
--start_local_db if this flag is used a SQLAlchemy DB URI is generated to use a local db
Environment Variables
GRID_GATEWAY_PORT
- Port to run server on.GRID_GATEWAY_HOST
- The grid gateway hostNUM_REPLICAS
- Number of replicas to provide fault tolerance to model hostingDATABASE_URL
- The gateway database URLSECRET_KEY
- The secret key
You can also start the PyGrid app by running the dev_server.sh
script.
$ ./dev_server.sh
This script uses the dev_server.conf.py
as configuration file, including some gunicorn preferences and environment variables. The file is pre-populated with the default environment variables. You can set them by editing the following property:
raw_env = [
'PORT=5000',
'SECRET_KEY=ineedtoputasecrethere',
'DATABASE_URL=sqlite:///databasegateway.db',
]
You can now deploy the PyGrid app and Grid Node docker containers on kubernetes. This can be either to a local (minikube) cluster or a remote cluster (GKE, EKS, AKS etc). The steps to setup the cluster can be found in ./k8s/Readme.md
A comprehensive list of tutorials can be found here.
These tutorials cover how to create a PyGrid node and what operations you can perform.
The guide for contributors can be found here. It covers all that you need to know to start contributing code to PyGrid in an easy way.
Also join the rapidly growing community of 7300+ on Slack. The slack community is very friendly and great about quickly answering questions about the use and development of PyGrid/PySyft!
We also have a Github Project page for a Federated Learning MVP here.
You can check the PyGrid's official development and community roadmap here.
Do NOT use this code to protect data (private or otherwise) - at present it is very insecure.