Skip to content

Protocol Buffers

Alan Cleary edited this page Nov 24, 2021 · 2 revisions

Some microservices in this repository support gRPC, as such, they use Protocol Buffers to define their service APIs and the data types they use. This page documents how Protocol Buffers should be organized and used in this repository.

Protocol Buffer Organization

Per the Repository Organization page, Protocol Buffers can be located in two places in the repository:

  1. The top-level proto/ directory
  2. A top-level proto/ directory inside a microservice directory

The proto/ directory inside a microservice directory should contain the Protocol Buffers that define its service API. All other Protocol Buffers, i.e. the data types that services actually use, should be located in the top-level proto/ directory so they can be independently versioned and shared among the microservices.

All Protocol Buffers should conform to the Protocol Buffer style guide. For instance, file names should be lower_snake_case.proto.

The top-level proto/ directory

The top-level proto/ directory contains .proto files (Protocol Buffers) that define structured data types that may be used by the microservices in this repository and the clients that interact with them. Each .proto file is individually versioned (i.e. MAJOR.MINOR.PATCH) to allow microservices to use old and new versions of various data types simultaneously. As such, each .proto file should be self contained and not import types from other .proto files, meaning related types should be contained within a single .proto file. See the Tagging Releases and Automated Builds page for more information about how Protocol Buffers and microservices should be versioned.

To support version differentiation across various languages, each .proto file should be placed in a subpath defined by its name and major version, and the package line in the file should use the same scheme but have the GitHub organization and repository as the prefix. For example, as long as the major version of the gene.proto file is 1 (e.g. 1.0.0, 1.2.3, 1.16.7, etc), its path in this repository should be:

root/
└── proto/
    └── gene/
        └── v1/
            └── gene.proto

Similarly, as long as the major version of the gene.proto file is 1, the package line in the file should be:

package legumeinfo.microservices.gene.v1

Microservice proto/ directories

Microservices that support gRPC will define their APIs using Protocol Buffers. Such .proto files should be stored with their microservice (i.e. not in the top-level proto/ directory) and also use their name and service's major version in their subpath and package line. Additionally, the name portion of a microservice's .proto file subdirectory and package line should have _service appended to it to prevent collisions with .proto files defined in the top-level proto/ directory.

As with .proto files located in the top-level proto/ directory, microservice .proto files' package lines should also have the GitHub organization and repository as the prefix. For example, as long as the major version of the Micro-Synteny Search Python service is 1, the path to its microsyntenysearch.proto file in this repository should be:

root/
└── micro_synteny_search/
    └── proto/
        └── microsyntenysearch_service/
            └── v1/
                └── microsyntenysearch.proto

Similarly, as long as the major version of the Micro-Synteny Search service is 1, the package line in its .proto file should be:

package legumeinfo.microservices.microsyntenysearch_service.v1

Using Protocol Buffers

The intent of the previously described Protocol Buffer organization is to maximize portability of the .proto files. Here we will describe how microservices within this repository should utilize top-level .proto files and .proto files of other microservices, and we will discuss how external applications may use these .proto files as well.

Within this repository

The microservices themselves should not symlink, copy, or directly include .proto files from this directory or from other microservices. Instead, the git read-tree command should be used to create versioned copies of the .proto files a microservice depends on in its proto/ directory. For example, major version 1 of the Macro-Synteny Blocks service depends on major version 1 of the Pairwise Macro-Synteny Blocks service and major version 1 of the Block data type. git read-tree can be used to copy these specific versions into the Macro-Synteny Blocks service's proto/ directory as follows

$ git read-tree --prefix=macro_synteny_blocks/proto/pairwisemacrosynteny_service -u [email protected]:pairwise_macro_synteny_blocks/proto/pairwisemacrosyntenyblocks_service
$ git read-tree --prefix=macro_synteny_blocks/proto/block -u proto/[email protected]:proto/block 

Notice that this preserves the previously described subpaths of each .proto file. This is intentional to preserve version differentiation across languages and so dependencies are correct when compiling with protoc.

This technique ensures that a microservice is built against the specific versions of the .proto files it depends on. It also allows old versions of .proto files to be removed from the repository but still remain available as dependency targets.

External applications

External applications may be implemented in a variety of different programming languages. As such, it is up to the developers of those applications to compile the Protocol Buffers into their desired language. However, we recommend that developers utilize the versioned tags of this repository when fetching the Protocol Buffers, rather than copying whatever is currently the HEAD of the main branch. Furthermore, it is advised that developers encapsulate the compiled Protocol Buffers in a library in order to simplify scoping and import paths.

For example, if a JavaScript developer wishes to use Protocol Buffers from one or more microservice, we recommend using degit to copy the proto/ subdirectory of each microservice with a tag specific to that microservice. For example, the following commands copy the Micro-Synteny Search, Macro-Synteny Blocks, and Chromosome Region microservice Protocol Buffers, each corresponding to a specific release tag:

$ degit legumeinfo/microservices/micro_synteny_search#[email protected]
$ degit legumeinfo/microservices/macro_synteny_blocks#macro_synteny_blocksv2.5.1
$ degit legumeinfo/microservices/chromosome_region#chromosome_regionv3.0.0

See the GCV microservice pseudo-package for an example of how this can be done programmatically, and how the generated code can be encapsulated in a library.