Skip to content

Graphene

William Silversmith edited this page Apr 10, 2020 · 46 revisions

Graphene is a format (graphene://) used by CloudVolume and our custom version of neuroglancer that takes a highly oversegmented volume and enables proofreading by linking segments together into larger objects in a graph. https://flywire.ai is a notable use of Graphene.

64-bit Label Structure

Image Credit: Sven Dorkenwald

Levels

The PyChunkGraph level is indicated, as above, by the first eight bits of the segment ID. The graph is assembled into a hierarchy of levels to make representing multiple resolutions of edits possible. Level numbering starts at 1 instead of 0 due to the PyChunkGraph's genesis in Julia.

Our largest two datasets have 10 and 12 levels respectively.

Level Representation
0 Not Used.
1 Watershed
2 Intra-chunk agglomeration
3 First inter-chunk linkages w/ fan-out from each chunk
4 Fan-out from level 3 graph.
...
N "Root IDs" aka complete aggregated objects

Meshing

In order to control meshing costs, we are adopting a variation of the sharded format for initial meshes. Meshes must be created for each level of the chunk graph in order to provide a granular substrate for creating dynamic meshes. The initial meshes are based on a single timepoint in the chunk graph server, the dynamic meshes that come after are generated based on each proofreading action and are uploaded as individual mesh files (i.e. not sharded) as rewriting shards one mesh at a time would incur the same file creation cost as a simpler individual format. (It might be possible to aggregate multiple stitching levels into a single shard for something like an 8x savings on dynamic file creation costs, but these costs will probably be low as it is proportional to the number of proofreading edits on the dataset.)

The meshes will be initially created using zmesh, compressed using draco, and stored as initial meshes. All dynamic meshes can be generated by stitching initial or previous dynamic meshes. In order to properly fuse meshes across chunk boundaries, it's necessary to extend one pixel into a neighboring chunk. This requires a complex remapping operation to ensure the extension is mapped to the same segment ids as the originating chunk.

Dynamic meshes will be created on-demand as proofreading edits generate meshing requests. These requests will be fulfilled by a Real Time Stitcher (RTS) that is always running and checking for new requests. The results of this operation will be written into the dynamic meshing section of the mesh directory structure.

Directory Structure

meshes
   |- info # contains sharding information
   |\
   | initial
   |   2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12   # levels of the chunk graph
   |    \
   |    `121951224-0ea.shard`, ... (made up example, each shard contains many meshes) 
   |
   |\
   | dynamic
   |   \
   |   2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12   # levels of the chunk graph
   |    \
   |    `737912791273`, ... (made up example, each file contains one mesh)

The Sharded Format

With the exception of the lowest level of the graph server (L2) which can mesh and generate a shard file directly, each additional level of meshing requires some level of aggregation into multiple shards. For this reason we adapt the sharded file name to be $GRAPHENE_CHUNK_ID-$SHARD_NUMBER.shard where the $GRAPHENE_CHUNK_ID is the x,y,z bits of the label extracted and represented in base 10 and the $SHARD_NUMBER is the standard shard number assigned by the Precomputed sharding scheme.

As accessing sharded files is slower due to the need for consulting indices, we would like to balance the number of labels in a shard file with the total number of shard files for a given chunk, level pair. A larger number of shards increases the number of cache misses in CloudVolume and Neuroglancer's LRU cache for the constant size index. Too few shards means that reading the minishard index will become taxing.

As a rule of thumb, we should aim to make it so that reading a single uncached mesh is as close to perceptually instantaneous as possible on a 10 Mbps home connection (1 Gbps is the most common connection speed in the Princeton Neuroscience Institute). This means that all three requests + decompression should add up to less than 200 msec, a standard yardstick in HCI (humans can perceive smaller increments but this is the approximate threshold of annoyance). The reason this is a valid design goal is we wish to make proofreading a seamless experience, and loading meshes is core to the proofreading workflow. Batch analysis may require even higher throughput, but is less sensitive to individual mesh latency.

  • At 10 Mbps (1,250 kiB/sec), it takes 200 msec to download 250 kB.
  • At 100 Mbps (12,500 kiB/sec), it takes 200 msec to download 2.5 MB.
  • At 1 Gbps (125,000 kiB/sec) it takes 200 msec to download 25 MB.

If ping is 12 msec, then 36 msec of RTT is built into the request. This leaves 164 msec (150 kiB). To download a small 100 kB mesh, this leaves 50 kiB for fetching indices. gzip decodes at around 18 MB/sec, so it should take about 5 msec to decode. It is difficult to find reliable data on draco, this blog post is the best I've found.

In order to keep the size of minishard indicies to a reasonable size, we will need to progressively increase the number of shards and the number of minishards as the levels of the chunk graph are ascended. Since it makes sense to treat each L2 chunk as a single shard, we start at 0 shard bits. Given an L2 test region for fly, it seems that a 256x256x512 area contained ~6k labels. With a fan out of 2^3 per a chunk level, and Z getting truncated at 7062 slices, I attempted to balance the caching chance with download size in the below chart.

For number of labels in a 256x256x512 L2 chunk:

Level	    Labels	S. Bits	MS.Bits	MS. kiB	Idx kiB	%Cached Bits Left (64 allotted)	
2	      6048      0	4	8.86	0.03125	100.0%	60	
3	     48384      0	6	17.72	0.125	100.0%	58	
4	    387072      0	9	17.72	1	100.0%	55	
5	   3096576      1	11	17.72	4	50.0%	52	
6	  24772608	2	12	35.44	8	25.0%	50	
7	  99090432	4	12	35.44	8	6.3%	48	Z starts being truncated here
8	 396361728	8	11	17.72	4	0.4%	45	
9	1585446912	10	10	35.44	2	0.1%	44	
10	6341787648	10	12	35.44	8	0.1%	42