Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement graph components and archetypes #7500

Open
wants to merge 163 commits into
base: main
Choose a base branch
from

Conversation

grtlr
Copy link
Collaborator

@grtlr grtlr commented Sep 24, 2024

This implements basic graph primitives in the Rerun data model. This is a first step towards visualizing node-link diagrams in Rerun (related issue: #6898).

In addition to the changes to the data model, this PR adds two example crates, node_link_graph and graph_view to the Rust examples that show how these primitives can be used.

Design Decisions

  • Nodes and edges are stored as components and can be batched. To have a single node per entity we can use Rerun’s [clamping mechanism](https://rerun.io/docs/concepts/batches#component-clamping).
  • GraphNodeId is modeled as u32 to improve performance when using petgraph strings for better user experience.
  • A node is unique identified by combining its GraphNodeId and its EntityPath.
  • Labels of the nodes can be set via the labels component and toggled via show_labels
  • Hierarchical graphs can be modeled through entity paths. For edges that cross entity boundaries we can insert dummy nodes to properly render subparts of the hierarchy.
  • Nodes and edges need to be logged to different entities, otherwise the selections will collide. We provider helper functions / conversions to link nodes that are stored in different entities.

Logging example

rec.set_time_sequence("frame", 2);
rec.log("living/objects", &GraphNodes::new(["table"]))?;
rec.log("living/areas", &GraphNodes::new(["area0", "area1", "area2"]))?;

rec.log(
    "living/edges",
    &GraphEdgesDirected::new([
        // Both source and target are in the same entity
        ("living/areas", "area0", "area1"),
        ("living/areas", "area0", "area2"),
        ("living/areas", "area1", "area2"),
    ]),
)?;

rec.log(
    "reachable",
    &GraphEdgesUndirected::new([
        // Source and target are in different entities.
        (("living/areas", "area1"), ("living/objects", "table")),
    ]),
)?;

TODOs

  • Get rid of the Default derive for GraphNodeId and GraphEdge in the flatbuffer definitions.
  • Improve ergonomics for generating graph edges during logging.
  • Ensure that logging works from Python and C++ too.
  • Fix remaining lints.

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@nikolausWest
Copy link
Member

In this design, are the node id's global?

@grtlr
Copy link
Collaborator Author

grtlr commented Sep 24, 2024

@nikolausWest Good point! So far I've made the assumption that all node IDs are global and can be referenced across entities to allow edges that connect nodes from different entities (similar to ClassId).

We could probably use a namespaced approach (e.g. based on entity path) by changing the way we gather the nodes from the entities that are currently being visualized. Then, we would have to store this entity information in the edges though.

However, it's probably simpler for the user to encode this information in the node IDs themselves. In the case of String IDs, this could be done by appending the entity path as a suffix. If we stick with IDs based on integers users could for example use factorizations.

Do you have a particular use case for namespaced node IDs in mind? How would you expect Rerun to behave in that case?

@nikolausWest
Copy link
Member

So far I've made the assumption that all node IDs are global and can be referenced across entities to allow edges that connect nodes from different entities (similar to ClassId).

Class Ids aren't actually global in the sense that to resolve a class id into e.g. a color, we walk up the graph to the first Annotation Context and use that to look up the value. That means these are actually scoped.

Do you have a particular use case for namespaced node IDs in mind? How would you expect Rerun to behave in that case?

I'm rather thinking about what happens if a user has multiple graphs. It would be a bit strange if edges started going between them because the user didn't correctly partition their node ids.

@nikolausWest
Copy link
Member

One option to consider here could be to have two types of edges. One that is within-entity edges (just a pair of node ids). The other could be between entity edges (destination entity, optional(source id), optional(destination id)). There might be several ways to model that but that would at least be the general idea

@grtlr
Copy link
Collaborator Author

grtlr commented Sep 25, 2024

@nikolausWest Brief update:

Edges now have two additional attributes source_entity and target_entity as you described above. That way we can:

  1. Make edges local to an entity by default to avoid collisions.
  2. Allow linking between nodes of different entities.

The following shows a new toy example of the new logging API, which I also streamlined: https://github.com/rerun-io/rerun/pull/7500/files#diff-d054c306b388fcc1e8daf9f0477735519df3eeb486030979c62478b5d43404dcR36-R66.

I've also improved the debug graph viewer to show lists of nodes, edges, and their corresponding entity path. I'm currently working on getting overrides in the UI to work so that we can color nodes and edges to more easily understand the data model. This also helps me better understand the visualizer concepts.

I have one outstanding design decision:

The current systems allows for edges to live in completely unrelated parts of the entity hierarchy. This means that when the user choses to visualize a certain sub-entity not all edges are retrieved by default and the user has to manually specify the additional entities that contain the edges using Entity Path Filters:

+ /doors/**                <- contains the global edges
+ /hallway/areas/**        <- the actual entity that the user wants to visualize

Since this can be confusing due to the lack of discoverability, I think we should pull in global edges from outside the current hierarchy and visualizing them as dummy nodes. In the current design this forces us to iterate over all edges starting from the root. Should we mitigate this by introducing a new GraphCrossEntityEdge component, similar to what you described above?

@nikolausWest
Copy link
Member

@grtlr I actually think having edges on different entities than nodes is the main reason to go with your proposal. That allows users to put different meta data on edges and nodes which is a very important feature. If you can think of a way to still keep it more local that might be worth while.

Since this can be confusing due to the lack of discoverability, I think we should pull in global edges from outside the current hierarchy and visualizing them as dummy nodes. In the current design this forces us to iterate over all edges starting from the root.

I'm not completely sure about this. Maybe @Wumpf has some thoughts?

@Wumpf
Copy link
Member

Wumpf commented Sep 26, 2024

Haven't been following the entire discussion, but there's a lot of issues with pulling data from outside of a viewer's query/entity-pathfilter:

  • changes the query mechanism from going through a higher level abstraction to direct store queries
    • we actually want to get to a place where we can predict the query of a view perfectly just from looking at how its configured, not knowing its specific type
    • we're quite far from that and there's existing violations of that rule, 3d transforms being the most prominent one
  • blueprint can't be applied to everything outside the path filter
    • blueprint overrides have no effect
    • blueprint defaults on the View have no effect
  • if we pull data outside of the path filter, how to stop certain data from being ingested?
    • control for that is blueprint
    • does this cause a huge query?
    • how do I display independent graphs in independent views?

@nikolausWest
Copy link
Member

I think it's pretty clear we shouldn't include data from outside the entity path query. I think one question that leaves though is how to handle edges that are within the included entities but refer to nodes that are not. Perhaps some kind of greyed out nodes could make sense there (maybe even an option to include or exclude those)

@grtlr
Copy link
Collaborator Author

grtlr commented Sep 26, 2024

Thank you @Wumpf for the clarification—that makes a lot of sense!

@nikolausWest I agree that we should show those as edges to some dummy nodes. In fact this is how I stumbled upon the above problem in the first place (undirected edges).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat-graph-view Everything related to the graph view
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants