Skip to content

MIT-SPARK/Ouroboros

Repository files navigation

Spark VLC

Installation

pip install from the top-level directory. I guess we should have both packages installed by the same setup.py eventually.

See the example in examples to get an idea of how to use the interface so far.

Example ROS Integration

ouroboros_ros is a valid ROS package that has an example of a node that generates "ground truth" loop closures. It queries the tf tree for the specified tf, and occasionally publishes a loop closure message where appropriate. You can try this example by building ouroboros_ros in your ROS workspace, running roslaunch ouroboros_ros ground_truth_lc_node.launch, and then playing a rosbag with the appropriate tfs. See ouroboros_ros/config/ground_truth_lc_node.yaml for configuration information.

Interface Notes

The central feature of the library is the VlcDb database for visual loop closures (see vlc_db.py). This class provides an interface for storing, updating, and querying keyframe images, embeddings, keypoints, descriptors, and loop closures. The "VLC Server" will be built on top of this. Currently the interface to this database is:

Images

add_image and get_image add a (keyframe) image to the database with associated metadata.When you add an image, you get back a UUID that uniquely identifies that image. drop_image deletes an image from the store.

update_embedding, update_keypoints, and update_descriptors update the place embedding, image keypoints, and feature descriptors for a stored image, given the image uuid and embedding/keypoints/descriptors.

query_embeddings takes an array of embedding vectors and returns the top-k closest matches and distances for each

Images are inserted into the database with SparkImage datatype (in spark_image.py) and returned with the VlcImage datatype (in vlc_image.py).

Loop Closures

add_lc and get_lc add or remove a loop closure from the database. I think this functionality is most likely to be useful for multi session / multi robot.

Loop closures are inserted and returned with the SparkLoopClosure datatype (in spark_loop_closure.py)

Sessions

A session is a single, continuous stream of images from a camera. Usually this maps to the normal notion of a session being one robot running for a certain amount of time. But for a robot with multiple cameras, each camera is a session. A centralized mapping system which consumes several distributed camera streams would have one session for each camera.

We provide add_session to create a new session or insert_session to insert a session generated by an external state manager.

Sessions are returned with the SparkSession type (in vlc_session_table.py).

Devlopment Notes

Terms / Design choices to expand on:

  • "Database" architecture. Philosophically, the current design is a database with three tables: Images, LoopClosures, and Sessions. Since many of the things we want to store are very large chunks of data and we don't necessarily have complicated queries, we don't actually use a database and instead roll our own system for manager storage and querying. If we ever start doing more complex queries in the future, we should switch to sqlite.

VlcImageTable -- ... LcTable -- ... SessionTable -- ...

When you get an element from the table you get back a class that represents a "row" in the database (e.g. VlcImage)

  • SparkImage -- wrapper for all pixel-level image data (e.g. r,g,b,d), but NOT METADATA
  • VlcImage -- a "row" of image data that we store

Dev questions:

  • Naming of SparkImage / SparkLoopClosure (currently "rows" in the database, but in general it defines the interface for the data that is put into / queried from the database (even if the underlying storage were different)
  • Anything else to store for SparkLoopClosure?

TODO:

  • Make a clear interface to (optionally) stop storing the image once we have generated the embedding and keypoints etc.
    • There are many possible combinations of what info we keep, so probably don't want the database itself to support all of these. Let that be the job of a higher level of abstraction. But the database probably should support dropping the original image once we have the embedding etc.