Data preprocessing issues #165

kmatzen · 2024-08-19T18:10:10Z

I'm going to consolidate some data preprocessing problems here and then someone can break them out into separate issues as needed.

I'm attempting to go through the instructions as written and wanted to note some incompatibilities that have arisen.

scannet++ sfm reconstructions appear to lack some images that are in the selected images.
- workaround has been to skip these images and then rewrite the selection array and the pairs indices
- I downloaded the default data for scannet++ first week of August.
- Same issue as here, although I found many scenes to have issues. scannetpp_pairs not compatible with scannetpp's colmap data #159

hm3d, gibson, and replica cad are missing some views

I'm extending the skip-and-retry logic to handle missing files:

          # load the view (and use the next one if this one's broken)
          for ii in range(view_index, view_index + 5):
              try:
                  image, depthmap, intrinsics, camera_pose = self._load_one_view(data_path, key, ii % 5, resolution, rng)
                  if np.isfinite(camera_pose).all():
                      break
              except Exception as exc:
                  print(exc)
                  pass

blendedmvs fails a check on the principal point

Error:

[rank6]:     assert min_margin_x > W/5, f'Bad principal point in view={info}'
[rank6]: AssertionError: Bad principal point in view=('data/blendedmvs_processed/000000000000000000000001', '00000036')

As a workaround I commented out both the horizontal and vertical principal point asserts. 1/5 the height and width seems like a heuristic?

Ran into this issue preprocess_co3d.py can't work #162
find_scenes.py doesn't seem to use the same validation set size as what is advertised in the top level readme.

--- a/datasets_preprocess/habitat/find_scenes.py
+++ b/datasets_preprocess/habitat/find_scenes.py
@@ -49,8 +49,8 @@ def find_all_scenes(habitat_root, n_scenes=[100000]):
     print(f'from {len(list_scenes)} scenes in total')
 
     np.random.shuffle(list_scenes)
-    train_scenes = list_scenes[len(list_scenes)//10:]
-    val_scenes = list_scenes[:len(list_scenes)//10]
+    train_scenes = list_scenes[len(list_scenes)//1000:]
+    val_scenes = list_scenes[:len(list_scenes)//1000]
 
     def write_scene_list(scenes, n, fpath):
         sub_scenes = [os.path.join(scene, id) for scene, ids in scenes for id in ids]
@@ -65,7 +65,7 @@ def find_all_scenes(habitat_root, n_scenes=[100000]):
 
     for n in n_scenes:
         write_scene_list(train_scenes, n, os.path.join(habitat_root, f'Habitat_{n}_scenes_tra
in.txt'))
-        write_scene_list(val_scenes, n//10, os.path.join(habitat_root, f'Habitat_{n//10}_scen
es_val.txt'))
+        write_scene_list(val_scenes, n//1000, os.path.join(habitat_root, f'Habitat_{n//1000}_
scenes_val.txt'))

The text was updated successfully, but these errors were encountered:

yocabon · 2024-09-20T09:40:06Z

Hi,
Thanks for the issue; I apologize for not looking into this earlier.

It's also related to #159

about scannet++
we were using one the early releases (not v1), and it seems that they updated the dataset since then (and are going to update it again ?).
We will eventually update the released pairs to include the new scenes.

blendedmvs/find_scenes/co3d error: should now be addressed

yocabon added a commit that referenced this issue Sep 20, 2024

comment asserts, generate more subsets for habitat, #165

c9e9336

yocabon mentioned this issue Sep 20, 2024

scannetpp_pairs not compatible with scannetpp's colmap data #159

Closed

nam1410 mentioned this issue Oct 10, 2024

scannetpp error #183

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data preprocessing issues #165

Data preprocessing issues #165

kmatzen commented Aug 19, 2024 •

edited

Loading

yocabon commented Sep 20, 2024

Data preprocessing issues #165

Data preprocessing issues #165

Comments

kmatzen commented Aug 19, 2024 • edited Loading

yocabon commented Sep 20, 2024

kmatzen commented Aug 19, 2024 •

edited

Loading