Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data format (Good Intro for Beginner) #244

Open
TianhaoFu opened this issue Dec 29, 2021 · 7 comments
Open

data format (Good Intro for Beginner) #244

TianhaoFu opened this issue Dec 29, 2021 · 7 comments
Labels
good first issue Good for newcomers

Comments

@TianhaoFu
Copy link

TianhaoFu commented Dec 29, 2021

Hi, thanks for your code.

when I was using your repo, i found the batch_data format are as follows:

dict_keys(['metadata', 'points', 'voxels', 'shape', 'num_points', 'num_voxels', 'coordinates', 'gt_boxes_and_cls', 'hm', 'anno_box', 'ind', 'mask', 'cat'])

can you explain what each item means? Besides, can you tell me how the voxels data generates? where is the corresponding code?


Also when i training my centerpoint based on pointpillars, i found that the input data are

data["features"], data["num_voxels"], data["coors"]

one of the example of the input data is

fetures: torch.Size([74222, 20, 5])
num_voxels: torch.Size([74222])
coors: torch.Size([74222, 4])

can you tell me how the features generates? where is the corresponding code?
and what the meaning of num_voxels?

thanks!
:)

@tianweiy
Copy link
Owner

tianweiy commented Dec 30, 2021

can you explain what each item means? Besides, can you tell me how the voxels data generates? where is the corresponding code?

metadata': this contains infos for this frame (e.g. lidar path, frame id, etc..)
points: lidar points, a list of N x 4+x data
voxels: namely, voxels, N x 4+x (the first channel indicates which frame this voxel is in across the whole batch)
shape: spatial shape of the voxel data
num_points: how many voxels are in each frame
num_voxels: how many points are inside each voxel
coordinates: integer xyz coordinates of each voxel
gt_boxes_and_cls: the training target (x, y, w, l ,w, h, theta, and object label)
heatmap: the center heatmap described in the paper
anno_box: the box map for training (it contains the target box info at each center location) it is generated here

anno_box[new_idx] = np.concatenate(

ind: indices for the center. This one is used to extract box parameter prediction from the bev map (and compute loss with anno box during the training)

mask: we do some zero padding for boxes (e.g. you get 10 real boxes for example 1, 20 for example 2, we will pad both frames to 20 boxes for efficient batching). Mask indicates if it is a zero padded value or real value

cat: the category label of this anno_box

Besides, can you tell me how the voxels data generates

it is implemented as dynamic voxelization

def voxelization(points, pc_range, voxel_size):

Also when i training my centerpoint based on pointpillars, i found that the input data are

features are generated here

voxels, coordinates, num_points = self.voxel_generator.generate(

it is basically, 74222 pillars, and each pillar gets 20 lidar points (some are zero padded) and each points have 5 features (x, y, z, r, timestamp)

num_voxels as explained above is the number of valid points per voxels (or pillars)
coors is the coordinate for each pillar

Hopefully, this helps.

@tianweiy tianweiy changed the title data format data format (Good Intro for Beginner) Dec 30, 2021
@tianweiy tianweiy added the good first issue Good for newcomers label Dec 30, 2021
@tianweiy tianweiy pinned this issue Dec 30, 2021
@TianhaoFu
Copy link
Author

Thanks for your reply. I carefully read the voxelization function

def voxelization(points, pc_range, voxel_size): 

This function looks as if it is not in the form of a dynamic voxelization. Because its implementation simply traverses the point cloud step by step, generates the voxel index, and fills the point cloud's feature data into the new voxel matrix according to the index.
@tianweiy

@tianweiy
Copy link
Owner

tianweiy commented Jan 2, 2022

Because its implementation simply traverses the point cloud step by step, generates the voxel index, and fills the point cloud's feature data into the new voxel matrix according to the index.

Sorry, I don't get this. I think what you describe (and what I implemented) is dynamic voxelization. Do I misunderstand the concept?

@TianhaoFu
Copy link
Author

I printed the input point cloud coordinate data and found it to be in this form:

tensor([[  0,   0, 277, 241],
        [  0,   0, 310, 283],
        [  0,   0, 260, 241],
        ...,
        [  3,   0, 193, 361],
        [  3,   0, 152, 112],
        [  3,   0, 383, 193]], device='cuda:0', dtype=torch.int32)

data = dict(
features=voxels,
num_voxels=num_points_in_voxel,
coors=coordinates,
batch_size=batch_size,
input_shape=example["shape"][0],
)
Why are the coordinates in 4 dimensions and what does each dimension mean?


Also I printed the input point cloud voxel data and found it to be in this form

tensor([[[ -2.8630,   4.2008,  -1.9721,  23.0000,   0.1499],
         [ -2.9000,   4.2069,  -1.8522,  33.0000,   0.1001],
         [ -2.9082,   4.3142,  -1.8927,  28.0000,   0.0000],
         ...,
         [ -2.9082,   4.3697,  -1.9095,  27.0000,   0.0000],
         [ -2.9958,   4.3997,  -1.7918,  12.0000,   0.1001],
         [ -2.8851,   4.2570,  -1.9805,  24.0000,   0.3003]],

        [[  5.5198,  10.9735,  -2.3002,   3.0000,   0.2500],
         [  5.4946,  10.9447,  -2.2990,   3.0000,   0.1499],
         [  5.5790,  10.8439,  -2.2948,   3.0000,   0.0498],
         ...,
         [  5.4573,  10.9144,  -2.2963,   3.0000,   0.0498],
         [  5.4343,  10.9814,  -2.3001,   3.0000,   0.1499],
         [  5.5965,  10.8180,  -2.2944,   3.0000,   0.0000]],

        [[ -2.9384,   0.9971,  -1.6486,  20.0000,   0.0498],
         [ -2.9434,   0.8410,  -1.8094,   9.0000,   0.2500],
         [ -2.9163,   0.9642,  -1.8186,   5.0000,   0.1001],
         ...,
         [ -2.9446,   0.8919,  -1.6328,  13.0000,   0.1499],
         [ -2.9377,   0.8296,  -1.7143,  10.0000,   0.0498],
         [ -2.9425,   0.9046,  -1.8151,   7.0000,   0.4004]],

        ...,

        [[ 21.1160, -12.5320,   0.5717,  15.0000,   0.1001],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         ...,
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000]],

        [[-28.6991, -20.6613,  -0.8174,  22.0000,   0.1499],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         ...,
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000]],

        [[-12.4745,  25.5616,  -3.3252,   4.0000,   0.2500],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         ...,
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000],
         [  0.0000,   0.0000,   0.0000,   0.0000,   0.0000]]], device='cuda:0')

data = dict(
features=voxels,
num_voxels=num_points_in_voxel,
coors=coordinates,
batch_size=batch_size,
input_shape=example["shape"][0],
)

By definition the voxel data is just a rearrangement of the point cloud data, so why is there a negative number, and what does each dimension of this mean? @tianweiy

@TianhaoFu
Copy link
Author

Because its implementation simply traverses the point cloud step by step, generates the voxel index, and fills the point cloud's feature data into the new voxel matrix according to the index.

Sorry, I don't get this. I think what you describe (and what I implemented) is dynamic voxelization. Do I misunderstand the concept?

Sorry, I misunderstood, you are right. your implementation is the dynamic voxelization.

@yuedi-hhh
Copy link

您好我想请问,在det3d/models/readers/dynamic_vixel_encoder.py中的DynamicVoxelEncoder类 和det3d/datasets/pipelines/process.py中的Voxelization类,二者在数据流动过程中会同时被使用吗?因为一个是readers,一个数据预处理的pipeline,二者都是体素化。我能否理解当使用 VoxelFeatureExtractorV3作为reader时,就需要用Voxelization进行数据预处理,如果使用DynamicVoxelEncoder作为reader就不要Voxelization数据预处理呢?因为我注意到V3没有把点云体素化。

我对于这个框架是怎么运转的还不明白,希望能得到你们的帮助,谢谢!
我对于det3d的运作是这样理解的,使用了@READERS.register_module等装饰器,从而在import的过程中会将文件从头到尾运行,于是就将各个类注册到registry中,从而可以使用congfig的关键字就能build出各个类?请问我这么理解是对的吗?除了看源码,哪里还能了解到他的运行机制呢?源码太多了,一个一个看太复杂了,看不明白。

@yuedi-hhh
Copy link

@TianhaoFu 请问您能解决我的疑惑吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants