Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import only import one source #1214

Closed
AbdelsalamHaa opened this issue Dec 1, 2023 · 3 comments
Closed

Import only import one source #1214

AbdelsalamHaa opened this issue Dec 1, 2023 · 3 comments
Assignees
Labels
BUG Something isn't working

Comments

@AbdelsalamHaa
Copy link

System:
Windows 11 with WSL
python 3.8

##Issue:
importing the dataset imports only one source of the project, although it might have many sources.

Should be

The import function should import the full project not only one source

details
here is my tree/config.yml file
build_targets: project: parents: [] stages: - hash: '' kind: '' name: root params: {} type: project task_113: parents: [] stages: - hash: '' kind: '' name: root params: {} type: source - hash: '' kind: rename name: stage-1 params: regex: '|frame_|task_113_' type: transform - hash: '' kind: random_split name: stage-2 params: seed: null splits: - - train - 0.7 - - test - 0.3 type: transform task_117: parents: [] stages: - hash: '' kind: '' name: root params: {} type: source - hash: '' kind: rename name: stage-1 params: regex: '|frame_|task_117_' type: transform - hash: '' kind: random_split name: stage-2 params: seed: null splits: - - train - 0.7 - - test - 0.3 type: transform format_version: 2 sources: task_113: format: yolo hash: '' options: {} path: '' url: /home/iradar/aiv_pipeline/datasets/tuna/task_113 task_117: format: yolo hash: '' options: {} path: '' url: /home/iradar/aiv_pipeline/datasets/tuna/task_117

but only task 117 is imported

My code to import
`dataset_path = '/home/iradar/aiv_pipeline/datasets/tuna'

dataset = Dataset.import_from(dataset_path, 'yolo') `
@AbdelsalamHaa
Copy link
Author

@vinnamkim
I just would like to add, when I used version 1.0.0, the import function worked well and everything seemed fine.

@vinnamkim vinnamkim added the BUG Something isn't working label Dec 4, 2023
@vinnamkim
Copy link
Contributor

Hi @AbdelsalamHaa,
Sorry for the late response. It takes time to find the root cause for this. Until we solve this thing, you can merge your datasets manually by code. For example,

from datumaro import Dataset, HLOps

dataset_1 = Dataset.import_from("/home/iradar/aiv_pipeline/datasets/tuna/task_113", 'yolo')
dataset_2 = Dataset.import_from("/home/iradar/aiv_pipeline/datasets/tuna/task_117", 'yolo')
dataset = HLOps.merge([dataset_1, dataset_2])

Thank you.

vinnamkim added a commit that referenced this issue Jan 16, 2024
…the project (#1243)

### Summary

- Ticket no. 127589
- Fix a bug found in #1214
- In the previous Datumaro version, we could import the nested datasets
in the given path. For example, if we
`Dataset.import_from(./some_project)` for the following directory
structure
``` console
./some_project/
├── dataset_1
└── dataset_2
```
, the imported dataset include `dataset_1` and `dataset_2`.
- This is common pattern for the Datumaro project, so that we have to
fix this.

### How to test
Added an integration test for this scenario.

### Checklist
<!-- Put an 'x' in all the boxes that apply -->
- [ ] I have added unit tests to cover my changes.​
- [x] I have added integration tests to cover my changes.​
- [x] I have added the description of my changes into
[CHANGELOG](https://github.com/openvinotoolkit/datumaro/blob/develop/CHANGELOG.md).​
- [ ] I have updated the
[documentation](https://github.com/openvinotoolkit/datumaro/tree/develop/docs)
accordingly

### License

- [x] I submit _my code changes_ under the same [MIT
License](https://github.com/openvinotoolkit/datumaro/blob/develop/LICENSE)
that covers the project.
  Feel free to contact the maintainers if that's a concern.
- [x] I have updated the license header for each file (see an example
below).

```python
# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT
```

---------

Signed-off-by: Kim, Vinnam <[email protected]>
@wonjuleee
Copy link
Contributor

Thank you @vinnamkim for the kind guidance of Datumaro. I will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants