Skip to content

Commit

Permalink
Fix files not found with find_files function (#1255)
Browse files Browse the repository at this point in the history
<!-- Contributing guide:
https://github.com/openvinotoolkit/datumaro/blob/develop/CONTRIBUTING.md
-->

### Summary

Unexpected behavior occurs when the dataset class uses `find_files()` in
`os_util.py` with `recursive=True`. The `walk()` function called inside
the `find_files` function does not follow symbolic links. The image or
label inside the symbolic link directory cannot be found and an
exception is thrown.

This issue affects the `VideoFrameImporter`, `AvaImporter`,
`RoboflowYoloBase`, `YoloLooseBase`, etc.

<!--
Resolves #111 and #222.
Depends on #1000 (for series of dependent commits).

This PR introduces this capability to make the project better in this
and that.

- Added this feature
- Removed that feature
- Fixed the problem #1234
-->

### How to test
<!-- Describe the testing procedure for reviewers, if changes are
not fully covered by unit tests or manual testing can be complicated.
-->

Using a dataset with structure like this:

```bash
images/
    test/
    train/
yolo-ultralytics/
    images -> ../images
    labels/
        test/
        train/
```

and load the dataset:

```python
import datumaro as dm
ds = dm.Dataset.import_from("yolo-ultralytics", "yolo")
print(ds)
```

Exception is raised:

```python
datumaro.components.errors.ItemImportError: Failed to import item ...
```


### Checklist
<!-- Put an 'x' in all the boxes that apply -->
- [ ] I have added unit tests to cover my changes.​
- [ ] I have added integration tests to cover my changes.​
- [ ] I have added the description of my changes into
[CHANGELOG](https://github.com/openvinotoolkit/datumaro/blob/develop/CHANGELOG.md).​
- [ ] I have updated the
[documentation](https://github.com/openvinotoolkit/datumaro/tree/develop/docs)
accordingly

### License

- [x] I submit _my code changes_ under the same [MIT
License](https://github.com/openvinotoolkit/datumaro/blob/develop/LICENSE)
that covers the project.
  Feel free to contact the maintainers if that's a concern.
- [x] I have updated the license header for each file (see an example
below).

```python
# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT
```
  • Loading branch information
imyhxy authored Feb 2, 2024
1 parent 76769b5 commit 5a06a9d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/datumaro/util/os_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def walk(path, max_depth: Optional[int] = None, min_depth: Optional[int] = None)
min_depth = DEFAULT_MIN_DEPTH

baselevel = path.count(osp.sep)
for dirpath, dirnames, filenames in os.walk(path, topdown=True):
for dirpath, dirnames, filenames in os.walk(path, topdown=True, followlinks=True):
curlevel = dirpath.count(osp.sep)
if baselevel + min_depth > curlevel:
continue
Expand Down

0 comments on commit 5a06a9d

Please sign in to comment.