您好，我用自定义数据集训练出现了这个问题：ValueError: Empty training data.训练之前的步骤都可以，之前跑通了原始voc数据集的训练以及部署。自己做一个小项目分4类，数据集按照voc格式制作的，前边的步骤也都可以，一到训练这里就 #26

MintonLee · 2020-08-13T08:36:31Z

(tf115) F:\1_work\K210\k210-for-yolo\yolo-for-trash_detection-k210>make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10 ILR=0.001 DATASET=voc CLSNUM=4 IAA=False BATCH=8
python ./keras_train.py
--train_set voc
--class_num 4
--pre_ckpt ""
--model_def yolo_mobilev1
--depth_multiplier 0.75
--augmenter False
--image_size 224 320
--output_size 7 10 14 20
--batch_size 8
--rand_seed 3
--max_nrof_epochs 10
--init_learning_rate 0.001
--learning_rate_decay_factor 0
--obj_weight 1
--noobj_weight 1
--wh_weight 1
--obj_thresh 0.7
--iou_thresh 0.5
--vaildation_split 0.05
--log_dir log
--is_prune False
--prune_initial_sparsity 0.5
--prune_final_sparsity 0.9
--prune_end_epoch 5
--prune_frequency 100
2020-08-13 16:31:51.021329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
https://github.com/tensorflow/addons
https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

2020-08-13 16:31:54.680652: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-08-13 16:31:54.693250: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-08-13 16:31:54.730848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:54.736948: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:54.746169: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:54.753784: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:54.760818: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:54.767979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:54.774382: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:54.787883: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:54.792935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.371745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-13 16:31:55.376201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-08-13 16:31:55.379023: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-08-13 16:31:55.382793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4608 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
�[34m[ INFO ]�[0m data augment is False
WARNING:tensorflow:From F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\data\util\random_seed.py:58: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
2020-08-13 16:31:55.534760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:55.541031: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:55.544708: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:55.547742: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:55.551895: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:55.555008: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:55.559177: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:55.562276: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:55.565958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.569220: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:55.574059: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:55.578232: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:55.581370: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:55.585411: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:55.588457: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:55.592139: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:55.595837: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:55.599397: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.602502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-13 16:31:55.606356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-08-13 16:31:55.608364: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-08-13 16:31:55.610806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4608 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
�[34m[ INFO ]�[0m data augment is False
WARNING:tensorflow:From F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Train on 21 steps
Epoch 1/10
2020-08-13 16:32:22.746351: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:32:24.054899: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-08-13 16:32:24.128920: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
1/21 [>.............................] - ETA: 2:31 - loss: 792.4858 - l1_loss: 131.6574 - l2_loss: 660.4179 - l1_p: 0.0000e+00 - l1_r: 0.0000e+00 - l2_p: 5.8514e-04 - l2_r: 0.20002020-08-13 16:32:26.777512: I tensorflow/core/profiler/lib/profiler_session.cc:205] Profiler session started.
2020-08-13 16:32:26.787342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cupti64_100.dll
WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.280863). Check your callbacks.
2/21 [=>............................] - ETA: 1:17 - loss: 817.4302 - l1_loss: 172.4517 - l2_loss: 644.5675 - l1_p: 0.0012 - l1_r: 0.2500 - l2_p: 0.0025 - l2_r: 0.5000 2020-08-13 16:32:27.322285: I tensorflow/core/platform/default/device_tracer.cc:588] Collecting 3504 kernel records, 316 memcpy records.
WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.444144). Check your callbacks.
3/21 [===>..........................] - ETA: 52s - loss: 745.5250 - l1_loss: 159.0245 - l2_loss: 586.0892 - l1_p: 8.5985e-04 - l1_r: 0.1429 - l2_p: 0.0023 - l2_r: 0.3636WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.280863). Check your callbacks.
4/21 [====>.........................] - ETA: 37s - loss: 695.2993 - l1_loss: 148.6290 - l2_loss: 546.2584 - l1_p: 0.0015 - l1_r: 0.1538 - l2_p: 0.0022 - l2_r: 0.3200 WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.117583). Check your callbacks.
20/21 [===========================>..] - ETA: 0s - loss: 364.4713 - l1_loss: 68.8265 - l2_loss: 295.2221 - l1_p: 0.0012 - l1_r: 0.0385 - l2_p: 0.0020 - l2_r: 0.0563Traceback (most recent call last):
File "./keras_train.py", line 155, in
args.prune_frequency)
File "./keras_train.py", line 99, in main
validation_data=h.test_dataset, validation_steps=int(h.test_epoch_step * h.validation_split))
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 727, in fit
use_multiprocessing=use_multiprocessing)
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 675, in fit
steps_name='steps_per_epoch')
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 440, in model_iteration
steps_name='validation_steps')
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 411, in model_iteration
aggregator.finalize()
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py", line 138, in finalize
raise ValueError('Empty training data.')
ValueError: Empty training data.
Makefile:35: recipe for target 'train' failed
make: *** [train] Error 1

MintonLee · 2020-08-13T08:37:52Z

这个是什么问题，是数据集大小需要修改吗？还是路径问题？还是batch问题，我已经修改了batch到1也是这个问题

xji-apex · 2020-09-14T03:43:38Z

同样的问题，有人可以指点一下嘛？

zhen8838 · 2020-09-14T07:47:14Z

自定义数据集的时候很容易出现数据太少，并且我之前的代码是对于测试集是直接取训练集中的一部分，可能导致他无法满足tf.dataset里面的buffer出现这个问题，你可以在定义输入数据管道的地方将shuffle、map等操作的buffer改小一些。

uguisu · 2020-11-30T06:22:58Z

我也遇到过同样问题。程序自动对数据集进行分割，分成训练集和测试集。这里的参数batch_size和vaildation_split直接影响训练集和测试集的大小。如果按照默认设置batch_size = 32, vaildation_split = 0.05则会出现测试集过小，训练中途失败的问题。

我是通过调整参数来解决问题的。例如将设置修改为batch_size = 16, vaildation_split = 0.1。

noobgrow · 2021-04-19T13:28:17Z

在tools/utils.py 文件中修改_create_dataset函数中的shuffle参数如下，并在train_fit函数中把validation_step 参数改为定值，亲测可以解决！！（把buffer大小改为数据集大小就可以了！！）

def _create_dataset()

dataset = (tf.data.Dataset.from_generator(gen, (tf.framework_ops.dtypes.string, tf.float32), ([], [None, 5])).
shuffle(self.train_total_data if is_training == True else self.test_total_data, rand_seed).repeat().
map(_parser_wrapper, tf.data.experimental.AUTOTUNE).
batch(batch_size, True).prefetch(tf.data.experimental.AUTOTUNE))

    return dataset

noobgrow mentioned this issue Apr 22, 2021

ValueError: Empty training data. vseasky/yolo-for-k210#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MintonLee commented Aug 13, 2020

MintonLee commented Aug 13, 2020

xji-apex commented Sep 14, 2020

zhen8838 commented Sep 14, 2020

uguisu commented Nov 30, 2020

noobgrow commented Apr 19, 2021 •

edited

Loading

Comments

MintonLee commented Aug 13, 2020

MintonLee commented Aug 13, 2020

xji-apex commented Sep 14, 2020

zhen8838 commented Sep 14, 2020

uguisu commented Nov 30, 2020

noobgrow commented Apr 19, 2021 • edited Loading

noobgrow commented Apr 19, 2021 •

edited

Loading