-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,我用自定义数据集训练出现了这个问题:ValueError: Empty training data.训练之前的步骤都可以,之前跑通了原始voc数据集的训练以及部署。自己做一个小项目分4类,数据集按照voc格式制作的,前边的步骤也都可以,一到训练这里就 #26
Comments
这个是什么问题,是数据集大小需要修改吗?还是路径问题?还是batch问题,我已经修改了batch到1也是这个问题 |
同样的问题,有人可以指点一下嘛? |
自定义数据集的时候很容易出现数据太少,并且我之前的代码是对于测试集是直接取训练集中的一部分,可能导致他无法满足 |
我也遇到过同样问题。程序自动对数据集进行分割,分成训练集和测试集。这里的参数 我是通过调整参数来解决问题的。例如将设置修改为 |
在tools/utils.py 文件中修改_create_dataset函数中的shuffle参数如下,并在train_fit函数中把validation_step 参数改为定值,亲测可以解决!!(把buffer大小改为数据集大小就可以了 !!) def _create_dataset() dataset = (tf.data.Dataset.from_generator(gen, (tf.framework_ops.dtypes.string, tf.float32), ([], [None, 5])).
|
(tf115) F:\1_work\K210\k210-for-yolo\yolo-for-trash_detection-k210>make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10 ILR=0.001 DATASET=voc CLSNUM=4 IAA=False BATCH=8
python ./keras_train.py
--train_set voc
--class_num 4
--pre_ckpt ""
--model_def yolo_mobilev1
--depth_multiplier 0.75
--augmenter False
--image_size 224 320
--output_size 7 10 14 20
--batch_size 8
--rand_seed 3
--max_nrof_epochs 10
--init_learning_rate 0.001
--learning_rate_decay_factor 0
--obj_weight 1
--noobj_weight 1
--wh_weight 1
--obj_thresh 0.7
--iou_thresh 0.5
--vaildation_split 0.05
--log_dir log
--is_prune False
--prune_initial_sparsity 0.5
--prune_final_sparsity 0.9
--prune_end_epoch 5
--prune_frequency 100
2020-08-13 16:31:51.021329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
If you depend on functionality not listed there, please file an issue.
2020-08-13 16:31:54.680652: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-08-13 16:31:54.693250: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-08-13 16:31:54.730848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:54.736948: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:54.746169: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:54.753784: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:54.760818: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:54.767979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:54.774382: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:54.787883: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:54.792935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.371745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-13 16:31:55.376201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-08-13 16:31:55.379023: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-08-13 16:31:55.382793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4608 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
�[34m[ INFO ]�[0m data augment is False
WARNING:tensorflow:From F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\data\util\random_seed.py:58: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
2020-08-13 16:31:55.534760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:55.541031: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:55.544708: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:55.547742: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:55.551895: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:55.555008: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:55.559177: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:55.562276: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:55.565958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.569220: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:55.574059: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:55.578232: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:55.581370: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:55.585411: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:55.588457: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:55.592139: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:55.595837: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:55.599397: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.602502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-13 16:31:55.606356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-08-13 16:31:55.608364: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-08-13 16:31:55.610806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4608 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
�[34m[ INFO ]�[0m data augment is False
WARNING:tensorflow:From F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Train on 21 steps
Epoch 1/10
2020-08-13 16:32:22.746351: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:32:24.054899: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-08-13 16:32:24.128920: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
1/21 [>.............................] - ETA: 2:31 - loss: 792.4858 - l1_loss: 131.6574 - l2_loss: 660.4179 - l1_p: 0.0000e+00 - l1_r: 0.0000e+00 - l2_p: 5.8514e-04 - l2_r: 0.20002020-08-13 16:32:26.777512: I tensorflow/core/profiler/lib/profiler_session.cc:205] Profiler session started.
2020-08-13 16:32:26.787342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cupti64_100.dll
WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.280863). Check your callbacks.
2/21 [=>............................] - ETA: 1:17 - loss: 817.4302 - l1_loss: 172.4517 - l2_loss: 644.5675 - l1_p: 0.0012 - l1_r: 0.2500 - l2_p: 0.0025 - l2_r: 0.5000 2020-08-13 16:32:27.322285: I tensorflow/core/platform/default/device_tracer.cc:588] Collecting 3504 kernel records, 316 memcpy records.
WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.444144). Check your callbacks.
3/21 [===>..........................] - ETA: 52s - loss: 745.5250 - l1_loss: 159.0245 - l2_loss: 586.0892 - l1_p: 8.5985e-04 - l1_r: 0.1429 - l2_p: 0.0023 - l2_r: 0.3636WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.280863). Check your callbacks.
4/21 [====>.........................] - ETA: 37s - loss: 695.2993 - l1_loss: 148.6290 - l2_loss: 546.2584 - l1_p: 0.0015 - l1_r: 0.1538 - l2_p: 0.0022 - l2_r: 0.3200 WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.117583). Check your callbacks.
20/21 [===========================>..] - ETA: 0s - loss: 364.4713 - l1_loss: 68.8265 - l2_loss: 295.2221 - l1_p: 0.0012 - l1_r: 0.0385 - l2_p: 0.0020 - l2_r: 0.0563Traceback (most recent call last):
File "./keras_train.py", line 155, in
args.prune_frequency)
File "./keras_train.py", line 99, in main
validation_data=h.test_dataset, validation_steps=int(h.test_epoch_step * h.validation_split))
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 727, in fit
use_multiprocessing=use_multiprocessing)
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 675, in fit
steps_name='steps_per_epoch')
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 440, in model_iteration
steps_name='validation_steps')
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 411, in model_iteration
aggregator.finalize()
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py", line 138, in finalize
raise ValueError('Empty training data.')
ValueError: Empty training data.
Makefile:35: recipe for target 'train' failed
make: *** [train] Error 1
The text was updated successfully, but these errors were encountered: