upgrade to numpy 2.0 and remove imgaug #13937

GreatV · 2024-10-02T03:20:51Z

This pull request includes significant changes to the augmentation pipeline, dependencies, and testing framework. The main updates involve replacing imgaug with albumentations, introducing a custom resize transformation, and adding comprehensive tests for the new augmentation behavior.

Augmentation Pipeline Updates:

Replaced imgaug with albumentations for image augmentation, ensuring compatibility by preventing automatic updates in Albumentations (ppocr/data/imaug/iaa_augment.py).
Introduced ImgaugLikeResize, a custom resize transformation to mimic imgaug behavior with scaling (ppocr/data/imaug/iaa_augment.py).
Updated AugmenterBuilder to map common imgaug transformations to albumentations equivalents and handle custom augmenter arguments (ppocr/data/imaug/iaa_augment.py).

Dependency Updates:

Removed imgaug and added albumentations and albucore to the dependencies to support the new augmentation library (pyproject.toml). [1] [2]

Testing Enhancements:

Added a new test file tests/test_iaa_augment.py with comprehensive tests for the new augmentation pipeline, including fixtures for sample images and polygons, and various test cases for different augmentation scenarios (tests/test_iaa_augment.py).

Training Log:

training log for ppocrv3

[2024/11/05 03:44:40] ppocr INFO: Architecture : 
[2024/11/05 03:44:40] ppocr INFO:     Backbone : 
[2024/11/05 03:44:40] ppocr INFO:         disable_se : True
[2024/11/05 03:44:40] ppocr INFO:         model_name : large
[2024/11/05 03:44:40] ppocr INFO:         name : MobileNetV3
[2024/11/05 03:44:40] ppocr INFO:         scale : 0.5
[2024/11/05 03:44:40] ppocr INFO:     Head : 
[2024/11/05 03:44:40] ppocr INFO:         k : 50
[2024/11/05 03:44:40] ppocr INFO:         name : DBHead
[2024/11/05 03:44:40] ppocr INFO:     Neck : 
[2024/11/05 03:44:40] ppocr INFO:         name : RSEFPN
[2024/11/05 03:44:40] ppocr INFO:         out_channels : 96
[2024/11/05 03:44:40] ppocr INFO:         shortcut : True
[2024/11/05 03:44:40] ppocr INFO:     Transform : None
[2024/11/05 03:44:40] ppocr INFO:     algorithm : DB
[2024/11/05 03:44:40] ppocr INFO:     model_type : det
[2024/11/05 03:44:40] ppocr INFO: Eval : 
[2024/11/05 03:44:40] ppocr INFO:     dataset : 
[2024/11/05 03:44:40] ppocr INFO:         data_dir : train_data/2024092502/det
[2024/11/05 03:44:40] ppocr INFO:         label_file_list : ['train_data/2024092502/det/val.txt']
[2024/11/05 03:44:40] ppocr INFO:         name : SimpleDataSet
[2024/11/05 03:44:40] ppocr INFO:         transforms : 
[2024/11/05 03:44:40] ppocr INFO:             DecodeImage : 
[2024/11/05 03:44:40] ppocr INFO:                 channel_first : False
[2024/11/05 03:44:40] ppocr INFO:                 img_mode : BGR
[2024/11/05 03:44:40] ppocr INFO:             DetLabelEncode : None
[2024/11/05 03:44:40] ppocr INFO:             DetResizeForTest : None
[2024/11/05 03:44:40] ppocr INFO:             NormalizeImage : 
[2024/11/05 03:44:40] ppocr INFO:                 mean : [0.485, 0.456, 0.406]
[2024/11/05 03:44:40] ppocr INFO:                 order : hwc
[2024/11/05 03:44:40] ppocr INFO:                 scale : 1./255.
[2024/11/05 03:44:40] ppocr INFO:                 std : [0.229, 0.224, 0.225]
[2024/11/05 03:44:40] ppocr INFO:             ToCHWImage : None
[2024/11/05 03:44:40] ppocr INFO:             KeepKeys : 
[2024/11/05 03:44:40] ppocr INFO:                 keep_keys : ['image', 'shape', 'polys', 'ignore_tags']
[2024/11/05 03:44:40] ppocr INFO:     loader : 
[2024/11/05 03:44:40] ppocr INFO:         batch_size_per_card : 1
[2024/11/05 03:44:40] ppocr INFO:         drop_last : False
[2024/11/05 03:44:40] ppocr INFO:         num_workers : 1
[2024/11/05 03:44:40] ppocr INFO:         shuffle : False
[2024/11/05 03:44:40] ppocr INFO: Global : 
[2024/11/05 03:44:40] ppocr INFO:     cal_metric_during_train : False
[2024/11/05 03:44:40] ppocr INFO:     checkpoints : None
[2024/11/05 03:44:40] ppocr INFO:     debug : False
[2024/11/05 03:44:40] ppocr INFO:     distributed : False
[2024/11/05 03:44:40] ppocr INFO:     epoch_num : 5
[2024/11/05 03:44:40] ppocr INFO:     eval_batch_step : [0, 50]
[2024/11/05 03:44:40] ppocr INFO:     infer_img : doc/imgs_en/img_10.jpg
[2024/11/05 03:44:40] ppocr INFO:     log_smooth_window : 20
[2024/11/05 03:44:40] ppocr INFO:     pretrained_model : pretrained_models/ch_PP-OCRv3_det_distill_train/student.pdparams
[2024/11/05 03:44:40] ppocr INFO:     print_batch_step : 10
[2024/11/05 03:44:40] ppocr INFO:     save_epoch_step : 300
[2024/11/05 03:44:40] ppocr INFO:     save_inference_dir : None
[2024/11/05 03:44:40] ppocr INFO:     save_model_dir : output/train/det
[2024/11/05 03:44:40] ppocr INFO:     save_res_path : ./checkpoints/det_db/predicts_db.txt
[2024/11/05 03:44:40] ppocr INFO:     use_gpu : True
[2024/11/05 03:44:40] ppocr INFO:     use_visualdl : False
[2024/11/05 03:44:40] ppocr INFO: Loss : 
[2024/11/05 03:44:40] ppocr INFO:     alpha : 5
[2024/11/05 03:44:40] ppocr INFO:     balance_loss : True
[2024/11/05 03:44:40] ppocr INFO:     beta : 10
[2024/11/05 03:44:40] ppocr INFO:     main_loss_type : DiceLoss
[2024/11/05 03:44:40] ppocr INFO:     name : DBLoss
[2024/11/05 03:44:40] ppocr INFO:     ohem_ratio : 3
[2024/11/05 03:44:40] ppocr INFO: Metric : 
[2024/11/05 03:44:40] ppocr INFO:     main_indicator : hmean
[2024/11/05 03:44:40] ppocr INFO:     name : DetMetric
[2024/11/05 03:44:40] ppocr INFO: Optimizer : 
[2024/11/05 03:44:40] ppocr INFO:     beta1 : 0.9
[2024/11/05 03:44:40] ppocr INFO:     beta2 : 0.999
[2024/11/05 03:44:40] ppocr INFO:     lr : 
[2024/11/05 03:44:40] ppocr INFO:         learning_rate : 0.0005
[2024/11/05 03:44:40] ppocr INFO:         name : Const
[2024/11/05 03:44:40] ppocr INFO:         warmup_epoch : 0
[2024/11/05 03:44:40] ppocr INFO:     name : Adam
[2024/11/05 03:44:40] ppocr INFO:     regularizer : 
[2024/11/05 03:44:40] ppocr INFO:         factor : 5e-05
[2024/11/05 03:44:40] ppocr INFO:         name : L2
[2024/11/05 03:44:40] ppocr INFO: PostProcess : 
[2024/11/05 03:44:40] ppocr INFO:     box_thresh : 0.6
[2024/11/05 03:44:40] ppocr INFO:     max_candidates : 1000
[2024/11/05 03:44:40] ppocr INFO:     name : DBPostProcess
[2024/11/05 03:44:40] ppocr INFO:     thresh : 0.3
[2024/11/05 03:44:40] ppocr INFO:     unclip_ratio : 1.5
[2024/11/05 03:44:40] ppocr INFO: Train : 
[2024/11/05 03:44:40] ppocr INFO:     dataset : 
[2024/11/05 03:44:40] ppocr INFO:         data_dir : train_data/2024092502/det
[2024/11/05 03:44:40] ppocr INFO:         label_file_list : ['train_data/2024092502/det/train.txt']
[2024/11/05 03:44:40] ppocr INFO:         name : SimpleDataSet
[2024/11/05 03:44:40] ppocr INFO:         ratio_list : [1.0]
[2024/11/05 03:44:40] ppocr INFO:         transforms : 
[2024/11/05 03:44:40] ppocr INFO:             DecodeImage : 
[2024/11/05 03:44:40] ppocr INFO:                 channel_first : False
[2024/11/05 03:44:40] ppocr INFO:                 img_mode : BGR
[2024/11/05 03:44:40] ppocr INFO:             DetLabelEncode : None
[2024/11/05 03:44:40] ppocr INFO:             IaaAugment : 
[2024/11/05 03:44:40] ppocr INFO:                 augmenter_args : 
[2024/11/05 03:44:40] ppocr INFO:                     args : 
[2024/11/05 03:44:40] ppocr INFO:                         p : 0.5
[2024/11/05 03:44:40] ppocr INFO:                     type : Fliplr
[2024/11/05 03:44:40] ppocr INFO:                     args : 
[2024/11/05 03:44:40] ppocr INFO:                         rotate : [-10, 10]
[2024/11/05 03:44:40] ppocr INFO:                     type : Affine
[2024/11/05 03:44:40] ppocr INFO:                     args : 
[2024/11/05 03:44:40] ppocr INFO:                         size : [0.5, 3]
[2024/11/05 03:44:40] ppocr INFO:                     type : Resize
[2024/11/05 03:44:40] ppocr INFO:             EastRandomCropData : 
[2024/11/05 03:44:40] ppocr INFO:                 keep_ratio : True
[2024/11/05 03:44:40] ppocr INFO:                 max_tries : 50
[2024/11/05 03:44:40] ppocr INFO:                 size : [960, 960]
[2024/11/05 03:44:40] ppocr INFO:             MakeBorderMap : 
[2024/11/05 03:44:40] ppocr INFO:                 shrink_ratio : 0.4
[2024/11/05 03:44:40] ppocr INFO:                 thresh_max : 0.7
[2024/11/05 03:44:40] ppocr INFO:                 thresh_min : 0.3
[2024/11/05 03:44:40] ppocr INFO:             MakeShrinkMap : 
[2024/11/05 03:44:40] ppocr INFO:                 min_text_size : 8
[2024/11/05 03:44:40] ppocr INFO:                 shrink_ratio : 0.4
[2024/11/05 03:44:40] ppocr INFO:             NormalizeImage : 
[2024/11/05 03:44:40] ppocr INFO:                 mean : [0.485, 0.456, 0.406]
[2024/11/05 03:44:40] ppocr INFO:                 order : hwc
[2024/11/05 03:44:40] ppocr INFO:                 scale : 1./255.
[2024/11/05 03:44:40] ppocr INFO:                 std : [0.229, 0.224, 0.225]
[2024/11/05 03:44:40] ppocr INFO:             ToCHWImage : None
[2024/11/05 03:44:40] ppocr INFO:             KeepKeys : 
[2024/11/05 03:44:40] ppocr INFO:                 keep_keys : ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask']
[2024/11/05 03:44:40] ppocr INFO:     loader : 
[2024/11/05 03:44:40] ppocr INFO:         batch_size_per_card : 16
[2024/11/05 03:44:40] ppocr INFO:         drop_last : False
[2024/11/05 03:44:40] ppocr INFO:         num_workers : 8
[2024/11/05 03:44:40] ppocr INFO:         shuffle : True
[2024/11/05 03:44:40] ppocr INFO: profiler_options : None
[2024/11/05 03:44:40] ppocr INFO: train with paddle 3.0.0-beta1 and device Place(gpu:0)
[2024/11/05 03:44:40] ppocr INFO: Initialize indexs of datasets:['train_data/2024092502/det/train.txt']
[2024/11/05 03:44:40] ppocr INFO: Initialize indexs of datasets:['train_data/2024092502/det/val.txt']
W1105 03:44:40.914129  7321 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.4, Runtime API Version: 12.3
W1105 03:44:40.917488  7321 gpu_resources.cc:164] device: 0, cuDNN Version: 9.0.
[2024/11/05 03:44:41] ppocr INFO: train dataloader has 52 iters
[2024/11/05 03:44:41] ppocr INFO: valid dataloader has 205 iters
[2024/11/05 03:44:41] ppocr INFO: load pretrain successful from pretrained_models/ch_PP-OCRv3_det_distill_train/student
[2024/11/05 03:44:41] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 50 iterations
[2024/11/05 03:44:49] ppocr INFO: epoch: [1/5], global_step: 10, lr: 0.000500, loss: 2.411226, loss_shrink_maps: 1.516755, loss_threshold_maps: 0.583720, loss_binary_maps: 0.303819, loss_cbn: 0.000000, avg_reader_cost: 0.17043 s, avg_batch_cost: 0.72321 s, avg_samples: 16.0, ips: 22.12364 samples/s, eta: 0:03:00, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:44:55] ppocr INFO: epoch: [1/5], global_step: 20, lr: 0.000500, loss: 2.155175, loss_shrink_maps: 1.363963, loss_threshold_maps: 0.538264, loss_binary_maps: 0.272408, loss_cbn: 0.000000, avg_reader_cost: 0.00060 s, avg_batch_cost: 0.42729 s, avg_samples: 16.0, ips: 37.44502 samples/s, eta: 0:02:18, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:00] ppocr INFO: epoch: [1/5], global_step: 30, lr: 0.000500, loss: 1.251230, loss_shrink_maps: 0.667089, loss_threshold_maps: 0.452958, loss_binary_maps: 0.133242, loss_cbn: 0.000000, avg_reader_cost: 0.00059 s, avg_batch_cost: 0.42648 s, avg_samples: 16.0, ips: 37.51665 samples/s, eta: 0:02:00, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:06] ppocr INFO: epoch: [1/5], global_step: 40, lr: 0.000500, loss: 1.175845, loss_shrink_maps: 0.593399, loss_threshold_maps: 0.443418, loss_binary_maps: 0.118228, loss_cbn: 0.000000, avg_reader_cost: 0.00054 s, avg_batch_cost: 0.42289 s, avg_samples: 16.0, ips: 37.83511 samples/s, eta: 0:01:49, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:11] ppocr INFO: epoch: [1/5], global_step: 50, lr: 0.000500, loss: 1.148533, loss_shrink_maps: 0.561819, loss_threshold_maps: 0.441667, loss_binary_maps: 0.111523, loss_cbn: 0.000000, avg_reader_cost: 0.00040 s, avg_batch_cost: 0.41491 s, avg_samples: 16.0, ips: 38.56294 samples/s, eta: 0:01:41, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 44.29it/s]
[2024/11/05 03:45:16] ppocr INFO: cur metric, precision: 0.8491379310344828, recall: 0.9516908212560387, hmean: 0.89749430523918, fps: 117.85659291562934
[2024/11/05 03:45:16] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:45:16] ppocr INFO: best metric, hmean: 0.89749430523918, is_float16: False, precision: 0.8491379310344828, recall: 0.9516908212560387, fps: 117.85659291562934, best_epoch: 1
[2024/11/05 03:45:17] ppocr INFO: epoch: [1/5], global_step: 52, lr: 0.000500, loss: 1.148533, loss_shrink_maps: 0.561819, loss_threshold_maps: 0.442872, loss_binary_maps: 0.111133, loss_cbn: 0.000000, avg_reader_cost: 0.00008 s, avg_batch_cost: 0.05085 s, avg_samples: 2.2, ips: 43.26385 samples/s, eta: 0:01:38, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:17] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:45:23] ppocr INFO: epoch: [2/5], global_step: 60, lr: 0.000500, loss: 1.101211, loss_shrink_maps: 0.559913, loss_threshold_maps: 0.442170, loss_binary_maps: 0.110761, loss_cbn: 0.000000, avg_reader_cost: 0.17911 s, avg_batch_cost: 0.53415 s, avg_samples: 12.8, ips: 23.96324 samples/s, eta: 0:01:39, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:29] ppocr INFO: epoch: [2/5], global_step: 70, lr: 0.000500, loss: 1.048014, loss_shrink_maps: 0.509727, loss_threshold_maps: 0.433972, loss_binary_maps: 0.102026, loss_cbn: 0.000000, avg_reader_cost: 0.00056 s, avg_batch_cost: 0.42238 s, avg_samples: 16.0, ips: 37.88042 samples/s, eta: 0:01:32, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:34] ppocr INFO: epoch: [2/5], global_step: 80, lr: 0.000500, loss: 1.025770, loss_shrink_maps: 0.502827, loss_threshold_maps: 0.406002, loss_binary_maps: 0.099933, loss_cbn: 0.000000, avg_reader_cost: 0.00056 s, avg_batch_cost: 0.42322 s, avg_samples: 16.0, ips: 37.80524 samples/s, eta: 0:01:26, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:45:40] ppocr INFO: epoch: [2/5], global_step: 90, lr: 0.000500, loss: 1.100100, loss_shrink_maps: 0.580924, loss_threshold_maps: 0.406092, loss_binary_maps: 0.116078, loss_cbn: 0.000000, avg_reader_cost: 0.00055 s, avg_batch_cost: 0.42280 s, avg_samples: 16.0, ips: 37.84322 samples/s, eta: 0:01:20, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:45:45] ppocr INFO: epoch: [2/5], global_step: 100, lr: 0.000500, loss: 1.043621, loss_shrink_maps: 0.524704, loss_threshold_maps: 0.409671, loss_binary_maps: 0.105025, loss_cbn: 0.000000, avg_reader_cost: 0.00039 s, avg_batch_cost: 0.41913 s, avg_samples: 16.0, ips: 38.17386 samples/s, eta: 0:01:14, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 46.56it/s]
[2024/11/05 03:45:50] ppocr INFO: cur metric, precision: 0.9384615384615385, recall: 0.8840579710144928, hmean: 0.9104477611940298, fps: 137.22437976049994
[2024/11/05 03:45:50] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:45:50] ppocr INFO: best metric, hmean: 0.9104477611940298, is_float16: False, precision: 0.9384615384615385, recall: 0.8840579710144928, fps: 137.22437976049994, best_epoch: 2
[2024/11/05 03:45:51] ppocr INFO: epoch: [2/5], global_step: 104, lr: 0.000500, loss: 0.993080, loss_shrink_maps: 0.487648, loss_threshold_maps: 0.409671, loss_binary_maps: 0.097196, loss_cbn: 0.000000, avg_reader_cost: 0.00017 s, avg_batch_cost: 0.13128 s, avg_samples: 5.4, ips: 41.13481 samples/s, eta: 0:01:12, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:45:52] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:45:57] ppocr INFO: epoch: [3/5], global_step: 110, lr: 0.000500, loss: 0.977438, loss_shrink_maps: 0.484734, loss_threshold_maps: 0.401960, loss_binary_maps: 0.096992, loss_cbn: 0.000000, avg_reader_cost: 0.19116 s, avg_batch_cost: 0.46093 s, avg_samples: 9.6, ips: 20.82758 samples/s, eta: 0:01:11, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:02] ppocr INFO: epoch: [3/5], global_step: 120, lr: 0.000500, loss: 0.949405, loss_shrink_maps: 0.474069, loss_threshold_maps: 0.390250, loss_binary_maps: 0.094968, loss_cbn: 0.000000, avg_reader_cost: 0.00054 s, avg_batch_cost: 0.42337 s, avg_samples: 16.0, ips: 37.79235 samples/s, eta: 0:01:06, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:08] ppocr INFO: epoch: [3/5], global_step: 130, lr: 0.000500, loss: 0.956546, loss_shrink_maps: 0.475362, loss_threshold_maps: 0.390981, loss_binary_maps: 0.095230, loss_cbn: 0.000000, avg_reader_cost: 0.00055 s, avg_batch_cost: 0.42715 s, avg_samples: 16.0, ips: 37.45749 samples/s, eta: 0:01:01, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:13] ppocr INFO: epoch: [3/5], global_step: 140, lr: 0.000500, loss: 0.938787, loss_shrink_maps: 0.457235, loss_threshold_maps: 0.389091, loss_binary_maps: 0.091687, loss_cbn: 0.000000, avg_reader_cost: 0.00053 s, avg_batch_cost: 0.42422 s, avg_samples: 16.0, ips: 37.71627 samples/s, eta: 0:00:56, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:19] ppocr INFO: epoch: [3/5], global_step: 150, lr: 0.000500, loss: 0.952552, loss_shrink_maps: 0.457240, loss_threshold_maps: 0.396839, loss_binary_maps: 0.091207, loss_cbn: 0.000000, avg_reader_cost: 0.00038 s, avg_batch_cost: 0.41964 s, avg_samples: 16.0, ips: 38.12752 samples/s, eta: 0:00:51, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 47.62it/s]
[2024/11/05 03:46:23] ppocr INFO: cur metric, precision: 0.9395348837209302, recall: 0.9758454106280193, hmean: 0.957345971563981, fps: 141.7962959542575
[2024/11/05 03:46:23] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:46:23] ppocr INFO: best metric, hmean: 0.957345971563981, is_float16: False, precision: 0.9395348837209302, recall: 0.9758454106280193, fps: 141.7962959542575, best_epoch: 3
[2024/11/05 03:46:26] ppocr INFO: epoch: [3/5], global_step: 156, lr: 0.000500, loss: 0.968939, loss_shrink_maps: 0.485452, loss_threshold_maps: 0.406191, loss_binary_maps: 0.096986, loss_cbn: 0.000000, avg_reader_cost: 0.00022 s, avg_batch_cost: 0.21455 s, avg_samples: 8.6, ips: 40.08343 samples/s, eta: 0:00:47, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:26] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:46:31] ppocr INFO: epoch: [4/5], global_step: 160, lr: 0.000500, loss: 0.952552, loss_shrink_maps: 0.483840, loss_threshold_maps: 0.395270, loss_binary_maps: 0.096986, loss_cbn: 0.000000, avg_reader_cost: 0.18570 s, avg_batch_cost: 0.37285 s, avg_samples: 6.4, ips: 17.16513 samples/s, eta: 0:00:47, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:36] ppocr INFO: epoch: [4/5], global_step: 170, lr: 0.000500, loss: 0.877584, loss_shrink_maps: 0.432885, loss_threshold_maps: 0.362672, loss_binary_maps: 0.086792, loss_cbn: 0.000000, avg_reader_cost: 0.00056 s, avg_batch_cost: 0.42367 s, avg_samples: 16.0, ips: 37.76539 samples/s, eta: 0:00:42, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:42] ppocr INFO: epoch: [4/5], global_step: 180, lr: 0.000500, loss: 0.859737, loss_shrink_maps: 0.415850, loss_threshold_maps: 0.360794, loss_binary_maps: 0.083092, loss_cbn: 0.000000, avg_reader_cost: 0.00054 s, avg_batch_cost: 0.42430 s, avg_samples: 16.0, ips: 37.70891 samples/s, eta: 0:00:37, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:47] ppocr INFO: epoch: [4/5], global_step: 190, lr: 0.000500, loss: 0.907607, loss_shrink_maps: 0.435804, loss_threshold_maps: 0.382122, loss_binary_maps: 0.087253, loss_cbn: 0.000000, avg_reader_cost: 0.00058 s, avg_batch_cost: 0.42434 s, avg_samples: 16.0, ips: 37.70542 samples/s, eta: 0:00:32, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:53] ppocr INFO: epoch: [4/5], global_step: 200, lr: 0.000500, loss: 0.914766, loss_shrink_maps: 0.435804, loss_threshold_maps: 0.393868, loss_binary_maps: 0.087253, loss_cbn: 0.000000, avg_reader_cost: 0.00041 s, avg_batch_cost: 0.42173 s, avg_samples: 16.0, ips: 37.93916 samples/s, eta: 0:00:27, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 48.34it/s]
[2024/11/05 03:46:57] ppocr INFO: cur metric, precision: 0.9615384615384616, recall: 0.966183574879227, hmean: 0.963855421686747, fps: 144.523065757204
[2024/11/05 03:46:57] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:46:57] ppocr INFO: best metric, hmean: 0.963855421686747, is_float16: False, precision: 0.9615384615384616, recall: 0.966183574879227, fps: 144.523065757204, best_epoch: 4
[2024/11/05 03:47:01] ppocr INFO: epoch: [4/5], global_step: 208, lr: 0.000500, loss: 0.905105, loss_shrink_maps: 0.428904, loss_threshold_maps: 0.391091, loss_binary_maps: 0.085814, loss_cbn: 0.000000, avg_reader_cost: 0.00040 s, avg_batch_cost: 0.29841 s, avg_samples: 11.8, ips: 39.54349 samples/s, eta: 0:00:23, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:01] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:47:04] ppocr INFO: epoch: [5/5], global_step: 210, lr: 0.000500, loss: 0.896861, loss_shrink_maps: 0.428904, loss_threshold_maps: 0.388912, loss_binary_maps: 0.085814, loss_cbn: 0.000000, avg_reader_cost: 0.19693 s, avg_batch_cost: 0.29798 s, avg_samples: 3.2, ips: 10.73897 samples/s, eta: 0:00:23, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:10] ppocr INFO: epoch: [5/5], global_step: 220, lr: 0.000500, loss: 0.892904, loss_shrink_maps: 0.421943, loss_threshold_maps: 0.388345, loss_binary_maps: 0.083732, loss_cbn: 0.000000, avg_reader_cost: 0.00061 s, avg_batch_cost: 0.42509 s, avg_samples: 16.0, ips: 37.63907 samples/s, eta: 0:00:18, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:15] ppocr INFO: epoch: [5/5], global_step: 230, lr: 0.000500, loss: 0.881252, loss_shrink_maps: 0.419586, loss_threshold_maps: 0.373073, loss_binary_maps: 0.083601, loss_cbn: 0.000000, avg_reader_cost: 0.00056 s, avg_batch_cost: 0.42396 s, avg_samples: 16.0, ips: 37.73919 samples/s, eta: 0:00:13, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:21] ppocr INFO: epoch: [5/5], global_step: 240, lr: 0.000500, loss: 0.828409, loss_shrink_maps: 0.400391, loss_threshold_maps: 0.355088, loss_binary_maps: 0.079995, loss_cbn: 0.000000, avg_reader_cost: 0.00055 s, avg_batch_cost: 0.42439 s, avg_samples: 16.0, ips: 37.70157 samples/s, eta: 0:00:09, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:27] ppocr INFO: epoch: [5/5], global_step: 250, lr: 0.000500, loss: 0.845202, loss_shrink_maps: 0.397721, loss_threshold_maps: 0.362133, loss_binary_maps: 0.079616, loss_cbn: 0.000000, avg_reader_cost: 0.00051 s, avg_batch_cost: 0.42148 s, avg_samples: 16.0, ips: 37.96146 samples/s, eta: 0:00:04, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 44.84it/s]
[2024/11/05 03:47:31] ppocr INFO: cur metric, precision: 0.9626168224299065, recall: 0.9951690821256038, hmean: 0.9786223277909739, fps: 135.49694607540866
[2024/11/05 03:47:31] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:47:31] ppocr INFO: best metric, hmean: 0.9786223277909739, is_float16: False, precision: 0.9626168224299065, recall: 0.9951690821256038, fps: 135.49694607540866, best_epoch: 5
[2024/11/05 03:47:36] ppocr INFO: epoch: [5/5], global_step: 260, lr: 0.000500, loss: 0.880989, loss_shrink_maps: 0.428197, loss_threshold_maps: 0.375561, loss_binary_maps: 0.085547, loss_cbn: 0.000000, avg_reader_cost: 0.00051 s, avg_batch_cost: 0.38292 s, avg_samples: 15.0, ips: 39.17274 samples/s, eta: 0:00:00, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:37] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:47:37] ppocr INFO: best metric, hmean: 0.9786223277909739, is_float16: False, precision: 0.9626168224299065, recall: 0.9951690821256038, fps: 135.49694607540866, best_epoch: 5

training log for ppocrv4

train_2.log

Related PR:

replace imgaug with albumentations #13467

jzhang533

👍🏻 👍🏻 👍🏻
LGTM

BTW: we may need to remove library six after this PR landed, as we no longer need to support python2 any more.
there are several code snippets requiring six library, but should be easy to update.

mmglove · 2024-11-15T03:13:08Z

该 PR 导致PaddleOCR_det_r50_vd_pse_v2_0_bs8_fp16_DP_dynamic_N1C1 训练报错：

复现：
docker: iregistry.baidu-int.com/paddlecloud/base-images:paddlecloud-ubuntu18.04-gcc8.2-cuda11.8-cudnn8.6-nccl2.15.5
python3.10
cd PaddleOCR
export CUDA_VISIBLE_DEVICES=0;
bash test_tipc/prepare.sh test_tipc/configs/det_r50_vd_pse_v2_0/train_infer_python.txt benchmark_train ;�
python tools/train.py -c test_tipc/configs/det_r50_vd_pse_v2_0/det_r50_vd_pse.yml -o Global.print_batch_step=1 Train.loader.shuffle=false Global.use_gpu=True Global.save_model_dir=./test_tipc/output/det_r50_vd_pse_v2_0/benchmark_train/norm_train_gpus_0_autocast_amp Global.epoch_num=2 Train.loader.batch_size_per_card=8 Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True

GreatV · 2024-11-15T03:18:01Z

@mmglove 好的，是的，我后面看看。

GreatV · 2024-11-16T02:37:32Z

@mmglove fixed in #14239

GreatV added 2 commits October 2, 2024 03:20

upgrade to numpy 2.0 and remove imgaug

da183c0

fix bug

e39d059

GreatV changed the title ~~upgrade to numpy 2.0 and remove imgaug~~ [WIP] upgrade to numpy 2.0 and remove imgaug Oct 2, 2024

GreatV added 2 commits October 2, 2024 03:46

fix bug

8055b53

fix bug

a37c00d

agzeroo mentioned this pull request Oct 13, 2024

numpy version issue #13991

Open

2 tasks

GreatV added 2 commits November 5, 2024 02:18

fix bug

b8b185c

Merge branch 'main' into upgrade_numpy_2.0_remove_imgaug

7c0a38f

GreatV changed the title ~~[WIP] upgrade to numpy 2.0 and remove imgaug~~ upgrade to numpy 2.0 and remove imgaug Nov 5, 2024

GreatV added 2 commits November 5, 2024 03:03

fix bug

a1259cf

add license

e9da75e

GreatV requested review from Liyulingyue and jzhang533 November 5, 2024 03:24

Liyulingyue approved these changes Nov 5, 2024

View reviewed changes

jzhang533 approved these changes Nov 5, 2024

View reviewed changes

GreatV merged commit 15fb82d into PaddlePaddle:main Nov 6, 2024
3 checks passed

GreatV deleted the upgrade_numpy_2.0_remove_imgaug branch November 6, 2024 04:09

github-actions bot locked as resolved and limited conversation to collaborators Nov 11, 2024

paddle-bot bot added the contributor label Nov 13, 2024

PaddlePaddle unlocked this conversation Nov 15, 2024

GreatV mentioned this pull request Nov 16, 2024

fix benchmark det_r50_vd_pse_v2_0 train error #14239

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

upgrade to numpy 2.0 and remove imgaug #13937

upgrade to numpy 2.0 and remove imgaug #13937

GreatV commented Oct 2, 2024 •

edited

Loading

jzhang533 left a comment

mmglove commented Nov 15, 2024

GreatV commented Nov 15, 2024

GreatV commented Nov 16, 2024

upgrade to numpy 2.0 and remove imgaug #13937

upgrade to numpy 2.0 and remove imgaug #13937

Conversation

GreatV commented Oct 2, 2024 • edited Loading

Augmentation Pipeline Updates:

Dependency Updates:

Testing Enhancements:

Training Log:

training log for ppocrv3

training log for ppocrv4

Related PR:

jzhang533 left a comment

Choose a reason for hiding this comment

mmglove commented Nov 15, 2024

GreatV commented Nov 15, 2024

GreatV commented Nov 16, 2024

GreatV commented Oct 2, 2024 •

edited

Loading