Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade to numpy 2.0 and remove imgaug #13937

Merged
merged 8 commits into from
Nov 6, 2024

Conversation

GreatV
Copy link
Collaborator

@GreatV GreatV commented Oct 2, 2024

This pull request includes significant changes to the augmentation pipeline, dependencies, and testing framework. The main updates involve replacing imgaug with albumentations, introducing a custom resize transformation, and adding comprehensive tests for the new augmentation behavior.

Augmentation Pipeline Updates:

  • Replaced imgaug with albumentations for image augmentation, ensuring compatibility by preventing automatic updates in Albumentations (ppocr/data/imaug/iaa_augment.py).
  • Introduced ImgaugLikeResize, a custom resize transformation to mimic imgaug behavior with scaling (ppocr/data/imaug/iaa_augment.py).
  • Updated AugmenterBuilder to map common imgaug transformations to albumentations equivalents and handle custom augmenter arguments (ppocr/data/imaug/iaa_augment.py).

Dependency Updates:

  • Removed imgaug and added albumentations and albucore to the dependencies to support the new augmentation library (pyproject.toml). [1] [2]

Testing Enhancements:

  • Added a new test file tests/test_iaa_augment.py with comprehensive tests for the new augmentation pipeline, including fixtures for sample images and polygons, and various test cases for different augmentation scenarios (tests/test_iaa_augment.py).

Training Log:

training log for ppocrv3

[2024/11/05 03:44:40] ppocr INFO: Architecture : 
[2024/11/05 03:44:40] ppocr INFO:     Backbone : 
[2024/11/05 03:44:40] ppocr INFO:         disable_se : True
[2024/11/05 03:44:40] ppocr INFO:         model_name : large
[2024/11/05 03:44:40] ppocr INFO:         name : MobileNetV3
[2024/11/05 03:44:40] ppocr INFO:         scale : 0.5
[2024/11/05 03:44:40] ppocr INFO:     Head : 
[2024/11/05 03:44:40] ppocr INFO:         k : 50
[2024/11/05 03:44:40] ppocr INFO:         name : DBHead
[2024/11/05 03:44:40] ppocr INFO:     Neck : 
[2024/11/05 03:44:40] ppocr INFO:         name : RSEFPN
[2024/11/05 03:44:40] ppocr INFO:         out_channels : 96
[2024/11/05 03:44:40] ppocr INFO:         shortcut : True
[2024/11/05 03:44:40] ppocr INFO:     Transform : None
[2024/11/05 03:44:40] ppocr INFO:     algorithm : DB
[2024/11/05 03:44:40] ppocr INFO:     model_type : det
[2024/11/05 03:44:40] ppocr INFO: Eval : 
[2024/11/05 03:44:40] ppocr INFO:     dataset : 
[2024/11/05 03:44:40] ppocr INFO:         data_dir : train_data/2024092502/det
[2024/11/05 03:44:40] ppocr INFO:         label_file_list : ['train_data/2024092502/det/val.txt']
[2024/11/05 03:44:40] ppocr INFO:         name : SimpleDataSet
[2024/11/05 03:44:40] ppocr INFO:         transforms : 
[2024/11/05 03:44:40] ppocr INFO:             DecodeImage : 
[2024/11/05 03:44:40] ppocr INFO:                 channel_first : False
[2024/11/05 03:44:40] ppocr INFO:                 img_mode : BGR
[2024/11/05 03:44:40] ppocr INFO:             DetLabelEncode : None
[2024/11/05 03:44:40] ppocr INFO:             DetResizeForTest : None
[2024/11/05 03:44:40] ppocr INFO:             NormalizeImage : 
[2024/11/05 03:44:40] ppocr INFO:                 mean : [0.485, 0.456, 0.406]
[2024/11/05 03:44:40] ppocr INFO:                 order : hwc
[2024/11/05 03:44:40] ppocr INFO:                 scale : 1./255.
[2024/11/05 03:44:40] ppocr INFO:                 std : [0.229, 0.224, 0.225]
[2024/11/05 03:44:40] ppocr INFO:             ToCHWImage : None
[2024/11/05 03:44:40] ppocr INFO:             KeepKeys : 
[2024/11/05 03:44:40] ppocr INFO:                 keep_keys : ['image', 'shape', 'polys', 'ignore_tags']
[2024/11/05 03:44:40] ppocr INFO:     loader : 
[2024/11/05 03:44:40] ppocr INFO:         batch_size_per_card : 1
[2024/11/05 03:44:40] ppocr INFO:         drop_last : False
[2024/11/05 03:44:40] ppocr INFO:         num_workers : 1
[2024/11/05 03:44:40] ppocr INFO:         shuffle : False
[2024/11/05 03:44:40] ppocr INFO: Global : 
[2024/11/05 03:44:40] ppocr INFO:     cal_metric_during_train : False
[2024/11/05 03:44:40] ppocr INFO:     checkpoints : None
[2024/11/05 03:44:40] ppocr INFO:     debug : False
[2024/11/05 03:44:40] ppocr INFO:     distributed : False
[2024/11/05 03:44:40] ppocr INFO:     epoch_num : 5
[2024/11/05 03:44:40] ppocr INFO:     eval_batch_step : [0, 50]
[2024/11/05 03:44:40] ppocr INFO:     infer_img : doc/imgs_en/img_10.jpg
[2024/11/05 03:44:40] ppocr INFO:     log_smooth_window : 20
[2024/11/05 03:44:40] ppocr INFO:     pretrained_model : pretrained_models/ch_PP-OCRv3_det_distill_train/student.pdparams
[2024/11/05 03:44:40] ppocr INFO:     print_batch_step : 10
[2024/11/05 03:44:40] ppocr INFO:     save_epoch_step : 300
[2024/11/05 03:44:40] ppocr INFO:     save_inference_dir : None
[2024/11/05 03:44:40] ppocr INFO:     save_model_dir : output/train/det
[2024/11/05 03:44:40] ppocr INFO:     save_res_path : ./checkpoints/det_db/predicts_db.txt
[2024/11/05 03:44:40] ppocr INFO:     use_gpu : True
[2024/11/05 03:44:40] ppocr INFO:     use_visualdl : False
[2024/11/05 03:44:40] ppocr INFO: Loss : 
[2024/11/05 03:44:40] ppocr INFO:     alpha : 5
[2024/11/05 03:44:40] ppocr INFO:     balance_loss : True
[2024/11/05 03:44:40] ppocr INFO:     beta : 10
[2024/11/05 03:44:40] ppocr INFO:     main_loss_type : DiceLoss
[2024/11/05 03:44:40] ppocr INFO:     name : DBLoss
[2024/11/05 03:44:40] ppocr INFO:     ohem_ratio : 3
[2024/11/05 03:44:40] ppocr INFO: Metric : 
[2024/11/05 03:44:40] ppocr INFO:     main_indicator : hmean
[2024/11/05 03:44:40] ppocr INFO:     name : DetMetric
[2024/11/05 03:44:40] ppocr INFO: Optimizer : 
[2024/11/05 03:44:40] ppocr INFO:     beta1 : 0.9
[2024/11/05 03:44:40] ppocr INFO:     beta2 : 0.999
[2024/11/05 03:44:40] ppocr INFO:     lr : 
[2024/11/05 03:44:40] ppocr INFO:         learning_rate : 0.0005
[2024/11/05 03:44:40] ppocr INFO:         name : Const
[2024/11/05 03:44:40] ppocr INFO:         warmup_epoch : 0
[2024/11/05 03:44:40] ppocr INFO:     name : Adam
[2024/11/05 03:44:40] ppocr INFO:     regularizer : 
[2024/11/05 03:44:40] ppocr INFO:         factor : 5e-05
[2024/11/05 03:44:40] ppocr INFO:         name : L2
[2024/11/05 03:44:40] ppocr INFO: PostProcess : 
[2024/11/05 03:44:40] ppocr INFO:     box_thresh : 0.6
[2024/11/05 03:44:40] ppocr INFO:     max_candidates : 1000
[2024/11/05 03:44:40] ppocr INFO:     name : DBPostProcess
[2024/11/05 03:44:40] ppocr INFO:     thresh : 0.3
[2024/11/05 03:44:40] ppocr INFO:     unclip_ratio : 1.5
[2024/11/05 03:44:40] ppocr INFO: Train : 
[2024/11/05 03:44:40] ppocr INFO:     dataset : 
[2024/11/05 03:44:40] ppocr INFO:         data_dir : train_data/2024092502/det
[2024/11/05 03:44:40] ppocr INFO:         label_file_list : ['train_data/2024092502/det/train.txt']
[2024/11/05 03:44:40] ppocr INFO:         name : SimpleDataSet
[2024/11/05 03:44:40] ppocr INFO:         ratio_list : [1.0]
[2024/11/05 03:44:40] ppocr INFO:         transforms : 
[2024/11/05 03:44:40] ppocr INFO:             DecodeImage : 
[2024/11/05 03:44:40] ppocr INFO:                 channel_first : False
[2024/11/05 03:44:40] ppocr INFO:                 img_mode : BGR
[2024/11/05 03:44:40] ppocr INFO:             DetLabelEncode : None
[2024/11/05 03:44:40] ppocr INFO:             IaaAugment : 
[2024/11/05 03:44:40] ppocr INFO:                 augmenter_args : 
[2024/11/05 03:44:40] ppocr INFO:                     args : 
[2024/11/05 03:44:40] ppocr INFO:                         p : 0.5
[2024/11/05 03:44:40] ppocr INFO:                     type : Fliplr
[2024/11/05 03:44:40] ppocr INFO:                     args : 
[2024/11/05 03:44:40] ppocr INFO:                         rotate : [-10, 10]
[2024/11/05 03:44:40] ppocr INFO:                     type : Affine
[2024/11/05 03:44:40] ppocr INFO:                     args : 
[2024/11/05 03:44:40] ppocr INFO:                         size : [0.5, 3]
[2024/11/05 03:44:40] ppocr INFO:                     type : Resize
[2024/11/05 03:44:40] ppocr INFO:             EastRandomCropData : 
[2024/11/05 03:44:40] ppocr INFO:                 keep_ratio : True
[2024/11/05 03:44:40] ppocr INFO:                 max_tries : 50
[2024/11/05 03:44:40] ppocr INFO:                 size : [960, 960]
[2024/11/05 03:44:40] ppocr INFO:             MakeBorderMap : 
[2024/11/05 03:44:40] ppocr INFO:                 shrink_ratio : 0.4
[2024/11/05 03:44:40] ppocr INFO:                 thresh_max : 0.7
[2024/11/05 03:44:40] ppocr INFO:                 thresh_min : 0.3
[2024/11/05 03:44:40] ppocr INFO:             MakeShrinkMap : 
[2024/11/05 03:44:40] ppocr INFO:                 min_text_size : 8
[2024/11/05 03:44:40] ppocr INFO:                 shrink_ratio : 0.4
[2024/11/05 03:44:40] ppocr INFO:             NormalizeImage : 
[2024/11/05 03:44:40] ppocr INFO:                 mean : [0.485, 0.456, 0.406]
[2024/11/05 03:44:40] ppocr INFO:                 order : hwc
[2024/11/05 03:44:40] ppocr INFO:                 scale : 1./255.
[2024/11/05 03:44:40] ppocr INFO:                 std : [0.229, 0.224, 0.225]
[2024/11/05 03:44:40] ppocr INFO:             ToCHWImage : None
[2024/11/05 03:44:40] ppocr INFO:             KeepKeys : 
[2024/11/05 03:44:40] ppocr INFO:                 keep_keys : ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask']
[2024/11/05 03:44:40] ppocr INFO:     loader : 
[2024/11/05 03:44:40] ppocr INFO:         batch_size_per_card : 16
[2024/11/05 03:44:40] ppocr INFO:         drop_last : False
[2024/11/05 03:44:40] ppocr INFO:         num_workers : 8
[2024/11/05 03:44:40] ppocr INFO:         shuffle : True
[2024/11/05 03:44:40] ppocr INFO: profiler_options : None
[2024/11/05 03:44:40] ppocr INFO: train with paddle 3.0.0-beta1 and device Place(gpu:0)
[2024/11/05 03:44:40] ppocr INFO: Initialize indexs of datasets:['train_data/2024092502/det/train.txt']
[2024/11/05 03:44:40] ppocr INFO: Initialize indexs of datasets:['train_data/2024092502/det/val.txt']
W1105 03:44:40.914129  7321 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.4, Runtime API Version: 12.3
W1105 03:44:40.917488  7321 gpu_resources.cc:164] device: 0, cuDNN Version: 9.0.
[2024/11/05 03:44:41] ppocr INFO: train dataloader has 52 iters
[2024/11/05 03:44:41] ppocr INFO: valid dataloader has 205 iters
[2024/11/05 03:44:41] ppocr INFO: load pretrain successful from pretrained_models/ch_PP-OCRv3_det_distill_train/student
[2024/11/05 03:44:41] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 50 iterations
[2024/11/05 03:44:49] ppocr INFO: epoch: [1/5], global_step: 10, lr: 0.000500, loss: 2.411226, loss_shrink_maps: 1.516755, loss_threshold_maps: 0.583720, loss_binary_maps: 0.303819, loss_cbn: 0.000000, avg_reader_cost: 0.17043 s, avg_batch_cost: 0.72321 s, avg_samples: 16.0, ips: 22.12364 samples/s, eta: 0:03:00, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:44:55] ppocr INFO: epoch: [1/5], global_step: 20, lr: 0.000500, loss: 2.155175, loss_shrink_maps: 1.363963, loss_threshold_maps: 0.538264, loss_binary_maps: 0.272408, loss_cbn: 0.000000, avg_reader_cost: 0.00060 s, avg_batch_cost: 0.42729 s, avg_samples: 16.0, ips: 37.44502 samples/s, eta: 0:02:18, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:00] ppocr INFO: epoch: [1/5], global_step: 30, lr: 0.000500, loss: 1.251230, loss_shrink_maps: 0.667089, loss_threshold_maps: 0.452958, loss_binary_maps: 0.133242, loss_cbn: 0.000000, avg_reader_cost: 0.00059 s, avg_batch_cost: 0.42648 s, avg_samples: 16.0, ips: 37.51665 samples/s, eta: 0:02:00, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:06] ppocr INFO: epoch: [1/5], global_step: 40, lr: 0.000500, loss: 1.175845, loss_shrink_maps: 0.593399, loss_threshold_maps: 0.443418, loss_binary_maps: 0.118228, loss_cbn: 0.000000, avg_reader_cost: 0.00054 s, avg_batch_cost: 0.42289 s, avg_samples: 16.0, ips: 37.83511 samples/s, eta: 0:01:49, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:11] ppocr INFO: epoch: [1/5], global_step: 50, lr: 0.000500, loss: 1.148533, loss_shrink_maps: 0.561819, loss_threshold_maps: 0.441667, loss_binary_maps: 0.111523, loss_cbn: 0.000000, avg_reader_cost: 0.00040 s, avg_batch_cost: 0.41491 s, avg_samples: 16.0, ips: 38.56294 samples/s, eta: 0:01:41, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 44.29it/s]
[2024/11/05 03:45:16] ppocr INFO: cur metric, precision: 0.8491379310344828, recall: 0.9516908212560387, hmean: 0.89749430523918, fps: 117.85659291562934
[2024/11/05 03:45:16] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:45:16] ppocr INFO: best metric, hmean: 0.89749430523918, is_float16: False, precision: 0.8491379310344828, recall: 0.9516908212560387, fps: 117.85659291562934, best_epoch: 1
[2024/11/05 03:45:17] ppocr INFO: epoch: [1/5], global_step: 52, lr: 0.000500, loss: 1.148533, loss_shrink_maps: 0.561819, loss_threshold_maps: 0.442872, loss_binary_maps: 0.111133, loss_cbn: 0.000000, avg_reader_cost: 0.00008 s, avg_batch_cost: 0.05085 s, avg_samples: 2.2, ips: 43.26385 samples/s, eta: 0:01:38, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:17] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:45:23] ppocr INFO: epoch: [2/5], global_step: 60, lr: 0.000500, loss: 1.101211, loss_shrink_maps: 0.559913, loss_threshold_maps: 0.442170, loss_binary_maps: 0.110761, loss_cbn: 0.000000, avg_reader_cost: 0.17911 s, avg_batch_cost: 0.53415 s, avg_samples: 12.8, ips: 23.96324 samples/s, eta: 0:01:39, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:29] ppocr INFO: epoch: [2/5], global_step: 70, lr: 0.000500, loss: 1.048014, loss_shrink_maps: 0.509727, loss_threshold_maps: 0.433972, loss_binary_maps: 0.102026, loss_cbn: 0.000000, avg_reader_cost: 0.00056 s, avg_batch_cost: 0.42238 s, avg_samples: 16.0, ips: 37.88042 samples/s, eta: 0:01:32, max_mem_reserved: 12221 MB, max_mem_allocated: 10611 MB
[2024/11/05 03:45:34] ppocr INFO: epoch: [2/5], global_step: 80, lr: 0.000500, loss: 1.025770, loss_shrink_maps: 0.502827, loss_threshold_maps: 0.406002, loss_binary_maps: 0.099933, loss_cbn: 0.000000, avg_reader_cost: 0.00056 s, avg_batch_cost: 0.42322 s, avg_samples: 16.0, ips: 37.80524 samples/s, eta: 0:01:26, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:45:40] ppocr INFO: epoch: [2/5], global_step: 90, lr: 0.000500, loss: 1.100100, loss_shrink_maps: 0.580924, loss_threshold_maps: 0.406092, loss_binary_maps: 0.116078, loss_cbn: 0.000000, avg_reader_cost: 0.00055 s, avg_batch_cost: 0.42280 s, avg_samples: 16.0, ips: 37.84322 samples/s, eta: 0:01:20, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:45:45] ppocr INFO: epoch: [2/5], global_step: 100, lr: 0.000500, loss: 1.043621, loss_shrink_maps: 0.524704, loss_threshold_maps: 0.409671, loss_binary_maps: 0.105025, loss_cbn: 0.000000, avg_reader_cost: 0.00039 s, avg_batch_cost: 0.41913 s, avg_samples: 16.0, ips: 38.17386 samples/s, eta: 0:01:14, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 46.56it/s]
[2024/11/05 03:45:50] ppocr INFO: cur metric, precision: 0.9384615384615385, recall: 0.8840579710144928, hmean: 0.9104477611940298, fps: 137.22437976049994
[2024/11/05 03:45:50] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:45:50] ppocr INFO: best metric, hmean: 0.9104477611940298, is_float16: False, precision: 0.9384615384615385, recall: 0.8840579710144928, fps: 137.22437976049994, best_epoch: 2
[2024/11/05 03:45:51] ppocr INFO: epoch: [2/5], global_step: 104, lr: 0.000500, loss: 0.993080, loss_shrink_maps: 0.487648, loss_threshold_maps: 0.409671, loss_binary_maps: 0.097196, loss_cbn: 0.000000, avg_reader_cost: 0.00017 s, avg_batch_cost: 0.13128 s, avg_samples: 5.4, ips: 41.13481 samples/s, eta: 0:01:12, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:45:52] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:45:57] ppocr INFO: epoch: [3/5], global_step: 110, lr: 0.000500, loss: 0.977438, loss_shrink_maps: 0.484734, loss_threshold_maps: 0.401960, loss_binary_maps: 0.096992, loss_cbn: 0.000000, avg_reader_cost: 0.19116 s, avg_batch_cost: 0.46093 s, avg_samples: 9.6, ips: 20.82758 samples/s, eta: 0:01:11, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:02] ppocr INFO: epoch: [3/5], global_step: 120, lr: 0.000500, loss: 0.949405, loss_shrink_maps: 0.474069, loss_threshold_maps: 0.390250, loss_binary_maps: 0.094968, loss_cbn: 0.000000, avg_reader_cost: 0.00054 s, avg_batch_cost: 0.42337 s, avg_samples: 16.0, ips: 37.79235 samples/s, eta: 0:01:06, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:08] ppocr INFO: epoch: [3/5], global_step: 130, lr: 0.000500, loss: 0.956546, loss_shrink_maps: 0.475362, loss_threshold_maps: 0.390981, loss_binary_maps: 0.095230, loss_cbn: 0.000000, avg_reader_cost: 0.00055 s, avg_batch_cost: 0.42715 s, avg_samples: 16.0, ips: 37.45749 samples/s, eta: 0:01:01, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:13] ppocr INFO: epoch: [3/5], global_step: 140, lr: 0.000500, loss: 0.938787, loss_shrink_maps: 0.457235, loss_threshold_maps: 0.389091, loss_binary_maps: 0.091687, loss_cbn: 0.000000, avg_reader_cost: 0.00053 s, avg_batch_cost: 0.42422 s, avg_samples: 16.0, ips: 37.71627 samples/s, eta: 0:00:56, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:19] ppocr INFO: epoch: [3/5], global_step: 150, lr: 0.000500, loss: 0.952552, loss_shrink_maps: 0.457240, loss_threshold_maps: 0.396839, loss_binary_maps: 0.091207, loss_cbn: 0.000000, avg_reader_cost: 0.00038 s, avg_batch_cost: 0.41964 s, avg_samples: 16.0, ips: 38.12752 samples/s, eta: 0:00:51, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 47.62it/s]
[2024/11/05 03:46:23] ppocr INFO: cur metric, precision: 0.9395348837209302, recall: 0.9758454106280193, hmean: 0.957345971563981, fps: 141.7962959542575
[2024/11/05 03:46:23] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:46:23] ppocr INFO: best metric, hmean: 0.957345971563981, is_float16: False, precision: 0.9395348837209302, recall: 0.9758454106280193, fps: 141.7962959542575, best_epoch: 3
[2024/11/05 03:46:26] ppocr INFO: epoch: [3/5], global_step: 156, lr: 0.000500, loss: 0.968939, loss_shrink_maps: 0.485452, loss_threshold_maps: 0.406191, loss_binary_maps: 0.096986, loss_cbn: 0.000000, avg_reader_cost: 0.00022 s, avg_batch_cost: 0.21455 s, avg_samples: 8.6, ips: 40.08343 samples/s, eta: 0:00:47, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:26] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:46:31] ppocr INFO: epoch: [4/5], global_step: 160, lr: 0.000500, loss: 0.952552, loss_shrink_maps: 0.483840, loss_threshold_maps: 0.395270, loss_binary_maps: 0.096986, loss_cbn: 0.000000, avg_reader_cost: 0.18570 s, avg_batch_cost: 0.37285 s, avg_samples: 6.4, ips: 17.16513 samples/s, eta: 0:00:47, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:36] ppocr INFO: epoch: [4/5], global_step: 170, lr: 0.000500, loss: 0.877584, loss_shrink_maps: 0.432885, loss_threshold_maps: 0.362672, loss_binary_maps: 0.086792, loss_cbn: 0.000000, avg_reader_cost: 0.00056 s, avg_batch_cost: 0.42367 s, avg_samples: 16.0, ips: 37.76539 samples/s, eta: 0:00:42, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:42] ppocr INFO: epoch: [4/5], global_step: 180, lr: 0.000500, loss: 0.859737, loss_shrink_maps: 0.415850, loss_threshold_maps: 0.360794, loss_binary_maps: 0.083092, loss_cbn: 0.000000, avg_reader_cost: 0.00054 s, avg_batch_cost: 0.42430 s, avg_samples: 16.0, ips: 37.70891 samples/s, eta: 0:00:37, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:47] ppocr INFO: epoch: [4/5], global_step: 190, lr: 0.000500, loss: 0.907607, loss_shrink_maps: 0.435804, loss_threshold_maps: 0.382122, loss_binary_maps: 0.087253, loss_cbn: 0.000000, avg_reader_cost: 0.00058 s, avg_batch_cost: 0.42434 s, avg_samples: 16.0, ips: 37.70542 samples/s, eta: 0:00:32, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:46:53] ppocr INFO: epoch: [4/5], global_step: 200, lr: 0.000500, loss: 0.914766, loss_shrink_maps: 0.435804, loss_threshold_maps: 0.393868, loss_binary_maps: 0.087253, loss_cbn: 0.000000, avg_reader_cost: 0.00041 s, avg_batch_cost: 0.42173 s, avg_samples: 16.0, ips: 37.93916 samples/s, eta: 0:00:27, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 48.34it/s]
[2024/11/05 03:46:57] ppocr INFO: cur metric, precision: 0.9615384615384616, recall: 0.966183574879227, hmean: 0.963855421686747, fps: 144.523065757204
[2024/11/05 03:46:57] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:46:57] ppocr INFO: best metric, hmean: 0.963855421686747, is_float16: False, precision: 0.9615384615384616, recall: 0.966183574879227, fps: 144.523065757204, best_epoch: 4
[2024/11/05 03:47:01] ppocr INFO: epoch: [4/5], global_step: 208, lr: 0.000500, loss: 0.905105, loss_shrink_maps: 0.428904, loss_threshold_maps: 0.391091, loss_binary_maps: 0.085814, loss_cbn: 0.000000, avg_reader_cost: 0.00040 s, avg_batch_cost: 0.29841 s, avg_samples: 11.8, ips: 39.54349 samples/s, eta: 0:00:23, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:01] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:47:04] ppocr INFO: epoch: [5/5], global_step: 210, lr: 0.000500, loss: 0.896861, loss_shrink_maps: 0.428904, loss_threshold_maps: 0.388912, loss_binary_maps: 0.085814, loss_cbn: 0.000000, avg_reader_cost: 0.19693 s, avg_batch_cost: 0.29798 s, avg_samples: 3.2, ips: 10.73897 samples/s, eta: 0:00:23, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:10] ppocr INFO: epoch: [5/5], global_step: 220, lr: 0.000500, loss: 0.892904, loss_shrink_maps: 0.421943, loss_threshold_maps: 0.388345, loss_binary_maps: 0.083732, loss_cbn: 0.000000, avg_reader_cost: 0.00061 s, avg_batch_cost: 0.42509 s, avg_samples: 16.0, ips: 37.63907 samples/s, eta: 0:00:18, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:15] ppocr INFO: epoch: [5/5], global_step: 230, lr: 0.000500, loss: 0.881252, loss_shrink_maps: 0.419586, loss_threshold_maps: 0.373073, loss_binary_maps: 0.083601, loss_cbn: 0.000000, avg_reader_cost: 0.00056 s, avg_batch_cost: 0.42396 s, avg_samples: 16.0, ips: 37.73919 samples/s, eta: 0:00:13, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:21] ppocr INFO: epoch: [5/5], global_step: 240, lr: 0.000500, loss: 0.828409, loss_shrink_maps: 0.400391, loss_threshold_maps: 0.355088, loss_binary_maps: 0.079995, loss_cbn: 0.000000, avg_reader_cost: 0.00055 s, avg_batch_cost: 0.42439 s, avg_samples: 16.0, ips: 37.70157 samples/s, eta: 0:00:09, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:27] ppocr INFO: epoch: [5/5], global_step: 250, lr: 0.000500, loss: 0.845202, loss_shrink_maps: 0.397721, loss_threshold_maps: 0.362133, loss_binary_maps: 0.079616, loss_cbn: 0.000000, avg_reader_cost: 0.00051 s, avg_batch_cost: 0.42148 s, avg_samples: 16.0, ips: 37.96146 samples/s, eta: 0:00:04, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
eval model:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 205/205 [00:04<00:00, 44.84it/s]
[2024/11/05 03:47:31] ppocr INFO: cur metric, precision: 0.9626168224299065, recall: 0.9951690821256038, hmean: 0.9786223277909739, fps: 135.49694607540866
[2024/11/05 03:47:31] ppocr INFO: save best model is to output/train/det/best_accuracy
[2024/11/05 03:47:31] ppocr INFO: best metric, hmean: 0.9786223277909739, is_float16: False, precision: 0.9626168224299065, recall: 0.9951690821256038, fps: 135.49694607540866, best_epoch: 5
[2024/11/05 03:47:36] ppocr INFO: epoch: [5/5], global_step: 260, lr: 0.000500, loss: 0.880989, loss_shrink_maps: 0.428197, loss_threshold_maps: 0.375561, loss_binary_maps: 0.085547, loss_cbn: 0.000000, avg_reader_cost: 0.00051 s, avg_batch_cost: 0.38292 s, avg_samples: 15.0, ips: 39.17274 samples/s, eta: 0:00:00, max_mem_reserved: 12221 MB, max_mem_allocated: 10612 MB
[2024/11/05 03:47:37] ppocr INFO: save model in output/train/det/latest
[2024/11/05 03:47:37] ppocr INFO: best metric, hmean: 0.9786223277909739, is_float16: False, precision: 0.9626168224299065, recall: 0.9951690821256038, fps: 135.49694607540866, best_epoch: 5

training log for ppocrv4

train_2.log

Related PR:

@GreatV GreatV changed the title upgrade to numpy 2.0 and remove imgaug [WIP] upgrade to numpy 2.0 and remove imgaug Oct 2, 2024
@agzeroo agzeroo mentioned this pull request Oct 13, 2024
2 tasks
@GreatV GreatV changed the title [WIP] upgrade to numpy 2.0 and remove imgaug upgrade to numpy 2.0 and remove imgaug Nov 5, 2024
Copy link
Collaborator

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 👍🏻 👍🏻
LGTM

BTW: we may need to remove library six after this PR landed, as we no longer need to support python2 any more.
there are several code snippets requiring six library, but should be easy to update.

@GreatV GreatV merged commit 15fb82d into PaddlePaddle:main Nov 6, 2024
3 checks passed
@GreatV GreatV deleted the upgrade_numpy_2.0_remove_imgaug branch November 6, 2024 04:09
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 11, 2024
@PaddlePaddle PaddlePaddle unlocked this conversation Nov 15, 2024
@mmglove
Copy link
Contributor

mmglove commented Nov 15, 2024

该 PR 导致PaddleOCR_det_r50_vd_pse_v2_0_bs8_fp16_DP_dynamic_N1C1 训练报错:
image
复现:
docker: iregistry.baidu-int.com/paddlecloud/base-images:paddlecloud-ubuntu18.04-gcc8.2-cuda11.8-cudnn8.6-nccl2.15.5
python3.10
cd PaddleOCR
export CUDA_VISIBLE_DEVICES=0;
bash test_tipc/prepare.sh test_tipc/configs/det_r50_vd_pse_v2_0/train_infer_python.txt benchmark_train ;�
python tools/train.py -c test_tipc/configs/det_r50_vd_pse_v2_0/det_r50_vd_pse.yml -o Global.print_batch_step=1 Train.loader.shuffle=false Global.use_gpu=True Global.save_model_dir=./test_tipc/output/det_r50_vd_pse_v2_0/benchmark_train/norm_train_gpus_0_autocast_amp Global.epoch_num=2 Train.loader.batch_size_per_card=8 Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True

@GreatV
Copy link
Collaborator Author

GreatV commented Nov 15, 2024

@mmglove 好的,是的,我后面看看。

@GreatV
Copy link
Collaborator Author

GreatV commented Nov 16, 2024

@mmglove fixed in #14239

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants