Skip to content

TaoTao-real/LeNet-5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LeNet-5 卷积神经网络计算优化

LeNet-5

使用的优化方法

  • Unrollinig
  • SIMD
  • OpenMP
  • OpenCL

环境

  • OS macOS Mojave 10.14.6
  • CPU 2.7GHz Intel Core i7
  • GPU Intel Iris Plus GraPhics 655 1536MB
  • Complier g++, clang++
  • OpenCL OpenCL 1.2

计算热点分析

hotspot

Unrolling

对内层循环进行展开,扩大基本块大小,增加寄存器重命名机会 unrolling

SIMD

批量处理数据,提高数据并行度 simd

openMP

利用CPU多核将数据无关的循环分配到多线程同时执行 OpenMP对macOS平台支持不完善,环境配置,编译困难 解决: clang++ -Xpreprocessor -fopenmp -O3 *.cpp -o ../Release/openmp_cnn -lomp openmp

openCL

Forward:按照每个输出像素分配一个线程展开 opencl1 与forward顺序相反 output层:按照num_neuron_output_CNN循环展开 opencl2 分为两个kernel函数,分别计算weight和bias opencl3 opencl4

优化对比

forward backward

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published