Replies: 4 comments
-
Sorry but I'm not familiar with From your graph, it looks like |
Beta Was this translation helpful? Give feedback.
-
I think this is it: https://code.ur.gs/lupine/ordoor/src/branch/master/cmd/view-minimap/main.go#L172-L176 |
Beta Was this translation helpful? Give feedback.
-
No problem, I can appreciate the desire for backwards compatibility! I call (minimap is emulating immediate-mode drawing, I haven't touched it for a while ^^). I can eager-load the sprites to take those specific calls out of the So cgo is still significant, but coming from... glBufferSubData ? Perhaps?
Thanks for taking the time to look at this! I don't expect you to debug my performance problems for me, so feel free to close this issue if you're certain you don't want instancing :). |
Beta Was this translation helpful? Give feedback.
-
Thank you for the more information! cgo itself is not fast, but my guess is that the OpenGL driver itself is slow. Sorry I could not give a help to you so far. If I come up with a good idea, I'll notify you. |
Beta Was this translation helpful? Give feedback.
-
Hi,
https://code.ur.gs/lupine/ordoor is now running on my pinebook pro, and I'm investigating ways to improve performance. I think I've identified a possibility, but let me walk you through it.
The game is based on an isometric map. Each frame I draw approx. 200 "cells" in that map, each of which has 2-7 Z levels and 1-4 images, typically less than 64x64 pixels each - so perhaps 1,000 calls to
screen.DrawImage
, each with its own matrix to specify location on screen.The output (prior to implementing bounds clipping, reducing resolution, and removing the ability to zoom out) looks like: https://code.ur.gs/lupine/ordoor/src/branch/master/doc/formats/img/chapter01_rendered_2018-09-08.png
All this results in just 4 draw-triangles calls per frame:
This seems pretty good to me, and this efficient automatic batching is a big part of why I switched to ebiten ^^. However, I peg a CPU core in doing this, and tend to get 10-30fps and 30-60 tps, which isn't fast enough :/. So where is the time going?
I attach a profile showing that it's spent in: a) runtime.futex (#1073 ?) - but even more of it is spent inside cgo, presumably in these glDrawElements calls: https://github.com/hajimehoshi/ebiten/blob/master/internal/graphicsdriver/opengl/context_desktop.go#L496
So what can we do about it?
My engine is drawing a limited set of sprites multiple times: one view gave me 148 unique sprites, drawn a total of 840 times. The most popular sprite was drawn 118 times per frame. The mean is 5.6, the median is 1.
Talking about this a bit on
#gamedev
on freenode, a suggestion made was to useglDrawElementsInstanced
instead ofglDrawElements
: https://learnopengl.com/Advanced-OpenGL/Instancing & https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glDrawElementsInstanced.xhtml . This is available in OpenGL 3.1 and, as far as I understand it, means that the sprite drawn 118 times will become much cheaper, while the sprite drawn 1 time will be about as expensive.Is this something you're familiar with? I don't know that it'll work, but it seems promising to me. Happy to dive into it more if you think it's worth exploring, or to talk about other ways to make this thing fast ^^.
Beta Was this translation helpful? Give feedback.
All reactions