Need Help Optimizing Particle System

Started by Ashmoor, April 29, 2019, 01:54:21

Previous topic - Next topic

Derron

Mky.mod (mojo reimplementation) offers both.

You might also try sdl.gl2sdlgraphica instead of brl.glgraphics or the dx variants.
It does a little bit of batching (collecting draw commands).


@ GaborD
Brucey is currently working on implementing bgfx support. This should bring shaders etc to BlitzMax.

Else I think Col/Davecamp has some shader module on github.
Think it would need a pretty complete overhaul of the particle code above.


Bye
Ron

Ashmoor

@GaborD
"Do yo need to stick to stock BMX due to project constraints or are you free to fully leverage 3D hardware?"

I am too inexperienced to switch to other things after about 16 months of development. I am self taught and would like to release a game before moving on and learning complex new things. I've been working as game designer and game artist for more than 15 years but I've never programmed any games. This is my first commercial game. In short, yes, I would rather find a different way of handling the scenes than spending 3 months learning and implementing new stuff.

"If you think about it, particles are as parallelized as it gets, you call simple functions tons of times. They basically beg for a GPU implementation."

This makes a lot of sense but I don't have the skills yet, I know close to nothing of OpenGL or shaders :(

@Derron
"Did you use only a single image or multiple images for rendering? Using only one can be used to simulate single-textures with no further texture switches."

I used single images.

I have ran more tests. It may be my lack of experience or just plain ignorance that I didn't know that a debug build is ~3 times slower than release builds. Here are my latest stats:

~ 5k basic particles in release mode (single image, no animation, greyscale)
~ 1.7k basic particles in debug mode
~ 5k barebone particles (all update stuff disabled except for death timer) in debug mode
~ 15k barebone particles in release mode

I've made a separate test for an extremely basic particle (just image and coords) and it stays at 60 fps:
~25k particles with a draw call and 2 randomize coords calls in debug mode
~8k particles with a complex draw call (set color, set scale, set rotation, set alpha, set blend and setting them back after use) and 1 randomize coords in debug
~75k particles with a draw call and 2 randomize coords calls in release mode

I tried just processing the particles without drawing them. Which leads to a limit of around 5.8k in release mode.

Based on this and your comments I think pushing draw calls is a big limiting factor, especially for "complex" draw calls, but the main code could be improved as well. I may get it to around 10-12k particles in release. That would be enough for my project.

Derron

Debug mode puts some code around your stuff... This makes memory intensive stuff a lot slower.


Drawcall minimization would be doable with "groups". So you render eg 20 particles sharing movement, colors, rotation... (And a neighbourhood to keep textures small) on a single texture.
Such optimization can be left for now...seems your numbers are ok.

Bye
Ron