(FontMachine) 9MB Font fails to load in DX9 mode

Started by Yellownakji, June 24, 2018, 20:51:09

Previous topic - Next topic

Derron

Does it happen with NG too? NG uses a different GC (and defaults to a threaded build...).


@ bug closed
https://github.com/blitz-research/blitzmax/issues/8
and the link referred to in that issue is outdated and the backup can be found there:
http://mojolabs.nz/posts.php?topic=107624#bottom

It was closed without an associated "commit" ... so judge yourself. BTW col took part in the communication too.


bye
Ron

col

Yep,

On a unit that experiences the issue the GC is kicking in and freeing Globals. DX9 uses a little 'texture store' to hold onto its textures, as to the history of how that came into being I don't know, but the GC frees the contents of the store and frees the store itself. When a new TImage wants to add a new texture to the store you get an EAV. This happens only when Incbin'ning the large 9MB file, and the code then loads in a 'biggish' texture - all are Incbin'ned. The store is a Global TList of TAutoRelease instances which are home to the textures ( as well as the texture instance in the TImage ) - as to why this store exists I'm not actually sure as it seems a bit pointless to me.

The most annoying part is that the exact same exe will work on one unit but fail on another. They all have 8GB of RAM.

I'm very suspicious that the GC error is actually a side effect of something else.
Also when the EAV occurs the GC carries on allocating memory - so it also looks like allocation within the GC itself is happening in a separate thread.

The problem seems related to having a Global within a Type and creating an instance of that Global variable without creating an instance of the Type. When the Global is outside of the Type the issue doesn't seem to occur, not yet anyway ( remember that this could be a red-herring as to the real fault as that sort of code should be valid ).
https://github.com/davecamp

"When you observe the world through social media, you lose your faith in it."

Yellownakji

#17
Quote from: col on July 02, 2018, 09:36:40
Yep,

On a unit that experiences the issue the GC is kicking in and freeing Globals. DX9 uses a little 'texture store' to hold onto its textures, as to the history of how that came into being I don't know, but the GC frees the contents of the store and frees the store itself. When a new TImage wants to add a new texture to the store you get an EAV. This happens only when Incbin'ning the large 9MB file, and the code then loads in a 'biggish' texture - all are Incbin'ned. The store is a Global TList of TAutoRelease instances which are home to the textures ( as well as the texture instance in the TImage ) - as to why this store exists I'm not actually sure as it seems a bit pointless to me.

The most annoying part is that the exact same exe will work on one unit but fail on another. They all have 8GB of RAM.

I'm very suspicious that the GC error is actually a side effect of something else.
Also when the EAV occurs the GC carries on allocating memory - so it also looks like allocation within the GC itself is happening in a separate thread.

The problem seems related to having a Global within a Type and creating an instance of that Global variable without creating an instance of the Type. When the Global is outside of the Type the issue doesn't seem to occur, not yet anyway ( remember that this could be a red-herring as to the real fault as that sort of code should be valid ).

The application runs on all of my PCs.     Windows 7 with Pentium, Intel Graphics & 4gb ram, Windows Me with Pentium 3, 512mb ram, Nvidia Geforce 6800 and my Windows 10 with 64gb ram and a Geforce 1080.

GCsuspend() let's me load in all my textures and the font from wherever i please, but that's not really what i want to do.   I'd like to have garbage collection.

--

Has this issue been submitted to Brucey yet, Col?  What a weird issue; Perhaps a re-write is in order, on my end.   I should just have a "Graphics.BMX" source and not put them in a type.  I reckon that'd be better.

Derron

As asked right before: is it happening in NG _and_ vanilla or just one of them?
NG uses a different GC than vanilla.

So if it was a GC bug it should only happen in either NG _or_ vanilla.
If it was a bug in Blitzmax ("ref counting") then it should fail in both.
If it only fails in NG _or_ vanilla it might also be an issue with the used modules (vanilla vs maxmods/brl.mod|pub.mod vs bmx-ng/brl.mod|pub.mod).


@ red herring
If the issue with "globals" exists, then you should be able to recreate the bug without the fontmachine-code (just incbin a 9mb "random data"-block - or even better some "validateable/predictable" data block - like a 1.000.000 times repeated pattern like 012345789).

Also: if the global contains a custom type you could use its "Delete()"-method to print out when it gets GC'd. Similar to what we've done in the linked issue above.


bye
Ron

Yellownakji

Quote from: Derron on July 02, 2018, 17:24:05
As asked right before: is it happening in NG _and_ vanilla or just one of them?
NG uses a different GC than vanilla.

So if it was a GC bug it should only happen in either NG _or_ vanilla.
If it was a bug in Blitzmax ("ref counting") then it should fail in both.
If it only fails in NG _or_ vanilla it might also be an issue with the used modules (vanilla vs maxmods/brl.mod|pub.mod vs bmx-ng/brl.mod|pub.mod).


@ red herring
If the issue with "globals" exists, then you should be able to recreate the bug without the fontmachine-code (just incbin a 9mb "random data"-block - or even better some "validateable/predictable" data block - like a 1.000.000 times repeated pattern like 012345789).

Also: if the global contains a custom type you could use its "Delete()"-method to print out when it gets GC'd. Similar to what we've done in the linked issue above.


bye
Ron

You asked if it's in NG too, which as Col said, 'yup'.   In terms of this bug, we're only testing with NG, at least i am.   I'm not using vanilla blitzmax at all.

--

Yes...  The bug occurs with and without fontmachine.   It has nothing to do with my font or the fontmachine at this point as it immediately trashes Sounds and Image too.

All i can say for certain as this is a GC issue for DX9.   This problem does not exist in GL, to my knowledge, as 'crashing' code works flawlessly when switching the render mode to GL in my configuration file.

--

This is my framework:

Framework brl.D3D9Max2D
Import brl.glmax2d
Import brl.Random
Import brl.Timerdefault
Import BRL.RamStream
Import BRL.PolledInput
Import BRL.Stream
Import BRL.Retro
Import brl.PNGLoader
Import brl.jpgloader
Import brl.bmploader
Import brl.math
Import brl.oggloader
Import pub.Win32
Import pub.freejoy
Import blide.fontmachine



Yellownakji

Quote from: Yellownakji on July 02, 2018, 23:48:43
Quote from: Derron on July 02, 2018, 17:24:05
As asked right before: is it happening in NG _and_ vanilla or just one of them?
NG uses a different GC than vanilla.

So if it was a GC bug it should only happen in either NG _or_ vanilla.
If it was a bug in Blitzmax ("ref counting") then it should fail in both.
If it only fails in NG _or_ vanilla it might also be an issue with the used modules (vanilla vs maxmods/brl.mod|pub.mod vs bmx-ng/brl.mod|pub.mod).


@ red herring
If the issue with "globals" exists, then you should be able to recreate the bug without the fontmachine-code (just incbin a 9mb "random data"-block - or even better some "validateable/predictable" data block - like a 1.000.000 times repeated pattern like 012345789).

Also: if the global contains a custom type you could use its "Delete()"-method to print out when it gets GC'd. Similar to what we've done in the linked issue above.


bye
Ron

You asked if it's in NG too, which as Col said, 'yup'.   In terms of this bug, we're only testing with NG, at least i am.   I'm not using vanilla blitzmax at all.

--

Yes...  The bug occurs with and without fontmachine.   It has nothing to do with my font or the fontmachine at this point as it immediately trashes Sounds and Image too.

All i can say for certain as this is a GC issue for DX9.   This problem does not exist in GL, to my knowledge, as 'crashing' code works flawlessly when switching the render mode to GL in my configuration file.

However, Col may very well be right in terms of that something could just be affecting GC in DX9 mode.   There's a few possibilities.

--

This is my framework:

Framework brl.D3D9Max2D
Import brl.glmax2d
Import brl.Random
Import brl.Timerdefault
Import BRL.RamStream
Import BRL.PolledInput
Import BRL.Stream
Import BRL.Retro
Import brl.PNGLoader
Import brl.jpgloader
Import brl.bmploader
Import brl.math
Import brl.oggloader
Import pub.Win32
Import pub.freejoy
Import blide.fontmachine


col

The same code builds and runs perfectly ok with the legacy BMax.
Drop the same source into NG and it EAVs.

QuoteIf the issue with "globals" exists, then you should be able to recreate the bug without the fontmachine-code (just incbin a 9mb "random data"-block - or even better some "validateable/predictable" data block - like a 1.000.000 times repeated pattern like 012345789).
It can be recreated without using the fontmachine code module - I've got to the point of not importing and using anything fontmachine related at all other than Incbin'ing the external data - the font machine data isn't being used, but standard graphic images are. Tests today confirm that it is a bug in NG.

Quoteif the global contains a custom type you could use its "Delete()"
Indeed, this was the first thing I did to see if it the Dx9 module itself is at fault. It turns out that the Dx9 module just has code that exposes the bug.

I'm now looking into making as small a code as possible to reproduce the issue.
https://github.com/davecamp

"When you observe the world through social media, you lose your faith in it."

col

I've narrowed this to this basic example. I took my time to peel away the game code systematically to verify that nothing was wrong with Yellownakji's code and arrived with this:

Resources:
The system.fmf font can be downloaded from: https://www.dropbox.com/s/9op2dqff4u0ubqn/system.fmf?dl=0
The fade.png image is attached to this post.

@Qube The website 'post message form' silently rejected me posting these 2 files in this reply? Maybe the file is too big @ just under 9MB? It seemed to be uploading with a progress output, but the post itself didn't appear.

If you put both files ( system.fmf, fade.png ) into the same folder as the source then build and run in Debug ( didnt bother trying a release build at this stage ), in 'NG only' you get an EAV as explained in the code. Legacy BMax runs fine.


SuperStrict

Incbin "system.fmf" ' comment out this line to remove the bug explained below
Incbin "fade.png"

SetGraphicsDriver D3D9Max2DDriver()
Graphics(1024, 768, 0, 60)

SetColor (255, 255, 255)
DrawText ("load....", 3, 3) ' creates 5 textures -> 'l' 'o' 'a' 'd' '.'

Global gfx_fade:TImage = LoadImage("incbin::fade.png", 0) ' load a TPixmap of size 1280x1024

While Not KeyDown(KEY_ESCAPE)
Cls

' CRASH HERE The bug occurs during the next steps:

' D3D9 code resizes the pixmap to 2048 x 1024 for power2 sizes.
' At the point of creating a new TPixmap all previous created textures ( ie 'l' 'o' 'a' 'd' '.' ) get freed.
' TMax2DImageFrame then creates the new texture at 2048x1024.
' When it tries to store the texture via _d3d9graphics.AutoRelease an EAV occurs at
' 'd3d9graphics.bmx -> TD3D9Graphics -> AutoRelease( unk:IUnknown_ ) -> _autoRelease.AddLast t'
DrawImage(gfx_fade, 0, 0, 0)

Flip
Wend

https://github.com/davecamp

"When you observe the world through social media, you lose your faith in it."

Yellownakji

#23
FADE.PNG is just a 1280x1024 solid white image.   Just for the record.

EDIT:  The source image has been included now.

Derron

Maybe raise an issue on github for this?


bye
Ron

Yellownakji

Quote from: Derron on July 05, 2018, 19:46:09
Maybe raise an issue on github for this?


bye
Ron

Col did some days ago.   No response but Col mentioned Brucey tends to fix things without initially responding, so perhaps he saw it.  I hope, at least.

col

Brucey has implemented a fix for this problem.

It looks stable here and appears to work perfectly :)
https://github.com/davecamp

"When you observe the world through social media, you lose your faith in it."

Yellownakji

#27
So, it appears Brucey has fixed this issue, according to github.   However, i am unable to test it ?properly?.

--

As a preface, i took Col's second advice and re-downloaded ALL the modules and built them.   All my modules are up to date now and i have the latest NG, i presume.  BCC 0.93, BMK 3.21 mt-win32-x86.   I backed up my older modules tho; glad i did.

Basically, my BMK and BCC are stock;  They came with the pre-compiled binaries commit.

The modules built fine in MaxIDE.   They also built fine in Blide, but im using the MaxIDE ones since nobody here seems to use blide.  Figured it'd help with consistency.

However, there are some hiccups.

--

project output

C:/BlitzMax/mod/blide.mod/fontmachine.mod/fontmachine.debug.win32.x86.a(fontmachine.bmx.debug.win32.x86.o): In function `blide_fontmachine_DrawBitMapText':
C:/BlitzMax/mod/brl.mod/graphics.mod/graphics.debug.win32.x86.a(graphics.bmx.debug.win32.x86.o): In function `brl_graphics_CreateGraphics':
C:/BlitzMax/mod/brl.mod/graphics.mod/.bmx/graphics.bmx.debug.win32.x86.c:815: undefined reference to `bbExObject'
C:/BlitzMax/mod/brl.mod/graphics.mod/graphics.debug.win32.x86.a(graphics.bmx.debug.win32.x86.o): In function `brl_graphics_AttachGraphics':
C:/BlitzMax/mod/brl.mod/graphics.mod/.bmx/graphics.bmx.debug.win32.x86.c:908: undefined reference to `bbExObject'
C:/BlitzMax/mod/brl.mod/filesystem.mod/filesystem.debug.win32.x86.a(filesystem.bmx.debug.win32.x86.o): In function `brl_filesystem_CopyFile':
C:/BlitzMax/mod/brl.mod/filesystem.mod/.bmx/filesystem.bmx.debug.win32.x86.c:1746: undefined reference to `bbExObject'
C:/BlitzMax/mod/brl.mod/pngloader.mod/pngloader.debug.win32.x86.a(pngloader.bmx.debug.win32.x86.o): In function `brl_pngloader_LoadPixmapPNG':
C:/BlitzMax/mod/brl.mod/pngloader.mod/.bmx/pngloader.bmx.debug.win32.x86.c:767: undefined reference to `bbExObject'
C:/BlitzMax/mod/brl.mod/pngloader.mod/pngloader.debug.win32.x86.a(pngloader.bmx.debug.win32.x86.o):C:/BlitzMax/mod/brl.mod/pngloader.mod/.bmx/pngloader.bmx.debug.win32.x86.c:1154: more undefined references to `bbExObject' follow
collect2.exe: error: ld returned 1 exit status
Build Error: Failed to link


--

BCC output

Building bcc
[ 12%] Processing:base64.bmx
[ 13%] Processing:stringbuffer_common.bmx
[ 13%] Processing:stringbuffer_core.bmx
[ 15%] Processing:base.configmap.bmx
[ 15%] Processing:base.stringhelper.bmx
[ 16%] Processing:options.bmx
[ 17%] Processing:config.bmx
[ 18%] Processing:type.bmx
[ 18%] Processing:toker.bmx
[ 19%] Processing:parser.bmx
[ 20%] Processing:ctranslator.bmx
[ 20%] Processing:bcc.bmx
[ 68%] Compiling:stringbuffer_glue.c
[ 68%] Compiling:transform.c
[ 81%] Compiling:base64.bmx.release.win32.x86.c
[ 81%] Compiling:stringbuffer_common.bmx.release.win32.x86.c
[ 82%] Compiling:stringbuffer_core.bmx.release.win32.x86.c
[ 84%] Compiling:base.configmap.bmx.release.win32.x86.c
[ 84%] Compiling:base.stringhelper.bmx.release.win32.x86.c
[ 85%] Compiling:options.bmx.release.win32.x86.c
[ 86%] Compiling:config.bmx.release.win32.x86.c
[ 86%] Compiling:type.bmx.release.win32.x86.c
[ 87%] Compiling:toker.bmx.release.win32.x86.c
[ 88%] Compiling:parser.bmx.release.win32.x86.c
[ 88%] Compiling:ctranslator.bmx.release.win32.x86.c
[ 89%] Compiling:bcc.bmx.console.release.win32.x86.c
[100%] Linking:bcc.exe
C:/Users/YellowNakji/Desktop/bcc-master/.bmx/bcc.bmx.console.release.win32.x86.o:bcc.bmx.console.release.win32.x86.c:(.text+0x3ef): undefined reference to `bbExObject'
C:/Users/YellowNakji/Desktop/bcc-master/.bmx/parser.bmx.release.win32.x86.o:parser.bmx.release.win32.x86.c:(.text+0x1765c): undefined reference to `bbExObject'
C:/Users/YellowNakji/Desktop/bcc-master/.bmx/parser.bmx.release.win32.x86.o:parser.bmx.release.win32.x86.c:(.text+0x17cce): undefined reference to `bbExObject'
C:/Users/YellowNakji/Desktop/bcc-master/.bmx/type.bmx.release.win32.x86.o:type.bmx.release.win32.x86.c:(.text+0x14cf1): undefined reference to `bbExObject'
C:/Users/YellowNakji/Desktop/bcc-master/.bmx/type.bmx.release.win32.x86.o:type.bmx.release.win32.x86.c:(.text+0x14f52): undefined reference to `bbExObject'
C:/Users/YellowNakji/Desktop/bcc-master/.bmx/type.bmx.release.win32.x86.o:type.bmx.release.win32.x86.c:(.text+0x15703): more undefined references to `bbExObject' follow
collect2.exe: error: ld returned 1 exit status
Build Error: Failed to link C:/Users/YellowNakji/Desktop/bcc-master/bcc.exe
Process complete


--

Not sure what i did wrong?   Are the latest module commits incompatible with my BCC?   Do i need to update something else?

--

If i revert back to my backed-up built modules, i can compile fine.

I compile the BCC and then went into MaxIDE.   I made a small change (added a comment somewhere) and then did CTRL+D to build just D3D9Max2d.bmx; Compiled fine.

I then used the latest BCC + the new D3D9MAX2D i compiled to compile my application with SYSTEM.FMF.

--

I got an EAV when i included it but not set it to a variable.    It ran when i included it and set it to a variable, like before.   It runs when i don't include it, but load it externally, like before.

Not sure if the patch wasn't successful for me or not.   On my end, it seems like things are just wonky.

Sorry, ugh.  ???

Derron

Do not just rebuild the DX module - the change might affect more stuff (the change is there to make sure that globals are not garbage-collected).

Once you have the newest BCC compiled with your _old_ brl/pub modules you can update brl/pub and recompile ALL of these modules (just rename old brl.mod and old pub.mod - so no "cached stuff" exists and it already compiles the modules needed).

The problem with the modules was/is, that there are changes in the modules which are not understandable by old BCC. So you need a newer BCC to compile them - means you need to compile BCC with old modules first.
Alternatively you can compile BCC with your vanilla BlitzMax and use this then.


bye
Ron

Yellownakji

Quote from: Derron on July 19, 2018, 21:37:23
Do not just rebuild the DX module - the change might affect more stuff (the change is there to make sure that globals are not garbage-collected).

Once you have the newest BCC compiled with your _old_ brl/pub modules you can update brl/pub and recompile ALL of these modules (just rename old brl.mod and old pub.mod - so no "cached stuff" exists and it already compiles the modules needed).

The problem with the modules was/is, that there are changes in the modules which are not understandable by old BCC. So you need a newer BCC to compile them - means you need to compile BCC with old modules first.
Alternatively you can compile BCC with your vanilla BlitzMax and use this then.


bye
Ron

I wasn't able to do that because:

[ 19%] Processing:hyperlink.bmx
[ 19%] Processing:scrollpanel.bmx
[ 20%] Processing:splitter.bmx
[ 20%] Processing:win32maxguiex.bmx
Compile Error: Syntax error - expecting identifier.
[C:/BlitzMax/mod/maxgui.mod/win32maxguiex.mod/win32maxguiex.bmx;4028;0]
Build Error: failed to compile (-1) C:/BlitzMax/mod/maxgui.mod/win32maxguiex.mod/win32maxguiex.bmx
Process complete


Error "expecting identifier"