readdir bug on Mac ? Memory leak ?

Started by Pingus, September 09, 2020, 03:57:44

Previous topic - Next topic

Pingus

The following code is stopping after around 2500 iteration on my Macmini in debug mode.
The copyfile function return 0 althought it ran fine for around 2500 loops.
In standard mode it stops at ~8000 iterations. The more demanding the code, the sooner it occur.
Within my game, it stops after around 200 iterations.
No crash on PC version (at least not after one hour of testing).

The issue comes from :          Local dir:Byte Ptr = ReadDir(CurrentDir())

In that example, copyfile do not work as intended, but without the copyfile, it would crash elsewhere (opening a stream by ex), so I guess that there is a memory leak somewhere, but related to readdir.

Could someone try to reproduce the issue on a Mac ? Just copy the code below and name it readdirtest.bmx so copyfile have something to eat.
Build and run in debug mode.


Framework SDL.gl2sdlmax2d

Import brl.jpgloader

Import brl.Random

Import brl.StandardIO



Global wWidth=1200

Global wHeight=900


Global testtimer=1000


Local g:TGraphics=Graphics (wWidth,wHeight,0,DesktopHertz())

Local crashcnt=0
Repeat

    Cls
   
    If testtimer >=  0
        testtimer=testtimer-16
        If testtimer < 0
            Local dir:Byte Ptr = ReadDir(CurrentDir())
            Local res=CopyFile ("readdirtest.bmx","anything.bmx")
            testtimer=16
            If res=0
                DebugStop()
            EndIf
        Print "counter before crash:"+crashcnt
        crashcnt:+1
        EndIf
    EndIf

    DrawText("rm:"+MouseX()+","+MouseY(),10,30)

    Flip()


    If KeyHit(KEY_ESCAPE) Or AppTerminate() Then End

Forever


iWasAdam

2 questions
1. why are you constantly copying a file?
2. why are you constantly reading a dir?

RE the dir:
it's returning a pointer - so the dir is using memory. Can't recall about NG and it's garbage collector. but I would have thought constantly getting a dir would not allow the collector time to do it's stuff.



Derron

#2
Might be because you truncated your code to a "working/failing sample" but:

Code (Blitzmax) Select

    If testtimer >=  0
        testtimer=testtimer-16
        If testtimer < 0
            Local dir:Byte Ptr = ReadDir(CurrentDir())
            Local res=CopyFile ("readdirtest.bmx","anything.bmx")
            testtimer=16
            If res=0
                DebugStop()
            EndIf
        Print "counter before crash:"+crashcnt
        crashcnt:+1
        EndIf
    EndIf

does fail "waiting for a second".

testtimer cycle #01= 1000  :: -16 :: not "< 0"
testtimer cycle #02= 984  :: -16 :: not "< 0"
testtimer cycle #11= 840  :: -16 :: not "< 0"
testtimer cycle #61= 40  :: -16 :: not "< 0"
testtimer cycle #62= 24  :: -16 :: not "< 0"
testtimer cycle #63= 8  :: -16 :: "< 0"  --> COPY, testtimer = 16
from here on: every second cycle
testtimer cycle #64= 16  :: -16 :: not "< 0"
testtimer cycle #65= 0  :: -16 :: "< 0"  --> COPY, testtimer = 16
testtimer cycle #66= 16  :: -16 :: not "< 0"
testtimer cycle #67= 0  :: -16 :: "< 0"  --> COPY, testtimer = 16
testtimer cycle #68= 16  :: -16 :: not "< 0"
testtimer cycle #69= 0  :: -16 :: "< 0"  --> COPY, testtimer = 16


dunno what your plan here was ... if you want it to happen "every second loop cycle", you could just do a
Code (BlitzMax) Select

local run:int = 0
while True
  run = 1 - run
  if run then copyFileStuff
wend



If you wanted an interval:
Code (BlitzMax) Select

local copyTime:int = 0
local copyInterval:int = 1000
local lastTime:int = Millisecs() 'attention to 27 days wrap-over, so better use a :long variant
while True
  copyTime :- (Millisecs() - lastTime)
  if copyTime < 0
    'either here: (to ensure no copy cycle is missed, but "copyInterval" is not guaranteed)
    copyTime :+ copyInterval
    'or this way: (to ensure it runs "copyInterval" later than the last doCopyStuff)
    'copyTime = Millisecs() + copyInterval

    doCopyStuff...
  endif
wend


So if "doCopyStuff" is run either every 1000 ms here, or every "1000ms after the last doCopyStuff"


bye
Ron

Pingus

@IwasAdam
Quote1. why are you constantly copying a file?
2. why are you constantly reading a dir?

It is of course a huge simplification of what my code does. I had a random bug occuring after half an hour to an hour of heavy testing. So I preferred to make a simple code that repeat the possible faulty lines in a much faster way.
You may be right about the garbage collector but that's quite odd if it is overwhelmed by a single readdir line. In the real code, the readdir is not run in loop like here, and indeed I do not need to use it in a loop. But if you are right and that is is 'just' a GC limitation, it is good to know that this function is so weak that it can not 'recover' even when it has seconds for that.

@Derron, yes the interval code is weak, but it does not matter here, it is just to mimic 'some time' between copy files.

As I said copyfile is not the problem, it is just 'where' the problem is visible after a while. If I remove the readdir, the copyfile can run indefinitly.

iWasAdam

The readdir will be allocating memory.

if you are constantly calling it, then more and more memory will be used until the app run out of menory and crashes or does something funky.

it looks like there is no way to deallocate the memory - unless the garbage collector does it. but the collector only triggers every now and then, so if you are eating more memory that the collector is being called then you will again run out of memory.

The best way to think about it is:
use getdir WHEN you need a new dir. this should ONLY be when you actually need it and not refreshing constantly.
This should give the collector time to do its stuff when it needs to ;)

Pingus

@IawsAdam

The following code (inherited from Ashmoor test code) crashed after 230 loops:

The main difference is that it does draw some stuff on the screen.

I tested it also with 1 second every readdir.
Crashes after 230 loops as well.

What amount of time is needed for the GC to recover a single function ?
How come that a 8gb Ram machine is overwhelmed by few 'readdir' ? Not millions, just 230 !?

Don't you think that there is something wrong here ?
I understand that Mac and PC version can have their differences, but here we have a code that can run indefinitly on a PC and crash after 230 seconds on a Mac... !



Framework SDL.gl2sdlmax2d

Import brl.jpgloader

Import brl.Random

Import brl.StandardIO



Global isFullScreen



Global vrWidth=1920

Global vrHeight=1080



Global wWidth=960

Global wHeight=600



Global fsWidth=DesktopWidth()

Global fsHeight=DesktopHeight()

Global robo1testtimer=100
Global robo2testtimer=100


Local g:TGraphics=Graphics (wWidth,wHeight,0,DesktopHertz())
Local dhz=DesktopHertz()


SetBlend AlphaBlend

SetVirtualResolution(vrWidth, vrHeight)

Local scr_x, scr_y, scr_dx, scr_dy, dx   
GetViewport(scr_x, scr_y, scr_dx, scr_dy)

Local mousehittest
Local crashcnt=0
Repeat


    Cls

        Local dir:Byte Ptr = ReadDir(CurrentDir())
            Print ("dir present ?"+dir)
            Local res=CopyFile ("readdirtest2.bmx","anything.bmx")
            If res=0
                Notify "is file there ?"
                DebugStop()
            EndIf
        Print "counter before crash:"+crashcnt
        crashcnt:+1


'draw something so we can check

    SetColor (150,50,20)
    DrawRect (0,0,700,500)
    '    DrawImage(bgImg,0,0)

        SetViewport(200,200,1000,500)
    SetColor (50,250,80)
    DrawRect (400,400,400,400)   
    '    DrawImage(bgImg2,0,0)
        SetViewport(0,0,1920,1080)


        SeedRnd(MilliSecs() / 1000)

        For Local i :Int = 0 To 100

                SetColor Rand(0,255), Rand(0,255), Rand(0,255)

                DrawText("Hey", Rand(0, vrWidth-50), Rand(0, vrHeight-20))

        Next

        SetColor (255,255,255)

       

        SetScale(2.0,2.0)

        DrawText("rm:"+MouseX()+","+MouseY(),10,30)

        DrawText(VirtualMouseX()+","+VirtualMouseY(),10,10)

        SetScale(1.0,1.0)

       

        Local x:Float, y:Float

        x=VirtualMouseX(); y=VirtualMouseY()

       

        DrawRect(x-5,y-5,10,10)



    Flip()




'end condition

    If KeyHit(KEY_ESCAPE) Or AppTerminate() Then End

Forever



Derron

#6
Which brl.mod version do you use?

On July 21 Brucey replaced these file functions so they use "brl.io" (which uses pub.physfs - an file system abstraction layer allowing to use stuff like zip files overlaying the real file system - perfect for "mod support" etc).

If you are already using it (open brl.mod/filesystem.mod/filesystem.bmx and check if _rootPath is looking like this:

Function _RootPath$( path$ )
If MaxIO.ioInitialized Then


If so, then MaxIO is used, if not then the old approach is used.
If it is MaxIO then maybe there is some issue to fix for Brucey. If not (old version) I think the issue is somewhere in your code or an OS Flaw (it uses the C-provided file stuff then).

Yet it should only happen if you "Import Brl.io" (and call MaxIO.Init()). So possibly not an issue for you.


If you are not using this current version of brl (and newer bcc, pub.mod ... and all this stuff) you might try it out in an extra blitzmax dir / setup ... maybe with brl.io this works ?

Code (BlitzMax) Select

Framework Brl.StandardIO
Import Brl.FileSystem
Import Brl.io

MaxIO.Init()




EDIT:


Local dir:Byte Ptr = ReadDir(CurrentDir())

Ok ... so where does it fail there - in readdir, or currentdir? Maybe store CurrentDir() in a variable before ... so you can see in which of both commands it fails.


Dunno how to debug the C stuff on mac os ... is there gdb? then you could compile with GDB information (checkbox in MaxIDE ) and rund the debug binary with "gdb -r mybinary".


bye
Ron

Pingus

@Derron,

Thanks for the hints.
I use the 'old' version. So far it shows quite stable except the issue mentioned above and I am a bit hesitant to switch to something new while I spent hundred hours testing my game with that 'old' version and as I have to send a build for publishing soon.

QuoteOk ... so where does it fail there - in readdir, or currentdir? Maybe store CurrentDir() in a variable before ... so you can see in which of both commands it fails.

It is in readdir(). For convenience, I used Currentdir() in my example but it crashes the same if I use a string with a existing path.
Note that the fail occur in copyfile in that example, not in readdir itself.
In my game, if I remove the copyfile line, the fail will occur 'elsewhere' but always in a file system function.
And it is not related to the directory checked by readdir as I tested on miscelaneous folders, with few or lot of files inside.


iWasAdam

I'm running a test with the original code (minus the copyfile)
current counter = 18000 and still going up
memory use is stable at 32.0mb, but slightly increasing (as it's 32.1mb now)

No crash or anything else... so I can't find anything to recommend at this stage.

Same NG version as yours and running in debug mode. I've got 64gb of ram though

Pingus

#9
@IwasAdam,
Thanks for the test.
Second version crash way faster however.
I wonder if it is linked to the use of larger screen or what.

[Edit] copyfile is indeed not needed, readdir return 0 after 230 loops on my Mac.

Derron

ReadDir just uses some std c function ... so if you call it "raw" it should fail too:


Framework Brl.StandardIO
Import Pub.StdC

local path:String = "mypath"
For local i:int = 0 to 1000
if i mod 10 = 0 then print "open #"+i

local thisPath:String = path
'FixPath(thisPath, True)
        opendir_( thisPath )
Next

print "Done."

(code untested)

btw opendir_ is from pub.mod/stdc.c
Code (BlitzMax) Select

void* opendir_( BBString *path ){
if( _bbusew ) {
BBChar *p=bbStringToWString( path );
void * res = _wopendir( p );
bbMemFree(p);
return res;
}
char *p=bbStringToCString( path );
void * res = opendir( p );
bbMemFree(p);
return res;
}

And this looks rather fine to me - maybe there is an OS limit ??


You might have seen now, that "ReadDir()" actually calls OPENdir_() ... and it returns a directory handle ... and you know there is a "CloseDir()" function which calls CLOSEdir_()


In other words: once you are done with a dir, you should close it ...

More details in:
https://github.com/bmx-ng/brl.mod/blob/master/filesystem.mod/filesystem.bmx



bye
Ron

Derron

Maybe my last post was too long ... so just append a "CloseDir()" in your initial code ... and let this run.


Framework SDL.gl2sdlmax2d

Import brl.jpgloader

Import brl.Random

Import brl.StandardIO



Global wWidth=1200

Global wHeight=900


Global testtimer=1000


Local g:TGraphics=Graphics (wWidth,wHeight,0,DesktopHertz())

Local crashcnt=0
Repeat

    Cls
   
    If testtimer >=  0
        testtimer=testtimer-16
        If testtimer < 0
            Local dir:Byte Ptr = ReadDir(CurrentDir())
            Local res=CopyFile ("readdirtest.bmx","anything.bmx")
            testtimer=16
            If res=0
                DebugStop()
            EndIf


            'RONNY HERE..... ADD THIS LINE
            CloseDir(dir)


        Print "counter before crash:"+crashcnt
        crashcnt:+1
        EndIf
    EndIf

    DrawText("rm:"+MouseX()+","+MouseY(),10,30)

    Flip()


    If KeyHit(KEY_ESCAPE) Or AppTerminate() Then End

Forever



bye
Ron

Pingus

Thanks Derron,

One more time you nailed it  8) !!

There seems to be indeed a limit on MACOS (~230) that is not 'requested' on PC, hence it never was an issue so far.

So I can keep my horrible readdir now :) (as long as I close them)

Still learning Blitzmax after all that years... ???


iWasAdam

this is from the NG help docs:
Quote
' readdir.bmx
SuperStrict

Local dir:Byte Ptr = ReadDir(CurrentDir())

If Not dir RuntimeError "failed to read current directory"

Repeat
   Local t:String = NextFile( dir )
   If t="" Exit
   If t="." Or t=".." Continue
   Print t   
Forever

CloseDir dir

So it is more a case RTFM - lol \o/

Derron

Brucey updated the docs so it is more clear.

Nonetheless I also mentioned in the discord chat, that "ReadDir()" returns a Byte Ptr ... while "ReadFile/ReadStream" return a TStream.
The difference is: the first one returns an unmanaged object, the last one a managed one. If the variable "local Result:TStream = ReadFile(myuri)" goes out of scope, the file is automatically closed (if you did not earlier).
Think we should consider adding a similar thing for "ReadDir()" (some TDirectoryHandle or so). Yet it would break backwards compatibility...


bye
Ron