Quick and cheap bytes data?

Started by lettersquash, March 12, 2020, 11:26:37

Previous topic - Next topic

lettersquash

I'm in the early stages of writing an evolutionary simulation program, which needs to be fast but work with large amounts of data in memory. I've not had much experience writing smallbasic yet, but the lack of a byte data type is an issue for the memory size, when a lot of my variables might fit in 1 or 2 bytes - is there a way to create a byte array or memory buffer? I'm pretty dumb on the internal workings of Windows (I'm only concerned with this running on Windows, by the way), so maybe I can do it with some system calls if there's not a native method. I realise I can do it with strings, encoding my bytes as chr(x), and in some cases that works well, but it takes a hit on the speed putting the chrs in and ASC-ing them out.
I'll have you know, I'm coding all the right commands, just not necessarily in the right order.

bplus

#1
Consider a string as your byte array or buffer, MID(bytebuff, i, 1) is the array with index i.

SmallBASIC doesn't have MID statement but easy enough to make so you can plug-in bytes anywhere on string array/buffer.
1 person likes this

lettersquash

SB does have a MID(string, pos, length) function, I've been using it. It also has a handy REPLACE(string, pos, newstr, length) and a bunch of other string choppers. That all works fine, it's just slightly slower than other methods.

Storing a byte as ASCII text, which can be any of the numbers 0-255, the natural thing to do is use stringvar=chr(number), but to use it as a number again, as well as extracting it with mid(), you have to use asc().

After posting this, I also realised another method would be to multiply several bytes (four of them, I guess) by powers of 256 and store them together as a single integer variable, which might prove quicker to extract (with some divisions, % and/or BAND).
I'll have you know, I'm coding all the right commands, just not necessarily in the right order.

Aurel [banned]

QuoteSmallBASIC doesn't have MID statement
B++ from where you digg this statement ?
(Y)

bplus

#4
Quote from: Aurel on March 12, 2020, 19:58:19
QuoteSmallBASIC doesn't have MID statement
B++ from where you digg this statement ?

Are you asking because we know SmallBASIC has a MID FUNCTION but does not have the classic BASIC: MID SUB that allows you to replace a string within a part of another?

Oh wait, are you addressing me? (I am still waiting on my other +, it's in the mail they tell me.)
1 person likes this

lettersquash

 :o Well, I tried a little speed test squashing byte data into integers and it was both mind-boggling and didn't show great promise on the speed. Somehow I also kept getting negative 'bytes' out, so my maths was off.

I don't know anything about a "classic BASIC: MID SUB", bplus, but as I said, there's a function REPLACE() that does what you describe, replacing parts of a string with another given. ...I'm not sure what the 'length' argument does, since the replacement string is what it is - maybe it's a length through which to search.
I'll have you know, I'm coding all the right commands, just not necessarily in the right order.

bplus

Hi LS,

Actually REPLACE works better than Classic MID statement:

s = "Hi Aurel how many + do you have?"
s = replace(s, instr(s, "+"), "++++", 1)
? s


Test qb64 MID$ statement (NOT FUNCTION)
s$ = "Hi Aurel how many + do you have?"
MID$(s$, INSTR(s$, "+"), 1) = "++++" ' <<<< test MID$ statement  in QB64
PRINT s$
1 person likes this

lettersquash

"Better" how, out of curiosity?

I see now how the last argument, length, works in REPLACE(). Fairly obvious now I tested it, it's just the number of characters of the original to remove, so that you can remove fewer or more than the length of the new text. If omitted, it defaults to the length of the new text for a straight swap, as it were. Handy, also, that it can be 0, then acting as an insertion of the new text. You may have known that already, I just thought it would be useful to share these tips for anyone reading this later, as the sb docs are a bit concise.

On the issue of storing byte-ish-size data in strings, it occurs to me there's another method that might suit my needs, just using printable characters from, say ascii 40 to 127, giving 87 different states I can store without having to go chr(x) and asc(y) every time. If I actually needed these as numbers, I'd still have to add or subtract 40, but that's probably very quick, and if I'm just using these different states as cases to direct flow (with IF THEN or SELECT CASE), I don't even have to translate them into numbers, just check for the relevant character. IF x="J" should be pretty quick.

87 is a lot less than a full byte, but enough for what I need. I'll try it that way. My "biots" have "genes", and I'll save those as strings of characters.

Another tip on speed of using data in arrays - from my experiments, it's faster if you have parallel arrays, like DIM x (1000), y (1000), c (1000)... rather than a 2D array, DIM arr (1000, 2) for the same data. That's assuming you're not having to access the position in the row via a calculation - but it's still faster to use x(i) and y(i) than x(i,0) and y(i,1), say.
I'll have you know, I'm coding all the right commands, just not necessarily in the right order.

bplus

x(i), y(i) is certainly straightforward just to fog the decision a little you could also go
Dim O(nItems-1)
and use O(i).x, O(i).y for x and y position and piece of cake if you want to add other properties later O(i).property.
1 person likes this

lettersquash

Oh, that's how you do that. I read about the dot notation but hadn't seen examples. Thanks, that might come in handy. If I remember right, that's a map when you do that, so it's stored as nested lists and prints out with square brackets around items if you say ? O.
I'll have you know, I'm coding all the right commands, just not necessarily in the right order.

bplus

#10
Well it's not exactly in a square bracket, it's interesting to check out:

nItems = 100
dim myStuff(nItems -1)
for i = 0 to nItems-1
  myStuff(i).x = rnd*xmax
  myStuff(i).y = rnd*ymax
next
myStuff(50).clr = 12
? myStuff(50)



The property names are listed as well as values inside curly brackets.

Oh here's the square brackets:

nItems = 10
dim myStuff(nItems -1)
for i = 0 to nItems-1
  myStuff(i).x = rnd*xmax
  myStuff(i).y = rnd*ymax
next
myStuff(5).clr = 12
? myStuff(5)
?
? "Oh here's the square brackets!!"
? myStuff

1 person likes this

lettersquash

Aha, the plot thickens. Yes, that's more like a dictionary, or maybe it's a map. I'm a noob.  :))

What I was thinking of was more like this:

n=10
myStuff = () ' or []...they're interchangeable
for pairs = 1 to n
  myStuff << [int(rnd*xmax), int(rnd*ymax)] ' can be thought of as x,y, without the labels
next
myStuff(5) << [12] ' Again, lacks your nice labeling.
? myStuff(5)
?
? "Lots of square brackets!!"
? myStuff
?:? "You can do this:":? "? myStuff(5)(2)  ...which gives ";
? myStuff(5)(2)
? "But '? myStuff(4)(2)' will give out of range error."


What I don't get is what kind of data these results are, apart from "array". If you say
? myStuff(5)(2) + 42
you get 'Operator can't be used with array'. So then I thought, well, it's '[12]' isn't it, so what about
? mid(myStuff(5)(2), 2, 2)
But no, that gives a null string...and if you say +42, you get 42.

As you showed me earlier, you can do that with an "array" - we chopped the '[' and ']' and ';' out of a 2D array - just not this kind of "array" (which I thought was called a map!).

Anyway, I'll think of it as a tree, where you can keep adding branches anywhere they're needed. There are lots of different structures to choose from according to needs.
I'll have you know, I'm coding all the right commands, just not necessarily in the right order.

bplus

Hey cool, looks like you have the key to arrarys within arrays within...

It's a bracket racket!
1 person likes this

lettersquash

Quote from: bplus on March 15, 2020, 19:52:48
Hey cool, looks like you have the key to arrarys within arrays within...

It's a bracket racket!
:)) Yeah, Elmar Vogt has some interesting bits on those maps (or trees, or nested arrays/lists...?) here: https://smallbasic.github.io/pages/vade_data.html
You - I mean, I - could get into a right mess with them in no time!

Yay, I've also realised why I was puzzling about what type the data was I was getting out, trying to add a number to the array [12], then, failing that, thinking it's a string and trying to strip the brackets off it - I forgot the reference needs a final (0) to refer to that element of the array (it's just a single-element array, which threw me). myStuff(5)(2)(0) gives 12, which I can then treat as normal in sB, as a number or a string according to what functions I plug it into.
I'll have you know, I'm coding all the right commands, just not necessarily in the right order.

bplus

I've never seen that bit from Elmar under DATA.
1 person likes this