Panpharmacon for numbers: strings

A long time ago, at university, I discussed with a colleague how to save numbers and how to represent them to application’s user. Unfortunately, we never agreed.

A specific data source provided us information encoded in bytes. Those bytes sometimes represented just a number, sometimes four bytes resulted in one large integer and sometimes the bits or nibbles of the bytes were important. I proposed to store the values in their particular interpreted form, as an int32, as a single byte or the high and low nibble each in their own byte. This idea was countered with the argument that you cannot instantly see the hex or binary representation of the number and that this causes redundant pieces of code, whenever a conversion is required.

That’s why a new structure was created, that I’d like to illustrate based on an abstract example. We will talk about the number 42, which I would simply have stored as 42 in a byte.

The object allows me to get the value as integer, as decimal string “42”, as hex string “0x2a” and even as binary string “00101010”.

Instead of using one single byte (technically four bytes in java), more than 150 bytes were wasted, for each value that we obtained from the data source. In addition, the value was converted into each representation for every object, no matter whether it was required or not. That doesn’t sound very much considering the available resources in today’s computers, providing us with thousands of megabytes of ram, what’s the difference! Nevertheless, this object consumes more than 35 times of memory compared to the simple solution. In our case, about four of those objects were created per second and kept in memory. After one minute, 240 * 150 bytes are allocated, that is 36.000 bytes compared to 960 bytes. Using the application for about half an hour results in about one megabyte of memory that is used only for keeping the values.

This doesn’t sound very significant either, but at this point it is only one single data object. If the whole application was designed in a similar manner, the application could crash because of memory leaks or simply because of an excessive memory consumption.

And the moral of this story:

Think long and hard about the information that you want to keep for the lifetime of the application and the data that you only need temporarily (e.g. different representations of values for the UI).  Even though you have plenty of resources at your disposal, use them sparingly. It is always considerably more expensive and more difficult to change data structures afterwards than having the memory footprint in mind when you design them.

Leave a Reply

Your email address will not be published. Required fields are marked *