This article has been originally published in Italian (here). Feedbacks on content and translation are appreciated. Contributions are welcome. The original article must be considered the reference in case of updates.
In layman’s terms: it’s all about handling bits, bytes and stuff like that to understand how they work and which maths they follow. Yes, there can be kind of “a different maths”, so please bear with me for a while. I am not writing a rigorous treatise on binary mathematics, electronic circuits and other “computer stuff”: I leave this duty to specialized textbooks. I will try to capitalize on what I have learned over the years to explain in the clearest and most “practical” way the meaning of numbers and measurement units we usually come across when dealing with computers and similar contraptions.
I am not going into deep detail, that would otherwise require more precise explanations, not suited for the purpose of this post, meant to be a “hands-on” approach to the matter. Besides, IT is not exactly my job, therefore I prefer to avoid inaccuracies on things I suppose I have understood (but actually I didn’t). Suggestions, additions and clarifications are therefore more than welcome.
Starting from basics
A classic IT humour one-liner:
There are only 10 types of people in the world: those who understand binary, and those who don’t.
Keep on reading if you didn’t get it.
Literally, we must have in mind which is the “numeral system” a computer uses, opposed to the one we “non-computers” use. We count in “base 10” (decimal) while computers count in “base 2” (binary). Count in “base N” means using N symbols (figures) to write numbers (define a quantity) and build them up as a sum of powers of the base. Again, no panic: an example should clarify the matter, making number 126 as “test subject” (I will only consider integer numbers for sake of simplicity). In “base 10” we use digits from 1 to 9 plus 0 (that is 10 symbols) and number 126 will be decomposed this way:
In base 2 we will use only digits 0 and 1 and the same number will be decomposed this way:
Just like we did for base 10, all digits allowed by the base (0 and 1) are multiplied by the powers of the base itself up to the grade required to obtain the number. We’d better start to become familiar with some powers of 2, we will need them soon:
Ultimately we can affirm that , where subscripts specify the numeral system. You are not supposed to learn how to convert numbers into different bases: it is a simple task that can be easily accomplished even from Google search box, writing “convert <number> in binary” to immediately see the result. Keep in mind the different representations to understand implications in the very next paragraphs.
Computers use base 2 because it is the one that best suits the physical nature of the circuits they are made of. I hope the more “hardcore computer scientists” do not get angry for what I am saying, but basically computers are a bunch of switches. As such, switches can be closed (state corresponding to 1) or open (state corresponding to 0). That is more than enough to represent and operate on data we are providing, ground till they become long lines (strings) of 0s and 1s easy to manage. Long lines… but how long?
Enter the “bit”
Let’s see now the “measurement units” used by a computer. The bit (shortening for “binary digit“) is pretty much what it says, a single 0 or a 1, and it is the minimum possible quantity we can deal with: one single open or closed switch. To represent data, a certain number of bits are put together, making a registry.
Source of picture switch
Number 126 from initial example would be more or less represented this way in one of the registries. In it there is an additional position, compared to the actually required number of “boxes” and the reason is simple: registers have a defined length (8, 16, 32, 64 positions) and cannot be “shortened” when data stored in them require less positions. Unused positions (leftmost ones) keep status (value) 0.
Again, in our example we have an 8-bit register; sounds familiar? Maybe we know nothing about electronics and binary systems but we might still recall the famous NES (Nintendo Entertainment System) was an “8-bit” console or that the brand new pc we’d like to buy has a “64 bit” operating system (often pre-installed grrr… :evil:). In other words, hardware and software handle registers of that specific length.
Ok, but what does all this mean? For example on a system capable of N bits audio or graphics, 2N different notes or colours can be handled, each one corresponding to a specific bit sequence: 2 elements (0 e 1) on N positions create so many combinations. Music and art from Super Mario Bros. is well planted in the minds of us all, isn’t it?
A quick test with 23: possible combinations are 8, namely:
If instead we deal with 264 combinations a few more (18446744073709551616 that is more than 18 trillions or 18 billion billions): have fun writing them all down! Currently 64 bit is the limit for available software and hardware.
A more “technnical” implication involves RAM memories, where each combination is a “memory address”. Practically, programs use a bit string to “target” a specific cell in RAM. 32-bit operating systems, for example, can manage 232 = 4294967296 memory addresses (unless options to extend this capability are activated). Anyway, the idea is clear and simple, the more bits, the better.
A little archeology: my first pc (1987 more or less) sported CGA 4-bit (16 colours, but few at a time!) graphics while my first audio card (years and years after that time) was a 8-bit Creative Sound Blaster. After some more time, I had began using, on a slightly better machine, MS-DOS (version 6.x) and grand-grandpa Windows 3.11 that were 16-bit operating systems. Today I tinker with Windows XP/7 and several GNU/Linux distributions, some of which are 64-bit ones, 16-bit audio and 24-bit graphics (16 million colours).
Little numbers, big numbers
Like any other measurement unit, bit has its multiples that help us to deal with large quantities as the 18 trillions cited above. First multiple is the byte: 1 byte (B) = 8 bits (b). Take care of capitals! More often multiples of byte rather then those of bits are used.
To make life more complicated (or easier, depending on your point of view) there are two classes of byte multiples: one is related to decimal units (powers of 10), the other one is related to powers of 2 (I told you we needed them!).
Here is a first look at multiples from the “decimal” class:
I have restricted the list to most commonly used multiples.
Similarly, multiples defined by powers of 2 will be:
Wikipedia page on Kibibyte lists also largest multiples for both cases.
The reason for the doubling is always the same: we, humans, are familiar with base 10 as much as those evil bunch of circuits are with base 2. When a computer measures available resources (RAM or space on disks for example) it counts in the way that best fits its “nature” and then kindly shows results in a “human readable” fashion. So we are not going to come very often across Kib, MiB and other binary derived multiples, although their existence is so important. RAM sticks, for instance, may come in 512 or 1024 MiB sizes (improperly referred as MB) rather than in 500 or 1000 MB sizes. This is precisely due to hardware using binary multiples.
From theory to practice
After the long, hopefully useful, introduction, it is time to give a practical meaning to all the quantities cited above. What can a single byte do? It might represent, for example, a single text character.
Experiment: try to create an empty text file and check its size. Then write one letter in it (spaces, punctuation and newline also count), save and check again. Do you believe me now?
Starting from here, we can relate storage media capacities to “real world”.
One more jump in the recent past: does anybody recall floppy disks, maybe 5 1/4″ ones? The oldest ones I can remember could hold a whopping 360 kB of data, that is 360000 characters. The whole Dante’s “Divina Commedia” is worth about 500 kB (plain text, download from LiberLiber). So just figure out the 3 books, although physically small, squeezed inside two of those old floppy disks. Luckily the gigantic 1.44 MB floppies came!
When the first CDs were marketed, everybody was astonished by their inconceivable capacity of 650 MB, that could suffice to hold a whole encyclopaedia. Or 1300 times the “Divina Commedia”. :-). Today we deal with even bigger storage media, be they DVDs (4.7 / 8.5 GB), Blu-ray (> 50 GB), pen-drives (currently up to 64 GB) and hard disks (2 TB and growing). How many times could the Divina Commedia be held in 1 TB?
I hope the examples above make things a little clearer for friends who often ask me whether 1 MB or 1 GB “is little or a lot of space”, and also for those who insist in trying to mail some hundred MBs of videos. One thing is to send a postcard, one thing is to send a crate of books.
Let’s head back to the always-less-mysterious bytes: equivalence 1 B = 1 character is right but too restrictive. There are several different file formats other than text one. Time for another experiment!
This time we may consider a bitmap (BMP) picture, 100 × 100 pixels in size, that can be created by any simple image software (or with some help from Piet Mondrian). This one resembles the classic TV test card… 🙂
BMP image format is kind of a “table”: each cell holds the value of the colour of a single pixel making up the picture. Such a table will therefore have 100 × 100 = 10000 cells for our test picture. If you save the file in 24, 4 and 1 bit (monochrome) colour format, each cell will contain the colour definition as a 24, 4 and 1 bit long string. Some simple maths will help to evaluate the final file size:
The latter notations are not strictly correct but make calculations a little more straightforward. Actually the “half byte” exists, but we’ll simply pretend it doesn’t… For the very-very-very nitpicking among us, I will admit that real size is slightly bigger, because files contain further information (a “header”) that state their type: actual size of the monochromatic picture, for example, is 1662 B against the 1250 B evaluated size. Science: it works, bitches!
How much for a kilo of bytes?
Bits and bytes are often the mean by which “quality” (and therfore cost) technology products or services are measured (and priced!). Let’s see what we spend our money for.
The easiest category, we have been discussing all time until now. When it comes to buying an hard disk, the purpose it will serve should guide us to buy the right sized one.
“Heaviest” file are for sure videos. Backups of DVDs or Blu-rays (even though they are rarely full to the brim) take up a lot of space and rapidly fill the available space. If that is our need, we are for sure going to buy a larger (> 2 TB) disk. Size of an AVI, MPG or MKV format video file ranges between 1 and 2 GB (2 hour movie) depending on compression quality.
Photos and music share approximately the same typical size: a good smartphone can take high resolution photos about 5 MB in size (usually JPG picture format, compressed file for which the calculations done before for BMP does not apply). Same size for the “average song” in MP3 (4 minutes @ 192 kbps). Depending on the number of photos and pictures we want to be able to store, again we will pick the medium that best fits our needs. Ok, just one second, I am covering “kbps” in a moment…
Multimedia files are tightly bound to time: a song may lasts 5 minutes, a movie a couple of hours. These kind of files are therefore read gradually, differently form a picture or a text that have to be fully loaded when opened. How big the file “chunk” has to be to play 1 sec. of sound or video is stated by its so called “bitrate“, noted as kbps (kilobit per second – kb/s). Bitrate, in turn, comes to indicate how mauch storage space is required to “describe” 1 sec. of sound or video. As for BMP, that is a measure of its quality and it is the hint to evaluate the file size. For the MP3 file, the quick maths is the following:
And here are something more than 5 MB, as promised. Ta Dah! 😉 With this simple trick, some time ago I was able to calculate the right bitrate to fit a whole discography on a single CD. So weird things like these do the trick even in real life, you see…
Connection speed (ADSL, LAN, Wi-Fi)
“Hello, I am from AnnoyPhone, I can offer you an ADSL 20 Mega and a Gigabit wi-fi N router to let you chat and facebook till you lose dignity and sense of reality, leech on Emule, blow up Youtube and surf the Web like there is no tomorrow, all things at the same time 25 hours a day. You want it, don’t you?”
Starting from the basic concept that the connection speed providers “sell” is (highly) theoretical -translated: you’ll never ever see it- the word “Mega”, smartly left a little vague, means of course Mbit/s (Mbps – Megabit per second), not MB/s. I am not using formulas any more: I assume everyone who is reading has understood what I wrote up to this point.
A nice way to test our actual connection speed is visiting Speedtest or downloading via torrent a well known Linux distribution (my choice is Kubuntu :-)). It is also worth to note that even if we are able to receive data at high speed, on the other side of the wires there is a server (or file seeders, for torrent download) that might not be able to fulfil that speed. Personally I use a 8 Mbps connection, that is more than enough for all I need to do. Let’s see why.
Excluding peer to peer downloads, another “bandwidth hog” is video streaming (Youtube, Vimeo and so on): a Flash video at best quality for both audio and video requires about 6 Mbps (see here), so there is room for some more light web browsing through my connection.
Finally I will explain what happens on “our side” of the router. From phone wires we get the connection speed we are paying our service provider for, but router itself and the network (LAN or Wi-Fi) cards allow for much higher transmission speed, very near to the declared values.
LAN (Ethernet cable) is capable of 100 or 1000 Mbps (the latter being named Gigabit), Wi-Fi “g” and current standard “n” reach 54 and 450 Mbps (peak speed) respectively. You are doing the usual maths, aren’t you?
As big as signal losses can be (especially for wireless), we should not worry about slow navigation. Even more so if we are sharing files between home pc or network disks.
Just to clarify, I might lazily sit on my couch streaming a movie stored on my desktop pc through the laptop wireless connection. Top transfer speed I got is over 5 MB/s (yes, this time it’s megabytes), occasionally dropping to 2-3 MB/s. Let me put it like this: if copying the file takes less than its duration, then I can also stream it. Clear, isn’t it? Shall I explain? Some more formulas? I’d rather not, I have been writing too much.
I hope someone will find my notes useful and clarifying rather than completely disrupting. However, I will still be here for a while, sorting KiB from kB (and fiddling with ). Just ask.