Computers are everywhere. We are surrounded by technology, whether be it on our phones / tablets, in our cars, or on our laptops. But what is a computer and how does it work?
In the beginning, computers started as common calculators, and they were only manipulating numbers back then.
It all started when humans began wondering whether a machine could be designed to help us with the thinking we do, such as solving equations. In order to do so, the machines needed to manipulate information.
Therefore, a computer is a machine that is able to perform the following tasks:
We tell the computer what to do through the input, which is performed by using the keyboard, mouse, microphone, camera, touch screen, etc. All the input is stored as information in the memory of the computer.
A computer’s processor takes information from memory, manipulates it, and then it sends the processed information back to the memory, in order to be stored.
The output is what we see, as the end users of the computer. It can be either by text, image, videos, etc. When two computers are connected via internet, the output of one can be passed as input to another, and so on.
A computer is made out of hardware pieces, and the most important ones are the following:
The motherboard connects all the computer parts together, either directly, or via cables.
The CPU is the brain of the computer, being responsible for processing commands from hardware and software.
The CPU clock speed of a processor is the number of instructions that can be processed each second. A CPU with a clock speed of 3.0 GHz can process 3.0 billion instructions (CPU cycles) per second.
CPUs with multiple cores (dual core, quad-core) means that the CPU can simultaneously manage two/four times the number of instructions every second.
CPU registers are a place to temporarily store commonly used data. Registers are the fastest possible access to the processor, but they are very limited in size.
Similar to the CPU registers, CPU cache is a place to temporarily store commonly used data, for faster retrieval. The data in cache is slower than accessing the registers, but faster than using RAM because it's a physical part of the processor.
RAM is the hardware which temporarily stores data. When data is used, it is first fetched from storage (hard drive) into RAM, because it is much quicker to run the data from the RAM. As an analogy, the RAM is similar to a desk in which all the files that you’re using are there, while all the files you have ever had are stored in a storage place far from you. In comparison with the hard drive, the contents in RAM are lost when the power is off, whereas on hard drive the data is persistent, hence the term of storage.
If we want to know how computers work on the inside, we need to talk about electric wires, circuits, and signals.
Inside the computer, we will find electric wires and circuits that carry all the information, and that uses electricity to do their job.
But how can they do so?
If you have a single wire with electricity flowing through it, the signal can be either on or off. This is great, because one wire can now represent the value one or zero, based on the electricity state.
This state of a wire is called a bit, and this is the smallest unit of information a computer can store.
In order to process the information received as input, and to provide output to the end-user, a computer needs the input signals. This is performed with the use of the electronic components, which come together to form circuits.
The “NOT” circuit takes an electrical signal and flips it. If the input is 1, the circuit sends a 0, otherwise it will send 1. Since the signal that comes in is NOT the same as the signal that comes out, the signal is called "NOT".
Other circuits take multiple signals and combine them.
The AND gate returns 1 only if both input signals are 1, otherwise the output will be 0.
|Input 1||Input 2||Result|
The OR gate returns 1 if either of the input signals is 1. The result will be 0 only if both signals are 0.
|Input 1||Input 2||Result|
The XOR gate returns 1 if only one of the two signals are 1. The result will be 0 if both signals are equal. (both 0 or 1)
|Input 1||Input 2||Result|
These gates perform simple calculations, but when put together in a circuit, they are able to perform more complicated calculations.
An Adder is a circuit board that adds two bits together. This circuit takes two bits, and adds them together to calculate the sum.
|Input 1||Input 2||Result|
Putting circuits side by side can be further used to calculate the sum of much larger numbers.
In all numbering systems, the formula to calculate a number is as follows:
We calculate the value of each digit, and then we add them together.
Not clear? No problem, let's see some examples in decimal and binary systems.
In the decimal system, we have 10 available value: zero to nine.
Like i previously said, each digit of a number have a different value, based on its position in that number.
Let’s take for example the number 1234:
Now, let’s display the position for each digit:
|103 = 1000||102 = 100||101 = 10||100 = 1|
And this is how we calculate the value based on the digits:
(1 * 103) + (2 * 102) + (3 * 101) + (4 * 100) = 1000 + 200 + 30 + 4 = 1234
In the binary system, we only have two possible values, zero and one.
Similar to decimal system, each position also carries a value, but instead of having 10 as base (100 = 1, 101 = 10, 102 = 100), we use the base two (20 = 1, 21 = 2, 22 = 4).
Let’s take for example the value 1010:
Now, let’s display the position for each digit:
|23 = 8||22 = 4||21 = 2||20 = 1|
To calculate the value, we do:
(1 * 23) + (0 * 22) + (1 * 21) + (0 * 20) = 8 + 2 = 10.
With this, any number can be represented with ones and zeros, assuming we use enough bits for the job.
|Decimal (base 10)||Binary (base 2)|
|1||3||103 = 1000||1||3||23 = 8|
|2||2||102 = 100||0||2||22 = 4|
|3||1||101 = 10||1||1||21 = 2|
|4||0||100 = 1||0||0||20 = 1|
With 8 bits, we can store numbers between 0 and 255.
That is because each bit represent a value, which is the power of 2 based on their position.
The highest value is set when all bits are set to 1, which is equal to 255 (128 + 64 + 32 + 16 + 8 + 4 + 2 + 1).
|27 = 128||26 = 64||25 = 32||24 = 16||23 = 8||22 = 4||21 = 2||20 = 1|
We call an octet a unit of information that consist of 8 bits. Most people use the word byte instead of octet, and that is not exactly correct because a byte is not always equal to 8 bits, but is guaranteed to contain at least 8 bits.
If this is not clear yet, let’s assume we want to calculate the following numbers:
Let’s apply the knowledge we got from school on how to add two numbers - by adding together two individual digits. If the sum of two digits is a two-digit number:
Let’s take them digit by digit and calculate their real value based on their position in the number.
Decimal 1234 = (1 * 1000) + (2 * 100) + (3 * 10) + (4 * 1)
Binary 1010 = (1 * 8) + (0 * 4) + (1 * 2) + (0 * 1) = 8 + 2 = 10 (decimal)
Note: In school, we learned that:
a / b = divisor (the number of times a is divided by b)
a % b = remainder (what remains after we divided the values – also called modulo)
if we want to divide ten items in groups of 3, we will have 3 full groups of 3 items each, and in the end we’re left with 1 remaining item.
In the decimal system, we cannot go higher than the digit 9, because we are in base 10.
When that happens, we take the remainder (13 % 10 = 3) and store 1 for the next digit operation.
In the binary system, we cannot go higher than the digit 1, because we are in base 2.
When that happens, we take the remainder (2 % 2 = 0) and store 1 for the next digit operation.
Therefore, in base 2, 1 + 1 = 10.
|Decimal||Binary||Incremental with 1||Logic|
|0||0000||0000 + 0001 = 0001||0 + 1 = 1|
|1||0001||0001 + 0001 = 0010||01 + 01 =
|2||0010||0010 + 0001 = 0011||10 + 01 = 11|
|3||0011||0011 + 0001 = 0100||11 + 01 =
|4||0100||0100 + 0001 = 0101||100 + 001 = 101|
The hexadecimal system is widely used in programming, as it provides a more human-friendly representation of binary-coded values. A hexadecimal digit uses a four-bit aggregation (called a nibble) to represent its values.
The hexadecimal digit uses the decimal numbers (0-9) and six extra symbols for the values for 10 and beyond, taken from the english alphabet (hexadecimal A = decimal 10, B = 11, … F=15).
To avoid confusion with the decimal system, we write the numbers with the letter h after the value (25h = 25 hexadecimal), or with 0x before the number (0x25). Similar to the other systems, the value is the sum of the values calculated by multiplying the digit value with the base to the power.
Let’s take for example the value ABCD:
To calculate the value, we do the following:
(A * 163) + (B * 162) + (C * 161) + (D * 160) = (10 * 163) + (11 * 162) + (12 * 161) + (13 * 160) = 43,981
Therefore, if we want to provide some values to the computer, it’s easier to write 0xCDEF, instead of writing 1100 1101 1110 1111.
Since 0xF is the value 1111, the value 0xFF will be a full byte (1111 1111).
The conversion between hex and binary is quite easy.
From hex to binary, we must change from base 16 to base 2, and since 16 is a power of 2, we simply convert each hex digit to the four binary bits that correspond to that value.
Let’s take for example 0xABCD.
With the help of the values table, we can directly transpose them to binary.
0xABCD = 1010 1011 1100 1101
From binary to hex, we first need to assure that the number of bits are a multiple of 4. If that’s not the case, we will concatenate the value with leading 0s to make it so, because leading 0s add no value to the number (same as with decimals, the value 42 being equal to 0042). Afterwards, we group 4 bits at a time and convert them to their hex correspondent.
Let’s take for example the sequence 10 1100.
First step is to assure we have a multiple of 4, and trailing zeros are not relevant (value 12 in decimal = 012 = 0012).
So, our bit sequence with trailing 0s will be 0010 1100.
The conversion between decimal and other bases is a bit more difficult than that. The easiest way would be to divide by the base repeatedly, and then read the remainder.
Given the value 245, converting from decimal to binary (dividing by 2) is as follows:
|245||1||245 / 2 = 122, remainder 1|
|122||0||122 / 2 = 61, remainder 0|
|61||1||61 / 2 = 30, remainder 1|
|30||0||30 / 2 = 15, remainder 0|
|15||1||15 / 2 = 7, remainder 1|
|7||1||7 / 2 = 3, remainder 1|
|3||1||3 / 2 = 1, remainder 1|
|1||1||1 / 2 = 0, remainder 1|
Let’s read the remainder now (from lowest to highest). The value is 1111 0101.
The number 1111 0101
= (1 * 27) + (1 * 26) + (1 * 25) + (1 * 24) + (0 * 23) + (1 * 22) + (0 * 21) + (1 * 20)
= 128 + 64 + 32 + 16 + 4 + 1
|245||5||245 / 16 = 15, remainder 5|
|15||15||15 / 16 = 0, remainder 15|
Let’s read the remainder now (from lowest to highest). The value is 0x15 0x5.
Since 15 = 0xF, the hex equivalent is 0xF5.
Let’s convert to binary so we can confirm this with the math from above.
Let’s take an example to understand what’s going on.
|0100 (4)||< < 1||1000 (8)|
|1000 (8)||< < 1||0000 (0)|
|1000 (8)||> > 1||0100 (4)|
|0100 (4)||> > 1||0010 (2)|
|0001 (2)||> > 1||0000 (0)|
The concept of bit masking is a bit more difficult to grasp, but I’ll try to explain it as simple as possible.
The bit masking is performed with the AND operator (&), and we will need to have a mask of bits that we apply to our data.
Given the binary sequence 0010 0011, let’s assume we want to get the value of the bit from the first position.
0b is just a way of saying that our number is in binary form. Not all programming languages support such feature, but we’ll use it here for making it more clear.
value = 0b0010'0011; bit = value & 0x1;
We take the whole number, and then we do a logical AND with the value 0x1.
Keep in mind that 0x1 = 0000 0001, as we can have trailing 0s.
Doing an AND on this and any other bit sequence will return the following:
This way, we managed to take only the bit value for the last bit. Let's try now with a mask that consists of multiple bits.
Assuming we want to take the first four bits (0xF):
We can see that the result is the same with what was set on those positions.
But what if we want to do a mask in the middle of a bit field? Let’s say we want to take the bits 2-3.
For this, we set 1 as many as bits we want to return (2 in our case, for bits from [2,3]) and we shift them to that location.
mask = 0b0000'0011 << 2;
We can see that our end result is 00 because the bit values in the number we applied the mask on were 00 as well. If any bits were there with the value of 1, the AND result would have also been 1.
Since we need two letters to represent a byte, we can split them in two parts: high order, and low order.
For the value 0xAC (1010 1100), 0xA (1010) is the high order part, whereas 0xC (1100) is the low order part).
If we want to take them separately and then put them together, we need to perform a set of logical operations.
The low part (0x0C) will be taken by doing an AND call between our number and the range we’re interested in (0xF for 4 bits).
num = 0xAC; // 1010 1100 low = num & 0xF; // 0000 1111
The high part (0x0A) will be taken by shifting the bits four positions to the right, overwriting the bits that are there. Since we shift them, what is left on the left side will be filled with 0’s, so we will remain only with the bits we’re interested in.
num = 0xAC; // 1010 1100 high = num >> 4; // 0000 1010
When combining together, we need to move the high bits to where they are supposed to be, so we have to shift them back on that position. The OR operation will set the bit to 1 if there’s any bit to 1 in that operation, and since we use it with bits that are on different positions, it’s a simple concatenation of bits for our final number.
low = 0xC; // 1100 high = 0xA; // 1010 num = (high << 4) | low;
|1||0||1||0||0||0||0||0||High << 4|
|1||0||1||0||1||1||0||0||High << 4 | low|
What happens if we want to increment a value, but it’s the highest possible value for that data type?
Will the number of bytes that we store in memory will increase? Of course not.
This is called an integer overflow. When we want to go over the highest possible value in a data type, we overflow the range we can use to represent that value. The value that will be set after incrementing with 1 over the highest possible value will be the lowest possible value for that data type.
The most common result will be to wrap around the maximum – and be set to 0 (lowest possible value).
255 + 1 will not be equal to 256 since that would require the bit 29 to be set to 1, and we only have 8 bits in total.
In this case, 0 is the lowest possible value, because we use all bits to represent a positive number, therefore we can represent the range 0-255.
But there are cases in which we need to represent negative values as well. But how can we do that, if we represent numbers as sequences of bits? One way of representing this is through making use of the leftmost bit as a flag (sign bit).
A signed variable is a variable that can hold both positive and negative values. On the same principle, an unsigned variable can only hold positive values.
For an 8 bit data type, by using 1 bit for the sign, we are left with 7 bits for the data. Therefore, this means that the range will be from -128 to +127.
The first bit is the sign, and 1 shows that we are a negative value. Therefore, the number will start with the – sign.
Having 0 everywhere means this is the smallest possible negative value.
In this case, the sign is 0, so we are dealing with a positive value.
As we’re used to by now, we calculate the decimal value in the same way – by powers of 2.
111 1111 will be our data (127) and the sign is +, so our maximum value will be +127.
An integer underflow occurs when we substract over the minimum value. Decreasing with 1 from the minimum value will cause the data to reach the highest maximum value that can be stored on those bits.
On a computer, the memory is organized into bytes. When we assign a value to a data type, we write that value at a memory location.
Pointers are used to store the address location of another variable located in memory. Based on the type it points to, the compiler will know how many bytes to read starting from that location and how to interpret those bytes.
When we want to write in memory a value with a size larger than a byte, we need to decide how those bytes will be stored.
Endianness refers to the order in which bytes are written in memory.
When reading memory or receiving transmitted data from a different computer system, it is required to verify the endianness, and if they differ, to process and translate data between one and another.
Before we talk about endianness types, we need to first define what is the most significant byte, and the least significant byte.
In the number 123456, 1 is the most significant digit, and 6 is the least significant digit.
That is because changing 1 will have the highest impact upon the number, whereas changing 6 will not have such a big impact on the overall number.
Under the same logic, assuming we have the data 0x1234, 0x12 is the most significant byte, and 0x34 is the least significant byte.
Big-endian is a format in which the most significant byte is stored or sent first (has the lowest address), and then we send the following bytes in decreasing significance order.
Assuming we want to store the value 0x12345678 in big endian, we will store them on an incremental memory address, as follows:
Little-endian is a format that reverses the order. We send or store the least significant byte first (lowest address) and the most significant byte last (highest address).
Until now, we talked about bits and bytes, but how does computers represent other types of information, such as text, video, or audio? Well, all of these things can be represented with numbers.
If we think of all the letters in the alphabet, we can represent each of them with a different number.
A = 1, B = 2, C = 3, etc.
|0000 0100||0000 0001||0001 0100||0000 0001|
With this, we can represent any word as a sequence of numbers. Every word we see is displayed through data.
But what about images, and videos?
Images and videos are made out of a big number of dots called pixels, and each pixel has a color, each of them being represented as numbers.
The most common system used to represent them is RGB (Red, Green, Blue), where each pixel is a combination of these 3 bytes, that goes from 0 - 255. A typical image has millions of pixels, and a video with 30 FPS (Frames per second) means that there are 30 frames drawn on the screen (rendered) each second, or in other words, 30 images per second.
Any sound is a series of vibration, and graphically, they are represented as a waveform. Any point on the waveform can be represented by a number.
With this, any sound can be broken down into a series of numbers. For a higher quality sound, you pick 32-bit audio over 8-bit audio, because more bits imply a higher range of numbers, so it is more precise.
When we want to create a variable to hold the value 1 for example, we need to specify to the computer where we want to store it, how much space (bytes) it should hold, and how to interpret the data that is at that location. But declaring all of that is really not as complicated as it looks like, the value 1 is an integer, so we need to use an integer type to store it, and the compiler will allocate enough space for an integer to be placed at that memory location.
Below, you’ll find a table with the standard types and their minimum size in bytes. I say minimal, because it’s not necessary to be of that size, but it’s the smallest possible size for each of the types – therefore these types can have a higher number of bits in different architectures or different programming languages.
These standard types are often called primitives, due to the fact that they are basic building blocks and they are integrated directly into the language.
|Type||Keyword||Example||Range||Minimum size in bytes|
|Boolean||bool||True||Limited to true/false||1|
|Character||char||a||-128 to +127||1|
|Small number||short||3||-32,768 to +32,767||2|
|Number||int||40000||-2,147,483,648 to +2,147,483,647||4|
|Long number||long||9999999999999||-263 to +(263 -1)||8|
|Decimal number||float||1,38||3.4E +/- 38 (7 digits)||4|
|double||1,400000000000000||1.7E +/- 308 (15 digits)||8|
So far, we talked about the hardware and the computer circuits. But how does the software interact with these?
Instructions that can be executed directly by a CPU are called machine language instructions, or machine code. That is possible because the CPU manufacturer has burned into the CPU what each machine instruction will do, and therefore, machine code is dependent on a specific CPU architecture.
Assembly language (‘asm’) is a low-level programming language that acts as the middle layer between the program statements (code) and the architecture’s machine code instructions.
The assembler is responsible for converting the assembly code into machine code, so that the CPU can execute the instructions.
The machine code is then stored in memory by the CPU, and then fetched sequentially and executed.
The software is written in high-level commands that are converted into simpler commands understood by the CPU (the hardware) through the process of compilation.
There are many programming languages, each with their own differences, and due to that they are also behaving differently. Based on the programming language itself, we can divide them into two categories: Compiled languages, and Interpreted languages.
A compiled language is a programming language which requires a compiler (translator that generate machine code from source code) to create an executable. Programs that are compiled tend to be faster than those written in interpreted languages, due to the effect of the optimizations made possible in the compiler.
Once the code has been compiled and the executable has been created, that executable is compatible only for the platform it was created for. That is due to many reasons, including:
An interpreted language is a programming language in which most of the instructions are executed directly, without the need for a compilation to machine-language instructions. The program is executed through an interpreter, which translates each statement as it is run.
Interpreted languages have some advantages, including:
Disadvantages for interpreted language include:
We talked about compiled languages, now let’s also talk about the process that converts high level language code into machine level language. This process is called compiling the code, and this is performed through the following steps: Preprocessing, Compilation, and Linking.
Preprocessing phase is responsible for replacing the include directive (the command which includes/import code from other files) with the actual contents, for replacing the macros (#defines in C++), and selecting the parts of the code based on compiler directives (specific code for specific platforms, for example).
Compilation phase is responsible for converting the source code into assembly code for that specific CPU architecture, and then converting it further to object code with the use of assembler. It also generates the object files (.o) for that platform. At this stage, the developer receives the usual compiler errors, like syntax errors.
So far we talked about executable (binary) files, which can be executed directly by the operating system. Based on our needs, we can also create different types of files, such as libraries.
Libraries are files that contain reusable code, which can be used in other libraries or executables. They are invoked directly by other code, once the library is loaded.
When linking, libraries come in two flavors: static and dynamic libraries.
Static libraries are used at compile-time (when the code is compiled to create the binary), and the code inside the libraries is copied into the final executable.
The advantage when using static libraries is that there is no performance or versioning problem, and that the code is optimized by the compiler.
A disadvantage is that we cannot make use of code reuse. If we have multiple applications that use the same static library, each of them will have its own copy of the code.
Dynamically linked libraries are not copied in the code, but instead the compiler gets a small stub which is used to load them by name when they are needed (at runtime).
If the library is not found when the application is loaded, the application will stop immediately with an error.
An advantage when using dynamic libraries is that we can benefit from code reuse, the executable is smaller, and the library can be updated without rebuilding the application, as long as the interface is the same.
We can also load or unload the library on demand in the process memory.
A disadvantage when using DLLs is that we have a performance drop due to loading the library in memory at runtime.
Modules are the equivalent of a dynamic library, but for interpred languages. Unlike the compiled languages, a module can act both as a module and as an executable.
Ok so we have a piece of code and we compiled it, and there it is – our executable. But what’s next? How do we execute our code? What happens when we attempt to run it ?
The Operating System reads and validates the integrity of the executable.
After the OS confirms that we have a valid executable, it looks into a special section for dynamic libraries and loads them.
An instance of execution is then created, which is called a process. The process is instantiated when the application is loaded into the computer’s memory and begins execution, and it is nothing more than a manager/wrapper, which encapsulates all the resources that are needed by our application.
A process is divided into four sections:
While the stack and the heap are related to the memory allocated statically or dynamically, the text and data fields are filled by data taken from the executable.
Each process contains one or more execution threads, and it uses them to execute code sequentially or in parallel.
A thread is a sequence of instructions that can be passed and processed by a CPU core. It contains the thread id (so we can differentiate between threads), the program counter / instruction pointer (the last executed instruction), a set of registers, and a stack. The stack is a data structure (we’ll learn more about it later) that we use to keep track of the functions that were called on that thread.
If the process contains only one thread of execution, it is called a single-threaded application, and it means that all instructions run sequentially.
If the process contains more than one thread of execution, it is called a multi-threaded application. In this case, each thread can be scheduled independently (can run at the same time as other threads from the same process), and will share the heap memory with all the other threads.
The stack is the memory allocated for a thread of execution. When calling a function, a block is reserved on the top of the stack for local variables and arguments / return value. When that function returns, the objects that were created on that stack are destroyed and the block can be reused for the next function call.
In comparison with the heap, the stack memory is faster. It is easier to allocate and deallocate memory from it (simply increment/decrement a pointer). Other than that, because the memory is reused very frequently, it tends to be mapped to the CPU’s cache. When there’s a lot of nested functions called, or an infinite recursive call, the stack runs out of memory (which is called a stack overflow) – and the application is terminated.
The heap is used for dynamic allocation. You can allocate and deallocate any block of any size, at any time. Data stored on the heap has to be deleted manually by the programmer – the system will not take care of that for you.
A memory leak is something that occurs when we allocate memory on the heap and forget to deallocate it. This is problematic, because we have a resource that is no longer used (or cannot be accessed anymore) that keeps the memory used until the process exits.
The heap is slower than the stack because it is a global resource, so it must be multi-thread safe, therefore each allocation and deallocation must be synchronized with other heap accesses in the application.
An example of platform is the operating system, so we can include windows-related code only if we’re building the binary for windows, for example.
A pointer references a location in memory.
CPU cache is a small storage space that keeps the most frequently used data in order to reduce the accesses to the main memory.
Except for languages that contain garbage collector (which automatically frees memory that is no longer referenced).
NULL refer to the absence of a value, which means that it is not the same as having the value 0.
A Complete binary tree is a binary tree in which all nodes except leaves have two children, and the last level has all keys as left as possible.