To effficiently read and write to databases, they use a data structure called B+ Tree
Instead of having 2 childern like a binary tree, this tree has m+ childern.
Each node doesn’t contain 1 value, but rather 2 values
Data is actually ONLY stored in the leaf nodes, rest of the nodes just help you get there
The leaf nodes are stored in some form of LinkedList, which are linked in a sorted order. This means that once you get access to a specific piece of information in a leaf node, but now want to get similar stuff, it’ll just be easier to find the next few elements.
If there are M nodes, this must mean that there are m-1 keys. For example, in the above picture we have m =3, but only 2 keys above it.
B+ trees also provide indexing
In an SQL database, data is stored inside tables. Tables are a way to organize data where each row contains information about a single primary key. A primary key uniquely identifies each record, where a record is a row.
If we go back to the phone numbers example, let's say we want to
uniquely identify each person by their phone number and store their name
in a table called People
. We must declare the structure of the table first before inserting any records. In SQL, it can be defined as the following:
CREATE TABLE People (
PhoneNumber int PRIMARY KEY,
Name varchar(100)
);
varchar, (sometimes pronounced var-car) is a data type for variable length strings. This can inlcude numbers, letters and special characters.
If we wanted to have another table which would associate each PhoneNumber
in our People
table with an address, indicating that each person must have an address, we can do so in a table called HOMES
. We want to ensure that no PhoneNumber
that does not exist in People
can be inserted into Homes
. This is an example of a FOREIGN KEY
constraint. This now means that each row in Homes
table is associated with a row in the People
table via the PhoneNumber
.