[DatabaseSystem] Database Introduction And Relational Model
Why Database System
Database System is an organized collection of inter-related data that models some aspect of the real-world. Databases are core of most computer applications. Before the whole course discussion, first question is that why we need database system.
Flat File System
The basic data storage we used before is to storr information in files. For example, we use CSV to store batch data. However, it requires application to parse the files each time when the users want to read or update records.
for line in files:
record = parse(line)
if (record[0]=="Ice Cube")
print record
Also, flat files does not have normalization, and you could not guarantee the key relation between different files. Let’s conclude the problem of flat file system as following:
1. What if we want to create a new application using the same database?
2. What if two threads try yo write to the same file at the same time?
3. What if the machine crashes while our program is updating a record?
4. What if we want to replicate the database to provide high availability?
The answer to the question above is called “DBMS”
DBMS
A DBMS is software that allows applications to store and analyze information in a database. A general-purpose DBMS is designed to allow the definition, creation, query,update and administration.
Relational Model
Data Model is a collection of concepts for describing the data in a database.
Schema is a description of a particular collection of data, using a given data model.
The popular data models include:
1. Most DBMS: Relational
2. NoSQL: Key-value, Graph, Document, Column-family
3. Machine Learning: Array/Matrix
4. Obsolete/Rare: Hierarchical, Network
For relational model, we define following concepts:
1. Structure: The definition of relations and their contents
2. Integrity: Ensure the database’s contents satisfy constraints.
3. Manipulation: How to access and modify a database’s contents.
name | year | country |
---|---|---|
personA | 1992 | USA |
personB | 1992 | USA |
personC | 1989 | USA |
The table with 3 columns above is a 3-ary Relation, and composed of 3 tuples.
Primary Key
Primary key uniquely identifies a single tuple, some DBMS automatically create an internal primary key if you don’t define one.
Foreign Key
A foreign key specifies that an attribute from one relation has to map to a tuple in another relation.
Data Manipulation Languages (DML)
DML is related to the way to store and retrieve information from a database.
Procedural: The query specifies the high-level strategy, and then DBMS use it to find the desired result
Non-Procedural: The query specifies only what data is wanted, but does not show how to find it.
Conclusion
The relational model is independent of any query language implementation. SQL does not equal to relational model, instead, SQL is a implementation of relational model in de facto standard.
SQL is non-procedure which only tells which data is required by users, instead of writing a for loop to get data one by one.