Introduction to Relational Databases

As you continue building web applications with Express and the MVC architecture, you will soon encounter a fundamental limitation: data persistence. While your applications can process information and render dynamic views, they lack a reliable way to store and retrieve data beyond the current server session. This is where databases come in.

In this learning resource, you will learn about relational databases, understand why they are essential for web applications, and explore the differences between SQL, MySQL, and PostgreSQL. This foundation will prepare you for integrating database functionality into your Express applications.

What is a Database?

A database is an organized collection of structured information or data, typically stored electronically in a computer system. Databases are designed to efficiently store, retrieve, update, and manage data. They provide a way to persist information beyond the life of a single program execution.

Think of a database as a digital filing system where information is stored in an organized manner, making it easy to find specific pieces of data when needed. This organization allows for efficient data operations, even when dealing with large volumes of information.

Why Do We Need Databases?

In the context of web development, databases solve several critical problems:

Data Persistence: Information remains available even after server restarts or application crashes
Data Integrity: Rules can be enforced to ensure data remains accurate and consistent
Efficient Retrieval: Complex queries can be performed to find exactly the data you need
Concurrent Access: Multiple users can access and modify data simultaneously
Scalability: Well-designed databases can handle growing amounts of data and users

Without a database, your Express applications would lose all their data whenever the server restarts. User accounts, application state, and any information entered by users would disappear, severely limiting what your applications can do. Files and images are not stored in the database, but they can be referenced and managed through database records. We will cover file storage at a later point in the course.

Relational Databases

Relational databases organize data into tables with rows and columns, similar to spreadsheets. Each table represents a specific type of entity (like users, products, or orders), and the rows represent instances of that entity. The columns define the attributes or properties that each instance has.

What makes relational databases powerful is their ability to establish relationships between different tables. This allows you to model complex real-world scenarios where different entities interact with each other.

Key Concepts in Relational Databases

Let's explore some fundamental concepts that form the foundation of relational databases:

Tables, Rows, and Columns

A relational database stores data in tables (also called relations). Each table has a name and consists of rows and columns:

Tables: Represent entities or concepts (e.g., users, products, orders)
Rows: Represent individual records or instances (e.g., a specific user)
Columns: Represent attributes or properties (e.g., user_id, username, email)

Here is a simple example of what a "users" table might look like:


        | user_id | username  | email               | join_date  |
        |---------|-----------|---------------------|------------|
        | 1       | johndoe   | john@example.com    | 2023-01-15 |
        | 2       | janedoe   | jane@example.com    | 2023-02-20 |
        | 3       | bobsmith  | bob@example.com     | 2023-03-05 |

Primary Keys

A primary key is a column (or a combination of columns) that uniquely identifies each row in a table. In the example above, the "user_id" column serves as the primary key, ensuring that each user record can be uniquely identified.

Primary keys are essential because they:

Ensure each record is unique within a table
Provide a way to reference specific records
Enable relationships between different tables

Foreign Keys and Relationships

A foreign key is a column in one table that refers to the primary key of another table. This creates a relationship between the two tables, allowing you to model connections between different types of data. For example, if we have a "posts" table in a blog application, we might include a "user_id" column that references the "user_id" in the "users" table:


        | post_id | title           | content                   | user_id |
        |---------|-----------------|---------------------------|---------|
        | 1       | First Post      | This is my first post!    | 1       |
        | 2       | Hello World     | Hello to everyone!        | 2       |
        | 3       | Database Intro  | Databases are important...| 1       |

In this example, the "user_id" column in the "posts" table is a foreign key that references the "user_id" primary key in the "users" table. This relationship links each post to the user who created it, allowing you to easily retrieve all posts by a specific user or identify the author of any post. When designing tables, it is common to define the foreign key either immediately after the primary key in the table definition or at the end of the column list. This helps clarify the structure and relationships within your database schema.

Relationships in Action

When modeling data for a web application, you will frequently need to express connections between different entities. For instance, in an e-commerce application, you might have customers who place orders for products. These relationships can be modeled with foreign keys connecting tables for customers, orders, and products.

Introduction to SQL

SQL (Structured Query Language) is the standard language for interacting with relational databases. It provides a set of commands for creating, modifying, querying, and managing data in a relational database system.

SQL was developed in the 1970s by IBM researchers and has since become the universal language for relational database management. It is designed to be both human-readable and powerful enough to handle complex data operations.

Basic SQL Operations

SQL operations can be categorized into several types:

Data Definition Language (DDL): Commands for creating and modifying database structures
Data Manipulation Language (DML): Commands for inserting, updating, and querying data
Data Control Language (DCL): Commands for controlling access to data

Here are some examples of common SQL commands that demonstrate these types of operations:

Creating a Table

The following SQL command creates a simple "users" table:


        CREATE TABLE users (
            user_id INT PRIMARY KEY,
            username VARCHAR(50) NOT NULL,
            email VARCHAR(100) UNIQUE NOT NULL,
            join_date DATE
        );

This command defines a table with four columns, specifying the data type for each column and adding constraints like PRIMARY KEY, NOT NULL, and UNIQUE to ensure data integrity. In this case, the "user_id" column is the primary key, which uniquely identifies each user, so there's no need to add a separate UNIQUE constraint. The "username" and "email" columns are marked as NOT NULL, meaning they must have a value when inserting a new record. The "email" column is also marked as UNIQUE, ensuring that no two users can have the same email address.

Inserting Data

To add data to a table, you use the INSERT command:


        INSERT INTO users (user_id, username, email, join_date)
        VALUES (1, 'johndoe', 'john@example.com', '2023-01-15');

Querying Data

The SELECT command retrieves data from one or more tables:


        SELECT username, email
        FROM users
        WHERE join_date > '2023-01-01'
        ORDER BY username;

This query selects the username and email of all users who joined after January 1, 2023, and sorts the results alphabetically by username.

Comparing MySQL and PostgreSQL

While SQL is the standard language for relational databases, different database management systems (DBMS) implement SQL in slightly different ways, with varying features and optimizations. Two popular relational database systems are MySQL and PostgreSQL.

MySQL

MySQL was first released in 1995 and is now owned by Oracle Corporation. It is one of the most widely used database systems, particularly for web applications.

Key characteristics of MySQL include:

Ease of Use: Known for being relatively easy to set up and administer
Performance: Optimized for read-heavy operations, making it suitable for many web applications
Widespread Adoption: Used by many popular platforms and content management systems
Storage Engines: Supports multiple storage engines with different capabilities (InnoDB, MyISAM, etc.)

MySQL has traditionally been popular in the LAMP stack (Linux, Apache, MySQL, PHP/Python/Perl), which has been a common setup for web development for many years.

PostgreSQL

PostgreSQL (often called "Postgres") was initiated in 1986 as the POSTGRES project and has evolved into an advanced open-source database system.

Key characteristics of PostgreSQL include:

Advanced Features: Supports a wide array of data types and advanced features like table inheritance
SQL Compliance: More closely adheres to SQL standards compared to many other database systems
Extensibility: Can be extended with custom functions, operators, and data types
Concurrency and Transactions: Excellent support for concurrent operations and complex transactions

PostgreSQL is often chosen for applications that require complex queries, data integrity, and scalability.

Key Differences

While both MySQL and PostgreSQL are powerful database systems that support the SQL language, they differ in several important ways:

MySQL vs. PostgreSQL: Key Distinctions

SQL Dialect: Although both use SQL, there are subtle syntax differences. For example, PostgreSQL uses double quotes for identifiers, while MySQL traditionally uses backticks.

Data Types: PostgreSQL supports a wider range of built-in data types, including more precise numeric types, arrays, JSON, and custom types.

Concurrency Model: PostgreSQL uses MVCC (Multi-Version Concurrency Control) which handles concurrent operations differently than MySQL's approach.

Ecosystem: MySQL has traditionally been more widely used in web development, while PostgreSQL has been favored for applications requiring complex data operations and strict data integrity.

Despite these differences, both systems are excellent choices for web applications. For this course, we will use PostgreSQL because of its robust feature set, strong adherence to SQL standards, and powerful capabilities that will serve you well as you develop more complex applications.

Databases in the MVC Architecture

As you have been learning, the Model-View-Controller (MVC) architecture separates your application into three interconnected components: Model (responsible for data and business logic), View (handles the UI and presentation), and Controller (processes requests and coordinates between Model and View).

In this architecture, the database primarily interacts with the Model layer. The Model represents the data structures of your application and contains the logic for accessing and manipulating that data. When implementing a database in your Express application, you will create Model files that handle database operations. These files will contain functions for creating, reading, updating, and deleting data (often called CRUD operations).

Organizing Database Operations

One common approach is to group related database operations into a single object. For example, in a user management system, you might have a User model with methods like:


        // User model (simplified example)
        import db from '../database.js'; // Connection to the database
        
        // Create a User object with all user-related database operations
        const User = {
            // Get all users
            getAllUsers: async () => {
                const result = await db.query('SELECT * FROM users');
                return result.rows;
            },
            
            // Get user by ID
            getUserById: async (id) => {
                const result = await db.query('SELECT * FROM users WHERE user_id = $1', [id]);
                return result.rows[0];
            },
            
            // Create new user
            createUser: async (userData) => {
                const result = await db.query(
                    'INSERT INTO users (username, email, join_date) VALUES ($1, $2, $3) RETURNING *',
                    [userData.username, userData.email, new Date()]
                );
                return result.rows[0];
            },
            
            // Update user
            updateUser: async (id, userData) => {
                const result = await db.query(
                    'UPDATE users SET username = $1, email = $2 WHERE user_id = $3 RETURNING *',
                    [userData.username, userData.email, id]
                );
                return result.rows[0];
            },
            
            // Delete user
            deleteUser: async (id) => {
                await db.query('DELETE FROM users WHERE user_id = $1', [id]);
                return { success: true };
            }
        };
        
        export default User;

This approach creates a clear namespace for all user-related operations. Controllers would import and use this object like this:


        // In a controller file
        import User from '../models/user.js';
        
        // Route handler for getting a user profile
        const getUserProfile = async (req, res) => {
            const userId = req.params.id;
            const user = await User.getUserById(userId);
            // ...
        };

Alternative Approaches

However, it is not always necessary to group all database functions into a single object, especially when the operations deal with unrelated data types or simple utility functions. In such cases, it is common to define and export standalone functions directly. This approach improves modularity and keeps the codebase flexible:


        // Standalone functions for analytics events
        import db from '../database.js';

        // Log a page visit with arrow function syntax
        const logPageVisit = async (url, userId) => {
            const result = await db.query(
                'INSERT INTO page_visits (url, user_id, visited_at) VALUES ($1, $2, NOW()) RETURNING id',
                [url, userId]
            );
            return result.rows[0].id; // Return the ID of the new visit record
        };

        // Get visit statistics for a specific page
        const getPageVisitStats = async (url) => {
            const result = await db.query(
                `SELECT 
                    COUNT(*) as total_visits,
                    COUNT(DISTINCT user_id) as unique_visitors,
                    MAX(visited_at) as last_visit
                FROM page_visits 
                WHERE url = $1`,
                [url]
            );
            return result.rows[0];
        };

        // Export the functions individually
        export {
            logPageVisit,
            getPageVisitStats
        };

In this example, the analytics.js file exports individual functions rather than a unified object. This approach works well when the functions perform simple, focused operations, the functions might be used independently in different parts of the application, or there is no strong need to namespace the functions under a common object.

Controllers would import and use these functions like this:


        // In a controller file
        import { logPageVisit, getPageVisitStats } from '../db/analytics.js';
        
        // Middleware to log page visits
        const visitLogger = async (req, res, next) => {
            const userId = req.session.userId || null;
            await logPageVisit(req.originalUrl, userId);
            // ...
        };
        
        // Route handler for admin dashboard
        const getAnalyticsDashboard = async (req, res) => {
            const homepageStats = await getPageVisitStats('/');
            const aboutPageStats = await getPageVisitStats('/about');
            // ...
        };

Choosing the Right Approach

There is no one-size-fits-all rule for structuring your database code. For primary entities in your application (users, products, orders, etc.), the object approach provides clear organization. For utility functions or smaller features, individual function exports may be more appropriate. Choose the approach that best fits the specific needs of your application.

Looking Ahead

In future assignments, you will learn how to set up a database connection in your Express application and create models that interact with the database. This separation of concerns will make your code more organized, maintainable, and scalable.

The Power of Databases in Web Applications

Integrating a database into your Express application unlocks numerous possibilities that would be difficult or impossible to achieve otherwise. Understanding these capabilities helps illustrate why databases are such a fundamental part of modern web development.

User Accounts and Authentication

Databases allow you to store user information securely, making features like user registration, login, and personalized experiences possible. Without a database, user credentials would be lost when the server restarts, forcing users to re-register every time your application starts up.

Content Management

For applications that manage content such as blog posts, product listings, or user-generated content, databases provide a structured way to store, organize, and retrieve that content. This enables features like search, categorization, and content relationships.

Data Analysis and Reporting

With data stored in a structured format, you can run complex queries to analyze trends, generate reports, and gain insights from user behavior or application usage. This capability becomes increasingly valuable as your application grows and accumulates data over time.

Application State and History

Databases allow your application to maintain state across sessions and track history, enabling features like order history, activity logs, and data recovery. This persistence creates a much richer user experience and enables business-critical functionality.

Scaling Beyond a Single Server

As your application grows, you may need to run it on multiple servers to handle increased traffic. A database provides a central location for data that can be accessed by multiple server instances, enabling horizontal scaling of your application architecture.

Key Concepts Summary

Relational databases are a fundamental component of modern web applications. They provide a structured, efficient, and reliable way to store and retrieve data, enabling features that would otherwise be impossible to implement.

You have learned that databases solve the critical problem of data persistence in web applications, allowing information to survive server restarts and enabling features like user accounts, content management, and data analysis. Relational databases organize data into tables with established relationships, creating a flexible foundation for modeling real-world scenarios.

SQL serves as the universal language for database operations, providing standardized commands for creating structures and manipulating data. While MySQL and PostgreSQL both implement SQL, they each offer different strengths, with PostgreSQL providing more advanced features and stricter SQL compliance.

Finally, you have seen how databases integrate into the MVC architecture through the Model layer, where database operations are organized into logical modules that controllers can use to serve dynamic content to views.

What You've Learned

Databases provide persistence — They allow your application to store and retrieve data even after the server restarts.

Relational databases organize data in tables — Tables have rows (records) and columns (attributes), with relationships between tables established through keys.

SQL is the language for database operations — It provides commands for creating structures, manipulating data, and controlling access.

MySQL and PostgreSQL are popular database systems — They implement SQL with different feature sets and optimizations.

In MVC, databases interact with the Model layer — Models encapsulate the logic for database operations, keeping your code organized and maintainable.

Explore Further

Now that you understand the foundational concepts of relational databases, you might want to explore some of these topics independently:

Research different types of relationships in relational databases (one-to-one, one-to-many, many-to-many)
Explore database normalization concepts and why they matter for data organization
Look into NoSQL databases and understand when they might be used instead of relational databases
Investigate database indexing and how it improves query performance

As you continue in this course, you will learn how to set up a PostgreSQL database, connect it to your Express application, and perform CRUD operations to create fully featured web applications with persistent data storage.