# Database Learning

## Indexes

An index is a data structure (usually a B‑tree) that lets the database find rows faster than scanning the whole table. When you create an index on a column (or set of columns), the DB maintains a sorted copy of those values plus pointers to the corresponding rows.

## Views

A view is a virtual table basically a saved SELECT statement you can query like a table.You can query it just like a regular table, but the data stays in the underlying tables.

## Transactions

A transaction groups one or more SQL statements into a single unit of work. Either all of them succeed (COMMIT) or none of them do (ROLLBACK), which keeps the data consistent even if something goes wrong. Most databases support the ACID properties (Atomicity, Consistency, Isolation, Durability) for transactions

## ACID: Atomicity, Consistency, Isolation, Durability

ACID is the set of properties that guarantee reliable processing of database transactions:

- Atomicity ("All or Nothing") – a transaction is an all‑or‑nothing unit. If any part fails, the whole thing is rolled back.

- Consistency ("Valid State") – The database must always move from one valid state to another.
  Example: If a rule says "balance cannot be negative," the transaction must respect that. After execution, the database should still follow all rules and constraints

- Isolation ("No Interference") – Transactions should not affect each other while running.
  Example: If two people buy the last ticket at the same time, isolation ensures only one succeeds, preventing double booking.

- Durability ("It Stays Saved") – Once a transaction is committed, the changes are permanent—even if the system crashes.
  Example: After transferring money, the new balances remain stored safely in the database

```bash
-- Start the transaction
BEGIN TRANSACTION;

-- Step 1: Deduct $100 from Account A
UPDATE Accounts
SET balance = balance - 100
WHERE account_id = 'A';

-- Step 2: Add $100 to Account B
UPDATE Accounts
SET balance = balance + 100
WHERE account_id = 'B';

-- Commit or rollback
COMMIT;
ROLLBACK;
```

- Atomicity: Both updates (deduct from A, add to B) must succeed together. If one fails, ROLLBACK ensures neither change is saved.

- Consistency: The total money in the system remains the same. Rules like "balance cannot be negative" are enforced.

- Isolation: If two transfers happen at the same time, each runs independently. SQL ensures they don’t interfere (e.g., no double spending).

- Durability: Once COMMIT is executed, the changes are permanently stored—even if the server crashes right after.

## 📌 Why It Matters

- Without ACID, databases could:
- Lose money in transactions,
- End up with corrupted or incomplete data,
- Allow conflicting operations,
- Forget committed changes after a crash.

That’s why most SQL databases (MySQL, PostgreSQL, SQL Server, Oracle) are ACID-compliant by design

## Normalization

Normalization in SQL is the process of organizing tables to reduce redundancy and improve data integrity. The main types are 1NF, 2NF, 3NF (and beyond). Each step makes the database cleaner and more efficient.

- Normalization = breaking large, messy tables into smaller, logical ones.
- Goal: avoid duplicate data, prevent anomalies (update, insert, delete problems), and ensure consistency.
- Common forms: 1NF → 2NF → 3NF → BCNF (Boyce-Codd Normal Form).

## 1️⃣ First Normal Form (1NF) – No Repeating Groups

**_Rule: Each column must hold atomic (indivisible) values._**

## ❌ What NOT to Do in 1NF

- Don’t store multiple values in one column. Example: putting several phone numbers in a single field separated by commas.
- Don’t use repeating groups of columns. Example: phone1, phone2, phone3 columns in the same table.
- Don’t keep non-atomic (non-divisible) data. Example: storing full address in one column like "House 12, Road 5, Dhaka" instead of splitting into street, city, etc.

```bash
-- Bad design: multiple values in one column
CREATE TABLE Customers (
    customer_id INT PRIMARY KEY,
    name VARCHAR(50),
    phone_numbers VARCHAR(100) -- "12345, 67890"
);
```

## ✅ What TO Do in 1NF

- Each column should hold atomic values (single, indivisible).
- Each row should be unique.
- Break repeating groups into separate rows or tables.

```bash
-- Good design: atomic values, no repeating groups
CREATE TABLE Customers (
    customer_id INT PRIMARY KEY,
    name VARCHAR(50)
);

CREATE TABLE CustomerPhones (
    customer_id INT,
    phone_number VARCHAR(20),
    FOREIGN KEY (customer_id) REFERENCES Customers(customer_id)
);
```

**Here**:

- Each phone number is stored in its own row.
- Data is atomic (one value per cell).
- No repeating groups or multi-valued columns.

## 🧩 Easy Analogy

- Think of 1NF like a school attendance sheet:
- ❌ Wrong: writing “Rahim, Karim, Salma” in one cell.
- ✅ Right: each student gets their own row.

## 2️⃣ Second Normal Form (2NF) – Remove Partial Dependencies

**Rule: Every non-key column must depend on the whole primary key**

**_❌ What NOT to Do in 2NF_**

- Don’t keep attributes that depend only on part of a composite primary key.
- This happens when a table has a composite key (two or more columns as the primary key), but some non-key columns depend only on one part of that key.

Example: A table tracking which student takes which course:

```bash

-- Bad design: partial dependency
CREATE TABLE StudentCourses (
    student_id INT,
    course_id INT,
    course_name VARCHAR(50), -- depends only on course_id
    PRIMARY KEY (student_id, course_id)
);
```

**Here**

- The primary key is (student_id, course_id).
- But course_name depends only on course_id, not on the whole key.
- This violates 2NF.

**_✅ What TO Do in 2NF_**

- Separate data into different tables so that every non-key column depends on the whole primary key.
- Create one table for courses, and another for the student-course relationship.

```bash
-- Courses table (course_id → course_name)
CREATE TABLE Courses (
    course_id INT PRIMARY KEY,
    course_name VARCHAR(50)
);

-- StudentCourses table (linking students to courses)
CREATE TABLE StudentCourses (
    student_id INT,
    course_id INT,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (course_id) REFERENCES Courses(course_id)
);
```

- course_name is stored in the Courses table, where it depends only on course_id.
- The StudentCourses table only stores relationships, with no partial dependency.
- Every non-key attribute depends on the whole key.

## 3️⃣ Third Normal Form (3NF) – Remove Transitive Dependencies

**Rule: Non-key columns should not depend on other non-key columns.**

**_❌ What NOT to Do in 3NF_**

- Don’t keep attributes that depend on other non-key attributes.
- This is called a transitive dependency.
- Example: storing city based on zip_code inside the Students table.

```bash
-- Bad design: transitive dependency
CREATE TABLE Students (
    student_id INT PRIMARY KEY,
    name VARCHAR(50),
    zip_code VARCHAR(10),
    city VARCHAR(50) -- depends on zip_code, not directly on student_id
);
```

**Here:**

- student_id is the primary key.
- city does not depend on student_id directly; it depends on zip_code.
- That violates 3NF.

**_✅ What TO Do in 3NF_**

- Separate attributes that depend on other non-key attributes into their own table.
- Keep only direct dependencies on the primary key.

```bash
-- ZipCodes table (zip_code → city)
CREATE TABLE ZipCodes (
    zip_code VARCHAR(10) PRIMARY KEY,
    city VARCHAR(50)
);

-- Students table (student_id → name, zip_code)
CREATE TABLE Students (
    student_id INT PRIMARY KEY,
    name VARCHAR(50),
    zip_code VARCHAR(10),
    FOREIGN KEY (zip_code) REFERENCES ZipCodes(zip_code)
);
```

**Now:**

- city depends only on zip_code in the ZipCodes table.
- Students table has no transitive dependency — every column depends directly on student_id.
- ✅ This is proper 3NF.

## 1️⃣ One-to-One (1:1)

**Definition: Each record in Table A relates to exactly one record in Table B.**

**Example:**

- Table Students: student_id, name
- Table StudentPassports: passport_id, student_id
- Each student has one passport, and each passport belongs to one student.

```bash
CREATE TABLE Students (
    student_id INT PRIMARY KEY,
    name VARCHAR(50)
);

CREATE TABLE StudentPassports (
    passport_id INT PRIMARY KEY,
    student_id INT UNIQUE,
    FOREIGN KEY (student_id) REFERENCES Students(student_id)
);
```

## 2️⃣ One-to-Many (1:N)

**_Definition: One record in Table A can relate to many records in Table B._**

**Example:**

- Table Teachers: teacher_id, name
- Table Courses: course_id, teacher_id
- One teacher can teach many courses, but each course has one teacher.

```bash
CREATE TABLE Teachers (
    teacher_id INT PRIMARY KEY,
    name VARCHAR(50)
);

CREATE TABLE Courses (
    course_id INT PRIMARY KEY,
    course_name VARCHAR(50),
    teacher_id INT,
    FOREIGN KEY (teacher_id) REFERENCES Teachers(teacher_id)
);
```

## 3️⃣ Many-to-Many (M:N)

- In a many-to-many (M:N) relationship, you must use an intermediate (junction) table to connect the two entities. This is because relational databases don’t allow direct many-to-many links between two tables.

**Why Intermediate Table is Needed**

- A student can enroll in many courses.
- A course can have many students.
- If you try to store this directly, you’ll end up with duplicate or messy data.
- The junction table solves this by holding pairs of IDs.

```bash
-- Students table
CREATE TABLE Students (
    student_id INT PRIMARY KEY,
    student_name VARCHAR(50)
);

-- Courses table
CREATE TABLE Courses (
    course_id INT PRIMARY KEY,
    course_name VARCHAR(50)
);

-- Junction table (StudentCourses)
CREATE TABLE StudentCourses (
    student_id INT,
    course_id INT,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES Students(student_id),
    FOREIGN KEY (course_id) REFERENCES Courses(course_id)
);
```

**🔎 How It Works**

- Students holds student info.
- Courses holds course info.
- StudentCourses links them together:
  > Each row = one student enrolled in one course.
  > Composite key (student_id, course_id) ensures no duplicates.

## 4️⃣ Self-Referencing (Recursive Relationship)

**Definition: A table relates to itself.**

- Example:
  > Table Employees: employee_id, name, manager_id
- An employee can be managed by another employee in the same table.

```bash
CREATE TABLE Employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(50),
    manager_id INT,
    FOREIGN KEY (manager_id) REFERENCES Employees(employee_id)
);
```