Data Structure Using C & C++

UNIT-I: Introduction to Data Structure and Array

Introduction to Data Structure and its Characteristics. Array: Representation of single and multidimensional arrays; Sparse arrays – Lower and Upper Triangular Matrices and Tridiagonal Matrices with Vector Representation.

UNIT-II: Stacks and Queues

Stacks and Queues: Introduction and Primitive Operations on Stack; Stack Application; Infix, Postfix, Prefix Expressions; Evaluation of Postfix Expression; Conversion between Prefix, Infix, and Postfix. Introduction and Primitive Operation on Queues, Deques, and Priority Queues.

UNIT-III: Lists

Lists: Introduction to Linked Lists; Sequential and Linked Lists, Operations such as Traversal, Insertion, Deletion, Searching; Two-Way Lists and Use of Headers.

UNIT-IV: Trees

Trees: Introduction and Terminology; Traversal of Binary Trees; Recursive Algorithms for Tree Operations such as Traversal, Insertion, Deletion; Binary Search Tree.

UNIT-V: B-Trees

B-Trees: Introduction, The Invention of B-Tree; Statement of the Problem; Indexing with Binary Search Trees; A Better Approach to Tree Indexes; B-Trees; Working up from the Bottom; Example for Creating a B-Tree.

UNIT-VI: Sorting and Searching Techniques

Sorting Techniques: Insertion Sort, Selection Sort, Merge Sort, Heap Sort. Searching Techniques: Linear Search, Binary Search, and Hashing.

UNIT-I: Introduction to Data Structure and its Characteristics Array

1. Introduction to Data Structures

A data structure is a way of organizing and storing data to enable efficient access and modification. It provides a framework for managing and utilizing data effectively in computational problems.

Characteristics of Data Structures:

Efficiency in storage and retrieval.
Flexibility in terms of size and access patterns.
Scalability for complex problems.

2. Representation of Single and Multidimensional Arrays

Arrays are a collection of elements of the same type stored in contiguous memory locations. They can be single-dimensional or multi-dimensional.

    Example: Single-dimensional array
    int arr[5] = {1, 2, 3, 4, 5};

    Example: Two-dimensional array
    int matrix[2][3] = {
        {1, 2, 3},
        {4, 5, 6}
    };

3. Sparse Arrays

A sparse array is an array in which most of the elements are zero or empty. Special representations are used to store only the non-zero elements efficiently.

    Example: Sparse matrix (2D array with mostly zeros)
    0  0  3
    0  4  0
    5  0  0

    Stored as:
    Row   Column   Value
    0     2        3
    1     1        4
    2     0        5

4. Lower and Upper Triangular Matrices

In a lower triangular matrix, all elements above the main diagonal are zero, while in an upper triangular matrix, all elements below the main diagonal are zero.

    Example: Lower triangular matrix
    1  0  0
    4  2  0
    5  3  3

    Example: Upper triangular matrix
    1  2  3
    0  4  5
    0  0  6

5. Tridiagonal Matrices with Vector Representation

A tridiagonal matrix is a square matrix in which non-zero elements are present only on the main diagonal, the diagonal above it, and the diagonal below it. It can be represented using three vectors.

    Example: Tridiagonal matrix
    2  3  0  0
    4  5  6  0
    0  7  8  9
    0  0  1  2

    Stored as:
    Diagonal below main: {4, 7, 1}
    Main diagonal: {2, 5, 8, 2}
    Diagonal above main: {3, 6, 9}

UNIT-II: Stacks and Queues

1. Introduction to Stacks

A stack is a linear data structure that follows the LIFO (Last In, First Out) principle. Elements are added and removed only from the top of the stack.

Primitive Operations:

Push: Add an element to the top of the stack.
Pop: Remove and return the top element of the stack.
Peek/Top: View the top element without removing it.

    Example:
    Stack: []
    Push(5): [5]
    Push(10): [5, 10]
    Pop(): [5] (returns 10)

2. Stack Applications

Stacks are used in various applications:

Expression evaluation: Evaluate mathematical expressions.
Backtracking: Solve problems like maze traversal or undo operations.
Function call management: Manage function calls in recursive programming.

3. Infix, Postfix, and Prefix Expressions

Infix: Operators are between operands (e.g., A + B).
Postfix: Operators follow operands (e.g., AB+).
Prefix: Operators precede operands (e.g., +AB).

    Example:
    Infix: (A + B) * C
    Prefix: * + A B C
    Postfix: A B + C *

4. Evaluation of Postfix Expressions

Postfix expressions are evaluated using a stack. Operands are pushed onto the stack, and operators are applied to the top two elements.

    Example:
    Postfix: 5 6 + 2 *
    Steps:
    Push(5), Push(6), Pop(6, 5), Push(11), Push(2), Pop(2, 11), Push(22)
    Result: 22

5. Conversion Between Prefix, Infix, and Postfix

Conversions involve changing the order of operators and operands based on the type of expression:

    Example:
    Infix to Postfix:
    Infix: A + B * C
    Postfix: A B C * +

    Infix to Prefix:
    Infix: A + B * C
    Prefix: + A * B C

6. Introduction to Queues

A queue is a linear data structure that follows the FIFO (First In, First Out) principle. Elements are added at the rear and removed from the front.

Primitive Operations:

Enqueue: Add an element to the rear of the queue.
Dequeue: Remove and return the front element of the queue.

    Example:
    Queue: []
    Enqueue(5): [5]
    Enqueue(10): [5, 10]
    Dequeue(): [10] (returns 5)

7. D-Queues and Priority Queues

Deque (Double-Ended Queue): A deque allows insertion and deletion from both ends.
Priority Queue: A queue where elements are dequeued based on their priority rather than their position.

    Example:
    Deque:
    PushFront(5): [5]
    PushBack(10): [5, 10]
    PopFront(): [10] (returns 5)

    Priority Queue:
    Enqueue(3, High): [3]
    Enqueue(1, Low): [3, 1]
    Dequeue(): [1] (returns 3, as it has higher priority)

UNIT-III: Lists

1. Introduction to Linked Lists

A linked list is a dynamic data structure consisting of nodes where each node contains data and a reference to the next node in the sequence. It provides flexibility in memory allocation and efficient insertion/deletion.

    Example:
    Node Structure:
    struct Node {
        int data;
        Node* next;
    };
    List: 10 → 20 → 30 → NULL

2. Sequential and Linked Lists

Sequential List: Also known as an array, it uses contiguous memory for elements.
Linked List: Uses nodes where memory is dynamically allocated.

    Example:
    Sequential List:
    int arr[3] = {10, 20, 30};

    Linked List:
    Head → [10|Next] → [20|Next] → [30|NULL]

3. Operations on Linked Lists

Linked lists support several operations:

Traversal: Accessing each node sequentially.
Insertion: Adding a new node at the beginning, end, or a specific position.
Deletion: Removing a node by its value or position.
Searching: Finding a node with a specific value.

    Example: Traversal
    Node* temp = head;
    while (temp != NULL) {
        cout << temp->data;
        temp = temp->next;
    }

    Example: Insertion at End
    Node* newNode = new Node();
    newNode->data = 40;
    newNode->next = NULL;
    temp->next = newNode;

4. Two-Way Lists (Doubly Linked Lists)

A doubly linked list is a linked list in which each node contains references to both its previous and next nodes, allowing traversal in both directions.

    Example:
    Node Structure:
    struct Node {
        int data;
        Node* prev;
        Node* next;
    };
    List: NULL ← 10 ⇔ 20 ⇔ 30 → NULL

5. Use of Headers

A header node is a special node at the beginning of a list that does not store data but provides a reference to the first node. It simplifies list operations by avoiding edge cases.

    Example:
    Header Node: [Header|Next] → [10|Next] → [20|Next] → [30|NULL]
    Benefits:
    - Simplifies insertion and deletion at the beginning of the list.
    - Provides consistent traversal.

UNIT-IV: Trees

1. Introduction and Terminology

A tree is a hierarchical data structure consisting of nodes. The topmost node is called the root, and each node may have child nodes.

Node: An element in the tree.
Edge: Connection between nodes.
Root: The topmost node of the tree.
Leaf: A node with no children.
Subtree: A tree formed by a node and its descendants.
Height: The length of the longest path from the root to a leaf.

    Example:
    Tree Structure:
        A (Root)
       / \
      B   C
     / \
    D   E
    Nodes: A, B, C, D, E
    Root: A
    Leaves: C, D, E

2. Traversal of Binary Trees

Traversal refers to visiting all nodes in a tree in a specific order. Common traversal methods include:

Inorder: Left subtree → Root → Right subtree
Preorder: Root → Left subtree → Right subtree
Postorder: Left subtree → Right subtree → Root

    Example:
    Binary Tree:
        A
       / \
      B   C
     / \
    D   E

    Inorder: D → B → E → A → C
    Preorder: A → B → D → E → C
    Postorder: D → E → B → C → A

3. Recursive Algorithms for Tree Operations

Recursive algorithms simplify tree operations such as traversal, insertion, and deletion by leveraging the self-referential nature of trees.

    Example: Inorder Traversal
    void inorder(Node* root) {
        if (root != NULL) {
            inorder(root->left);
            cout << root->data << " ";
            inorder(root->right);
        }
    }

    Example: Insertion
    Node* insert(Node* root, int key) {
        if (root == NULL) {
            root = new Node(key);
        } else if (key < root->data) {
            root->left = insert(root->left, key);
        } else {
            root->right = insert(root->right, key);
        }
        return root;
    }

4. Binary Search Tree (BST)

A Binary Search Tree (BST) is a binary tree where:

Each node's left subtree contains only nodes with values less than the node's value.
Each node's right subtree contains only nodes with values greater than the node's value.

    Example:
    BST:
        10
       /  \
      5    15
     / \   / \
    3   7 12  18

    Properties:
    - Inorder traversal of BST gives sorted order: 3 → 5 → 7 → 10 → 12 → 15 → 18

UNIT-V: B-Trees

1. Introduction

A B-Tree is a self-balancing search tree designed for systems that read and write large blocks of data. It maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time.

    Key Properties of B-Trees:
    - All leaf nodes are at the same level.
    - A node can have multiple keys and child pointers.
    - Ensures balanced height for efficient operations.

2. The Invention of B-Tree

The B-Tree was invented by Rudolf Bayer and Edward M. McCreight in 1972 as a solution to efficiently manage and index data stored on disk. It addressed the limitations of binary search trees in handling large datasets.

3. Statement of the Problem

The main problem with binary search trees (BSTs) is that they can become unbalanced, leading to inefficient operations. For instance:

Insertions or deletions may cause skewed trees, resulting in O(n) time complexity for operations.
BSTs do not efficiently utilize disk storage when managing large datasets.

4. Indexing with Binary Search Trees

Indexing with binary search trees uses the structure of BSTs to manage sorted data. However, unbalanced trees can degrade performance, especially with repeated insertions and deletions.

    Example:
    Insertions:
    Sequence: 10, 20, 30, 40
    Resulting BST:
        10
         \
          20
            \
             30
               \
                40
    Skewed structure → inefficient operations

5. A Better Approach to Tree Indexes

B-Trees offer a better approach by:

Allowing multiple keys in a node, reducing the height of the tree.
Balancing the tree dynamically during insertions and deletions.
Minimizing disk I/O operations, making it ideal for database indexing.

6. B-Trees

A B-Tree of order m has the following properties:

Each node has at most m children.
Each internal node (except the root) has at least ceil(m/2) children.
All leaf nodes are at the same level.

    Example:
    Order-3 B-Tree:
        [10 | 20]
       /   |   \
    [5]  [15]  [25 | 30]

7. Working Up from the Bottom

B-Trees are constructed by inserting elements and ensuring the tree remains balanced by splitting nodes and adjusting parent nodes.

    Example:
    Insert: 10, 20, 5, 15, 25, 30
    Step 1: Insert 10, 20 → [10 | 20]
    Step 2: Insert 5 → [5 | 10 | 20]
    Step 3: Split node → [10] with children [5] and [20]
    Step 4: Continue with subsequent insertions, adjusting as needed.

8. Example for Creating a B-Tree

To create a B-Tree:

Insert elements one by one.
Split nodes when they exceed the allowed number of keys.
Adjust parent nodes to maintain balance.

    Example:
    Order-3 B-Tree, Insert: 10, 20, 30, 40, 50
    1. Insert 10 → [10]
    2. Insert 20 → [10 | 20]
    3. Insert 30 → [10 | 20 | 30] (Node full, split)
       Result: [20] with children [10] and [30]
    4. Insert 40 → Adjust structure
    Final Tree:
        [20 | 40]
       /    |    \
    [10]  [30]  [50]

UNIT-VI: Sorting and Searching Techniques

1. Sorting Techniques

Sorting is the process of arranging elements in a specific order (ascending or descending). Common sorting techniques include:

Insertion Sort: Builds the sorted list one item at a time.
Selection Sort: Repeatedly selects the smallest element and places it in the correct position.
Merge Sort: Divides the array into halves, sorts them, and merges them back.
Heap Sort: Uses a heap data structure to sort elements efficiently.

2. Insertion Sort

Insertion Sort works by picking one element at a time and inserting it into its correct position in the sorted part of the array.

    Example:
    Array: [4, 3, 2, 1]
    Steps:
    1. Start with the first element (already sorted).
    2. Pick 3, compare with 4 → Insert before 4 → [3, 4, 2, 1]
    3. Pick 2 → [2, 3, 4, 1]
    4. Pick 1 → [1, 2, 3, 4]

3. Selection Sort

Selection Sort repeatedly finds the smallest element from the unsorted portion and places it at the beginning.

    Example:
    Array: [4, 3, 2, 1]
    Steps:
    1. Find the smallest element (1) → Swap with 4 → [1, 3, 2, 4]
    2. Find the next smallest (2) → Swap with 3 → [1, 2, 3, 4]

4. Merge Sort

Merge Sort is a divide-and-conquer algorithm that splits the array, sorts each half, and merges them.

    Example:
    Array: [4, 3, 2, 1]
    Steps:
    1. Split into halves → [4, 3] and [2, 1]
    2. Sort each half → [3, 4] and [1, 2]
    3. Merge → [1, 2, 3, 4]

5. Heap Sort

Heap Sort uses a binary heap to sort the elements. The maximum (or minimum) is repeatedly extracted and placed in sorted order.

    Example:
    Array: [4, 3, 2, 1]
    Steps:
    1. Build a max heap → [4, 3, 2, 1]
    2. Extract max (4) → Place at the end → [3, 2, 1, 4]
    3. Repeat until sorted → [1, 2, 3, 4]

6. Searching Techniques

Searching is the process of finding a specific element in a data set. Common techniques include:

Linear Search: Checks each element one by one.
Binary Search: Divides the array into halves and searches in the appropriate half (only works on sorted arrays).
Hashing: Maps keys to values using a hash function for fast lookups.

7. Linear Search

    Example:
    Array: [10, 20, 30, 40]
    Search for 30:
    Steps:
    1. Compare 30 with 10 → No
    2. Compare 30 with 20 → No
    3. Compare 30 with 30 → Yes (Found)

8. Binary Search

    Example:
    Sorted Array: [10, 20, 30, 40, 50]
    Search for 30:
    Steps:
    1. Compare with middle element (30) → Found
    Efficient for large sorted datasets.

9. Hashing

Hashing uses a hash function to compute an index into a table, from which the desired value can be found.

    Example:
    Keys: [101, 202, 303]
    Hash Function: key % 10
    Hash Table:
    Index: 1 → 101
    Index: 2 → 202
    Index: 3 → 303