Elements of Statistics
UNIT-I: Population, Sample and Data Condensation
Population, Sample, and Data Condensation: Definition and Scope of Statistics, Concept of Population and Sample with Illustration, Raw Data, Attributes and Variables, Classification, Frequency Distribution, Cumulative Frequency Distribution.
UNIT-II: Measures of Central Tendency
Measures of Central Tendency: Concept of Central Tendency, Requirements of a Good Measure of Central Tendency, Arithmetic Mean, Median, Mode, Harmonic Mean, Geometric Mean for Grouped and Ungrouped Data.
UNIT-III: Measures of Dispersion
Measures of Dispersion: Concept of Dispersion, Absolute and Relative Measures of Dispersion, Range, Variance, Standard Deviation, Coefficient of Variation.
UNIT-IV: Permutations and Combinations
Permutations and Combinations: Permutations of ‘n’ Dissimilar Objects Taken ‘r’ at a Time (with or without repetitions), \( nPr = \frac{n!}{(n-r)!} \) (without proof). Combinations of ‘r’ Objects Taken from ‘n’ Objects, \( nCr = \frac{n!}{r!(n-r)!} \) (without proof). Simple Examples and Applications.
UNIT-V: Sample Space, Events and Probability
Sample Space, Events, and Probability: Experiments and Random Experiments, Deterministic and Non-Deterministic Experiments, Definition of Sample Space, Discrete Sample Space, Events, Types of Events, Union and Intersections of Events, Mutually Exclusive Events, Complementary Event, Exhaustive Event. Classical Definition of Probability, Addition Theorem of Probability (without proof, up to three events), Conditional Probability, Independence of Two Events, Simple Numerical Problems.
UNIT-VI: Statistical Quality Control
Statistical Quality Control: Introduction, Control Limits, Specification Limits, Tolerance Limits, Process and Product Control. Control Charts for \( X \) and \( R \), Control Charts for Number of Defective (n-p Chart), Control Charts for Number of Defects (c-Chart).

UNIT-I: Population, Sample and Data Condensation

1. Definition and Scope of Statistics

Statistics is the science of collecting, organizing, analyzing, interpreting, and presenting data. It is widely used in fields like economics, medicine, engineering, and social sciences to make informed decisions.

    Example:
    Use in Medicine: Analyzing clinical trial data to evaluate treatment effectiveness.
    Use in Economics: Analyzing GDP growth rates to assess economic performance.
        

2. Concept of Population and Sample

Population: The complete set of items or individuals under study.
Sample: A subset of the population used for analysis to make inferences about the whole.

    Example:
    - Population: All students in a university.
    - Sample: A group of 100 students selected randomly from the university.
        

3. Raw Data, Attributes, and Variables

Raw Data: Unprocessed data collected directly from observations or experiments.
Attributes: Qualitative characteristics of data (e.g., gender, color).
Variables: Quantitative characteristics that can vary (e.g., age, height).

    Example:
    Raw Data: {18, Male, 5.7ft; 20, Female, 5.4ft}
    Attributes: Gender (Male/Female)
    Variables: Age, Height
        

4. Classification

Classification is the process of organizing raw data into meaningful categories to facilitate analysis. It can be based on qualitative or quantitative criteria.

    Example:
    - Age Groups: Below 20, 20-40, Above 40.
    - Product Categories: Electronics, Apparel, Food.
        

5. Frequency Distribution

A frequency distribution organizes data into classes and shows the number of occurrences (frequency) in each class.

    Example:
    Test Scores:
    Range      Frequency
    0-20          5
    21-40         10
    41-60         15
    61-80         8
    81-100        2
        

6. Cumulative Frequency Distribution

A cumulative frequency distribution shows the cumulative total of frequencies up to each class or category.

    Example:
    Test Scores:
    Range      Frequency     Cumulative Frequency
    0-20          5                 5
    21-40         10                15
    41-60         15                30
    61-80         8                 38
    81-100        2                 40
        

UNIT-II: Measures of Central Tendency

1. Concept of Central Tendency

Measures of central tendency describe a central or typical value for a dataset. They summarize data with a single representative value to analyze distributions effectively.

    Example:
    - Test Scores: {50, 60, 70, 80, 90}
    - Central Tendency: A single value representing the dataset, such as 70.
        

2. Requirements of a Good Measure of Central Tendency

A good measure of central tendency should have the following characteristics:

3. Arithmetic Mean

The arithmetic mean is the sum of all data points divided by the total number of data points. It is sensitive to extreme values.

    Formula: Mean = (Σx) / n
    Example:
    Data: {10, 20, 30, 40, 50}
    Mean = (10 + 20 + 30 + 40 + 50) / 5 = 30
        

4. Median

The median is the middle value of a dataset when arranged in ascending or descending order. It is unaffected by extreme values.

    Example:
    Data: {10, 20, 30, 40, 50}
    Median = 30 (middle value)
    For even data points: Median = Average of two middle values.
        

5. Mode

The mode is the value that appears most frequently in a dataset. A dataset can have one mode (unimodal), two modes (bimodal), or more (multimodal).

    Example:
    Data: {10, 20, 20, 30, 40}
    Mode = 20 (most frequent value)
        

6. Harmonic Mean

The harmonic mean is calculated as the reciprocal of the average of the reciprocals of the data points. It is used for rates and ratios.

    Formula: Harmonic Mean = n / (Σ(1/x))
    Example:
    Data: {2, 3, 4}
    Harmonic Mean = 3 / ((1/2) + (1/3) + (1/4)) = 2.77
        

7. Geometric Mean

The geometric mean is the nth root of the product of n data points. It is useful for growth rates.

    Formula: Geometric Mean = (Πx)^(1/n)
    Example:
    Data: {2, 8}
    Geometric Mean = √(2 × 8) = 4
        

8. Grouped and Ungrouped Data

Central tendency measures can be applied to both grouped and ungrouped data:

    Example:
    Grouped Data:
    Class Interval    Frequency
    10-20                3
    20-30                5
    30-40                2
    Mean, Median, Mode calculated accordingly.
        

UNIT-III: Measures of Dispersion

1. Concept of Dispersion

Dispersion refers to the spread or variability of a dataset. It provides insights into how data points are distributed around a central value, highlighting the degree of consistency or variability.

    Example:
    Dataset A: {5, 5, 5, 5, 5} → Low dispersion
    Dataset B: {1, 5, 9, 13, 17} → High dispersion
        

2. Absolute and Relative Measures of Dispersion

Absolute Measures: Expressed in the same units as the data (e.g., range, variance, standard deviation).
Relative Measures: Expressed as ratios or percentages, enabling comparison across datasets (e.g., coefficient of variation).

3. Range

The range is the simplest measure of dispersion, calculated as the difference between the maximum and minimum values in a dataset.

    Formula: Range = Maximum Value - Minimum Value
    Example:
    Dataset: {4, 8, 15, 16, 23}
    Range = 23 - 4 = 19
        

4. Variance

Variance measures the average squared deviation of each data point from the mean. It indicates the data's variability.

    Formula: Variance (σ²) = Σ(Xi - X̄)² / N
    Example:
    Dataset: {2, 4, 6}
    Mean (X̄) = 4
    Variance = [(2-4)² + (4-4)² + (6-4)²] / 3 = 2.67
        

5. Standard Deviation

The standard deviation is the square root of the variance. It provides a measure of dispersion in the same units as the data.

    Formula: Standard Deviation (σ) = √Variance
    Example:
    Dataset: {2, 4, 6}
    Variance = 2.67
    Standard Deviation = √2.67 = 1.63
        

6. Coefficient of Variation (CV)

The coefficient of variation is a relative measure of dispersion, calculated as the ratio of the standard deviation to the mean, expressed as a percentage.

    Formula: CV = (Standard Deviation / Mean) × 100%
    Example:
    Dataset: {2, 4, 6}
    Mean = 4, Standard Deviation = 1.63
    CV = (1.63 / 4) × 100 = 40.75%
        

UNIT-IV: Permutations and Combinations

1. Permutations

A permutation is an arrangement of objects in a specific order. The number of permutations of ‘n’ dissimilar objects taken ‘r’ at a time is given by:

    Formula:
    nPr = n! / (n-r)!
    Where n = total objects, r = objects selected.
    Example:
    For n = 5, r = 2:
    5P2 = 5! / (5-2)! = 5 × 4 = 20
        

With Repetition: When repetition is allowed, the formula is:

    Formula:
    n^r
    Example:
    For n = 3, r = 2:
    3^2 = 9 permutations.
        

2. Combinations

A combination is a selection of objects without regard to order. The number of combinations of ‘r’ objects taken from ‘n’ objects is given by:

    Formula:
    nCr = n! / (r!(n-r)!)
    Example:
    For n = 5, r = 2:
    5C2 = 5! / (2!(5-2)!) = (5 × 4) / (2 × 1) = 10
        

3. Differences Between Permutations and Combinations

Permutations: Order matters.
Combinations: Order does not matter.

    Example:
    Objects: {A, B}
    Permutations: AB, BA (2 ways)
    Combinations: {A, B} (1 way)
        

4. Applications

Permutations and combinations have a wide range of applications in probability, statistics, and real-life problems.

    Examples:
    1. Determining seating arrangements (Permutations).
    2. Selecting a committee from a group of people (Combinations).
    3. Calculating probabilities in card games.
        

UNIT-V: Sample Space, Events and Probability

1. Experiments and Random Experiments

Experiment: A process or action that results in an outcome.
Random Experiment: An experiment where the outcome is uncertain.

    Example:
    Experiment: Tossing a coin.
    Outcome: Heads or Tails (random).
        

2. Definition of Sample Space

The sample space is the set of all possible outcomes of a random experiment.

    Example:
    Tossing a coin: S = {Heads, Tails}
    Rolling a die: S = {1, 2, 3, 4, 5, 6}
        

3. Events and Types of Events

An event is a subset of the sample space. Types of events include:

    Example:
    Rolling a die:
    A = {1, 2, 3}, B = {4, 5, 6}
    Mutually Exclusive: A ∩ B = ∅
    Complement of A: A' = {4, 5, 6}
        

4. Classical Definition of Probability

Probability is the ratio of favorable outcomes to the total number of outcomes in the sample space.

    Formula:
    P(A) = Number of Favorable Outcomes / Total Number of Outcomes
    Example:
    Tossing a coin: P(Heads) = 1/2
    Rolling a die: P(Even) = 3/6 = 1/2
        

5. Addition Theorem of Probability (Without Proof)

For two events A and B:

    Formula:
    P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
    Example:
    A = Rolling an even number, B = Rolling a number > 3
    P(A) = 3/6, P(B) = 3/6, P(A ∩ B) = 1/6
    P(A ∪ B) = 3/6 + 3/6 - 1/6 = 5/6
        

6. Conditional Probability

Conditional probability is the probability of an event A given that another event B has occurred.

    Formula:
    P(A|B) = P(A ∩ B) / P(B)
    Example:
    A = Rolling a 2, B = Rolling an even number
    P(A|B) = P(A ∩ B) / P(B) = (1/6) / (3/6) = 1/3
        

7. Independence of Two Events

Two events A and B are independent if the occurrence of one does not affect the probability of the other.

    Formula:
    P(A ∩ B) = P(A) × P(B)
    Example:
    Tossing two coins: 
    A = First coin shows Heads, B = Second coin shows Tails
    P(A ∩ B) = P(A) × P(B) = 1/2 × 1/2 = 1/4
        

UNIT-VI: Statistical Quality Control

1. Introduction to Statistical Quality Control

Statistical Quality Control (SQC) uses statistical methods to monitor and control a process to ensure that it operates at its full potential. It focuses on maintaining the quality of both processes and products.

    Example:
    Ensuring the production of defect-free components in manufacturing using quality control charts.
        

2. Control Limits, Specification Limits, and Tolerance Limits

    Example:
    - Control Limits: Derived from process data (e.g., ±3 standard deviations from the mean).
    - Specification Limits: Set by customer requirements (e.g., 10 ± 0.2 cm).
        

3. Process and Product Control

Process Control: Ensures that a process remains stable and consistent over time.
Product Control: Focuses on inspecting the final product to ensure quality standards are met.

    Example:
    - Process Control: Monitoring temperature during production.
    - Product Control: Inspecting finished items for defects.
        

4. Control Charts for X and R

X-Chart: Monitors the mean of a process over time.
R-Chart: Monitors the range (variability) of a process over time.

    Example:
    - X-Chart: Tracks average weight of items produced.
    - R-Chart: Tracks variation in weights of items produced.
        

5. Control Charts for Number of Defectives (n-p Chart)

The n-p chart monitors the number of defective items in a sample of constant size. It is used when the output can either be defective or non-defective.

    Example:
    - Sample size: 100 items.
    - Defective items: Tracked over several batches.
        

6. Control Charts for Number of Defects (c-Chart)

The c-chart monitors the count of defects in a sample of constant size. It is used when defects can occur multiple times on a single item.

    Example:
    - Product: A fabric roll.
    - Defects: Number of tears, stains, or misprints per roll.