summaryrefslogtreecommitdiffstats
path: root/sannolikhet/Random variable.md
blob: ba7e1252853e70d66647cdad1e25af1e249072ef (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
Definition: A random variable (or a distribution) is a numerical value
associated with an [[Experiment]] whose value can change from one replicate of
the experiment to another.

A proper definition would need [[Probability space]] and [[Measurable function]]s.

For example, given a fair dice roll,

$$X = \{1,2,3,4,5,6\}$$

is a random variable.

$$Y = [30, 260]$$

is another.

Two types: Discrete random variables and continuous random variables.

# [[Discrete]] random variable

If the outcomes are either bounded in size or countably infinite.

## Probability mass function

Also known as the pmf. Every discrete random variable has a corresponding pmf.
Denoted

$$p(x) = P(X = x)$$

## Table

Every discrete random variable also has a corresponding table.

|$X$|$x_1$|$x_2$|$...$|$x_n$|
|--|--|--|--|--|
|$p(x)$|$p(x_1)$|$p(x_2)$|$...$|$p(x_n)$|

where

$$p(x_i) \ge 0 \quad \forall i \in \{1,2,..,n\}$$
$$\sum_{i=1}^n p(x_i) = 1$$

# [[Continuous]] random variable

The rest. E.g. some interval on the number line.

## Probability density function

Also knows as the pdf. Every continuous random variable has a corresponding pdf.
Denoted $f(x)$ where

$$\int_a^b f(x) dx = P(a \le X \le b)$$

and

$$f(x) \le 0 \quad \forall x$$
$$\int_{-\infty}^\infty f(x) dx = 1$$

# Cumulative distribution function

Also knows as the cdf.

$$F(x) = P(X \le x)$$

For discrete random variables:

$$F(y) = \sum_{i=1}^y p(x_i)$$

And for continuous random variables:

$$F(x) = \int_{-\infty}^x f(y) dy$$

Here we see that

$$F'(x) = f(x)$$

for continuous random variables. Compare with [[Algebrans fundamentalsats]]?

# Examples

## Waiting time (useful model)

Let $X$ be the waiting time between calls in a phone center. Assume $X$ is a
continuous random variable with pdf

$$f(x) = 2e^{-2x} \quad x \gt 0$$

What is $P(X \gt 3)$?

$$P(X \gt 3) = \int_3^\infty f(x) dx = \int_3^\infty 2e^{-2x}dx = e^{-6}$$

In actuality,

$$f(x) = 2e^{-2x} \quad x>0$$
$$0 \ \mathrm{otherwise}$$

but the 0-case is assumed.