mirror of
https://github.com/python/cpython.git
synced 2024-11-28 12:31:14 +08:00
Add another example to the statistics docs (GH-107904)
This commit is contained in:
parent
9b75ada6e4
commit
2b6dc2accc
BIN
Doc/library/kde_example.png
Normal file
BIN
Doc/library/kde_example.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 324 KiB |
@ -922,6 +922,10 @@ of applications in statistics.
|
||||
:class:`NormalDist` Examples and Recipes
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
|
||||
Classic probability problems
|
||||
****************************
|
||||
|
||||
:class:`NormalDist` readily solves classic probability problems.
|
||||
|
||||
For example, given `historical data for SAT exams
|
||||
@ -947,6 +951,10 @@ Find the `quartiles <https://en.wikipedia.org/wiki/Quartile>`_ and `deciles
|
||||
>>> list(map(round, sat.quantiles(n=10)))
|
||||
[810, 896, 958, 1011, 1060, 1109, 1162, 1224, 1310]
|
||||
|
||||
|
||||
Monte Carlo inputs for simulations
|
||||
**********************************
|
||||
|
||||
To estimate the distribution for a model than isn't easy to solve
|
||||
analytically, :class:`NormalDist` can generate input samples for a `Monte
|
||||
Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
|
||||
@ -963,6 +971,9 @@ Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
|
||||
>>> quantiles(map(model, X, Y, Z)) # doctest: +SKIP
|
||||
[1.4591308524824727, 1.8035946855390597, 2.175091447274739]
|
||||
|
||||
Approximating binomial distributions
|
||||
************************************
|
||||
|
||||
Normal distributions can be used to approximate `Binomial
|
||||
distributions <https://mathworld.wolfram.com/BinomialDistribution.html>`_
|
||||
when the sample size is large and when the probability of a successful
|
||||
@ -1000,6 +1011,10 @@ probability that the Python room will stay within its capacity limits?
|
||||
>>> mean(trial() <= k for i in range(10_000))
|
||||
0.8398
|
||||
|
||||
|
||||
Naive bayesian classifier
|
||||
*************************
|
||||
|
||||
Normal distributions commonly arise in machine learning problems.
|
||||
|
||||
Wikipedia has a `nice example of a Naive Bayesian Classifier
|
||||
@ -1054,6 +1069,48 @@ The final prediction goes to the largest posterior. This is known as the
|
||||
'female'
|
||||
|
||||
|
||||
Kernel density estimation
|
||||
*************************
|
||||
|
||||
It is possible to estimate a continuous probability density function
|
||||
from a fixed number of discrete samples.
|
||||
|
||||
The basic idea is to smooth the data using `a kernel function such as a
|
||||
normal distribution, triangular distribution, or uniform distribution
|
||||
<https://en.wikipedia.org/wiki/Kernel_(statistics)#Kernel_functions_in_common_use>`_.
|
||||
The degree of smoothing is controlled by a single
|
||||
parameter, ``h``, representing the variance of the kernel function.
|
||||
|
||||
.. testcode::
|
||||
|
||||
import math
|
||||
|
||||
def kde_normal(sample, h):
|
||||
"Create a continous probability density function from a sample."
|
||||
# Smooth the sample with a normal distribution of variance h.
|
||||
kernel_h = NormalDist(0.0, math.sqrt(h)).pdf
|
||||
n = len(sample)
|
||||
def pdf(x):
|
||||
return sum(kernel_h(x - x_i) for x_i in sample) / n
|
||||
return pdf
|
||||
|
||||
`Wikipedia has an example
|
||||
<https://en.wikipedia.org/wiki/Kernel_density_estimation#Example>`_
|
||||
where we can use the ``kde_normal()`` recipe to generate and plot
|
||||
a probability density function estimated from a small sample:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> sample = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2]
|
||||
>>> f_hat = kde_normal(sample, h=2.25)
|
||||
>>> xarr = [i/100 for i in range(-750, 1100)]
|
||||
>>> yarr = [f_hat(x) for x in xarr]
|
||||
|
||||
The points in ``xarr`` and ``yarr`` can be used to make a PDF plot:
|
||||
|
||||
.. image:: kde_example.png
|
||||
:alt: Scatter plot of the estimated probability density function.
|
||||
|
||||
..
|
||||
# This modelines must appear within the last ten lines of the file.
|
||||
kate: indent-width 3; remove-trailing-space on; replace-tabs on; encoding utf-8;
|
||||
|
Loading…
Reference in New Issue
Block a user