The Central Limit Theorem (CLT) states that if you take sufficiently large random samples from any population
(regardless of its distribution), the distribution of the sample means will approximate a normal distribution (a bell curve). As sample size increases, this normal distribution becomes more accurate, centered around the true population mean
Setup python3.14t from source per Google's directives on top of Arch Linux with CachyOS Kenel and Cosmic 1.0.10 preinstalled
Tune ~/.bashrc adding to the bottom
export CFLAGS="-march=native -O3 -pipe -fno-plt"
export CXXFLAGS="-march=native -O3 -pipe -fno-plt"
export LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now"
$ source ~/.bashrc
======================
Install required packages
======================
$sudo pacman -S base-devel openssl xz tk libffi libxcrypt-compat
$ tar -xf Python-3.14.4.tar.xz
$ cd Python-3.14.4
Run configure and build
$ ./configure \
--enable-optimizations \
--with-lto \
--with-computed-gotos \
--disable-gil \
--enable-loadable-sqlite-extensions
$ make -j10
$ sudo make altinstall
$ ls -l /usr/local/bin/python3.14t
-rwxr-xr-x 2 root root 36139208 Apr 23 11:07 /usr/local/bin/python3.14t
Setup venv for python in folder CLTDEMO
$ python3.14t -m venv .env
$ source .env/bin/activate
$ pip install aqtinstall
$ pip install numpy matplotlib cxroots
Create python script for testing
❯ cat multithrCLT01.py
import numpy as np
import matplotlib.pyplot as plt
from concurrent.futures import ThreadPoolExecutor
def generate_means(size):
# Set a unique seed per thread to ensure randomness isn't duplicated
rng = np.random.default_rng()
return [np.mean(rng.integers(-40, 40, size)) for _ in range(10000)]
sample_sizes = [50, 100, 500, 700, 900, 1200, 1400, 1600]
# Use ThreadPoolExecutor to run the generation in parallel
with ThreadPoolExecutor() as executor:
# map ensures the results stay in the same order as sample_sizes
all_sample_means = list(executor.map(generate_means, sample_sizes))
# Plotting (Matplotlib is not thread-safe, so we keep it in the main thread)
fig, axes = plt.subplots(4, 2, figsize=(10, 10))
for ax, means, size in zip(axes.flatten(), all_sample_means, sample_sizes):
ax.hist(means, bins=20, density=True, alpha=0.75,
color='blue', edgecolor='black')
ax.set_title(f"Sample size = {size}")
ax.set_xlabel("Sample Mean")
ax.set_ylabel("Density")
ax.grid(True, linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()
Snapshot of desktop been prepared for python's testing along with switching windows style from tiling to floating style and vice versa.
Significance in Statistics
The CLT is fundamental to inferential statistics because it allows for hypothesis testing and creating confidence intervals, even when the underlying data is not normally distributed. It allows researchers to calculate the likelihood that a particular sample mean represents the true population mean.
REFERENCES
https://www.geeksforgeeks.org/python/python-central-limit-theorem/



No comments:
Post a Comment