# 多元线性回归的代价函数推导

h θ ( x ) = θ 1 x 1 + θ 2 x 2 + . . . + θ n x n = ∑ i = 1 n θ i x i = θ T x h_{\theta}(x)=\theta_1x_1+\theta_2x_2+...+\theta_nx_n=\sum_{i=1}^{n}\theta_ix_i=\theta^Tx

y ( i ) = h θ ( x ( i ) ) + ϵ ( i ) y^{(i)}=h_{\theta}(x^{(i)})+\epsilon^{(i)} …(1)

ϵ ( i ) \epsilon^{(i)} 表示真实值与预测值之间的误差,我们通常认为 ϵ ( i ) \epsilon^{(i)} 是独立并具有相同的分布，并且服从均值为0方差为 θ 2 \theta^2 的高斯分布。

p ( ϵ ( i ) ) = 1 2 π σ e x p ( − ( ϵ ( i ) ) 2 2 σ 2 ) p(\epsilon^{(i)})=\frac{1}{\sqrt{2\pi}\sigma}exp(-\frac{(\epsilon^{(i)})^2}{2\sigma^2}) …(2)

p ( y ( i ) ∣ x ( i ) ; θ ) = 1 2 π σ e x p ( − ( y ( i ) − θ T x ( i ) ) 2 2 σ 2 ) p(y^{(i)}|x^{(i)};\theta)=\frac{1}{\sqrt{2\pi}\sigma}exp(-\frac{(y^{(i)}-\theta^Tx^{(i)})^2}{2\sigma^2})

L ( θ ) = ∏ i = 1 m 1 2 π σ e x p ( − ( y ( i ) − θ T x ( i ) ) 2 2 σ 2 ) L(\theta)=\prod_{i=1}^{m}\frac{1}{\sqrt{2\pi}\sigma}exp(-\frac{(y^{(i)}-\theta^Tx^{(i)})^2}{2\sigma^2})

l n L ( θ ) = l n ∑ i = 1 m 1 2 π σ e x p ( − ( y ( i ) − θ T x ( i ) ) 2 2 σ 2 ) lnL(\theta)=ln\sum_{i=1}^m\frac{1}{\sqrt{2\pi}\sigma}exp(-\frac{(y^{(i)}-\theta^Tx^{(i)})^2}{2\sigma^2})

= m . l n 1 2 π σ − 1 σ 2 . 1 2 ∑ i = 1 m ( y ( i ) − θ T x ( i ) ) 2 =m.ln\frac{1}{\sqrt{2\pi}\sigma}-\frac{1}{\sigma^2}.\frac{1}{2}\sum_{i=1}^m(y^{(i)}-\theta^Tx^{(i)})^2

J ( θ ) = 1 2 ∑ i = 1 m ( y ( i ) − θ T x ( i ) ) 2 J(\theta)=\frac{1}{2}\sum_{i=1}^m(y^{(i)}-\theta^Tx^{(i)})^2 （最小二乘法）

J ( θ ) = 1 2 m ∑ i = 1 m ( y ( i ) − θ T x ( i ) ) 2 J(\theta)=\frac{1}{2m}\sum_{i=1}^m(y^{(i)}-\theta^Tx^{(i)})^2