In statistics, many estimators maximize or minimize a function Maximum Likelihood
1.Least Squares
2.Method of Moments
3.Posterior Mode
Suppose we want to find an minimum or maximum of a function f(x) Sometimes f(x) will be very complicated Are there computer algorithms that can help? YES! >Newton’s Method Quasi-Newton Nelder Mead
g(x) = (3x)^5 + (4x)^4 + (6x)^3 + (4x)^4
x = seq(-2,2,0.01)
g = 3*(x)^5 + 4*(x)^4 + 6*(x)^3 + 4*(x) - 4
plot(x,g,type = 'l')
To find the tangent evaluate the first derivative of g(x). The function is g(x) = 3*(x)^5 + 4*(x)^4 + 6*(x)^3 + 4*(x) - 4 (1)
- The first derivative(一阶导数) is g'(x) = 15*(x)^4 + 16*(x)^3 + 18*(x)^2 + 4 (2)
g = 3*(x)^5 + 4*(x)^4 + 6*(x)^3 + 4*(x) - 4
g1 = 15*(x)^4 + 16*(x)^3 + 18*(x)^2 + 4
x0 = 1
a = 1000
findroot = function(a){
x = rep(NA, a+1)
x[1] = 1.4
for (i in 2:a) {
g = function(t){3*(t)^5 + 4*(t)^4 + 6*(t)^3 + 4*(t) - 4}
g1 = function(t){15*(t)^4 + 16*(t)^3 + 18*(t)^2 + 4}
x[i+1] = x[i] - g(x[i])/g1(x[i])
if(abs(g(x[i+1]))<=1E-10){break}
}
if(i+1 == a) return('a is too small')
return(x[i+1])
}
# findroot(1000)
#First define function
g<-function(x){(3*(x^5))-(4*(x^4))+(6*(x^3))+4*x-4}
#Then define first derivative
dg<-function(x){(15*(x^4))-(16*(x^3))+(18*(x^2))+4}
#Code 1 Newton-Raphson using "for" loop
nralgo1<-function(x0,func,deri){
epsilon<-1E-10 #Set Tolerance Level 精确度
MaxIter<-500 #Maximum Number of Iterations
x<-rep(NA,MaxIter+1) #A vector to store sequence of x
x[1]<-x0 # Initialise
for(n in 1:MaxIter){
x[n+1]<-x[n]-(func(x[n])/deri(x[n])) #Update Step
if(abs(func(x[n+1]))<=epsilon){break} #Stopping Rule
}
if(n==MaxIter){warning('Maximum number of iterations reached without convergence')}
return(x[n+1]) #Return value of x
}
nralgo1(1.4,g,dg)
## [1] 0.6618123
#CODE 2: Newton-Raphson using "while"
nralgo2<-function(x0,func,deri){
epsilon<-1E-10 #Tolerance Level
x<-x0 #Initialise
while(abs(func(x))>epsilon){
x<-x-(func(x)/deri(x)) #Update step
print(x)
}
return(x)
}
#With all functions defined we can now do optimisation:
x0<-1.4
#Use Code 1
root1=nralgo1(x0,g,dg)
print(paste('Root using Code 1:',round(root1,3)))
## [1] "Root using Code 1: 0.662"
print(paste('Function Evaluated at this Root',round(g(root1),3)))
## [1] "Function Evaluated at this Root 0"
#Using Code 2
root2<-nralgo2(x0,g,dg)
## [1] 1.044673
## [1] 0.7873304
## [1] 0.6768874
## [1] 0.6620379
## [1] 0.6618123
## [1] 0.6618123
print(paste('Root using Code 2:',round(root2,3)))
## [1] "Root using Code 2: 0.662"
print(paste('Function Evaluated at this Root',round(g(root2),3)))
## [1] "Function Evaluated at this Root 0"
Next example:最高次为3 g(x) = x^3 - 2x^2 - 11x + 12
Try two different starting values Starting value x0 = 2.35287527 Starting value x0 = 2.35284172
g = function(x){x^3 - 2*x^2 - 11*x + 12}
dg = function(x){ 3*x^2 - 4*x - 11}
x0 = 2.35287527
nralgo1(x0,g,dg)
## [1] 4
x0 = 2.35284172
nralgo1(x0,g,dg)
## [1] -3
得到的结果差别很大
因此有收敛和不收敛两种不同的状态,一定要让代码达到收敛状态
不同的算法有不同的收敛速率(那张图),收敛快的算法不一定最通用
如果fx的极值恰好在一阶导数为零的地方:df(x) = 0
那么问题转化为:如何求df=0的根
利用二阶导d2f(x)
牛顿迭代找到的极值是局部的极值,只有某些情况下才能找到全局的极值
健步走的似然函数 将var设为常数 能否用牛顿迭代找到似然函数的最大值