找到你要的答案

Q:scipy.optimize.minimize : compute hessian and gradient together

Q:scipy.optimize.minimize:计算Hessian和梯度在一起

The scipy.optimize.minimize function implements basically the equivalent to MATLAB's 'fminunc' function for finding local minima of functions.

In scipy, functions for the gradient and Hessian are separate.

res = minimize(rosen, x0, method='Newton-CG',
...                jac=rosen_der, hess=rosen_hess,
...                options={'xtol': 1e-30, 'disp': True})

However, I have a function whose Hessian and gradient share quite a few computations and I'd like to compute the Hessian and gradient together, for efficiency. In fminunc, the objective function can be written to return multiple values, i.e:

function [ q, grad, Hessian ] = rosen(x)

Is there a good way to pass in a function to scipy.optimize.minimize that can compute these elements together?

scipy.optimize.minimize功能实现的基本求解函数的局部极小值的matlab的fminunc功能等效。

在SciPy,为梯度和Hessian的功能是分开的。

res = minimize(rosen, x0, method='Newton-CG',
...                jac=rosen_der, hess=rosen_hess,
...                options={'xtol': 1e-30, 'disp': True})

然而,我有一个函数的Hessian和梯度的份额相当多的计算,我想计算Hessian和梯度,提高效率。在fminunc,目标函数可以写成返回多个值,即:

function [ q, grad, Hessian ] = rosen(x)

有在一个函数scipy.optimize.minimize可以计算这些元素结合在一起,通过一个好的方法吗?

answer1: 回答1:

You could go for a caching solution, but first numpy arrays are not hashable, and second you only need to cache a few values depending on whether the algorithm goes back and forth a lot on x. If the algorithm only moves from one point to the next, you can cache only the last computed point in this way, with your f_hes and f_jac being just lambda interfaces to a longer function computing both:

import numpy as np

# I choose the example f(x,y) = x**2 + y**2, with x,y the 1st and 2nd element of x below:
def f(x):
    return x[0]**2+x[1]**2

def f_jac_hess(x):
    if all(x==f_jac_hess.lastx):
        print('fetch cached value')
        return f_jac_hess.lastf
    print('new elaboration')
    res = array([2*x[0],2*x[1]]),array([[2,0],[0,2]])

    f_jac_hess.lastx = x
    f_jac_hess.lastf = res

    return res

f_jac_hess.lastx = np.empty((2,)) * np.nan

f_jac = lambda x : f_jac_hess(x)[0]
f_hes = lambda x : f_jac_hess(x)[1]

Now the second call would cache the saved value:

>>> f_jac([3,2])
new elaboration
Out: [6, 4]
>>> f_hes([3,2])
fetch cached value
Out: [[2, 0], [0, 2]]

You then call it as:

minimize(f,array([1,2]),method='Newton-CG',jac = f_jac, hess= f_hes, options={'xtol': 1e-30, 'disp': True})

你可以去一个缓存的解决方案,但是首先NumPy数组不表,和第二只需要缓存的几个值取决于算法来来回回很多X,如果算法只从一个点到下一个,你可以缓存只有最后计算点这样,与你的f_hes和f_jac只是λ接口较长的函数计算:

import numpy as np

# I choose the example f(x,y) = x**2 + y**2, with x,y the 1st and 2nd element of x below:
def f(x):
    return x[0]**2+x[1]**2

def f_jac_hess(x):
    if all(x==f_jac_hess.lastx):
        print('fetch cached value')
        return f_jac_hess.lastf
    print('new elaboration')
    res = array([2*x[0],2*x[1]]),array([[2,0],[0,2]])

    f_jac_hess.lastx = x
    f_jac_hess.lastf = res

    return res

f_jac_hess.lastx = np.empty((2,)) * np.nan

f_jac = lambda x : f_jac_hess(x)[0]
f_hes = lambda x : f_jac_hess(x)[1]

现在第二个调用将缓存保存的值:

>>> f_jac([3,2])
new elaboration
Out: [6, 4]
>>> f_hes([3,2])
fetch cached value
Out: [[2, 0], [0, 2]]

然后你称它为:

minimize(f,array([1,2]),method='Newton-CG',jac = f_jac, hess= f_hes, options={'xtol': 1e-30, 'disp': True})
python  matlab  numpy  optimization  scipy