程式
如何在 Python 中修復我的 Ornstein-Uhlenbeck 參數 MLE?
我正在嘗試將時間序列數據擬合到 Ornstein-Uhlenbeck 過程中。到目前為止,這是我的程式碼:
# source for computation: https://arxiv.org/pdf/1411.5062.pdf import math from math import sqrt, exp, log # exp(n) == e^n, log(n) == ln(n) import scipy.optimize as so import numpy as np def __compute_log_likelihood(params, *args): ''' Compute the average Log Likelihood, this function will by minimized by scipy. Find in (2.2) in linked paper returns: the average log likelihood from given parameters ''' # functions passed into scipy's minimize() needs accept one parameter, a tuple of # of values that we adjust to minimize the value we return. # optionally, *args can be passed, which are values we don't change, but still want # to use in our function (e.g. the measured heights in our sample or the value Pi) theta, mu, sigma = params X, dt = args n = len(X) sigma_tilde_squared = sigma ** 2 * (1 - exp(-2 * mu * dt)) / 2 * mu summation_term = 0 for i in range(1, len(X)): summation_term += (X[i] - X[i - 1] * exp(-mu * dt) - theta * (1 - exp(-mu * dt))) ** 2 summation_term = -summation_term / (2 * n * sigma_tilde_squared) log_likelihood = (-log(2 * math.pi) / 2) + (-log(sqrt(sigma_tilde_squared))) + summation_term return -log_likelihood # since we want to maximize this total log likelihood, we need to minimize the # negation of the this value (scipy doesn't support maximize) def estimate_coefficients_MLE(X, dt): ''' Estimates Ornstein-Uhlenbeck coefficients (θ, µ, σ) of the given array using the Maximum Likelihood Estimation method input: X - array-like data to be fit as an OU process returns: θ, µ, σ, Total Log Likelihood ''' bounds = ((0, None), (None, None), (0, None)) # theta > 0, mu ∈ ℝ, sigma > 0 mu_init = np.mean(X) result = so.minimize(__compute_log_likelihood, (1e-6, 1e-6, 1e-6), args=(X, dt), bounds=bounds) theta, mu, sigma = result.x max_log_likelihood = -result.fun # undo negation from __compute_log_likelihood return theta, mu, sigma, max_log_likelihood
但是,當我使用以下內容模擬 OU 流程時:
# simulate Ornstein-Uhlenbeck Process import numpy as np import matplotlib.pyplot as plt t_0 = 0 # define model parameters t_end = 2 length = 1000 theta = 1.1 mu = 0 sigma = 0.3 t = np.linspace(t_0,t_end,length) # define time axis dt = np.mean(np.diff(t)) y = np.zeros(length) y0 = np.random.normal(loc=0.0,scale=1.0) # initial condition drift = lambda y,t: theta*(mu-y) # define drift term, google to learn about lambda diffusion = lambda y,t: sigma # define diffusion term noise = np.random.normal(loc=0.0,scale=1.0,size=length)*np.sqrt(dt) #define noise process # solve SDE for i in range(1,length): y[i] = y[i-1] + drift(y[i-1],i*dt)*dt + diffusion(y[i-1],i*dt)*noise[i] plt.plot(t,y) plt.show()
然後使用我的函式擬合數據(儲存在 y 中):
theta, mu, sigma, max_ll = estimate_coefficients_MLE(y, 1/len(y))
我要麼得到“值錯誤:數學域錯誤”,要麼我的係數非常偏離。如果有人能指出我正確的方向,我將不勝感激,網上缺乏關於這個主題的資源。
加上 Japser 的回答,要解決除以 0 的問題,我們可以為 mu 和 sigma 的下限設置一個非常小的值(例如 1x10^-5)。要查看實際的算法,請參閱此
那是因為
sigma_tilde_squared == 0
,您可以在添加時添加 0.01 以避免它== 0