按照本教程,我创建了以下churn.py文件:
import numpy as np
import scipy as sp
import scipy.stats as stats
#duration of alive subscriptions
censored = np.array([419,513, ... ,316,14])
#duration of completed subscriptions
uncensored = np.array([389,123,340, ... ,56,31])
#Log likelihoods for censored data
def log_likelihood_lomax(args):
shape, scale = args
val = stats.lomax.logpdf(uncensored, shape, loc=0, scale=scale).sum() + stats.lomax.logsf(censored, shape, loc=0, scale=scale).sum()
return -val
res_lomax = sp.optimize.minimize(log_likelihood_lomax, [1, 1], bounds=((0.001, 1000000), (0.001, 1000000)))
print("lomax shape", res_lomax.x[0], ", scale=", res_lomax.x[1])
print("lomax mean", stats.lomax.mean(res_lomax.x[0], scale=res_lomax.x[1]))
print("lomax median", stats.lomax.median(res_lomax.x[0], scale=res_lomax.x[1]))
注:在...中censored和uncensored阵列在这里为保密的目的。在实际脚本中,我改为包含真实值。
当我用 运行这个脚本时python3 churn.py,我得到以下结果:
lomax shape 0.36948878639375643 , scale= 1440.4384891101636
lomax mean inf
lomax median 7961.447172364986
我知道为中位数返回的值是不正确的。
但最重要的是,我不明白为什么 lomar mean returns inf。
我的脚本有什么问题吗?
慕村225694
相关分类