如何设置要在随机搜索中使用的指数分布的界限？

我正在尝试使用 Sci-Kit Learn 找到分类问题的最佳参数值。我发现这样做的一种方法是使用RandomizedSearchCV() 当我设置我希望我的分类器使用的参数的字典时，我遇到了一个问题：我想使用 2^-15 和 2 之间的指数分布^15 用于 C 和 gamma 参数。

我做了一些研究，发现scipy.stats.expon可以解决我的问题。但是，我不知道如何设置我正在寻找的界限。

scoring = {

'accuracy': 'accuracy',

'precision_macro': 'precision_macro',

'recall_macro': 'recall_macro',

'f1_macro': 'f1_macro'}

param_distributions = {

'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],

'C': expon(), # Here are the line that I should set the distribution

'gamma': expon(), # Also here

'degree': randint(2, 7),

'coef0': [0],

'probability': [True]}

cv = StratifiedKFold(n_splits=4)

rdm = RandomizedSearchCV(

estimator=SVC(),

param_distributions=param_distributions,

n_iter=10,

scoring=scoring,

n_jobs=-1,

iid=False,

cv=cv,

refit='accuracy',

random_state=787870)

rdm_results = rdm.fit(X, y)

我应该如何处理这个？有没有一种简单的方法来获得我想要的分布？

缥缈止盈

浏览 143回答 1

1回答

慕婉清6462132

您可以先使用numpy.random.exponential从指数分布生成随机浮点数，然后使用sklearn.preprocessing.minmax_scale对它们进行 min-max 缩放，如下所示：import numpy as npfrom sklearn.preprocessing import minmax_scale# define the number of parameters to generate number_of_params = 500# generate random floats from an exponential distributionx = np.random.exponential(scale=1.0, size=number_of_params)# min-max scalerx = minmax_scale(x, feature_range=(2**-15, 2**15), axis=0, copy=True)

0 0

随时随地看视频慕课网APP