有没有一种简单的方法来扩展现有的激活函数？我的自定义 softmax 函数返回：梯度操作具有“无”

首页课程实战体系课手记专栏慕课教程

有没有一种简单的方法来扩展现有的激活函数？我的自定义 softmax 函数返回：梯度操作具有“无”

我想通过仅使用向量中的前 k 个值来实现使 softmax 更快的尝试。

为此，我尝试为 tensorflow 实现一个自定义函数以在模型中使用：

def softmax_top_k(logits, k=10):

values, indices = tf.nn.top_k(logits, k, sorted=False)

softmax = tf.nn.softmax(values)

logits_shape = tf.shape(logits)

return_value = tf.sparse_to_dense(indices, logits_shape, softmax)

return_value = tf.convert_to_tensor(return_value, dtype=logits.dtype, name=logits.name)

return return_value

我正在使用时尚 mnist 来测试该尝试是否有效：

fashion_mnist = keras.datasets.fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# normalize the data

train_images = train_images / 255.0

test_images = test_images / 255.0

# split the training data into train and validate arrays (will be used later)

train_images, train_images_validate, train_labels, train_labels_validate = train_test_split(

train_images, train_labels, test_size=0.2, random_state=133742,

)

model = keras.models.Sequential([

keras.layers.Flatten(input_shape=(28, 28)),

keras.layers.Dense(128, activation=tf.nn.relu),

keras.layers.Dense(10, activation=softmax_top_k)

])

model.compile(

loss='sparse_categorical_crossentropy',

optimizer='adam',

metrics=['accuracy']

)

model.fit(

train_images, train_labels,

epochs=10,

validation_data=(train_images_validate, train_labels_validate),

)

model_without_cnn.compile(

loss='sparse_categorical_crossentropy',

optimizer='adam',

metrics=['accuracy']

)

model_without_cnn.fit(

train_images, train_labels,

epochs=10,

validation_data=(train_images_validate, train_labels_validate),

)

但是在执行过程中出现错误：

ValueError: An operation has没有任何for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable).

我发现了这个 : (How to make a custom activation function)，它解释了如何对 tensorflow 实现完全自定义的激活函数。但是由于这使用并扩展了 softmax，我认为梯度应该仍然相同。

这是我使用 python 和 tensorflow 进行编码的第一周，因此我还没有对所有内部实现有一个很好的概述。

有没有更简单的方法将 softmax 扩展到新函数中，而不是从头开始实现？

提前致谢！

陪伴而非守候

浏览 244回答 1

1回答

慕妹3242003

不是使用稀疏张量来制作“除 softmaxed top-K 值之外的所有零”的张量，而是使用tf.scatter_nd：import tensorflow as tfdef softmax_top_k(logits, k=10):    values, indices = tf.nn.top_k(logits, k, sorted=False)    softmax = tf.nn.softmax(values)    logits_shape = tf.shape(logits)    # Assuming that logits is 2D    rows = tf.tile(tf.expand_dims(tf.range(logits_shape[0]), 1), [1, k])    scatter_idx = tf.stack([rows, indices], axis=-1)    return tf.scatter_nd(scatter_idx, softmax, logits_shape)编辑：这是具有任意维数的张量的稍微复杂的版本。但是，代码仍然要求在图构建时知道维数。import tensorflow as tfdef softmax_top_k(logits, k=10):    values, indices = tf.nn.top_k(logits, k, sorted=False)    softmax = tf.nn.softmax(values)    # Make nd indices    logits_shape = tf.shape(logits)    dims = [tf.range(logits_shape[i]) for i in range(logits_shape.shape.num_elements() - 1)]    grid = tf.meshgrid(*dims, tf.range(k), indexing='ij')    scatter_idx = tf.stack(grid[:-1] + [indices], axis=-1)    return tf.scatter_nd(scatter_idx, softmax, logits_shape)

0 0

随时随地看视频慕课网APP

相关分类

Python