kMeans Clustering using TF2.0, ValueError:

我正在尝试使用 TensorFlow 2.0 实现简单的 k-means 聚类。预计@tf.function使用 autograph 来转换带有 for 循环的函数。


请让我知道导致 ValueError 的原因。


tf_kmeans.py


import tensorflow as tf

import numpy as np

from typeguard import typechecked

from typing import Union


@tf.function

def train_kmeans(X: Union[tf.Tensor, np.ndarray], 

    k: Union[int, tf.Tensor], 

    n_iter: Union[int, tf.Tensor] = 10) -> (tf.Tensor, tf.Tensor):


    X = tf.convert_to_tensor(X)

    X = tf.cast(X, tf.float32)

    assert len(tf.shape(X)) == 2, "Training data X must be represented as 2D array only"

    m = tf.shape(X)[0]


    k = tf.convert_to_tensor(k, dtype=tf.int64)


    random_select = tf.random.shuffle(X)

    init_centroids = random_select[:k]


    centroids = tf.Variable(init_centroids)

    clusters = tf.zeros([m, ], dtype=tf.int64)

    clusters = tf.Variable(clusters)

    for _ in tf.range(n_iter):

        squared_diffs = tf.square(X[None, :, :] - centroids[:, None, :])

        euclidean_dists = tf.reduce_sum(squared_diffs, axis=-1) ** 0.5


        clusters.assign(tf.argmin(euclidean_dists, axis=0))


        selector = tf.range(k)[:, None] == clusters[None, :]


        for c in tf.range(k):

            select = selector[c]

            points = X[select]

            mean_points = tf.reduce_mean(points, axis=0)

            centroids[c].assign(mean_points)


    centroids = tf.convert_to_tensor(centroids)

    return centroids, clusters

以下代码用于调用该函数:


tf_means_test.py


import tensorflow as tf

import numpy as np


X = np.array([[ 2., 10.],

    [ 2.,  5.],

    [ 8.,  4.],

    [ 5.,  8.],

    [ 7.,  5.],

    [ 6.,  4.],

    [ 1.,  2.],

    [ 4.,  9.]])

k = 3


import tf_kmeans

centroids, clusters = tf_kmeans.train_kmeans(X, k)


print(centroids)

print(clusters)

如果tf.function删除了装饰器,则代码可以正常工作,因为在这种情况下不会执行签名。

提前致谢。


互换的青春
浏览 168回答 1
1回答

达令说

我假设您想要一个tf.Variable仅使用的实例assign。但是,在使用 时tf.function,您应该始终从外部提供变量,并在内部使用内置的 TensorFlow 数据结构。例如,您的代码更改最少,没有tf.Variable对象将是:import tensorflow as tfimport numpy as npfrom typeguard import typecheckedfrom typing import Union@tf.functiondef train_kmeans(X: Union[tf.Tensor, np.ndarray],     k: Union[int, tf.Tensor],     n_iter: Union[int, tf.Tensor] = 10) -> (tf.Tensor, tf.Tensor):    X = tf.convert_to_tensor(X)    X = tf.cast(X, tf.float32)    # Required as an int later    num_centers = k    assert len(tf.shape(X)) == 2, "Training data X must be represented as 2D array only"    m = tf.shape(X)[0]    k = tf.convert_to_tensor(k, dtype=tf.int64)    random_select = tf.random.shuffle(X)    init_centroids = random_select[:k]    centroids = init_centroids    clusters = tf.zeros([m, ], dtype=tf.int64)    for _ in tf.range(n_iter):        squared_diffs = tf.square(X[None, :, :] - centroids[:, None, :])        euclidean_dists = tf.reduce_sum(squared_diffs, axis=-1) ** 0.5        clusters = tf.argmin(euclidean_dists, axis=0)        selector = tf.range(k)[:, None] == clusters[None, :]        # TF data structure        new_centroids = tf.TensorArray(tf.float32, num_centers, element_shape=[1, 2])        for c in range(k):            select = selector[c]            points = X[select]            centroid = tf.reduce_mean(points, axis=0)            centroid = tf.reshape(centroid, [1, 2])            new_centroids.write(tf.cast(c, tf.int32), centroid)        centroids = new_centroids.concat()        centroids = tf.reshape(centroids, [num_centers, 2])    return centroids, clusters
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python