尝试在 Keras 中标记文本时出错？

首页课程实战体系课手记专栏慕课教程

Keras 和深度学习非常新，但我正在遵循在线指南，我正在尝试标记我的文本，以便在我为神经网络创建层时可以访问“形状”以用作“input_shape”。到目前为止，这是我的代码：

df = pd.read_csv(pathname, encoding = "ISO-8859-1")

df = df[['content_cleaned', 'meaningful']]

df = df.sample(frac=1)

#Transposed columns into numpy arrays

X = np.asarray(df[['content_cleaned']])

y = np.asarray(df[['meaningful']])

#Split into training and testing set

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=21)

# Create tokenizer

tokenizer = Tokenizer(num_words=100) #No row has more than 100 words.

#Tokenize the predictors (text)

X_train = np.concatenate(tokenizer.sequences_to_matrix(int(X_train), mode="binary"))

X_test = np.concatenate(tokenizer.sequences_to_matrix(int(X_test), mode="binary"))

#Convert the labels to the binary

encoder = LabelBinarizer()

encoder.fit(y_train)

y_train = encoder.transform(y_train)

y_test = encoder.transform(y_test)

错误突出显示：

X_train = tokenizer.sequences_to_matrix(int(X_train), mode="binary")

错误信息是：

TypeError: only length-1 arrays can be converted to Python scalars

任何人都可以发现我的错误并可能为此提供解决方案吗？我对此很陌生，无法解决此问题。

我希望能够调用“X_train.shape”，以便在创建网络层时将其输入到 input_shape 中。

任何帮助都会很棒！

慕婉清6462132

浏览 169回答 1

UYOU

您正在尝试将 numpy 数组转换为 python 整数，这当然是不可能的，并且会给您错误（该错误与 Keras 无关）。您真正想要做的是dtype将该 numpy 数组的更改为int. 请尝试以下操作：X_train.astype(np.int32)代替 int(X_train)

0 0

随时随地看视频慕课网APP