如何创建一个函数来删除 Python 中的特殊字符以进行特征工程?

我想创建一个函数来从熊猫数据框中删除特殊字符,但也传递一个参数来保留所需的字符。


def strip_characters(c, req_char = ''):

    spec_chars = ["!",'"',"#","%","&","'","(",")","*","+",",","-",".","/",":",";","<","=",">","?","@","[","\\","]","^","_","`","{","|","}","~","–"]


    new_spec = spec_chars.remove(req_char)

    for char in spec_chars:

        c = c.str.replace(char, ' ')

    return c



df['col'] = df['col'].apply(strip_characters,',')  # passing a comma to retain the character


# df['col'] = ['Dining Room', 'Pre-War', 'Laundry in Building', '&Lobby']


慕田峪7331174
浏览 112回答 1
1回答

一只斗牛犬

尝试这个,import pandas as pddf = pd.DataFrame({'col':['Dining Room', 'Pre-War', 'Laundry in Building', '&Lobby']})# ([^) means match anything but word character# "[^\w+|,]" to exclude specific character's from being replaceddf['col'].str.replace("[^\w+]"," ")输出0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Dining Room1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Pre War2&nbsp; &nbsp; Laundry in Building3&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Lobby
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python