如何用`hostnametld`替换字符串中的所有URL？

首页课程实战体系课手记专栏慕课教程

例如：

http://stackoverflow.com/questions/ask => stackoverflowcom

以下工作，但不适用于在httpsurl 之外的角落情况。

import re

from urllib.parse import urlparse

def convert_urls_to_hostnames(s):

try:

new_s = re.sub("http\S+", lambda match: urlparse(match.group()).hostname.replace('.','') if match.group() else urlparse(match.group()).hostname, s)

return new_s

except Exception as e:

print(e)

return s

这主要是有效的。

s = "Ask questions here: http://stackoverflow.com/questions/ask"

print(convert_urls_to_hostnames(s))

正确返回： Ask questions here: stackoverflowcom

但是，如果http*s在 url 之外的字符串中的任何位置找到它，它就会失败，如下所示：

s = "Urls may start with http or https like so: http://stackoverflow.com/questions/ask and https://example.com/questions/"

print(convert_urls_to_hostnames(s))

这将返回：'NoneType' object has no attribute 'replace'。

预期收益： Urls may start with http or https like so: stackoverflowcom and examplecom

料青山看我应如是

浏览 207回答 1

随时随地看视频慕课网APP