SqlServer分词

现在需要这样的一个效果，根据产品名，做模糊查询，得到其产品名相似的产品。

但是越精确的越靠前。

原本我的方式是：



alter function f_splitIncrease
(
    @strSource nvarchar(2000),
    @strSplitStr nvarchar(100)
)
returns @tempTable table(id int identity primary key, one nvarchar(1000))
as
begin
    declare @tempStr nvarchar(1000);
    declare @startIndex int;
    set @startIndex = 1;
    set @strSource = @strSource + @strSplitStr;
    while(@startIndex <> 0)
    begin
        set @startIndex = charindex(@strSplitstr, @strSource, @startIndex+1);
        if(@startIndex <> 0)
        begin
            set @tempStr = left(@strSource, @startIndex - 1);
            if(@tempStr <> '')
            begin
                insert into @tempTable values(@tempStr);
            end
        end
    end
return
end

调用 select * from dbo. f_splitIncrease('the office seasons')

会得到

the

the office

the office seasons

这样三条记录，

再根据这个去做搜索，代码如下：

alter proc Proc_Product_Related
(
    @name nvarchar(2000),
    @splitStr nvarchar(100) = ' '
)
as
begin
    declare @tempTable table(
        id int identity(1,1) primary key,
        ProductID int,
        [Name] nvarchar(255),
        ProductNo nvarchar(50),
        MemberPrice money,
        ThumbnailImg nvarchar(255),
        ProductImg nvarchar(255)
    )
    if(@splitStr is Null)
    begin
        set @splitStr = ' ';
    end
    begin transaction
    insert into @tempTable
        select distinct ProductID, [Name], ProductNo, MemberPrice, ThumbnailImg, ProductImg from product p, (select * from dbo.f_splitIncrease(@name, @splitStr)) f 
    where p.name like + f.one + '%'
    if(@@error > 0)
    begin
        rollback transaction
    end
    select * from @tempTable
    if @@error > 0
        rollback
    else
        commit transaction
end

于是最精确的排在最下面。

但是由于模糊搜索the office时，已经包含了 the office seasons，会出现记录重复。

现在要解决记录重复，如果使用distinct关键字的话，他的结果集就不是最精确的排在最下面。没有达到原本想要的效果。

这里该如何处理呀？

侃侃无极

浏览 769回答 5

5回答

一只萌萌小番薯

应用 Distinct 后再对“产品名”这列排序试试。

达令说

什么叫最精确。这个属于智能学，只有人对词或者字的理解排序的位置，在程序上既然你做出来了，开销还是比较大的。你在百度上搜，也并不是越精确的在前面啊。这是我的看法

慕桂英3389331

这里的最精确是相对而言,比如我输入中华人民共和国, 相信你模糊搜索中华人民,比模糊搜索中要精确一些. 关键在于分词该如何分才算合理.其次就是考虑其记录的有效性了.

随时随地看视频慕课网APP