猿问

性能:排序切片与排序类型(切片)与排序实现

我正在处理一些代码挑战,发现自定义排序(排序接口的实现)比仅针对切片的原始结构工作得更快。这是为什么?切片转换为类型是否有一些魔力(比如转换为结构指针的切片)?


我做了一些代码来测试我的 hipotesis


package sortingexample


import (

    "sort"

    "testing"

)


// Example of struct we going to sort.


type Point struct {

    X, Y int

}


// --- Struct / Raw Data

var TestCases = []Point{

    {10, 3},

    {10, 4},

    {10, 35},

    {10, 5},

    {10, 51},

    {10, 25},

    {10, 59},

    {10, 15},

    {10, 22},

    {10, 91},

}


// Example One - Sorting Slice Directly

// somehow - slowest way to sort it.

func SortSlice(points []Point) {

    sort.Slice(points, func(i, j int) bool {

        return points[i].Y < points[j].Y

    })

}


func BenchmarkSlice(b *testing.B) {

    tmp := make([]Point, len(TestCases))

    for i := 0; i < b.N; i++ {

        copy(tmp, TestCases)

        SortSlice(tmp)

    }

}


// Example Two - Sorting Slice Directly

// much faster performance

type Points []Point


// Sort interface implementation

func (p Points) Less(i, j int) bool { return p[i].Y < p[j].Y }

func (p Points) Len() int           { return len(p) }

func (p Points) Swap(i, j int)      { p[i], p[j] = p[j], p[i] }


func SortStruct(points []Point) {

    sort.Sort(Points(points))

}


func BenchmarkStruct(b *testing.B) {

    tmp := make([]Point, len(TestCases))

    for i := 0; i < b.N; i++ {

        copy(tmp, TestCases)

        SortStruct(tmp)

    }

}


// --- Pointers

var TestCasesPoints = []*Point{

    &Point{10, 3},

    &Point{10, 4},

    &Point{10, 35},

    &Point{10, 5},

    &Point{10, 51},

    &Point{10, 25},

    &Point{10, 59},

    &Point{10, 15},

    &Point{10, 22},

    &Point{10, 91},

}


// Example Three - Sorting Slice of Pointers


func SortSlicePointers(points []*Point) {

    sort.Slice(points, func(i, j int) bool {

        return points[i].Y < points[j].Y

    })

}


func BenchmarkSlicePointers(b *testing.B) {

    tmp := make([]*Point, len(TestCasesPoints))

    for i := 0; i < b.N; i++ {

        copy(tmp, TestCasesPoints)

        SortSlicePointers(tmp)

    }

}



很明显,对指针切片进行排序会更快,但是为什么自定义排序实现会更快呢?有什么我可以阅读的资源吗?


小怪兽爱吃肉
浏览 172回答 2
2回答

九州编程

一般sort.Slice()和sort.SliceStable()功能适用于任何切片。您必须将切片值作为interface{}值传递,并且实现必须使用反射(reflect包)来访问其元素和长度,并执行元素交换。相反,当你sort.Interface自己实现类型时,在你的实现中你可以访问你的切片的静态类型,并且你可以提供没有sort.Interface反射的实现,这将使它更快。因此,如果性能很关键/很重要,请始终sort.Interface自己提供实现。如果切片很小或性能不重要,您可以使用更方便的sort.Slice()功能。

蝴蝶刀刀

添加带有分配的运行输出看起来接口/结构方法也更好。❯ go versiongo version go1.17.1 darwin/amd64❯ go test -bench=. -benchmemgoos: darwingoarch: amd64pkg: github.com/timescale/promscale/pkg/api/parser/json/testcpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHzBenchmarkSlice-12&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3533616&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;319.6 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 88 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkStruct-12&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;9157018&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;126.0 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkSlicePointers-12&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 6643446&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;167.1 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 56 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2 allocs/opBenchmarkStructOfSlicePointers-12&nbsp; &nbsp; &nbsp; &nbsp; 9004021&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;124.1 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 24 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opPASSok&nbsp; &nbsp; &nbsp; github.com/timescale/promscale/pkg/api/parser/json/test 5.425s
随时随地看视频慕课网APP

相关分类

Go
我要回答