Java Math.abs(int) 优化,为什么这段代码慢了 6 倍?

如您所知,Math.abs(Integer.MIN_VALUE) == Integer.MIN_VALUE为了防止出现负值,该safeAbs方法在我的项目中实现了:


    public static int safeAbs(int i) {

        i = Math.abs(i);


        return i < 0 ? 0 : i;

    }

我将性能与以下性能进行了比较:


    public static int safeAbs(int i) {

        return i == Integer.MIN_VALUE ? 0 : Math.abs(i);

    }

第一个几乎比第二个慢 6 倍(第二个性能几乎与“纯”Math.abs(int) 相同)。从我的角度来看,字节码没有显着差异,但我猜差异存在于 JIT“汇编”代码中:


“慢”版本:


  0x00007f0149119720: mov     %eax,0xfffffffffffec000(%rsp)

  0x00007f0149119727: push    %rbp

  0x00007f0149119728: sub     $0x20,%rsp

  0x00007f014911972c: test    %esi,%esi

  0x00007f014911972e: jl      0x7f0149119734

  0x00007f0149119730: mov     %esi,%eax

  0x00007f0149119732: jmp     0x7f014911973c

  0x00007f0149119734: neg     %esi

  0x00007f0149119736: test    %esi,%esi

  0x00007f0149119738: jl      0x7f0149119748

  0x00007f014911973a: mov     %esi,%eax

  0x00007f014911973c: add     $0x20,%rsp

  0x00007f0149119740: pop     %rbp

  0x00007f0149119741: test    %eax,0x1772e8b9(%rip)  ;   {poll_return}

  0x00007f0149119747: retq

  0x00007f0149119748: mov     %esi,(%rsp)

  0x00007f014911974b: mov     $0xffffff65,%esi

  0x00007f0149119750: nop

  0x00007f0149119753: callq   0x7f01490051a0    ; OopMap{off=56}

                                                ;*ifge

                                                ; - math.FastAbs::safeAbsSlow@6 (line 16)

                                                ;   {runtime_call}

  0x00007f0149119758: callq   0x7f015f521d20    ;   {runtime_call}


RISEBY
浏览 163回答 1
1回答

Smart猫小萌

safeAbsSlow和方法生成的本机代码存在差异safeAbsFast。safeAbsSlow(C2,4 级):0x0000023d12ec4b14: add&nbsp; &nbsp; &nbsp;eax,ecx0x0000023d12ec4b16: inc&nbsp; &nbsp; &nbsp;ebx0x0000023d12ec4b18: cmp&nbsp; &nbsp; &nbsp;ebx,989680h0x0000023d12ec4b1e: jnl&nbsp; &nbsp; &nbsp;23d12ec4b4eh ; jump if `ebx` was not less than `10_000_000`0x0000023d12ec4b20: mov&nbsp; &nbsp; &nbsp;ecx,dword ptr [r9+rbx*4+10h]0x0000023d12ec4b25: test&nbsp; &nbsp; ecx,ecx0x0000023d12ec4b27: jnl&nbsp; &nbsp; &nbsp;23d12ec4b14h ; jump if `ecx` was not less-than `0`0x0000023d12ec4b29: neg&nbsp; &nbsp; &nbsp;ecx0x0000023d12ec4b2b: test&nbsp; &nbsp; ecx,ecx0x0000023d12ec4b2d: jnl&nbsp; &nbsp; &nbsp;23d12ec4b14h ; jump if `ecx` was not less-than `0`safeAbsFast(C2,4 级):0x000001d89e8a4b20: mov&nbsp; &nbsp; &nbsp;ecx,dword ptr [r9+rdi*4+10h]0x000001d89e8a4b25: cmp&nbsp; &nbsp; &nbsp;ecx,80000000h0x000001d89e8a4b2b: je&nbsp; &nbsp; &nbsp; 1d89e8a4b66h ; jump if `ecx` was equal to `2147483648`0x000001d89e8a4b2d: mov&nbsp; &nbsp; &nbsp;r11d,ecx0x000001d89e8a4b30: neg&nbsp; &nbsp; &nbsp;r11d0x000001d89e8a4b33: test&nbsp; &nbsp; ecx,ecx0x000001d89e8a4b35: cmovl&nbsp; &nbsp;ecx,r11d0x000001d89e8a4b39: add&nbsp; &nbsp; &nbsp;eax,ecx0x000001d89e8a4b3b: inc&nbsp; &nbsp; &nbsp;edi0x000001d89e8a4b3d: cmp&nbsp; &nbsp; &nbsp;edi,989680h0x000001d89e8a4b43: jl&nbsp; &nbsp; &nbsp; 1d89e8a4b20h ; jump if `edi` was less than `10_000_000`从上面我们可以看出,safeAbsSlow比 具有更多的条件跳转safeAbsFast。这尤其是因为Math.abs内联到的实现safeAbsFast没有条件跳转:0x000001d89e8a4b2d: mov&nbsp; &nbsp; &nbsp;r11d,ecx0x000001d89e8a4b30: neg&nbsp; &nbsp; &nbsp;r11d0x000001d89e8a4b33: test&nbsp; &nbsp; ecx,ecx0x000001d89e8a4b35: cmovl&nbsp; &nbsp;ecx,r11d因此,与数据集同时具有分散在数组中的正值和负值时的版本slow相比,该&nbsp; 版本中的分支未命中次数要多得多。normal以下是使用 Linux 分析器收集的相应统计信息perf:Benchmark&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Mode&nbsp; Cnt&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Score&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Error&nbsp; UnitssafeAbsFast&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; avgt&nbsp; &nbsp;10&nbsp; &nbsp; 9611659.726 ± 1429082.431&nbsp; ns/opsafeAbsFast:branch-misses&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; avgt&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2869.853&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsFast:branches&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp; 12492918.020&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsFast:cycles&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp; 28212203.936&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsFast:instructions&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp; 92352048.153&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsSlow&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; avgt&nbsp; &nbsp;10&nbsp; &nbsp;44524180.366 ± 6324887.086&nbsp; ns/opsafeAbsSlow:branch-misses&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; avgt&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;5006493.144&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsSlow:branches&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp; 17496069.911&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsSlow:cycles&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp;126413171.674&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsSlow:instructions&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp; 67549877.558&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/op相反,这是排序数据集的结果:Benchmark&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Mode&nbsp; Cnt&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Score&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Error&nbsp; UnitssafeAbsFast&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; avgt&nbsp; &nbsp;10&nbsp; &nbsp;9026800.584 ±&nbsp; 528992.157&nbsp; ns/opsafeAbsFast:branch-misses&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; avgt&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2785.463&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsFast:branches&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp;12474751.905&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsFast:cycles&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp;27379727.603&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsFast:instructions&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp;92418075.715&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsSlow&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; avgt&nbsp; &nbsp;10&nbsp; &nbsp;6981828.374 ± 2375480.834&nbsp; ns/opsafeAbsSlow:branch-misses&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; avgt&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2801.022&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsSlow:branches&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp;17496585.992&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsSlow:cycles&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp;19478382.113&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/opsafeAbsSlow:instructions&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;avgt&nbsp; &nbsp; &nbsp; &nbsp;67589946.278&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;#/op当数据集排序时,以前的slow版本变得更快(在这种情况下,代价高昂的分支未命中被最小化)。环境:openjdk version "12-internal" 2019-03-19OpenJDK Runtime Environment (slowdebug build 12-internal+0-adhoc.jdk12)OpenJDK 64-Bit Server VM (slowdebug build 12-internal+0-adhoc.jdk12, mixed mode)
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Java