Java微基準效能測試:數字轉字串方式哪家強?JMH來幫忙

語言: CN / TW / HK

目前各大網際網路公司都開始注重程式碼質量,在我司,單元測試已經在進行全面推廣和覆蓋中,這次,我們通過一起實際的例子,聊一聊另一種非常重要的測試,也就是微基準效能測試。

Java中數字轉字串相信大家都有做過,四種常用的轉換方式,究竟用哪種最優呢?本次通過對

  • Integer.toString(a)

  • String.valueOf(a)

  • a + ""

  • "" + a

四種數字轉字串的方式進行效能探究和分析,使大家對效能測試有正確的認識,逐步瞭解和掌握JMH微基準測試。

常規的測試方式

提到效能測試,如果是沒有經驗的小夥伴,通常會寫出以下程式碼,企圖通過迴圈來放大每次的執行速度偏差,程式碼如下:

public class CommonTest {
public static void main(String[] args) {
int num = 1000000;
int a = 123456789;
long start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = a + "";
}
long end = System.currentTimeMillis();
System.out.println("a+\"\" = " + (end - start));




start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = "" + a;
}
end = System.currentTimeMillis();
System.out.println("\"\"+a = " + (end - start));




start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = String.valueOf(a);
}
end = System.currentTimeMillis();
System.out.println("String.valueOf(a) = " +(end-start));




start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = Integer.toString(a);
}
end = System.currentTimeMillis();
System.out.println("Integer.toString(a) = " +(end-start));
}
}

本機環境為MacBook,處理器:2.6 GHz 六核Intel Core i7,記憶體:16 GB 2667 MHz DDR4

1000000資料測試結果:

a+"" = 56
""+a = 30
String.valueOf(a) = 31
Integer.toString(a) = 29

耗時結果為 a+"" > “”+a , String.valueOf(a) > Integer.toString(a),

多次測試,結果顯示Integer.toString(a)與String.valueOf(a)消耗時間相似,String.valueOf(a)大於Integer.toString(a),a+""始終大於""+a,且"" + a 消耗時長最小。

500000000資料測試結果:

a+"" = 9527
""+a = 9358
String.valueOf(a) = 13620
Integer.toString(a) = 13501

耗時結果為String.valueOf(a) > Integer.toString(a) > a+"" > “”+a ,但是a+““和”“+a的差距明顯縮小了。

如果以上結果你認為是偶然性偏差,那麼下面我再來做一個操作,可能會再次改變你的認知。

public class CommonTest {
public static void main(String[] args) {
int num = 500000000;
int a = 123456789;
// 本次程式碼的改變是,將“” + a 和 a + “”的測試順序進行了改變,先測試“” + a
long start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = "" + a;


}
long end = System.currentTimeMillis();
System.out.println("\"\"+a = " + (end - start));


start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = a + "";
}
end = System.currentTimeMillis();
System.out.println("a+\"\" = " + (end - start));


start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = String.valueOf(a);
}
end = System.currentTimeMillis();
System.out.println("String.valueOf(a) = " +(end-start));


start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = Integer.toString(a);
}
end = System.currentTimeMillis();
System.out.println("Integer.toString(a) = " +(end-start));
}
}

大家可以自行測試一下,這裡我可以直接告訴大家測試結果,a+"" 和 “”+a的測試結果出現了反轉,由之前的a+"" 始終大於 “”+a變為了a+"" 始終小於 “”+a。

在不同資料量,甚至測試順序的變化,都導致測試結果發生了差異,究竟哪個結果更可信?我們該如何正確的進行效能測試?

結果探究

在介紹正確的效能測試方法之前,我們可以先對以上測試結果的偏差原因進行分析。

點開String.valueOf(int i)原始碼不難看出:

String.valueOf(int i)其實是呼叫的Integer.toString(i),所以String.valueOf(int i)呼叫時間大於Integer.toString(i)比較正常,兩者時間應該非常相似。

為了驗證字串相加的編譯結果,下面給出探究過程:

測試程式碼:

package com.bestqiang.commontest;


public class CommonTest {
public static void main(String[] args) {
int a = 1;
String str = "hello" + a;


String str2 = "hello2" + 1;


String str3 = a + "hello3";


String str4 = 1 + "hello4";
}
}

首先看 str2, 我們把其他程式碼註釋掉,只留下 String str2 = "hello2" + 1;

使用javap解析可得:

"C:\Program Files\Java\jdk1.8.0_221\bin\javap.exe" -v com.bestqiang.commontest.CommonTest
Classfile /D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class
Last modified 2021-8-19; size 481 bytes
MD5 checksum 2d6ee54fb564dc0a7fc51ab0a617cfcc
Compiled from "CommonTest.java"
public class com.bestqiang.commontest.CommonTest
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #4.#20 // java/lang/Object."<init>":()V
#2 = String #21 // hello21
#3 = Class #22 // com/bestqiang/commontest/CommonTest
#4 = Class #23 // java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
#7 = Utf8 Code
#8 = Utf8 LineNumberTable
#9 = Utf8 LocalVariableTable
#10 = Utf8 this
#11 = Utf8 Lcom/bestqiang/commontest/CommonTest;
#12 = Utf8 main
#13 = Utf8 ([Ljava/lang/String;)V
#14 = Utf8 args
#15 = Utf8 [Ljava/lang/String;
#16 = Utf8 str2
#17 = Utf8 Ljava/lang/String;
#18 = Utf8 SourceFile
#19 = Utf8 CommonTest.java
#20 = NameAndType #5:#6 // "<init>":()V
#21 = Utf8 hello21
#22 = Utf8 com/bestqiang/commontest/CommonTest
#23 = Utf8 java/lang/Object
{
public com.bestqiang.commontest.CommonTest();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 9: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/bestqiang/commontest/CommonTest;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=2, args_size=1
0: ldc #2 // String hello21
2: astore_1
3: return
LineNumberTable:
line 14: 0
line 19: 3
LocalVariableTable:
Start Length Slot Name Signature
0 4 0 args [Ljava/lang/String;
3 1 1 str2 Ljava/lang/String;
}
SourceFile: "CommonTest.java"
Process finished with exit code 0

解析結果顯示,編譯器直接優化為了hello21,沒有那種StringBuilder追加的情況發生。

接著,我們把其他程式碼註釋掉,只留下 String str = "hello" + a;

使用javap解析可得:

Classfile /D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class
Last modified 2021-8-19; size 705 bytes
MD5 checksum 69bc22cd50230bd882b407a17dcff463
Compiled from "CommonTest.java"
public class com.bestqiang.commontest.CommonTest
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #9.#27 // java/lang/Object."<init>":()V
#2 = Class #28 // java/lang/StringBuilder
#3 = Methodref #2.#27 // java/lang/StringBuilder."<init>":()V
#4 = String #29 // hello
#5 = Methodref #2.#30 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#6 = Methodref #2.#31 // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
#7 = Methodref #2.#32 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#8 = Class #33 // com/bestqiang/commontest/CommonTest
#9 = Class #34 // java/lang/Object
#10 = Utf8 <init>
#11 = Utf8 ()V
#12 = Utf8 Code


#13 = Utf8 LineNumberTable
#14 = Utf8 LocalVariableTable
#15 = Utf8 this
#16 = Utf8 Lcom/bestqiang/commontest/CommonTest;
#17 = Utf8 main
#18 = Utf8 ([Ljava/lang/String;)V
#19 = Utf8 args
#20 = Utf8 [Ljava/lang/String;
#21 = Utf8 a
#22 = Utf8 I
#23 = Utf8 str
#24 = Utf8 Ljava/lang/String;
#25 = Utf8 SourceFile
#26 = Utf8 CommonTest.java
#27 = NameAndType #10:#11 // "<init>":()V
#28 = Utf8 java/lang/StringBuilder
#29 = Utf8 hello
#30 = NameAndType #35:#36 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#31 = NameAndType #35:#37 // append:(I)Ljava/lang/StringBuilder;
#32 = NameAndType #38:#39 // toString:()Ljava/lang/String;
#33 = Utf8 com/bestqiang/commontest/CommonTest
#34 = Utf8 java/lang/Object
#35 = Utf8 append
#36 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#37 = Utf8 (I)Ljava/lang/StringBuilder;
#38 = Utf8 toString
#39 = Utf8 ()Ljava/lang/String;
{
public com.bestqiang.commontest.CommonTest();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 9: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/bestqiang/commontest/CommonTest;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=3, args_size=1
0: iconst_1
1: istore_1
2: new #2 // class java/lang/StringBuilder
5: dup
6: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
9: ldc #4 // String hello
11: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
14: iload_1
15: invokevirtual #6 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
18: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
21: astore_2
22: return
LineNumberTable:
line 11: 0
line 12: 2
line 19: 22
LocalVariableTable:
Start Length Slot Name Signature
0 23 0 args [Ljava/lang/String;
2 21 1 a I
22 1 2 str Ljava/lang/String;
}
SourceFile: "CommonTest.java"
Process finished with exit code 0

如上圖所示,其實a + “hello”內部是用stringBuilder進行追加操作的。

a + “hello3”僅僅是順序發生了變化,底層實現是否是相同的呢?我們把其他程式碼註釋掉,只留下 String str3 = a + “hello3”;

使用javap解析可得:

Classfile /D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class
Last modified 2021-8-19; size 707 bytes
MD5 checksum b767896bc82bc01ac0153354ac6e5886
Compiled from "CommonTest.java"
public class com.bestqiang.commontest.CommonTest
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #9.#27 // java/lang/Object."<init>":()V
#2 = Class #28 // java/lang/StringBuilder
#3 = Methodref #2.#27 // java/lang/StringBuilder."<init>":()V
#4 = Methodref #2.#29 // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
#5 = String #30 // hello3
#6 = Methodref #2.#31 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#7 = Methodref #2.#32 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#8 = Class #33 // com/bestqiang/commontest/CommonTest
#9 = Class #34 // java/lang/Object
#10 = Utf8 <init>
#11 = Utf8 ()V
#12 = Utf8 Code
#13 = Utf8 LineNumberTable
#14 = Utf8 LocalVariableTable
#15 = Utf8 this
#16 = Utf8 Lcom/bestqiang/commontest/CommonTest;
#17 = Utf8 main
#18 = Utf8 ([Ljava/lang/String;)V
#19 = Utf8 args
#20 = Utf8 [Ljava/lang/String;
#21 = Utf8 a
#22 = Utf8 I
#23 = Utf8 str3
#24 = Utf8 Ljava/lang/String;
#25 = Utf8 SourceFile
#26 = Utf8 CommonTest.java
#27 = NameAndType #10:#11 // "<init>":()V
#28 = Utf8 java/lang/StringBuilder
#29 = NameAndType #35:#36 // append:(I)Ljava/lang/StringBuilder;
#30 = Utf8 hello3
#31 = NameAndType #35:#37 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#32 = NameAndType #38:#39 // toString:()Ljava/lang/String;
#33 = Utf8 com/bestqiang/commontest/CommonTest
#34 = Utf8 java/lang/Object
#35 = Utf8 append
#36 = Utf8 (I)Ljava/lang/StringBuilder;
#37 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#38 = Utf8 toString
#39 = Utf8 ()Ljava/lang/String;
{
public com.bestqiang.commontest.CommonTest();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 9: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/bestqiang/commontest/CommonTest;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=3, args_size=1
0: iconst_1
1: istore_1
2: new #2 // class java/lang/StringBuilder
5: dup
6: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
9: iload_1
10: invokevirtual #4 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
13: ldc #5 // String hello3
15: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
18: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
21: astore_2
22: return
LineNumberTable:
line 11: 0
line 16: 2
line 19: 22
LocalVariableTable:
Start Length Slot Name Signature
0 23 0 args [Ljava/lang/String;
2 21 1 a I
22 1 2 str3 Ljava/lang/String;
}
SourceFile: "CommonTest.java"

很明顯,同樣是使用StringBuilder進行的追加操作。

二者底層實現上並無什麼不同,

目前有兩個疑問產生了,一個是測試迴圈資料量提升後為何導致“”+a和a+“”的效能反超Integer.toString(a)?另一個是為何“”+a和a+“”的底層實現相同,但是測試順序不同,導致結果也不同,資料量大後這個差異就縮小了?

其實,結果之所以有這些差異,原因就是JVM未進行預熱,且JIT編譯器會對我們所寫的熱點程式碼進行優化,具體方式有迴圈展開(loop unrolling)、OSR(On-Stack Replacement)、方法內聯等,由於篇幅問題,不再這裡詳細的一一介紹,後續會附上總的梳理腦圖,感興趣的小夥伴可以單獨再去深入瞭解。也就是說,你所寫的程式碼!=真正執行的程式碼,我們的JVM會對你寫的程式碼進行隱形優化,導致你的測試結果發生偏差。

簡單驗證JIT編譯優化:

先簡單驗證一下,驗證方法很簡單,我們先把把OSR給關閉,具體方法就是在執行JVM時使用JVM引數:-XX:-UseOnStackReplacement,接下來,我們再使用不同資料量級資料進行測試。

1000000資料測試結果:

""+a = 116
a+"" = 74
String.valueOf(a) = 43
Integer.toString(a) = 68

可以看到,”“+a和a + ”“的耗時已經超過了String.valueOf(a) 和 Integer.toString(a)。

500000000資料測試結果:

""+a = 37581
a+"" = 36621
String.valueOf(a) = 20173
Integer.toString(a) = 19934

可以看到,”“+a和a + ”“的耗時依然是超過了String.valueOf(a) 和 Integer.toString(a)的。

簡單驗證JVM預熱影響:

驗證這個很好辦,我們在進行真正測試之前,關閉OSR,再加上一段測試程式碼先跑一段時間即可,如:

      for (int i=0; i<100000000; i++){
result = "" + a;
}

新增程式碼後,使用 1000000資料測試結果:

""+a = 71
a+"" = 70
String.valueOf(a) = 41
Integer.toString(a) = 40

可以看到”“ + a 與 a + ”“耗時的差異已經消失了,且始終大於String.valueOf(a)與Integer.toString(a)。

開啟OSR後,使用 500000000資料測試結果:

""+a = 11390
a+"" = 11220
String.valueOf(a) = 14500
Integer.toString(a) = 14423

可以看到,”“ + a 與 a + ”“效能再次反超Integer.toString(a)。

由此我們可以得出結論:String.valueOf(a)內部呼叫Integer.toString(a)在原始碼中可以看出。a+""與“”+a在javap解析後內部均使用StringBuilder進行相加,時間耗時測試也無特殊差異,另外對於非變數的相加如1+“hello”,使用javap分析可以看出直接優化為“1hello”放入常量池。關鍵在於Integer.toString(a)與“”+a在不同資料量迴圈的速度差異問題,考慮到低資料量與高數量結果迥然不同,所以懷疑是熱點資料觸發OSR編譯導致的結果差異,考慮到JVM採用分層編譯,主要為解釋執行+JIT即時編譯,JIT檢測熱點資料的方法為基於呼叫計數器(Invocation Counter)和回邊計數器(Back Edge Counter)的熱點探測,這裡測試為直接用-XX:-UseOnStackReplacement直接關閉OSR,測試結果為無論資料量大還是小Integer.toString(a)耗時均小於""+a,初步得出結論為OSR對熱點資料優化為機器指令後“”+a效率大於Integer.toString(a),否則Integer.toString(a)效率大於“”+a。

我們僅僅為了測試一個Java數字轉字串的效能,就需要關注這麼多影響變數,探究這麼多原理,那麼,有沒有一個工具,能夠把這些都考慮進去,自動化統計得到真實環境執行時的相對準確的效能測試結果呢?微基準測試神器JMH正可以滿足我們的需求。

使用JMH

官方介紹:JMH 是一個 Java 工具,用於構建、執行和分析用 Java 和其他面向 JVM 的語言編寫的 nano/micro/milli/macro 基準測試。

JMH全稱為Java Microbenchmark Harness,是一個專為Java做基準測試的工具,因為JVM層面在編譯期、載入、執行時對程式碼做很多優化,同樣的一段程式碼,在整個程式執行時不一定會真正生效,所以對Java做微基準測試是非常困難的,需要對JVM的原理非常瞭解,所以,JMH由JVM團隊親自進行開發,而且開發者給我們提供了38個例子,供我們進行學習,避坑。

我們如果想要測試一段程式碼片段的效能,就比如以上的例子:int型別轉字串這種程式碼粒度,往往不是介面維度的,需要較高精度才能準確測試,我們的一般的效能測試工具,如JMeter,往往測試的為介面維度,且其框架本身的執行成本也會使得測試精度下降,測試精度是無法達到微基準測試的要求的。JMH提供了不同的測試模式,並且幫我們自動進行測試預熱、JVM多環境隔離、控制方法內聯,並且提供多種引數物件,如黑洞物件、實驗狀態物件幫我們更好的進行微基準測試和避坑。

Talk is cheap, show me the code!讓我們看看如何使用JMH進行上面我們的int型別轉字串的效能測試吧!

@State(Scope.Thread)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class JHMTest {
private int a = 123456;


@Benchmark
public String work1() {
return "" + a;
}


@Benchmark
public String work2() {
return a + "";
}


@Benchmark
public String work3() {
return Integer.toString(a);
}


@Benchmark
public String work4() {
return String.valueOf(a);
}


public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JHMTest.class.getSimpleName())
.forks(1)
.build();
new Runner(opt).run();
}
}

測試結果為:

# JMH version: 1.21
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work1


# Run progress: 0.00% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration 1: 29.389 ns/op
# Warmup Iteration 2: 20.454 ns/op
# Warmup Iteration 3: 16.060 ns/op
# Warmup Iteration 4: 16.014 ns/op
# Warmup Iteration 5: 15.806 ns/op
Iteration 1: 15.982 ns/op
Iteration 2: 15.901 ns/op
Iteration 3: 16.003 ns/op
Iteration 4: 15.865 ns/op
Iteration 5: 15.929 ns/op


Result "bestqiang.test.jmh.JHMTest.work1":
15.936 ±(99.9%) 0.219 ns/op [Average]
(min, avg, max) = (15.865, 15.936, 16.003), stdev = 0.057
CI (99.9%): [15.717, 16.155] (assumes normal distribution)


# JMH version: 1.21
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work2
# Run progress: 25.00% complete, ETA 00:00:32
# Fork: 1 of 1
# Warmup Iteration 1: 24.678 ns/op
# Warmup Iteration 2: 19.042 ns/op
# Warmup Iteration 3: 15.775 ns/op
# Warmup Iteration 4: 15.769 ns/op
# Warmup Iteration 5: 15.847 ns/op
Iteration 1: 15.915 ns/op
Iteration 2: 15.923 ns/op
Iteration 3: 15.836 ns/op
Iteration 4: 15.898 ns/op
Iteration 5: 15.853 ns/op


Result "bestqiang.test.jmh.JHMTest.work2"
15.885 ±(99.9%) 0.149 ns/op [Average]
(min, avg, max) = (15.836, 15.885, 15.923), stdev = 0.039
CI (99.9%): [15.736, 16.034] (assumes normal distribution)


# JMH version: 1.21
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work3
# Run progress: 50.00% complete, ETA 00:00:21
# Fork: 1 of 1
# Warmup Iteration 1: 34.360 ns/op
# Warmup Iteration 2: 26.414 ns/op
# Warmup Iteration 3: 25.597 ns/op
# Warmup Iteration 4: 25.619 ns/op
# Warmup Iteration 5: 25.743 ns/op
Iteration 1: 25.723 ns/op
Iteration 2: 25.671 ns/op
Iteration 3: 25.862 ns/op
Iteration 4: 25.863 ns/op
Iteration 5: 25.774 ns/op


Result "bestqiang.test.jmh.JHMTest.work3":
25.779 ±(99.9%) 0.327 ns/op [Average]
(min, avg, max) = (25.671, 25.779, 25.863), stdev = 0.085
CI (99.9%): [25.452, 26.106] (assumes normal distribution)


# JMH version: 1.21
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work4
# Run progress: 75.00% complete, ETA 00:00:10


# Fork: 1 of 1
# Warmup Iteration 1: 34.378 ns/op
# Warmup Iteration 2: 26.560 ns/op
# Warmup Iteration 3: 25.823 ns/op
# Warmup Iteration 4: 25.633 ns/op
# Warmup Iteration 5: 26.084 ns/op
Iteration 1: 25.934 ns/op
Iteration 2: 26.429 ns/op
Iteration 3: 30.559 ns/op
Iteration 4: 28.399 ns/op
Iteration 5: 26.415 ns/op


Result "bestqiang.test.jmh.JHMTest.work4":
27.547 ±(99.9%) 7.439 ns/op [Average]
(min, avg, max) = (25.934, 27.547, 30.559), stdev = 1.932
CI (99.9%): [20.108, 34.987] (assumes normal distribution)


# Run complete. Total time: 00:00:42
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.


Benchmark Mode Cnt Score Error Units
JHMTest.work1 avgt 5 15.936 ± 0.219 ns/op
JHMTest.work2 avgt 5 15.885 ± 0.149 ns/op
JHMTest.work3 avgt 5 25.779 ± 0.327 ns/op
JHMTest.work4 avgt 5 27.547 ± 7.439 ns/op

@Benchmark代表了開啟基準測試的方法,avgt代表平均耗時,@Fork代表了我們開啟的JVM環境,比如1就代表我們為每個進行基準測試的方法都單獨開啟一個JVM環境,我們幾個進行基準測試的方法可以進行環境隔離避免互相影響,@Warmup代表我們的預熱引數,@Measurement代表我們的測試引數,可以看到,基準測試的執行結果和我們使用常規測試的最終結果是一致的。

編寫一個合格的基準測試,遠不止這麼簡單,上面的用例看似簡單,但是蘊含了一些坑點,比如變數a,如果定義為final,你會發現,最終結果會截然不同,這裡我直接貼出最終測試結果:

Benchmark      Mode  Cnt   Score   Error  Units
JHMTest.work1 avgt 5 3.166 ± 0.184 ns/op
JHMTest.work2 avgt 5 3.167 ± 0.103 ns/op
JHMTest.work3 avgt 5 26.091 ± 1.276 ns/op
JHMTest.work4 avgt 5 25.424 ± 1.306 ns/op

可以明顯的看到,“”+a和a+“”的平均耗時降低為了3 ns/op。這其實就是常量摺疊(Constant folding)起了作用,它是一個在編譯時期簡化常數的一個過程,這個優化導致我們拿到了錯誤的結果。但是我們也可以看到work3和work4並未受到太大影響,這種優化我們是難以準確預測的,所以我們需要按照標準格式來寫我們的測試用例,否則一旦JVM進行優化,我們就會拿到異常測試結果。又比如,上面的例子中,我們在每次計算完畢後就將結果返回,這就是為了避免”死碼消除“,如果一塊程式碼我們一直沒有用到,那麼可能它就會被編譯器優化掉,我們將計算結果返回可以避免這種情況,又或者,我們可以使用Blackholes來將我們的計算值進行消耗,避免”死碼消除“的情況發生,Blackholes因為本身也有一定的計算邏輯,為避免對測試結果造成影響,這裡使用了直接將計算結果返回的方式。

我們在進行微基準測試時,並沒有再使用迴圈,那麼使用迴圈後會有什麼後果呢?JMH的官方案例 JMHSample_11_Loops 中特別提到:don't overuse loops, rely on JMH to get the measurement right. 也就是說,不要使用迴圈,依賴JMH去獲取正確的結果,因為迴圈是在是坑點太多,優化點太多,如果想要寫出一個安全的迴圈也是非常的繁瑣的,所以我們日常使用中應該儘量避免。如果實在想要使用迴圈的話,官方提供了一個例子 JMHSample_34_SafeLooping 用來寫正確的迴圈,應用在我們本次的測試用例上面,程式碼可以這麼寫,供大家參考:

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;


import java.util.concurrent.TimeUnit;


@State(Scope.Thread)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class JHMTest {


@Param({"1000000"})
int size;


static String work1(int x) {
return "" + x;
}


static String work2(int x) {
return x + "";
}


static String work3(int x) {
return Integer.toString(x);
}


static String work4(int x) {
return String.valueOf(x);
}


@Benchmark
public void test1() {
for (int i = 0; i < size; i ++) {
work1(i);
}
}


@Benchmark
public void test2() {
for (int i = 0; i < size; i ++) {
work2(i);
}
}


@Benchmark
public void test3() {
for (int i = 0; i < size; i ++) {
work3(i);
}
}


@Benchmark
public void test4() {
for (int i = 0; i < size; i ++) {
work4(i);
}
}


@CompilerControl(CompilerControl.Mode.DONT_INLINE)
public static void sink(String v) {
// IT IS VERY IMPORTANT TO MATCH THE SIGNATURE TO AVOID AUTOBOXING.
// The method intentionally dBenchmarkProcessoroes nothing.
}


public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JHMTest.class.getSimpleName())
.forks(3)
.build();


new Runner(opt).run();
}
}

CompilerControl.Mode.INLINE 這個註解的意思就是強制跳過內聯的意思,避免內聯優化。

關於JMH的使用例子,一共有38個,這裡我們不再一一介紹,大家可以去檢視 https://github.com/openjdk/jmh 官方程式碼,這些例子可以幫助大家更好的使用JMH。關於這38個例子和涉及的知識點,我進行了腦圖梳理總結,可以幫助大家更好的學習和理解:

以上,本次的分享就結束了,關於微基準效能測試,你學到了嗎?