android 開發——疑難雜症ANR簡單介紹與解析
一、ANR介紹
ANR-application not response,應用無響應,應用開發者一般是關注自己的APP程序有沒有出現,系統開發者會關注當前系統執行起來後整體上所有的APP程序有沒有出現ANR,從這句話可以知道,只有應用程序的主執行緒(UI執行緒)超時可能導致ANR,我們從系統整體角度來闡述問題,順便也能包含應用出現的ANR bug。
從現象來看,ANR出現時,會彈出一個應用無響應的彈框,當然也可能不彈,一是因為現在的裝置廠商覺得這彈框難看,給去掉了,二是有些ANR本身就只有log記錄,不會出現彈框。一直聽到的就是不能在主執行緒做耗時操作,否則可能造成ANR,這裡首先要明確:主執行緒做了大量耗時操作後,還得有事件產生,按鍵也好、點選觸控的touch事件也好,這樣才會出現,整體上可以歸成幾類:
- 1、應用程序主執行緒導致,這裡主要表現的就是activity-5s超時(會彈框),broadcast-10s超過,service-20s超時,三類超時後,再產生事件,這個事件無法傳遞給焦點視窗就會上報ANR
- 2、CPU耗盡
- 3、記憶體耗盡
簡單來說,ANR就是產生的事件長時間沒機會執行,Google就設計了這樣一種提示使用者的互動流程,防止使用者以為裝置手機壞成板磚了,如果沒有這樣的提醒,使用者可能會說,這什麼破手機、破系統,進而扣電池砸手機,有個彈框就能把問題聚焦到具體的某個APP上。因此CPU或者記憶體耗盡,它顯然也會導致ANR的出現,恰好此時的APP上某個控制元件點選事件他就是無法執行(CPU功耗高,UI執行緒一直搶不到執行的時間片)而超時。記憶體耗盡也是同樣的邏輯。可以看到,出現ANR的程序中,有時候是自身程序裡導致的,也有是受了池魚之災,其他程序佔用CPU太多,導致自己完全無法執行而出現。
二、ANR怎麼入手分析
上面說了ANR是怎麼一回事,以及ANR為什麼會產生,接著我們來分析,出現ANR應該怎麼檢視,怎麼追蹤問題是哪裡產生。出現ANR時(怎麼讓裝置出現ANR日誌,最簡單的辦法是shell命令:adb shell kill 3 pid)," data/anr/ “目錄下會產生類似"anr_2022-03-29-15-45-53-142"這樣命名的檔案(老版本時ANR產生的日誌叫” data/anr/ traces.txt",老版本只會記錄最近一次ANR發生時的系統堆疊,高版本改了,預設可以儲存64個檔案,每發生一次ANR,就會生成一個檔案),因此在產生ANR時取出日誌,我們可以人為的製造一次ANR,很容易-activity的oncreate()週期裡來個死迴圈,然後跳轉這個activity後,點選螢幕,OK-anr_2022-03-29-15-45-53-142檔案產生,下面來看如何分析ANR日誌:
2.1、Traces檔案詳細分析
----- pid 30675 at 2022-03-29 15:45:53 ----- Cmd line: com.android.test
開頭就是發生ANR的程序pid、時間、以及程序名,從開頭的資訊,系統開發人員就能把這個bug分給具體某個應用負責人了,當然有些時候發生ANR的程序可能沒問題,是被別人給耽誤了,所以接著往下看
``` Build fingerprint: ‘//*:10/QKQ1.210910.001/75:userdebug/release-keys’ ABI: ‘arm64’ Build type: optimized Zygote loaded classes=9126 post zygote classes=318 Dumping registered class loaders
0 dalvik.system.PathClassLoader: [], parent #1
1 java.lang.BootClassLoader: [], no parent
2 dalvik.system.PathClassLoader: [/system/framework/tcmclient.jar], parent #0
3 dalvik.system.PathClassLoader: [], parent #0
4 dalvik.system.PathClassLoader: [/data/app/com.android.test-o0YNE5m6e2qP5cEIcS4y8A==/base.apk:/data/app/com.android.test-o0YNE5m6e2qP5cEIcS4y8A==/base.apk!classes2.dex], parent #1
Done dumping class loaders Intern table: 33006 strong; 438 weak JNI: CheckJNI is on; globals=618 (plus 36 weak) Libraries: /data/app/com.android.test-o0YNE5m6e2qP5cEIcS4y8A==/lib/arm64/libSRS.so libandroid.so libcompiler_rt.so libjavacore.so libjavacrypto.so libjnigraphics.so libmedia_jni.so libopenjdk.so libqti_performance.so libsoundpool.so libwebviewchromium_loader.so (11) Heap: 24% free, 1468KB/1948KB; 40045 objects ```
這是發生ANR的應用程序一些資訊,如果是自己的應用看著會覺得很熟悉,接著來
Dumping cumulative Gc timings
Start Dumping histograms for 1 iterations for concurrent copying
ProcessMarkStack: Sum: 12.241ms 99% C.I. 12.241ms-12.241ms Avg: 12.241ms Max: 12.241ms
ScanImmuneSpaces: Sum: 2.466ms 99% C.I. 2.466ms-2.466ms Avg: 2.466ms Max: 2.466ms
VisitConcurrentRoots: Sum: 1.220ms 99% C.I. 1.220ms-1.220ms Avg: 1.220ms Max: 1.220ms
ClearFromSpace: Sum: 850us 99% C.I. 850us-850us Avg: 850us Max: 850us
ThreadListFlip: Sum: 700us 99% C.I. 700us-700us Avg: 700us Max: 700us
FlipThreadRoots: Sum: 389us 99% C.I. 389us-389us Avg: 389us Max: 389us
ScanCardsForSpace: Sum: 358us 99% C.I. 358us-358us Avg: 358us Max: 358us
InitializePhase: Sum: 293us 99% C.I. 293us-293us Avg: 293us Max: 293us
GrayAllDirtyImmuneObjects: Sum: 174us 99% C.I. 174us-174us Avg: 174us Max: 174us
FlipOtherThreads: Sum: 142us 99% C.I. 142us-142us Avg: 142us Max: 142us
SweepSystemWeaks: Sum: 131us 99% C.I. 131us-131us Avg: 131us Max: 131us
EnqueueFinalizerReferences: Sum: 75us 99% C.I. 75us-75us Avg: 75us Max: 75us
VisitNonThreadRoots: Sum: 65us 99% C.I. 65us-65us Avg: 65us Max: 65us
ResumeRunnableThreads: Sum: 46us 99% C.I. 46us-46us Avg: 46us Max: 46us
(Paused)GrayAllNewlyDirtyImmuneObjects: Sum: 41us 99% C.I. 41us-41us Avg: 41us Max: 41us
MarkZygoteLargeObjects: Sum: 28us 99% C.I. 28us-28us Avg: 28us Max: 28us
CopyingPhase: Sum: 24us 99% C.I. 24us-24us Avg: 24us Max: 24us
RecordFree: Sum: 19us 99% C.I. 19us-19us Avg: 19us Max: 19us
ForwardSoftReferences: Sum: 17us 99% C.I. 17us-17us Avg: 17us Max: 17us
ProcessReferences: Sum: 16us 99% C.I. 3us-13us Avg: 8us Max: 13us
ReclaimPhase: Sum: 13us 99% C.I. 13us-13us Avg: 13us Max: 13us
EmptyRBMarkBitStack: Sum: 10us 99% C.I. 10us-10us Avg: 10us Max: 10us
(Paused)FlipCallback: Sum: 7us 99% C.I. 7us-7us Avg: 7us Max: 7us
(Paused)SetFromSpace: Sum: 6us 99% C.I. 6us-6us Avg: 6us Max: 6us
ResumeOtherThreads: Sum: 4us 99% C.I. 4us-4us Avg: 4us Max: 4us
(Paused)ClearCards: Sum: 2us 99% C.I. 250ns-2000ns Avg: 95ns Max: 2000ns
UnBindBitmaps: Sum: 1us 99% C.I. 1us-1us Avg: 1us Max: 1us
Done Dumping histograms
concurrent copying paused: Sum: 764us 99% C.I. 764us-764us Avg: 764us Max: 764us
concurrent copying freed-bytes: Avg: 49KB Max: 49KB Min: 49KB
Freed-bytes histogram: 0:1
concurrent copying total time: 19.399ms mean time: 19.399ms
concurrent copying freed: 262 objects with total size 49KB
concurrent copying throughput: 13789.5/s / 2586KB/s per cpu-time: 5032000/s / 4914KB/s
Average major GC reclaim bytes ratio 0.679411 over 1 GC cycles
Average major GC copied live bytes ratio 0.765451 over 4 major GCs
Cumulative bytes moved 5264536
Cumulative objects moved 117659
Peak regions allocated 226 (56MB) / 384 (96MB)
Start Dumping histograms for 1 iterations for young concurrent copying
ScanImmuneSpaces: Sum: 7.517ms 99% C.I. 7.517ms-7.517ms Avg: 7.517ms Max: 7.517ms
ProcessMarkStack: Sum: 7.348ms 99% C.I. 7.348ms-7.348ms Avg: 7.348ms Max: 7.348ms
VisitConcurrentRoots: Sum: 3.267ms 99% C.I. 3.267ms-3.267ms Avg: 3.267ms Max: 3.267ms
ClearFromSpace: Sum: 3.048ms 99% C.I. 3.048ms-3.048ms Avg: 3.048ms Max: 3.048ms
InitializePhase: Sum: 2.244ms 99% C.I. 2.244ms-2.244ms Avg: 2.244ms Max: 2.244ms
ScanCardsForSpace: Sum: 1.260ms 99% C.I. 1.260ms-1.260ms Avg: 1.260ms Max: 1.260ms
SweepSystemWeaks: Sum: 925us 99% C.I. 925us-925us Avg: 925us Max: 925us
GrayAllDirtyImmuneObjects: Sum: 713us 99% C.I. 713us-713us Avg: 713us Max: 713us
FlipOtherThreads: Sum: 324us 99% C.I. 324us-324us Avg: 324us Max: 324us
VisitNonThreadRoots: Sum: 254us 99% C.I. 254us-254us Avg: 254us Max: 254us
EnqueueFinalizerReferences: Sum: 204us 99% C.I. 204us-204us Avg: 204us Max: 204us
RecordFree: Sum: 126us 99% C.I. 1us-125us Avg: 63us Max: 125us
SweepArray: Sum: 110us 99% C.I. 110us-110us Avg: 110us Max: 110us
ThreadListFlip: Sum: 108us 99% C.I. 108us-108us Avg: 108us Max: 108us
(Paused)SetFromSpace: Sum: 77us 99% C.I. 77us-77us Avg: 77us Max: 77us
(Paused)ClearCards: Sum: 73us 99% C.I. 1us-26us Avg: 3.476us Max: 26us
CopyingPhase: Sum: 69us 99% C.I. 69us-69us Avg: 69us Max: 69us
(Paused)GrayAllNewlyDirtyImmuneObjects: Sum: 62us 99% C.I. 62us-62us Avg: 62us Max: 62us
ResumeRunnableThreads: Sum: 40us 99% C.I. 40us-40us Avg: 40us Max: 40us
ReclaimPhase: Sum: 33us 99% C.I. 33us-33us Avg: 33us Max: 33us
MarkZygoteLargeObjects: Sum: 31us 99% C.I. 31us-31us Avg: 31us Max: 31us
ProcessReferences: Sum: 28us 99% C.I. 5us-23us Avg: 14us Max: 23us
EmptyRBMarkBitStack: Sum: 26us 99% C.I. 26us-26us Avg: 26us Max: 26us
FreeList: Sum: 21us 99% C.I. 21us-21us Avg: 21us Max: 21us
ForwardSoftReferences: Sum: 20us 99% C.I. 20us-20us Avg: 20us Max: 20us
(Paused)FlipCallback: Sum: 8us 99% C.I. 8us-8us Avg: 8us Max: 8us
SwapBitmaps: Sum: 6us 99% C.I. 6us-6us Avg: 6us Max: 6us
FlipThreadRoots: Sum: 5us 99% C.I. 5us-5us Avg: 5us Max: 5us
ResumeOtherThreads: Sum: 3us 99% C.I. 3us-3us Avg: 3us Max: 3us
Done Dumping histograms
young concurrent copying paused: Sum: 314us 99% C.I. 314us-314us Avg: 314us Max: 314us
young concurrent copying freed-bytes: Avg: 3068KB Max: 3068KB Min: 3068KB
Freed-bytes histogram: 2880:1
young concurrent copying total time: 27.979ms mean time: 27.979ms
young concurrent copying freed: 6163 objects with total size 3068KB
young concurrent copying throughput: 228259/s / 110MB/s per cpu-time: 116376888/s / 110MB/s
Average minor GC reclaim bytes ratio 1.45298 over 1 GC cycles
Average minor GC copied live bytes ratio 0.0708724 over 2 minor GCs
Cumulative bytes moved 498176
Cumulative objects moved 4041
Peak regions allocated 226 (56MB) / 384 (96MB)
Total time spent in GC: 47.378ms
Mean GC size throughput: 63MB/s
Mean GC object throughput: 135590 objects/s
Total number of allocations 46469
Total bytes allocated 4565KB
Total bytes freed 3097KB
Free memory 480KB
Free memory until GC 480KB
Free memory until OOME 94MB
Total memory 1948KB
Max memory 96MB
Zygote space size 1508KB
Total mutator paused time: 1.078ms
Total time waiting for GC to complete: 16.563us
Total GC count: 2
Total GC time: 47.378ms
Total blocking GC count: 0
Total blocking GC time: 0
Native bytes total: 6988048 registered: 86288
Total native bytes at last GC: 6758872
/system/framework/oat/arm64/android.hidl.base-V1.0-java.odex: quicken
/system/framework/oat/arm64/android.hidl.manager-V1.0-java.odex: quicken
Current JIT code cache size: 3320B
Current JIT data cache size: 2264B
Current JIT mini-debug-info size: 8024B
Current JIT capacity: 64KB
Current number of JIT JNI stub entries: 0
Current number of JIT code cache entries: 9
Total number of JIT compilations: 5
Total number of JIT compilations for on stack replacement: 1
Total number of JIT code cache collections: 0
Memory used for stack maps: Avg: 243B Max: 480B Min: 40B
Memory used for compiled code: Avg: 953B Max: 2332B Min: 92B
Memory used for profiling info: Avg: 751B Max: 7776B Min: 24B
Start Dumping histograms for 18 iterations for JIT timings
Compiling: Sum: 25.341ms 99% C.I. 0.870ms-10.530ms Avg: 2.815ms Max: 10.701ms
TrimMaps: Sum: 2.896ms 99% C.I. 66us-887us Avg: 321.777us Max: 887us
Generate JIT debug info: Sum: 960us 99% C.I. 52us-247.750us Avg: 106.666us Max: 248us
Done Dumping histograms
Memory used for compilation: Avg: 110KB Max: 319KB Min: 17KB
ProfileSaver total_bytes_written=0
ProfileSaver total_number_of_writes=0
ProfileSaver total_number_of_code_cache_queries=0
ProfileSaver total_number_of_skipped_writes=0
ProfileSaver total_number_of_failed_writes=0
ProfileSaver total_ms_of_sleep=5000
ProfileSaver total_ms_of_work=0
ProfileSaver max_number_profile_entries_cached=0
ProfileSaver total_number_of_hot_spikes=0
ProfileSaver total_number_of_wake_ups=0
大家應該瞭解過Android虛擬機器的邏輯記憶體以及GC回收機制,能看懂GC young、old等,接下來就是重點,關鍵字 “main”
DALVIK THREADS (13):
“main” prio=5 tid=1 Runnable
| group=“main” sCount=0 dsCount=0 flags=0 obj=0x72b31ec8 self=0x7dacf71c00
| sysTid=30675 nice=0 cgrp=default sched=0/0 handle=0x7dae4d9ed0
| state=R schedstat=( 63858980563 752954346 3029 ) utm=6370 stm=15 core=2 HZ=100
| stack=0x7ff146a000-0x7ff146c000 stackSize=8192KB
| held mutexes= “mutator lock”(shared held)
TraceBegin:
at com.android.test.MainActivity.onCreate(MainActivity.java:97)
at android.app.Activity.performCreate(Activity.java:7845)
at android.app.Activity.performCreate(Activity.java:7834)
at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1307)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3246)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3410)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThreadH . h a n d l e M e s s a g e ( A c t i v i t y T h r e a d . j a v a : 2017 ) a t a n d r o i d . o s . H a n d l e r . d i s p a t c h M e s s a g e ( H a n d l e r . j a v a : 107 ) a t a n d r o i d . o s . L o o p e r . l o o p ( L o o p e r . j a v a : 214 ) a t a n d r o i d . a p p . A c t i v i t y T h r e a d . m a i n ( A c t i v i t y T h r e a d . j a v a : 7478 ) a t j a v a . l a n g . r e f l e c t . M e t h o d . i n v o k e ( N a t i v e m e t h o d ) a t c o m . a n d r o i d . i n t e r n a l . o s . R u n t i m e I n i t H.handleMessage(ActivityThread.java:2017) at android.os.Handler.dispatchMessage(Handler.java:107) at android.os.Looper.loop(Looper.java:214) at android.app.ActivityThread.main(ActivityThread.java:7478) at java.lang.reflect.Method.invoke(Native method) at com.android.internal.os.RuntimeInitH.handleMessage(ActivityThread.java:2017)atandroid.os.Handler.dispatchMessage(Handler.java:107)atandroid.os.Looper.loop(Looper.java:214)atandroid.app.ActivityThread.main(ActivityThread.java:7478)atjava.lang.reflect.Method.invoke(Nativemethod)atcom.android.internal.os.RuntimeInitMethodAndArgsCaller.run(RuntimeInit.java:492)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:935)
TraceEnd:
用一個表格來解釋這些引數含義
至於TraceBegin-end這裡的執行時異常,就沒必要說了,開頭為了觸發ANR寫的死迴圈就在異常這裡,很容易就能看出來。接下來看Binder執行緒
“Binder:30675_1” prio=5 tid=13 Native
| group=“main” sCount=1 dsCount=0 flags=1 obj=0x12c802d8 self=0x7d1bf43800
| sysTid=30694 nice=0 cgrp=default sched=0/0 handle=0x7cc3972d50
| state=S schedstat=( 2909010 9923125 18 ) utm=0 stm=0 core=4 HZ=100
| stack=0x7cc387c000-0x7cc387e000 stackSize=991KB
| held mutexes=
kernel: (couldn’t read /proc/self/task/30694/stack)
native: #00 pc 00000000000d10b4 /apex/com.android.runtime/lib64/bionic/libc.so (__ioctl+4)
native: #01 pc 000000000008b6b0 /apex/com.android.runtime/lib64/bionic/libc.so (ioctl+132)
native: #02 pc 0000000000058d24 /system/lib64/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+244)
native: #03 pc 0000000000058f00 /system/lib64/libbinder.so (android::IPCThreadState::getAndExecuteCommand()+24)
native: #04 pc 00000000000596d8 /system/lib64/libbinder.so (android::IPCThreadState::joinThreadPool(bool)+64)
native: #05 pc 000000000007f978 /system/lib64/libbinder.so (android::PoolThread::threadLoop()+24)
native: #06 pc 00000000000135f0 /system/lib64/libutils.so (android::Thread::_threadLoop(void*)+328)
native: #07 pc 00000000000c5bd8 /system/lib64/libandroid_runtime.so (android::AndroidRuntime::javaThreadShell(void*)+140)
native: #08 pc 00000000000e68c0 /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+36)
native: #09 pc 0000000000084b6c /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)
TraceBegin:
(no managed stack frames)
TraceEnd:
Binder執行緒是程序中執行緒池用來處理binder請求的執行緒,後面還有很多,看日誌輸出的格式沒有差異,我們先把他們列舉出來:
- Signal Catcher --signal訊號,負責接收處理kernel傳送的各種訊號,Signal_quit(3)、6是非法地址越界、11是空指標好像,記不太清了可以百度signal訊號能查到,traces檔案就是由這個執行緒輸出的,能看到執行緒狀態是runnable
- Jit thread pool worker thread 0 --這是Java虛擬機器的程式碼即時編譯器?
- ADB-JDWP Connection Control Thread --支援虛擬機器除錯的執行緒
- HeapTaskDaemon --堆疊
- ReferenceQueueDaemon --引用佇列
- FinalizerDaemon --釋放物件前呼叫finalize
- FinalizerWatchdogDaemon --電子狗?
- Binder執行緒 --預設的話有三個binder執行緒
- Profile Saver --AS裡的Profile
- RenderThread --開啟硬體加速後,就會出現操作GPU的渲染執行緒 然後就完了,最後的目錄我猜測它的意思是:列印traces日誌是在哪個目錄下程式碼執行的(明顯它指的就是AMS在原始碼的路徑),暫時先這麼理解,哪天發現有錯誤,回頭來改正
----- end 30675 ----- frameworks/base/services/core/java/com/android/server
作為一個資深的程式猿,應該要對執行緒的各種狀態都瞭解,因為Android的程式碼分為Java端和native端(也就是c++),因此,執行緒也有這兩端的差異,本質含義差不多,但是名稱叫法有點出入(也就是執行緒狀態的標誌),記一遍就好,常量狀態一般是用大寫來命名,大寫我就不認識單詞了,全換成小寫,理解意思就好:
能看到c++側,執行緒狀態更多更細化一些,Traces日誌中的執行緒狀態是以c++端來賦值的,通過這個檔案很容易看出ANR出問題的地方在:com.android.test.MainActivity.onCreate(MainActivity.java:97),翻程式碼就能找到97行來修改,當然實際開發中可能會遇到非常複雜,難以分辨出來的ANR。
2.2、cpu、記憶體使用率分析
上面我們說,ANR也可能是由於CPU佔比太高,主執行緒一直搶不到時間片導致的,其實平時在做單元測試或者功能測試的時候,隨時關注程序的CPU佔比也是一個開發的好習慣,有三個shell命令要記住:
1、adb shell top --檢視當前系統中排名前20的程序資訊,以CPU佔比排序(後邊加 -p 程序號可以檢視某個程序的詳細資訊),top命令可以檢視系統中所有正在執行程序的CPU資訊和meminfo,-p 程序號能夠看到這一個程序巨集觀上的詳細資訊,可以把這些資訊抓下來(adb shell top -b >D:***.txt) 2、adb shell dumpsys cpuinfo,dumpsys命令可以提供非常多的服務,總共好像有200+, 3、adb shell dumpsys meminfo
top命令可以從整體上來看系統當前的CPU和記憶體資訊,也可以看某個程序的CPU與記憶體,能夠看出來這個程序是否正常,但是具體哪一行程式碼導致的是無法定位的,解釋下使用率,CPU使用率粗略的認為:一段時間T減去CPU空閒時間t後與這段時間的比例值,CPU使用率= (T - t)/T,在使用adb shell top 後 能看到有這樣一行資訊:
User 13%, System 5%, IOW 0%, IRQ 0% User 85 + Nice 0 + Sys 37 + Idle 509 + IOW 0 + IRQ 0 + SIRQ 0 = 631
User 處於使用者態的執行時間,不包含優先值為負程序 Nice 優先值為負的程序所佔用的CPU時間 Sys 處於核心態的執行時間 Idle 除IO等待時間以外的其它等待時間 IOW IO等待時間 IRQ 硬中斷時間 SIRQ 軟中斷時間
PID PR CPU% S #THR VSS RSS PCY UID Name
0 13% S 56 423416K 88160K fg u0_a92 com.tmall.wireless
1 2% R 1 1232K 536K root top
0 1% S 46 341712K 40872K fg u0_a90 com.wandoujia.phoenix2.usbproxy
1 1% S 31 319976K 33284K fg u0_a74 com.android.Chinpower
0 1% S 32 67320K 20552K fg system /system/bin/surfaceflinger
0 1% S 112 445876K 80304K fg system system_server
…
0 0% S 1 0K 0K root watchdog/0
1 0% S 1 0K 0K root khelper
1 0% S 1 0K 0K root suspend_sys_syn
1 0% S 1 0K 0K root suspend
PID 程序id PR 優先順序 CPU% 當前瞬時CPU佔用率 S 程序狀態:D=不可中斷的睡眠狀態, R=執行, S=睡眠, T=跟蹤/停止, Z=殭屍程序 #THR 程式當前所用的執行緒數 VSS Virtual Set Size 虛擬耗用記憶體(包含共享庫佔用的記憶體) RSS Resident Set Size 實際使用實體記憶體(包含共享庫佔用的記憶體) PCY 排程策略優先順序,SP_BACKGROUND/SP_FOREGROUND UID 程序所有者的使用者id Name 程序的名稱
CPU使用率的計算是在ProcessStats類中實現的。如果在某兩個時刻T1和T2(T1 < T2)進行取樣記錄,CPU使用率的整個演算法可以歸納為以下幾個公式:
userTime = (user2 + nice2) – (user1 + nice1)
systemTime = system2 - system1
idleTime = idle2 - idle1
iowaitTime = iowait2 - iowait1
irqTime = irq2 - irq1
softirqTime = softirq2 - softirq1
TotalTime = userTime + systemTime + idleTime + iowaitTime + irqTime + softirqTime
有了以上資料就可以計算具體的使用率了,例如使用者態CPU使用率為: userCpuUsage = userTime / TotalTime 依此類推可以計算其他型別的使用率。而整個時間段內CPU使用率為: CpuUsage = (TotalTime – idleTime) / TotalTime 以上計算的是整個系統的CPU使用率,對於指定程序的使用率是通過讀取該程序的“/proc/程序號/stat”檔案計算的,而對於指定程序的指定執行緒的使用率是通過讀取該執行緒的“/proc/程序號/task/執行緒號/stat”檔案計算的。程序和執行緒的CPU使用率只包含該程序或執行緒的總使用率、使用者態使用率和核心態使用率。 AMS在執行appNotResponding函式過程中,共輸出了兩個時間段的CPU使用率,通常情況下在ANR發生時間點之前和之後各有一段。兩段使用率都是兩次呼叫ProcessStats物件的update函式,每次呼叫update函式時會保留上一時間點的取樣資料,並記錄當前時間點的取樣資料。然後再呼叫ProcessStats物件的printCurrentState函式,按照上述公式歸納的演算法計算CPU使用率,並輸出最終的結果。 而對於記憶體相較於ANR的影響,由top命令觀察到的程序記憶體變化已然夠用。
三、Traces檔案是怎麼產生的
當我們拿到一個ANR的Traces檔案後,已經知道如何來分析ANR,不過假使能夠知道ANR檔案的來龍去脈如何產生如何儲存,必然更好:例如我現在就知道高版本上Traces檔案被改名了、如果作為系統開發者,可以優化日誌系統、當然最大的好處就是能夠對ANR理解深刻等等。 不得不再次提起這句話:ANR的本質是在當前焦點視窗主執行緒在做耗時操作,此時又產生了一個事件(按鍵或者Touch事件),此時發現上一次事件執行超時,開始走ANR邏輯。很顯然,要從事件這裡入手,Android系統的事件由硬體產生,驅動節點"dev/input",處理事件的系統服務是InputManagerService,IMS服務執行在SystemServer程序,由startOtherServices函式負責啟動,IMS的start函式中會呼叫jni函式nativeStart,在native層裡主要邏輯都是由InputManager類完成,這個類中會啟動兩個執行緒:
- InputReaderThread–負責開啟input事件驅動節點,接收硬體事件;
- InputDispatchThread–負責將InputReaderThread執行緒接收到的事件分發給具體的視窗
IMS會和WMS通訊以確定哪個window是當前焦點視窗便於傳遞事件,是用LocalSocket通訊 ,InputDispatchThread分發事件的過程中,會記錄事件的分發時間,如果發現超時,就開起ANR的流程,因此說IMS中是ANR的起點,InputDispatcher呼叫notifyANR(InputMonitor是IMS和WMS通訊的中介軟體),當InputReader從EventHub拿到事件後,從dispatchMotionLocked開始分發
```
/frameworks\native\services\inputflinger\InputDispatcher.cpp/
int32_t InputDispatcher::findTouchedWindowTargetsLocked(nsecs_t currentTime,
const MotionEntry entry, std::vector
CommandEntry* commandEntry = postCommandLocked(
& InputDispatcher::doNotifyANRLockedInterruptible/*通知其他地方發生了ANR*/);
commandEntry->inputApplicationHandle = applicationHandle;
commandEntry->inputChannel = windowHandle != nullptr ?
getInputChannelLocked(windowHandle->getToken()) : nullptr;
commandEntry->reason = reason;
} void InputDispatcher::doNotifyANRLockedInterruptible( CommandEntry commandEntry) { mLock.unlock(); /mPolicy是什麼? 它是jni一路傳遞下來的PhoneWindoManager物件,在WMS建立時設定,介面是IMS的 WindowManagerCallbacks提供,因此最終回撥的是PhoneWindowManager的notifyANR PhoneWindowManager是windowManagerPolicy介面的實現類 */ nsecs_t newTimeout = mPolicy->notifyANR( commandEntry->inputApplicationHandle, commandEntry->inputChannel ? commandEntry->inputChannel->getToken() : nullptr, commandEntry->reason); mLock.lock(); resumeAfterTargetsNotReadyTimeoutLocked(newTimeout, commandEntry->inputChannel); } ```
注意: mPolicy->notifyANR後(jni機制呼叫),流程走到IMS的notifyANR,
//com_android_server_input_InputManagerService.cpp
nsecs_t NativeInputManager::notifyANR(const sp<InputApplicationHandle>& inputApplicationHandle,
const sp<IBinder>& token, const std::string& reason) {
jlong newTimeout = env->CallLongMethod(mServiceObj,
gServiceClassInfo.notifyANR, tokenObj,
reasonObj);
if (checkAndClearExceptionFromCallback(env, "notifyANR")) {
newTimeout = 0; // abort dispatch
} else {
assert(newTimeout >= 0);
}
return newTimeout;
}
//InputManagerService.java
private long notifyANR(IBinder token, String reason) {
return mWindowManagerCallbacks.notifyANR(
token, reason);
}
//InputManagerCallback.java
@Override
public long notifyANR(IBinder token, String reason) {
//輸入事件log列印,不同情況列印不一樣。關鍵字Input event dispatching timed out
synchronized (mService.mGlobalLock) {
Slog.i(TAG_WM, "Input event dispatching timed out "
+ ...);
}
...
} else if (windowState != null) {
// Notify the activity manager about the timeout and let it decide whether
// to abort dispatching or keep waiting.
long timeout = mService.mAmInternal.inputDispatchingTimedOut(
windowState.mSession.mPid, aboveSystem, reason);
if (timeout >= 0) {
// The activity manager declined to abort dispatching.
// Wait a bit longer and timeout again later.
return timeout * 1000000L; // nanoseconds
}
}
return 0; // abort dispatching
}
//ActivityManagerService.java
@VisibleForTesting
public final class LocalService extends ActivityManagerInternal {
@Override
public long inputDispatchingTimedOut(int pid, boolean aboveSystem, String reason) {
synchronized (ActivityManagerService.this) {
return ActivityManagerService.this.inputDispatchingTimedOut(
pid, aboveSystem, reason);
}
}
}
long inputDispatchingTimedOut(int pid, final boolean aboveSystem, String reason) {
synchronized (this) {
timeout = proc != null ? proc.getInputDispatchingTimeout() : KEY_DISPATCHING_TIMEOUT_MS;
}
if (inputDispatchingTimedOut(proc, null, null, null, null, aboveSystem, reason)) {
return -1;
}
return timeout;
}
boolean inputDispatchingTimedOut(ProcessRecord proc, String activityShortComponentName,
ApplicationInfo aInfo, String parentShortComponentName,
WindowProcessController parentProcess, boolean aboveSystem, String reason) {
if (proc != null) {
proc.appNotResponding(activityShortComponentName, aInfo,
parentShortComponentName, parentProcess, aboveSystem, annotation);
}
return true;
}
流程很明確,最後呼叫到appNotResponding函式(frameworks/base/services/core/java/com/android/server/am/ProcessRecord.java),
``` void appNotResponding(String activityShortComponentName, ApplicationInfo aInfo, String parentShortComponentName, WindowProcessController parentProcess, boolean aboveSystem, String annotation) { mWindowProcessController.appEarlyNotResponding(annotation, () -> kill("anr", true)); long anrTime = SystemClock.uptimeMillis(); if (isMonitorCpuUsage()) { mService.updateCpuStatsNow(); } synchronized (mService) { // PowerManager.reboot() can block for a long time, so ignore ANRs while shutting down. if (mService.mAtmInternal.isShuttingDown()) { Slog.i(TAG, "During shutdown skipping ANR: " + this + " " + annotation); return; } else if (isNotResponding()) { Slog.i(TAG, "Skipping duplicate ANR: " + this + " " + annotation); return; } else if (isCrashing()) { Slog.i(TAG, "Crashing app skipping ANR: " + this + " " + annotation); return; } else if (killedByAm) { Slog.i(TAG, "App already killed by AM skipping ANR: " + this + " " + annotation); return; } else if (killed) { Slog.i(TAG, "Skipping died app ANR: " + this + " " + annotation); return; } // In case we come through here for the same app before completing // this one, mark as anring now so we will bail out. setNotResponding(true);
// Log the ANR to the event log.
EventLog.writeEvent(EventLogTags.AM_ANR, userId, pid, processName, info.flags,
annotation);
// Dump thread traces as quickly as we can, starting with "interesting" processes.
firstPids.add(pid);
// Don't dump other PIDs if it's a background ANR
if (!isSilentAnr()) {
int parentPid = pid;
if (parentProcess != null && parentProcess.getPid() > 0) {
parentPid = parentProcess.getPid();
}
if (parentPid != pid) firstPids.add(parentPid);
if (MY_PID != pid && MY_PID != parentPid) firstPids.add(MY_PID);
for (int i = getLruProcessList().size() - 1; i >= 0; i--) {
ProcessRecord r = getLruProcessList().get(i);
if (r != null && r.thread != null) {
int myPid = r.pid;
if (myPid > 0 && myPid != pid && myPid != parentPid && myPid != MY_PID) {
if (r.isPersistent()) {
firstPids.add(myPid);
if (DEBUG_ANR) Slog.i(TAG, "Adding persistent proc: " + r);
} else if (r.treatLikeActivity) {
firstPids.add(myPid);
if (DEBUG_ANR) Slog.i(TAG, "Adding likely IME: " + r);
} else {
lastPids.put(myPid, Boolean.TRUE);
if (DEBUG_ANR) Slog.i(TAG, "Adding ANR proc: " + r);
}
}
}
}
}
}
// Log the ANR to the main log.
StringBuilder info = new StringBuilder();
info.setLength(0);
info.append("ANR in ").append(processName);
if (activityShortComponentName != null) {
info.append(" (").append(activityShortComponentName).append(")");
}
info.append("\n");
info.append("PID: ").append(pid).append("\n");
if (annotation != null) {
info.append("Reason: ").append(annotation).append("\n");
}
if (parentShortComponentName != null
&& parentShortComponentName.equals(activityShortComponentName)) {
info.append("Parent: ").append(parentShortComponentName).append("\n");
}
ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);
ArrayList<Integer> nativePids = null;
// don't dump native PIDs for background ANRs unless it is the process of interest
String[] nativeProc = null;
if (isSilentAnr()) {
for (int i = 0; i < NATIVE_STACKS_OF_INTEREST.length; i++) {
if (NATIVE_STACKS_OF_INTEREST[i].equals(processName)) {
nativeProc = new String[] { processName };
break;
}
}
int[] pid = nativeProc == null ? null : Process.getPidsForCommands(nativeProc);
if(pid != null){
nativePids = new ArrayList<>(pid.length);
for (int i : pid) {
nativePids.add(i);
}
}
} else {
nativePids = Watchdog.getInstance().getInterestingNativePids();
}
//Step1、生成Traces檔案
File tracesFile = ActivityManagerService.dumpStackTraces(firstPids,
(isSilentAnr()) ? null : processCpuTracker, (isSilentAnr()) ? null : lastPids,
nativePids);
String cpuInfo = null;
if (isMonitorCpuUsage()) {
mService.updateCpuStatsNow();
synchronized (mService.mProcessCpuTracker) {
cpuInfo = mService.mProcessCpuTracker.printCurrentState(anrTime);
}
info.append(processCpuTracker.printCurrentLoad());
info.append(cpuInfo);
}
info.append(processCpuTracker.printCurrentState(anrTime));
Slog.e(TAG, info.toString());
if (tracesFile == null) {
// There is no trace file, so dump (only) the alleged culprit's threads to the log
Process.sendSignal(pid, Process.SIGNAL_QUIT);
}
StatsLog.write(StatsLog.ANR_OCCURRED, uid, processName,
activityShortComponentName == null ? "unknown": activityShortComponentName,
annotation,
(this.info != null) ? (this.info.isInstantApp()
? StatsLog.ANROCCURRED__IS_INSTANT_APP__TRUE
: StatsLog.ANROCCURRED__IS_INSTANT_APP__FALSE)
: StatsLog.ANROCCURRED__IS_INSTANT_APP__UNAVAILABLE,
isInterestingToUserLocked()
? StatsLog.ANROCCURRED__FOREGROUND_STATE__FOREGROUND
: StatsLog.ANROCCURRED__FOREGROUND_STATE__BACKGROUND,
getProcessClassEnum(),
(this.info != null) ? this.info.packageName : "");
final ProcessRecord parentPr = parentProcess != null
? (ProcessRecord) parentProcess.mOwner : null;
mService.addErrorToDropBox("anr", this, processName, activityShortComponentName,
parentShortComponentName, parentPr, annotation, cpuInfo, tracesFile, null);
if (mWindowProcessController.appNotResponding(info.toString(), () -> kill("anr", true),
() -> {
synchronized (mService) {
mService.mServices.scheduleServiceTimeoutLocked(this);
}
})) {
return;
}
synchronized (mService) {
// mBatteryStatsService can be null if the AMS is constructed with injector only. This
// will only happen in tests.
if (mService.mBatteryStatsService != null) {
mService.mBatteryStatsService.noteProcessAnr(processName, uid);
}
if (isSilentAnr() && !isDebugging()) {
kill("bg anr", true);
return;
}
// Set the app's notResponding state, and look up the errorReportReceiver
makeAppNotRespondingLocked(activityShortComponentName,
annotation != null ? "ANR " + annotation : "ANR", info.toString());
// mUiHandler can be null if the AMS is constructed with injector only. This will only
// happen in tests.
if (mService.mUiHandler != null) {
// Bring up the infamous App Not Responding dialog
Message msg = Message.obtain();
msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
msg.obj = new AppNotRespondingDialog.Data(this, aInfo, aboveSystem);
mService.mUiHandler.sendMessage(msg);
}
}
} ```
ActivityManagerService.dumpStackTraces,再調回到AMS裡,由dumpStackTraces函式來建立並往檔案中寫入內容(data/anr/traces_**.txt),至此ANR分析完成。
以上就是Android開發中常會遇到的ANR介紹以及分析;Android開發中需要疑難核心知識點。需要深入的掌握;更多Android技術參考>可以> 前往
傳送直達↓↓↓ :link.juejin.cn/?target=htt…
文末
ANR 的全稱是 Application No Responding,即應用程式無響應,具體是一些特定的 Message (Key Dispatch、Broadcast、Service) 在應用的UI執行緒(主執行緒)沒有在規定的時間內處理完,進而觸發 ANR 異常。
ANR 由訊息處理機制保證,Android 在系統層實現了一套精密的機制來發現 ANR,核心原理是訊息排程和超時處理。ANR 機制主體實現在系統層,所有與 ANR 相關的訊息,都會經過系統程序 system_server排程,具體是 ActivityManagerService服務,然後派發到應用程序完成對訊息的實際處理,同時,系統程序設計了不同的超時限制來跟蹤訊息的處理。 一旦應用程式處理訊息不當,超時限制就起作用了,它收集一些系統狀態,譬如 CPU/IO 使用情況、程序函式呼叫棧 CallStack,(有些平臺比如 MTK,還會列印相應的 Message 出來供除錯分析),最後報告使用者有程序無響應了( ANR 對話方塊)。
ANR問題,相信是每位開發日常都會遇到的問題。如何更多問題可評論區或私信。
- Android效能優化——記憶體洩漏的根本原因
- Android ViewStub的使用方法——邊走邊看邊學
- Android進階——sdk開發和apk開發有什麼區別?
- Android開發——RXBinding防抖機制與案件分析
- Android效能啟動優化——IO優化進階
- Android適配【入坑指南 解決痛點】
- android 開發——疑難雜症ANR簡單介紹與解析
- Android外掛化框架—— Atlas
- 一名合格的音影片工程師,技能樹狀分佈是怎樣形成的?
- Android核心技術—核心(Linux) 的IO棧
- Android前沿技術—— Jetpack Compose
- Android開發資料結構與演算法——ArrayList原始碼講解
- Flutter中如何構建顯式動畫 【教學】
- Android記憶體抖動(主要原因分析 6個優化小技巧)
- Android車載多媒體開發——MediaSession框架
- 車機空調系統開發(HVAC),溫暖一整個冬天!
- 大廠為什麼在招聘安卓架構師時,為啥都需要熟悉 framework 經驗?
- 一個擴充套件性極強的 Flutter MVVM 實用框架,完善你的技術棧
- 2021年,跨端是否已成趨勢?Android 開發還有必要學 Flutter 嗎?
- Jetpack的MVVM通訊 - LiveData的原理分析