ByteCodeDL 學習
ByteCodeDL也是一款java位元組碼靜態分析工具,它藉助了 soot-fact-generator + Souffle 兩個工具實現了一款宣告式的靜態分析工具。
宣告式和命令式概念參見 https://www.aqee.net/post/imperative-vs-declarative/
其中soot-fact-generator的作用在於為souffle生成fact事實,也就是生成資料集。
souffle根據facts以及我們給定的規則語句(以 .dl
為字尾的檔案)來進行查詢。
ByteCodeDL為我們編寫了一些已經寫好的規則,比如callgraph/cha(Class hierarchy analysis)/PTA指標分析及 P/Taint
汙點分析等,可以根據自己需求編寫相應dl檔案實現靜態分析。規則檔案移步 https://github.com/BytecodeDL/ByteCodeDL/tree/main/logic
關於Datalog-Based Program Analysis這部分的原理應該先看李樾和譚添老師的ppt https://pascal-group.bitbucket.io/lectures/Datalog.pdf
本文只是根據文件過一遍,帶讀者簡單瞭解bytecodedl。
環境
安裝souffle 見https://souffle-lang.github.io/install
sudo wget https://souffle-lang.github.io/ppa/souffle-key.public -O /usr/share/keyrings/souffle-archive-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/souffle-archive-keyring.gpg] https://souffle-lang.github.io/ppa/ubuntu/ stable main" | sudo tee /etc/apt/sources.list.d/souffle.list sudo apt update sudo apt install souffle
然後下載BytecodeDL打包好的 soot-fact-generator.jar
souffle demo
以官方給的例子來看 https://souffle-lang.github.io/simple
給定一個edge.facts如下
再給定一個example.dl
.decl edge(x:number, y:number) .input edge .decl path(x:number, y:number) .output path path(x, y) :- edge(x, y). path(x, y) :- path(x, z), edge(z, y).
其中兩個 .decl
分別表示input、output傳入傳出關係,這表示從磁碟讀入edge.facts並將path.csv結果集寫入磁碟。
path(x, y) :- edge(x, y).
表示:如果存在x->y的一條edge邊,那麼就存在x->y的一條path路徑。
path(x, y) :- path(x, z), edge(z, y).
則表示:如果x到z有條路徑,並且z到y有條邊,那麼就可以推理出x到y也有路徑。
我們使用souffle查詢一下看看結果。
ubuntu@ubuntu:~$ cat edge.facts 1 2 2 3 ubuntu@ubuntu:~$ souffle -F. -D. example.dl ubuntu@ubuntu:~$ cat path.csv 1 2 2 3 1 3
輸出了三條路徑
這是最簡單的一個demo,而soot-fact-generator則是用來生成facts的。
soot-fact-generator
ByteCodeDL提供的soot-fact-generator是來自於另一個靜態分析框架 https://bitbucket.org/yanniss/doop/src/master/generators/
doop本身就是使用souffle來做Java Pointer and Taint Analysis的工具,並且其本身有一些分析規則https://bitbucket.org/yanniss/doop/src/master/souffle-logic/
ByteCodeDL將doop的generator提取了出來,用doop的程式碼來生成facts。
然後自己寫規則實現功能。
這節我們用 https://github.com/BytecodeDL/Benchmark 來生成facts資料集。
下載https://github.com/BytecodeDL/Benchmark 然後maven package。
執行soot-fact-generator
ubuntu@ubuntu:~$ java -jar soot-fact-generator.jar -i Benchmark-1.0-SNAPSHOT.jar -l /usr/lib/jvm/java-1.8.0-openjdk-amd64/jre/lib/rt.jar --generate-jimple --allow-phantom --full -d out No logs directory set, using: out/logs Logging initialized, using directory: out/logs WARNING: 'file.encoding' property missing or not UTF8, please pass: -Dfile.encoding=UTF-8 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. WARNING: SSA not enabled, generating Jimple instead of Shimple Preprocessing application: Benchmark-1.0-SNAPSHOT.jar Preprocessing platform library: /usr/lib/jvm/java-1.8.0-openjdk-amd64/jre/lib/rt.jar Adding archive: Benchmark-1.0-SNAPSHOT.jar Adding archive for resolving: /usr/lib/jvm/java-1.8.0-openjdk-amd64/jre/lib/rt.jar Classes in input (application) jar(s): 85 Total classes in Scene: 3695 Retrieved all bodies (time: 11) Fact generation cores: 16 WARNING: some classes were not resolved, consider using thorough fact generation or adding them manually via --also-resolve: [sun.util.locale.provider.HostLocaleProviderAdapterImpl, java.lang.annotation.Inherited] Found 74 phantom references. Rerun with '--report-phantoms' for more details. Total classes (application, dependencies and SDK) to generate Jimple for: 3695 Soot: hierarchy_dirs set. Methods without active bodies encountered (and reset): 0
在out目錄下會生成facts檔案
ubuntu@ubuntu:~$ find out/*.facts out/Activity.facts out/ActualParam.facts out/AndroidApplication.facts out/AndroidCallbackMethodName.facts out/AndroidEntryPoint.facts out/AndroidId.facts out/AndroidIncludeXML.facts out/AnnotationElement.facts out/ApplicationClass.facts out/ApplicationPackage.facts out/ArrayAllocationConstSize.facts out/ArrayAllocation.facts out/ArrayInitialValueFromConst.facts out/ArrayInitialValueFromLocal.facts out/ArrayInsnIndex.facts out/ArrayNumIndex.facts out/ArrayType.facts out/AssignBinop.facts out/AssignCast.facts out/AssignCastNull.facts out/AssignCastNumConstant.facts out/AssignHeapAllocation.facts out/AssignInstanceOf.facts out/AssignLocal.facts out/AssignNull.facts out/AssignNumConstant.facts out/AssignOperFromConstant.facts out/AssignOperFrom.facts out/AssignPhantomInvoke.facts out/AssignReturnValue.facts out/AssignUnop.facts out/BootstrapParam.facts out/BreakpointStmt.facts out/BroadcastReceiver.facts out/Class-Artifact.facts out/ClassHeap.facts out/ClassModifier.facts out/ClassType.facts out/ComponentType.facts out/ContentProvider.facts out/DexInstructionAddressMap.facts out/DirectSuperclass.facts out/DirectSuperinterface.facts out/DummyIfVar.facts out/DynamicMethodInvocation.facts out/DynamicMethodInvocation-ParamType.facts out/EmptyArray.facts out/EnterMonitor.facts out/ExceptionHandler.facts out/ExceptionHandler-FormalParam.facts out/ExceptionHandler-Previous.facts out/ExitMonitor.facts out/Field-Annotation.facts out/Field.facts out/FieldInitialValue.facts out/Field-Modifier.facts out/FormalParam.facts out/GenericField.facts out/GenericType-ErasedType.facts out/GenericTypeParameters.facts out/Goto.facts out/IfConstant.facts out/If.facts out/IfVar.facts out/InterfaceType.facts out/LayoutControl.facts out/LoadArrayIndex.facts out/LoadInstanceField.facts out/LoadStaticField.facts out/LookupSwitch-Default.facts out/LookupSwitch.facts out/LookupSwitch-Target.facts out/Method-Annotation.facts out/Method-DeclaresException.facts out/Method.facts out/MethodHandleConstant.facts out/MethodInvocation-Line.facts out/Method-Modifier.facts out/MethodTypeConstant.facts out/MethodTypeConstantParam.facts out/NativeLibEntryPoint.facts out/NativeMethodId.facts out/NativeMethodTypeCandidate.facts out/NativeNameCandidate.facts out/NativeReturnVar.facts out/NativeXRef.facts out/NormalHeap.facts out/NumConstantRaw.facts out/OperatorAt.facts out/Param-Annotation.facts out/PhantomBasedMethod.facts out/PhantomMethod.facts out/PhantomType.facts out/PolymorphicInvocation.facts out/Properties.facts out/Return.facts out/ReturnVoid.facts out/SensitiveLayoutControl.facts out/Service.facts out/SpecialMethodInvocation.facts out/StatementType.facts out/StaticMethodInvocation.facts out/StoreArrayIndex.facts out/StoreInstanceField.facts out/StoreStaticField.facts out/StringConstant.facts out/StringRaw.facts out/SuperMethodInvocation.facts out/TableSwitch-Default.facts out/TableSwitch.facts out/TableSwitch-Target.facts out/ThisVar.facts out/Throw.facts out/ThrowNull.facts out/Type-Annotation.facts out/Type-SimpleName.facts out/UnsupportedInstruction.facts out/Var-DeclaringMethod.facts out/Var-SimpleName.facts out/Var-Type.facts out/VirtualMethodInvocation.facts out/XMLNodeAttribute.facts out/XMLNodeData.facts out/XMLNode.facts
其中每個facts檔案對應了不同的關係,比如Method.facts
ubuntu@ubuntu:~$ cat out/Method.facts|head -10 <java.net.ProxySelector: void <init>()> <init> java.net.ProxySelector void ()V 0 <java.lang.invoke.MethodHandleImpl$CountingWrapper: void <init>(java.lang.invoke.MethodHandle,java.lang.invoke.LambdaForm,java.util.function.Function,java.util.function.Function,int)> <init> java.lang.invoke.MethodHandle,java.lang.invoke.LambdaForm,java.util.function.Function,java.util.function.Function,int java.lang.invoke.MethodHandleImpl$CountingWrapper void (Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/LambdaForm;Ljava/util/function/Function;Ljava/util/function/Function;I)V 5 <sun.text.normalizer.UBiDiProps$IsAcceptable: void <init>(sun.text.normalizer.UBiDiProps)> <init> sun.text.normalizer.UBiDiProps sun.text.normalizer.UBiDiProps$IsAcceptable void (Lsun/text/normalizer/UBiDiProps;)V 1 <java.lang.UNIXProcess$Platform: java.lang.UNIXProcess$Platform[] values()> values java.lang.UNIXProcess$Platform java.lang.UNIXProcess$Platform[] ()[Ljava/lang/UNIXProcess$Platform; 0 <sun.invoke.util.VerifyAccess: void <init>()> <init> sun.invoke.util.VerifyAccess void ()V 0 <java.util.WeakHashMap$KeySpliterator: void <init>(java.util.WeakHashMap,int,int,int,int)> <init> java.util.WeakHashMap,int,int,int,int java.util.WeakHashMap$KeySpliterator void (Ljava/util/WeakHashMap;IIII)V 5 <java.util.stream.Tripwire: void <init>()> <init> java.util.stream.Tripwire void ()V 0 <java.util.BitSet: int wordIndex(int)> wordIndex int java.util.BitSet int (I)I 1 <sun.invoke.util.VerifyAccess: boolean isMemberAccessible(java.lang.Class,java.lang.Class,int,java.lang.Class,int)> isMemberAccessible java.lang.Class,java.lang.Class,int,java.lang.Class,int sun.invoke.util.VerifyAccess boolean (Ljava/lang/Class;Ljava/lang/Class;ILjava/lang/Class;I)Z 5 <java.net.ProxySelector: java.net.ProxySelector getDefault()> getDefault java.net.ProxySelector java.net.ProxySelector ()Ljava/net/ProxySelector; 0
facts預設用 \t
做分隔符,抽出一行來看
<sun.invoke.util.VerifyAccess: boolean isMemberAccessible(java.lang.Class,java.lang.Class,int,java.lang.Class,int)> isMemberAccessible java.lang.Class,java.lang.Class,int,java.lang.Class,int sun.invoke.util.VerifyAccess boolean (Ljava/lang/Class;Ljava/lang/Class;ILjava/lang/Class;I)Z 5
這行對應 sun.invoke.util.VerifyAccess#isMemberAccessible
,以 \t
分隔每一列又對應到函式的不同屬性。
再者說MethodInvocation-Line.facts
ubuntu@ubuntu:~$ cat out/MethodInvocation-Line.facts |head -10 <sun.invoke.util.VerifyAccess: void <init>()>/java.lang.Object.<init>/0 38 <java.net.ProxySelector: void <init>()>/java.lang.Object.<init>/0 60 <java.util.stream.Tripwire: void <init>()>/java.lang.Object.<init>/0 55 <java.util.WeakHashMap$KeySpliterator: void <init>(java.util.WeakHashMap,int,int,int,int)>/java.util.WeakHashMap$WeakHashMapSpliterator.<init>/0 1102 <sun.text.normalizer.UBiDiProps$IsAcceptable: void <init>(sun.text.normalizer.UBiDiProps)>/java.lang.Object.<init>/0 107 <java.lang.UNIXProcess$Platform: java.lang.UNIXProcess$Platform[] values()>/java.lang.Object.clone/0 81 <java.net.ProxySelector: java.net.ProxySelector getDefault()>/java.lang.System.getSecurityManager/0 92 <java.lang.invoke.MethodHandleImpl$CountingWrapper: void <init>(java.lang.invoke.MethodHandle,java.lang.invoke.LambdaForm,java.util.function.Function,java.util.function.Function,int)>/java.lang.invoke.MethodHandle.type/0 810 <java.lang.invoke.MethodHandleImpl$CountingWrapper: void <init>(java.lang.invoke.MethodHandle,java.lang.invoke.LambdaForm,java.util.function.Function,java.util.function.Function,int)>/java.lang.invoke.DelegatingMethodHandle.<init>/0 810 <java.net.ProxySelector: java.net.ProxySelector getDefault()>/java.lang.SecurityManager.checkPermission/0 94
記錄了method call的行號。
那麼通過facts檔案我們就有了靜態軟體分析中所需要的東西。
寫規則
soot-fact-generator.jar 為我們提供了各種所需要的結果集
我們一步一步來實現靜態分析。
1 實現Class Hierarchy
從基本的Class Hierarchy開始,我們需要構建一個型別層次圖,用於尋找某個類的子類、父類,或者用於判斷兩個類之間是否有繼承關係。
generator為我們生成了facts結果集
ubuntu@ubuntu:~$ cat out/DirectSuperclass.facts |head -10 java.lang.UNIXProcess$Platform java.lang.Enum sun.text.normalizer.UBiDiProps$IsAcceptable java.lang.Object java.net.ProxySelector java.lang.Object java.lang.invoke.MethodHandleImpl$CountingWrapper java.lang.invoke.DelegatingMethodHandle java.util.WeakHashMap$KeySpliterator java.util.WeakHashMap$WeakHashMapSpliterator java.util.BitSet java.lang.Object sun.invoke.util.VerifyAccess java.lang.Object java.util.stream.Tripwire java.lang.Object sun.text.normalizer.ReplaceableString java.lang.Object java.net.StandardSocketOptions java.lang.Object
可以看到具體的繼承關係。extend對應的是DirectSuperclass.facts,implement對應的是DirectSuperinterface.facts。
.type Class <: symbol .decl DirectSuperclass(child:Class, parent:Class) .input DirectSuperclass .decl DirectSuperinterface(child:Class, parent:Class) .input DirectSuperinterface
遞迴判斷子類關係
.type Class <: symbol .decl ClassModifier(mod:symbol, class:Class) .input ClassModifier .decl ClassType(class:Class) .input ClassType .decl InterfaceType(interface:Class) .input InterfaceType .decl DirectSuperclass(child:Class, parent:Class) .input DirectSuperclass .decl DirectSuperinterface(child:Class, parent:Class) .input DirectSuperinterface .decl SubClass(subclass:Class, class:Class) .output SubClass SubClass(subclass, class) :- DirectSuperclass(subclass, class). SubClass(subclass, class) :- DirectSuperinterface(subclass, class). SubClass(subclass, class) :- ( DirectSuperclass(subclass, tmp); DirectSuperinterface(subclass, tmp) ), SubClass(tmp, class).
執行並且檢視輸出結果
ubuntu@ubuntu:~$ souffle -F out/ -D . example.dl ; cat SubClass.csv |head -n 10 java.lang.UNIXProcess$Platform java.lang.Enum java.lang.UNIXProcess$Platform java.lang.Object java.lang.UNIXProcess$Platform java.io.Serializable java.lang.UNIXProcess$Platform java.lang.Comparable java.lang.Enum java.lang.Object java.lang.Enum java.io.Serializable java.lang.Enum java.lang.Comparable sun.text.normalizer.UBiDiProps$IsAcceptable java.lang.Object sun.text.normalizer.UBiDiProps$IsAcceptable sun.text.normalizer.ICUBinary$Authenticate java.net.ProxySelector java.lang.Object
2 實現method call graph
對於static call和special call都是在編譯時就確定呼叫者的具體型別的,而virtual call需要在實際執行時根據obj的實際型別判斷函式呼叫。由此一來如何確定obj的執行時型別,成為了呼叫圖構造的關鍵。
對於cha演算法而言
receiver在實際執行的過程中的型別可以是其宣告型別的任意非abstract子類。所以我們需要一個Dispatch來進行method dispatch。
這裡直接貼ByteCodeDL的文件 https://github.com/BytecodeDL/ByteCodeDL/blob/main/docs/utils.md#method-dispatch
Dispatch(simplename, descriptor, class, method) :- MethodInfo(method, simplename, _, class, _, descriptor, _), !MethodModifier("abstract", method). Dispatch(simplename, descriptor, class, method) :- !MethodInfo(_, simplename, _, class, _, descriptor, _), DirectSuperclass(class, superclass), Dispatch(simplename, descriptor, superclass, method), !MethodModifier("abstract", method).
第一個Dispatch表示如果class中有簽名相對並且修飾符不為abstract的method則返回method,第二個Dispatch表示如果沒從當前class中找到method則去從該class的superclass中尋找對應簽名並且不是abstract的method。
有了dispatch之後就可以實現cha呼叫圖了,程式碼還是直接看https://github.com/BytecodeDL/ByteCodeDL/blob/main/logic/cha.dl
還有一個rta演算法,不在這裡寫了, 直接看文件 ,ByteCodeDL也實現了。
3 cha的實際使用
針對不同的需求,我們需要找特定類,那麼這個時候cha呼叫圖就比較有用了。
官方文件以ezchain hfctf2022為例,講解了cha的實際使用。該ctf給了一個getter,禁用已知鏈,讓自己找getter來rce。
那麼有了如下程式碼
用SinkDesc宣告我們要的sink
#define MAXSTEP 5 #define CHAO 2 #include "../logic/cha.dl" .decl NonParamPublicMethod(method:Method, class:Class) .output NonParamPublicMethod SinkDesc("exec", "java.lang.Runtime"). SinkDesc("<init>", "java.lang.ProcessBuilder"). SinkDesc("start", "java.lang.ProcessImpl"). SinkDesc("loadClass", "java.lang.ClassLoader"). SinkDesc("defineClass", "java.lang.ClassLoader"). SinkDesc("readObject", "java.io.ObjectInputStream"). SinkDesc("readExternal", "java.io.ObjectInputStream"). EntryMethod(method), Reachable(method, 0), NonParamPublicMethod(method, class) :- MethodInfo(method, simplename, _, class, _, _, arity), MethodModifier("public", method), contains("get", simplename), arity = 0. .output SinkMethod
找到entry為 <java.security.SignedObject: java.lang.Object getObject()>
的method
可以將結果匯入到neo4j中進行視覺化。
bash importOutput2Neo4j.sh neoImportCall.sh dbname
不演示了
這裡提一嘴,相對tabby來講,ByteCodeDL使用souffle減少了輸入源,更快。用了指標分析,更準。
缺點也很明顯,語法更變態,自定義規則頭髮直接掉完,文件少、規則庫不夠完善,門檻比tabby高太多。
4 pta/ptaint Analysis
這裡演算法我講不明白了,直接看ByteCodeDL文件把。
指標分析的一個簡單例子
#include "inputDeclaration.dl" #include "utils.dl" #include "pt-noctx.dl" // 例項化 component .init cipt = ContextInsensitivePt // 初始化readchable cipt.Reachable(method) :- MethodInfo(method, simplename, _, _, _, descriptor, _), simplename = "main", descriptor = "([Ljava/lang/String;)V". .output cipt.VarPointsTo
這樣可以查詢出method name為main並且函式簽名是 ([Ljava/lang/String;)V
的函式,可以傳播到的點結果集。
截出一部分結果集來看
<com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: void <init>()>/@this <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: void <init>()>/this#_0 <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/demo#_8 <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: void test1(java.lang.String)>/@this <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: void test1(java.lang.String)>/this#_0 <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: void Sink(java.lang.String)>/@this <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: void Sink(java.lang.String)>/this#_0 <com.bytecodedl.benchmark.demo.TaintDemo1: void Sink(java.lang.String)>/new java.lang.StringBuilder/0 <com.bytecodedl.benchmark.demo.TaintDemo1: void Sink(java.lang.String)>/builder#_19 <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: java.lang.String Source()>/@this <com.bytecodedl.benchmark.demo.TaintDemo1: void main(java.lang.String[])>/new com.bytecodedl.benchmark.demo.TaintDemo1/0 <com.bytecodedl.benchmark.demo.TaintDemo1: java.lang.String Source()>/this#_0
對應的程式碼為
package com.bytecodedl.benchmark.demo; public class TaintDemo1 { public static void main(String[] args) { TaintDemo1 demo = new TaintDemo1(); String name = demo.Source(); demo.test1(name); } public void test1(String name){ String sql = "select * from user where name='" + name + "'"; Sink(sql); } public void Sink(String param){ StringBuilder builder = new StringBuilder(); builder.append(param); } public String Source(){ return "tainted name"; } }
沒啥問題。
- fastjson 1.2.80 漏洞分析
- Doop學習 part 1
- ByteCodeDL 學習
- CVE-2022-36923 ManageEngine OpManager getUserAPIKey Authentication Bypass
- 從濫用HTTP hop by hop請求頭看CVE-2022-1388
- 解決哥斯拉記憶體馬pagecontext的問題
- CVE 2022 22947 SpringCloud GateWay SPEL RCE Echo Response
- CVE-2021-44521 Apache Cassandra 載入UDF RCE
- dotnet 反序列化的另外幾個gadget
- CVE-2021-45456 Apache Kylin 命令注入
- 使用C#開發IIS模組後門
- 使用serverless實現動態新增水印
- XXE到域控復現(基於資源的約束委派)
- Kerberos協議之基於資源的約束委派
- Kerberos協議之約束委派
- Kerberos協議之TGS_REQ & TGS_REP
- Kerberos協議之AS_REQ & AS_REP
- Windows網路認證NTLM&Net-NTLM Hash
- fastjson 1.2.68 bypass autotype
- Ysoserial JDK7u21