Android中Java的异常处理机制

Android中,Java虚拟机(JVM)是如何处理异常的呢?

在执行main函数的时候,如果运行过程中遇到异常问题,有两种情况:

  1. 通过try-catch捕获已知或者未知的异常将问题处理并跳过,然后继续运行,确保程序不会崩溃
  2. 但并非所有的异常都是可预知的,针对没有捕获到的异常,会一直向上抛,异常一旦被Thread.run()或主线程抛出后就不能在程序中对异常进行捕获,最终只能由JVM捕获由JVM来处理

JVM 有一个默认的异常处理机制,遇到异常,抛出异常,和打印异常信息,同时将程序停止运行,这就是我们看到的程序崩溃。

Java 的Thread类中有一个UncaughtExceptionHandler接口,该接口的作用主要是为了当Thread因未捕获的异常而突然终止时,调用处理程序处理异常

1
2
3
4
5
6
7
8
//UncaughtExceptionHandler接口唯一的回调函数
void uncaughtException(Thread t, Throwable e);
//设置当前线程的异常处理器
Thread.setUncaughtExceptionHandler
//设置所有线程的默认异常处理器
Thread.setDefaultUncaughtExceptionHandler
//设置所有线程的默认异常预处理器
Thread.setUncaughtExceptionPreHandler

JVM 遇到线程未捕获的异常后,通过 Thread 的dispatchUncaughtException(e)方法分发异常到当前线程:

1
2
3
4
5
6
7
8
9
10
// art/runtime/thread.cc

void Thread::HandleUncaughtExceptions(ScopedObjectAccessAlreadyRunnable& soa) {
...
// Call the Thread instance's dispatchUncaughtException(Throwable)
tlsPtr_.jni_env->CallVoidMethod(peer.get(),
WellKnownClasses::java_lang_Thread_dispatchUncaughtException,
exception.get());
...
}

这个java_lang_Thread_dispatchUncaughtException方法就是 Thread 中的dispatchUncaughtException方法的缓存:

1
2
//art/runtime/well_known_classes.cc
java_lang_Thread_dispatchUncaughtException = CacheMethod(env, java_lang_Thread, false, "dispatchUncaughtException", "(Ljava/lang/Throwable;)V");

Thread 的dispatchUncaughtException方法如下:

1
2
3
4
5
6
7
8
9
10
11
12
public final void dispatchUncaughtException(Throwable e) {
Thread.UncaughtExceptionHandler initialUeh =
Thread.getUncaughtExceptionPreHandler();
if (initialUeh != null) {
try {
initialUeh.uncaughtException(this, e);
} catch (RuntimeException | Error ignored) {
// Throwables thrown by the initial handler are ignored
}
}
getUncaughtExceptionHandler().uncaughtException(this, e);
}

这里有 2 个UncaughtExceptionHandler会参与处理,分别是PreHandlerHandler,核心是执行其各自实现的uncaughtException方法。

Android 中提供了此二者的默认实现。Android 系统中,应用进程由Zygote进程孵化而来,Zygote进程启动时,zygoteInit方法中会调用RuntimeInit.commonInit,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
/**
* The main function called when started through the zygote process...
*/
public static final Runnable zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader) {
// ...
RuntimeInit.commonInit();
ZygoteInit.nativeZygoteInit();
return RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);
}

// frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
protected static final void commonInit() {
...
/*
* set handlers; these apply to all threads in the VM. Apps can replace
* the default handler, but not the pre handler.
*/
LoggingHandler loggingHandler = new LoggingHandler();
Thread.setUncaughtExceptionPreHandler(loggingHandler);
Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));
...
}

commonInit方法中实例化了 2 个对象,分别是LoggingHandlerKillApplicationHandler,均实现了Thread.UncaughtExceptionHandler接口。其中:

  1. LoggingHandler负责打印异常信息,包括进程名pidJava栈信息
  • 系统进程,日志以"*** FATAL EXCEPTION IN SYSTEM PROCESS: "开头
  • 应用进程,日志以"FATAL EXCEPTION: "开头
  1. KillApplicationHandler检查日志是否已打印,通知 AMS 应用 Crash,并杀死当前进程。

注意1:

  • Android N 及之前版本,只有一个UncaughtHandler
  • Android O 及之后版本,进行了功能拆分,拆为LoggingHandlerKillApplicationHandler,回调方法uncaughtException实现如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public void uncaughtException(Thread t, Throwable e) {
try {
ensureLogging(t, e);
...
// Bring up crash dialog, wait for it to be dismissed
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
...
} finally {
// Try everything to make sure this process goes away.
Process.killProcess(Process.myPid());
System.exit(10);
}
}

注意2:

  • Thread.setDefaultUncaughtExceptionHandler是公开 API。应用可通过调用自定义UncaughtExceptionHandler,替换掉KillApplicationHandler,这样能自定义逻辑处理掉异常,避免闪退发生

  • Thread.setUncaughtExceptionPreHandler是 hidden API。应用不能直接调用,确保异常发生时能够正常打印异常日志,参考Thread.java的更新日志:

1
Add a new @hide API to set an additional UncaughtExceptionHandler that is called before dispatching to the regular handler. The framework uses this to enforce logging.

Android O 及以后版本,对于任何一个线程异常,会优先经过getUncaughtExceptionPreHandler方法获取异常预处理器处理, 然后通过getUncaughtExceptionHandler方法获取当前线程实例的异常处理器处理异常。

Thread 的getUncaughtExceptionHandler方法:

1
2
3
4
public UncaughtExceptionHandler getUncaughtExceptionHandler() {
return uncaughtExceptionHandler != null ?
uncaughtExceptionHandler : group;
}

如果当前线程没有设置异常处理器,会选择当前线程所在的ThreadGroup(ThreadGroup 是一个Thread 的集合,自己实现了UncaughtExceptionHandler接口)来处理异常:

1
2
3
4
5
6
7
8
9
10
11
12
13
public void uncaughtException(Thread t, Throwable e) {
if (parent != null) {
parent.uncaughtException(t, e);
} else {
Thread.UncaughtExceptionHandler ueh = Thread.getDefaultUncaughtExceptionHandler();
if (ueh != null) {
ueh.uncaughtException(t, e);
} else if (!(e instanceof ThreadDeath)) {
System.err.print("Exception in thread \"" + t.getName() + "\" ");
e.printStackTrace(System.err);
}
}
}

ThreadGroupuncaughtException回调中会通过getDefaultUncaughtExceptionHandler接口获取默认的线程异常处理器进行最后的异常处理。

综上所述,当JVM遇到未捕获的异常时:

  1. 首先经所有线程共有的异常预处理器处理
  2. 线程共有异常预处理器预处理后交给当前线程的异常处理器处理
  3. 如果当前线程没有设置异常处理器,就转交给线程所在的线程组ThreadGroup来处理
  4. 线程组委托给父线程组处理,依次向上委托
  5. 最后在根线程组中获取线程共有的默认异常处理器来处理异常

以上流程总结如下图所示:

注意:

Android 中如果我们仅仅通过setDefaultUncaughtExceptionHandler方法覆盖默认的异常处理器,在回调中收集异常信息时,一定要注意记得杀死当前进程(让它痛快的死去):

1
2
Process.killProcess(Process.myPid());
System.exit(10);

不然应用就会陷入卡死状态,无法响应界面操作,进入了生不如死的状态。

:Java中出现 Crash 在 JVM 中的响应机制

通过上面的分析,我们知道出现 Crash 时,JVM 是通过Thread::HandleUncaughtExceptions方法将异常从Native 层传递到Java 层来逐层分发处理。

那么HandleUncaughtExceptions方法这个方法到底是在哪里调用的呢?搜索整个 Android 源码,我们只能找到一处调用,也即:

1
2
3
4
5
6
7
8
9
10
11
void Thread::Destroy() {
Thread* self = this;
DCHECK_EQ(self, Thread::Current());
...
if (tlsPtr_.opeer != nullptr) {
ScopedObjectAccess soa(self);
// We may need to call user-supplied managed code, do this before final clean-up.
HandleUncaughtExceptions(soa);
RemoveFromThreadGroup(soa);
}
}

这个Destroy()又是在什么时候调用呢?

通过搜索自然能够找到,如下图所示:

但不够直观理解。这里有一份 Crash 后打印的 Native 层堆栈信息(这个堆栈信息平时应该比较常见):

1
2
3
4
5
6
#23  pc 0000000000389c19  /system/lib/libart.so (art::Thread::HandleUncaughtExceptions(art::ScopedObjectAccessAlreadyRunnable&)+280)
#24 pc 0000000000389275 /system/lib/libart.so (art::Thread::Destroy()+1128)
#25 pc 00000000003982b1 /system/lib/libart.so (art::ThreadList::Unregister(art::Thread*)+104)
#26 pc 000000000037f209 /system/lib/libart.so (art::Thread::CreateCallback(void*)+1612)
#27 pc 0000000000048811 /system/lib/libc.so (__pthread_start(void*)+24)
#28 pc 000000000001b369 /system/lib/libc.so (__start_thread+32)

其中art::Thread::CreateCallback方法如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
void* Thread::CreateCallback(void* arg) {
Thread* self = reinterpret_cast<Thread*>(arg);
Runtime* runtime = Runtime::Current();
if (runtime == nullptr) {
LOG(ERROR) << "Thread attaching to non-existent runtime: " << *self;
return nullptr;
}
//...
{
ScopedObjectAccess soa(self);
self->InitStringEntryPoints();
//...
runtime->GetRuntimeCallbacks()->ThreadStart(self);

// Invoke the 'run' method of our java.lang.Thread.
ObjPtr<mirror::Object> receiver = self->tlsPtr_.opeer;
jmethodID mid = WellKnownClasses::java_lang_Thread_run;
ScopedLocalRef<jobject> ref(soa.Env(), soa.AddLocalReference<jobject>(receiver));
InvokeVirtualOrInterfaceWithJValues(soa, ref.get(), mid, nullptr);
}
// Detach and delete self.
Runtime::Current()->GetThreadList()->Unregister(self);

return nullptr;
}

Android Java 中的Thread类通过 start 启动一个线程时,会通过一个 native 函数nativeCreate进入 jni 层完成真正的线程创建:

1
2
3
4
5
6
7
8
9
10
11
public synchronized void start() {
...
try {
// Android-changed: Use Android specific nativeCreate() method to create/start thread.
// start0();
nativeCreate(this, stackSize, daemon);
started = true;
} finally {
...
}
}

这个nativeCreate方法接着会调用到art::Thread::CreateNativeThread方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// art/runtime/native/java_lang_Thread.cc
static void Thread_nativeCreate(JNIEnv* env, jclass, jobject java_thread, jlong stack_size,
jboolean daemon) {
// There are sections in the zygote that forbid thread creation.
Runtime* runtime = Runtime::Current();
if (runtime->IsZygote() && runtime->IsZygoteNoThreadSection()) {
jclass internal_error = env->FindClass("java/lang/InternalError");
CHECK(internal_error != nullptr);
env->ThrowNew(internal_error, "Cannot create threads in zygote");
return;
}

Thread::CreateNativeThread(env, java_thread, stack_size, daemon == JNI_TRUE);
}

JVM 在art::Thread::CreateNativeThread方法中通过pthread_create创建 Native 层的线程,并回调CreateCallback接口。

1
2
3
4
5
6
7
8
void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon) {
...
pthread_create_result = pthread_create(&new_pthread,
&attr,
Thread::CreateCallback,
child_thread);
...
}

CreateCallback接口中有一段{}括起来的代码块中调用了 Java 层Threadrun方法,进入 java 层Thread的线程循环。

1
2
3
4
5
6
7
8
9
10
11
12
void* Thread::CreateCallback(void* arg) {
...
{
...
// Invoke the 'run' method of our java.lang.Thread.
ObjPtr<mirror::Object> receiver = self->tlsPtr_.opeer;
jmethodID mid = WellKnownClasses::java_lang_Thread_run;
ScopedLocalRef<jobject> ref(soa.Env(), soa.AddLocalReference<jobject>(receiver));
InvokeVirtualOrInterfaceWithJValues(soa, ref.get(), mid, nullptr);
}
...
}

正常情况下对于主线程而言这里的run方法里会进入死循环,也就是当前的主线程ActivityThread的 main 函数中的Loop.loop()

一旦主线程中出现未捕获的异常,就会跳出主线程循环,从而离开这里的代码块,回调art::ThreadList::Unregister方法,然后调用art::Thread::Destroy方法,最后通过HandleUncaughtExceptions方法分发异常, 这也正好与上文中的异常堆栈吻合。


现在又问题来了,为什么出现异常就会退出这个线程循环呢?

这个问题要从 Java 的字节码指令执行上说起,首先我们举个简单的例子,crash()方法中触发一个简单的除零异常:

1
2
3
4
5
public class Crash {
public static void crash() {
int i = 10/0;
}
}

其中crash方法的smali代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
.method public static crash()V
.registers 1

.line 5
const/16 v0, 0xa

div-int/lit8 v0, v0, 0x0

.line 6
.local v0, "i":I
return-void
.end method

关于 Java 代码在dalvikart中的执行,这里暂不详细展开。这个除法操作编译后转换为了一条div-int指令,当虚拟机需要执行这个语句时,首先会去解释这个语句,通过字符串匹配的形式找到对应的指令代码,这条语句对应DIV_INT_LIT8方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
//art/libdexfile/dex/dex_instruction_list.h
...
V(0xD3, DIV_INT_LIT16, "div-int/lit16", k22s, kIndexNone, kContinue | kThrow, kDivide | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xD4, REM_INT_LIT16, "rem-int/lit16", k22s, kIndexNone, kContinue | kThrow, kRemainder | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xD5, AND_INT_LIT16, "and-int/lit16", k22s, kIndexNone, kContinue, kAnd | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xD6, OR_INT_LIT16, "or-int/lit16", k22s, kIndexNone, kContinue, kOr | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xD7, XOR_INT_LIT16, "xor-int/lit16", k22s, kIndexNone, kContinue, kXor | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xD8, ADD_INT_LIT8, "add-int/lit8", k22b, kIndexNone, kContinue, kAdd | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xD9, RSUB_INT_LIT8, "rsub-int/lit8", k22b, kIndexNone, kContinue, kSubtract | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xDA, MUL_INT_LIT8, "mul-int/lit8", k22b, kIndexNone, kContinue, kMultiply | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xDB, DIV_INT_LIT8, "div-int/lit8", k22b, kIndexNone, kContinue | kThrow, kDivide | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xDC, REM_INT_LIT8, "rem-int/lit8", k22b, kIndexNone, kContinue | kThrow, kRemainder | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xDD, AND_INT_LIT8, "and-int/lit8", k22b, kIndexNone, kContinue, kAnd | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xDE, OR_INT_LIT8, "or-int/lit8", k22b, kIndexNone, kContinue, kOr | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xDF, XOR_INT_LIT8, "xor-int/lit8", k22b, kIndexNone, kContinue, kXor | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xE0, SHL_INT_LIT8, "shl-int/lit8", k22b, kIndexNone, kContinue, kShl | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
V(0xE1, SHR_INT_LIT8, "shr-int/lit8", k22b, kIndexNone, kContinue, kShr | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \
...

DIV_INT_LIT8方法继而调用DoIntDivide方法:

1
2
3
4
5
6
7
// art/runtime/interpreter/interpreter_switch_impl-inl.h

ALWAYS_INLINE void DIV_INT_LIT8() REQUIRES_SHARED(Locks::mutator_lock_) {
bool success = DoIntDivide(shadow_frame, inst->VRegA_22b(inst_data),
shadow_frame.GetVReg(inst->VRegB_22b()), inst->VRegC_22b());
POSSIBLY_HANDLE_PENDING_EXCEPTION(!success, Next_2xx);
}

DoIntDivide方法定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
// art/runtime/interpreter/interpreter_common.h
// Handles div-int, div-int/2addr, div-int/li16 and div-int/lit8 instructions.
// Returns true on success, otherwise throws a java.lang.ArithmeticException and return false.
static inline bool DoIntDivide(ShadowFrame& shadow_frame, size_t result_reg,
int32_t dividend, int32_t divisor)
REQUIRES_SHARED(Locks::mutator_lock_) {
constexpr int32_t kMinInt = std::numeric_limits<int32_t>::min();
if (UNLIKELY(divisor == 0)) {
ThrowArithmeticExceptionDivideByZero();
return false;
}
...
}

这里我们可以看到当被除数divisor等于0时,就通过ThrowArithmeticExceptionDivideByZero方法抛出除零异常,继续跟踪:

1
2
3
4
// art/runtime/common_throws.cc
ThrowArithmeticExceptionDivideByZero() -> ThrowException() ->
// art/runtime/thread.cc
Thread::Current()-> ThrowNewException(exception_descriptor, nullptr) -> ThrowNewWrappedException(exception_class_descriptor, msg) ->

进入art::Thread::ThrowNewWrappedException方法后,会进行一大堆操作,包括获取当前线程的堆栈,最后赋值给tlsPtr这个大的结构体的exception

也就是说当虚拟机一步一步执行 Java 指令的时候,当遇到类似除零这种异常操作时,就会抛出一个对应的异常,然后一步一步返回到当前执行的地方,将异常入栈,跳出当前指令循环(可能解释得不是很清楚,参考这里),也就是结束了 Java 层的线程循环,回到art::Thread::CreateCallback回调中,从而进行接下来的异常分发流程。也就是说并不是虚拟机遇到未知运算或者未知指令出现了不可预期的异常,而是知道这个操作不符合规范,给不了有效的结果,主动抛出来一个异常给应用层。