ANR触发原理
🔗 转载自:https://www.cnblogs.com/huansky/p/14954020.html
一、概述
作为 Android 开发者,相信大家都遇到过 ANR。那么为什么会出现 ANR 呢,ANR 之后系统都做了啥。文章将对这个问题详细解说。
ANR(Application Not responding),是指应用程序未响应,Android系统对于一些事件需要在一定的时间范围内完成,如果超过预定时间能未能得到有效响应或者响应时间过长,都会造成ANR。一般地,这时往往会弹出一个提示框,告知用户当前xxx未响应,用户可选择继续等待或者Force Close。
那么哪些场景会造成ANR呢?
- Service Timeout:比如前台服务在20s内未执行完成;
- BroadcastQueue Timeout:比如前台广播在10s内未执行完成
- ContentProvider Timeout:内容提供者,在publish过超时10s;
- InputDispatching Timeout: 输入事件分发超时5s,包括按键和触摸事件。
触发ANR的过程可分为三个步骤: 埋炸弹, 拆炸弹, 引爆炸弹。
埋炸弹可以理解为发送了一个延迟触发的消息(炸弹);
拆炸弹可以理解为将这个延迟消息(炸弹)取消了,也就不会触发了;
引爆炸弹可以理解为延迟时间已达,开始处理延迟消息(炸弹引爆了)。
二、Service
先附上一张 service 启动流程图:
Service Timeout是位于”ActivityManager”线程中的AMS.MainHandler收到SERVICE_TIMEOUT_MSG消息时触发。
对于Service有两类:
- 对于前台服务,则超时为SERVICE_TIMEOUT = 20s;
- 对于后台服务,则超时为SERVICE_BACKGROUND_TIMEOUT = 200s
由变量ProcessRecord.execServicesFg来决定是否前台启动。
2.1 埋炸弹
其中在Service进程attach到system_server进程的过程中会调用realStartServiceLocked()方法来埋下炸弹.
首先咱们先看 service 的启动中一个方法 realStartServiceLocked:
// ActiveServices.java
private final void realStartServiceLocked(ServiceRecord r, ProcessRecord app, boolean execInFg) throws RemoteException {
...
//发送delay消息(SERVICE_TIMEOUT_MSG)
bumpServiceExecutingLocked(r, execInFg, "create");
try {
...
//最终执行服务的onCreate()方法
app.thread.scheduleCreateService(r, r.serviceInfo,
mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
app.repProcState);
} catch (DeadObjectException e) {
mAm.appDiedLocked(app);
throw e;
} finally {
...
}
}
private final void bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why) {
...
scheduleServiceTimeoutLocked(r.app);
}
void scheduleServiceTimeoutLocked(ProcessRecord proc) {
if (proc.executingServices.size() == 0 || proc.thread == null) {
return;
}
long now = SystemClock.uptimeMillis();
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_TIMEOUT_MSG);
msg.obj = proc;
//当超时后仍没有remove该SERVICE_TIMEOUT_MSG消息,则执行service Timeout流程
mAm.mHandler.sendMessageAtTime(msg,
proc.execServicesFg ? (now+SERVICE_TIMEOUT) : (now+ SERVICE_BACKGROUND_TIMEOUT));
}
在 AS.realStartServiceLocked 启动 service 方法中,发送了了一个延时的关于超时的消息,这里又对 service 进行了前后台的区分:
// How long we wait for a service to finish executing. 20s
static final int SERVICE_TIMEOUT = 20*1000;
// How long we wait for a service to finish executing. 200s
static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
2.2 拆炸弹
AS.realStartServiceLocked() 调用的过程会埋下一颗炸弹, 超时没有启动完成则会爆炸. 那么什么时候会拆除这颗炸弹的引线呢? 经过Binder等层层调用进入目标进程的主线程handleCreateService()的过程.
// ActivityThread,这里多说一句, ApplicationThread 是其内部类
private void handleCreateService(CreateServiceData data) {
...
java.lang.ClassLoader cl = packageInfo.getClassLoader();
Service service = (Service) cl.loadClass(data.info.name).newInstance();
...
try {
//创建ContextImpl对象
ContextImpl context = ContextImpl.createAppContext(this, packageInfo);
context.setOuterContext(service);
//创建Application对象
Application app = packageInfo.makeApplication(false, mInstrumentation);
service.attach(context, this, data.info.name, data.token, app,
ActivityManagerNative.getDefault());
//调用服务onCreate()方法
service.onCreate();
//
ActivityManagerNative.getDefault().serviceDoneExecuting(
data.token, SERVICE_DONE_EXECUTING_ANON, 0, 0);
} catch (Exception e) {
...
}
}
在这个过程会创建目标服务对象,以及回调 onCreate() 方法, 紧接再次经过多次调用回到 system_server 来执行 serviceDoneExecuting 。
// ActiveServices
private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying, boolean finishing) {
...
if (r.executeNesting <= 0) {
if (r.app != null) {
r.app.execServicesFg = false;
r.app.executingServices.remove(r);
if (r.app.executingServices.size() == 0) {
//当前服务所在进程中没有正在执行的service
mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);
...
}
...
}
// How long we wait for a service to finish executing.
static final int SERVICE_TIMEOUT = 20*1000;
该方法会在 service 启动完成后移除服务超时消息 SERVICE_TIMEOUT_MSG,时间是 20s。
2.3 引爆炸弹
前面介绍了埋炸弹和拆炸弹的过程, 如果在炸弹倒计时结束之前成功拆卸炸弹,那么就没有爆炸的机会, 但是世事难料. 总有些极端情况下无法即时拆除炸弹,导致炸弹爆炸, 其结果就是 App 发生 ANR. 接下来,带大家来看看炸弹爆炸的现场:
在 system_server 进程中有一个Handler线程,当倒计时结束便会向该 Handler 线程发送一条信息SERVICE_TIMEOUT_MSG,
// ActivityManagerService.java ::MainHandler
final class MainHandler extends Handler {
public MainHandler(Looper looper) {
super(looper, null, true);
}
@Override
public void handleMessage(Message msg) {
switch (msg.what) {
......case SERVICE_TIMEOUT_MSG: {
mServices.serviceTimeout((ProcessRecord)msg.obj);
} break;
}
}
当延时时间到了之后,就会对消息进行处理,下面看下具体处理逻辑:
oid serviceTimeout(ProcessRecord proc) {
String anrMessage = null;
synchronized(mAm) {
if (proc.executingServices.size() == 0 || proc.thread == null) {
return;
}
final long now = SystemClock.uptimeMillis();
final long maxTime = now -
(proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
ServiceRecord timeout = null;
long nextTime = 0;
for (int i=proc.executingServices.size()-1; i>=0; i--) {
// 从进程里面获取正在运行的 service
ServiceRecord sr = proc.executingServices.valueAt(i);
if (sr.executingStart < maxTime) {
timeout = sr;
break;
}
if (sr.executingStart > nextTime) {
nextTime = sr.executingStart;
}
}
if (timeout != null && mAm.mLruProcesses.contains(proc)) {
Slog.w(TAG, "Timeout executing service: " + timeout);
StringWriter sw = new StringWriter();
PrintWriter pw = new FastPrintWriter(sw, false, 1024);
pw.println(timeout);
timeout.dump(pw, " ");
pw.close();
mLastAnrDump = sw.toString();
mAm.mHandler.removeCallbacks(mLastAnrDumpClearer);
mAm.mHandler.postDelayed(mLastAnrDumpClearer, LAST_ANR_LIFETIME_DURATION_MSECS);
anrMessage = "executing service " + timeout.shortName;
}
}
if (anrMessage != null) {
//当存在timeout的service,则执行appNotResponding
mAm.appNotResponding(proc, null, null, false, anrMessage);
}
}
其中anrMessage的内容为”executing service [发送超时serviceRecord信息]”;
2.4 前台与后台服务的区别
系统对前台服务启动的超时为20s,而后台服务超时为200s,那么系统是如何区别前台还是后台服务呢?来看看ActiveServices的核心逻辑:
ComponentName startServiceLocked(...) {
final boolean callerFg;
if (caller != null) {
final ProcessRecord callerApp = mAm.getRecordForAppLocked(caller);
callerFg = callerApp.setSchedGroup != ProcessList.SCHED_GROUP_BACKGROUND;
} else {
callerFg = true;
}
...
ComponentName cmp = startServiceInnerLocked(smap, service, r, callerFg, addToStarting);
return cmp;
}
在startService过程根据发起方进程 callerApp 所属的进程调度组来决定被启动的服务是属于前台还是后台。当发起方进程不等于ProcessList.SCHED_GROUP_BACKGROUND (后台进程组) 则认为是前台服务,否则为后台服务,并标记在ServiceRecord的成员变量createdFromFg。
什么进程属于SCHED_GROUP_BACKGROUND调度组呢?进程调度组大体可分为TOP、前台、后台,进程优先级(Adj)和进程调度组(SCHED_GROUP)算法较为复杂,其对应关系可粗略理解为Adj等于0的进程属于Top进程组,Adj等于100或者200的进程属于前台进程组,Adj大于200的进程属于后台进程组。关于Adj的含义见下表,简单来说就是Adj>200的进程对用户来说基本是无感知,主要是做一些后台工作,故后台服务拥有更长的超时阈值,同时后台服务属于后台进程调度组,相比前台服务属于前台进程调度组,分配更少的CPU时间片。
前台服务准确来说,是指由处于前台进程调度组的进程发起的服务。这跟常说的fg-service服务有所不同,fg-service是指挂有前台通知的服务。
需要注意的问题,如果日志中出现 Reason: executing service com.example.baidu/.AnrService 也不一定是因为服务本身耗时导致,比如启动服务后,执行了耗时的操作,启动服务时onCreate函数或者 onStartCommand函数不能执行,超时后,仍然会造成anr
三、BroadcastReceiver
BroadcastReceiver Timeout 是位于”ActivityManager”线程中的BroadcastQueue.BroadcastHandler收到BROADCAST_TIMEOUT_MSG消息时触发。
对于广播队列有两个: foreground 队列和 background 队列:
- 对于前台广播,则超时为 BROADCAST_FG_TIMEOUT = 10s;
- 对于后台广播,则超时为 BROADCAST_BG_TIMEOUT = 60s
3.1 埋炸弹
先看发送广播的逻辑:
// ActivityManagerService.java]
public final int broadcastIntent(IApplicationThread caller,
Intent intent, String resolvedType, IIntentReceiver resultTo,
int resultCode, String resultData, Bundle resultExtras,
String[] requiredPermissions, int appOp, Bundle bOptions,
boolean serialized, boolean sticky, int userId) {
enforceNotIsolatedCaller("broadcastIntent");
synchronized(this) {
// 验证广播的有效性
intent = verifyBroadcastLocked(intent);
// 获取发送广播的进程信息
final ProcessRecord callerApp = getRecordForAppLocked(caller);
final int callingPid = Binder.getCallingPid();
final int callingUid = Binder.getCallingUid();
final long origId = Binder.clearCallingIdentity();
try {
return broadcastIntentLocked(callerApp,
callerApp != null ? callerApp.info.packageName : null,
intent, resolvedType, resultTo, resultCode, resultData, resultExtras,
requiredPermissions, appOp, bOptions, serialized, sticky,
callingPid, callingUid, callingUid, callingPid, userId);
} finally {
Binder.restoreCallingIdentity(origId);
}
}
}
broadcastIntent()方法有两个布尔参数 serialized 和 sticky 来共同决定是普通广播,有序广播,还是 Sticky 广播,参数如下:
类型 | serialized | sticky |
---|---|---|
sendBroadcast | false | false |
sendOrderedBroadcast | true | false |
sendStickyBroadcast | false | true |
说完发送广播,接下去就要讲讲讲收广播的操作了。
首先广播发出去之后,肯定会存在一个队列里面来进行处理。
// ActivityManagerService
public ActivityManagerService(Context systemContext, ActivityTaskManagerService atm) {
// ...... 创建了三个队列来保存不同的广播类型
mFgBroadcastQueue = new BroadcastQueue(this, mHandler,
"foreground", foreConstants, false);
mBgBroadcastQueue = new BroadcastQueue(this, mHandler,
"background", backConstants, true);
mOffloadBroadcastQueue = new BroadcastQueue(this, mHandler,
"offload", offloadConstants, true);
mBroadcastQueues[0] = mFgBroadcastQueue;
mBroadcastQueues[1] = mBgBroadcastQueue;
mBroadcastQueues[2] = mOffloadBroadcastQueue;
}
在 ams 的构造函数里面,可以发现这里对广播进行了分类,分别有前台广播,后台广播,Offload 广播,并用一个新的数组将这三个队列放在一起。这里的 handler 是 MainHandler,也就是主线程的。传入是为了获取其 looper 。
BroadcastQueue(ActivityManagerService service, Handler handler,
String name, BroadcastConstants constants, boolean allowDelayBehindServices) {
mService = service;
// 广播的 handler 主要是获取到 ams 中 handler looper 来创建的
mHandler = new BroadcastHandler(handler.getLooper());
mQueueName = name;
mDelayBehindServices = allowDelayBehindServices;
mConstants = constants;
mDispatcher = new BroadcastDispatcher(this, mConstants, mHandler, mService);
}
下面就说下处理广播的逻辑:
private final class BroadcastHandler extends Handler {
public BroadcastHandler(Looper looper) {
super(looper, null, true);
}
@Override
public void handleMessage(Message msg) {
switch (msg.what) {
case BROADCAST_INTENT_MSG: {
if (DEBUG_BROADCAST) Slog.v(
TAG_BROADCAST, "Received BROADCAST_INTENT_MSG ["
+ mQueueName + "]");
// 开始处理广播
processNextBroadcast(true);
} break;
case BROADCAST_TIMEOUT_MSG: {
synchronized (mService) {
broadcastTimeoutLocked(true);
}
} break;
}
}
}
可以发现这里调用的是 processNextBroadcast 方法来处理广播。
final void processNextBroadcast(boolean fromMsg) {
synchronized(mService) {
//part1: 处理并行广播
while (mParallelBroadcasts.size() > 0) {
r = mParallelBroadcasts.remove(0);
r.dispatchTime = SystemClock.uptimeMillis();
r.dispatchClockTime = System.currentTimeMillis();
final int N = r.receivers.size();
for (int i=0; i<N; i++) {
Object target = r.receivers.get(i);
//分发广播给已注册的receiver
deliverToRegisteredReceiverLocked(r, (BroadcastFilter)target, false);
}
addBroadcastToHistoryLocked(r);//将广播添加历史统计
}
//part2: 处理当前有序广播
do {
if (mOrderedBroadcasts.size() == 0) {
mService.scheduleAppGcsLocked(); //没有更多的广播等待处理
if (looped) {
mService.updateOomAdjLocked();
}
return;
}
r = mOrderedBroadcasts.get(0); //获取串行广播的第一个广播
boolean forceReceive = false;
int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
if (mService.mProcessesReady && r.dispatchTime > 0) {
long now = SystemClock.uptimeMillis();
if ((numReceivers > 0) && (now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
broadcastTimeoutLocked(false); //当广播处理时间超时,则强制结束这条广播
}
}
...
if (r.receivers == null || r.nextReceiver >= numReceivers
|| r.resultAbort || forceReceive) {
if (r.resultTo != null) {
//处理广播消息消息,调用到onReceive()
performReceiveLocked(r.callerApp, r.resultTo,
new Intent(r.intent), r.resultCode,
r.resultData, r.resultExtras, false, false, r.userId);
}
cancelBroadcastTimeoutLocked(); //取消BROADCAST_TIMEOUT_MSG消息
addBroadcastToHistoryLocked(r);
mOrderedBroadcasts.remove(0);
continue;
}
} while (r == null);
//part3: 获取下一个receiver
r.receiverTime = SystemClock.uptimeMillis();
if (recIdx == 0) {
r.dispatchTime = r.receiverTime;
r.dispatchClockTime = System.currentTimeMillis();
}
if (!mPendingBroadcastTimeoutMessage) {
long timeoutTime = r.receiverTime + mTimeoutPeriod;
setBroadcastTimeoutLocked(timeoutTime); //设置广播超时延时消息
}
//part4: 处理下条有序广播
ProcessRecord app = mService.getProcessRecordLocked(targetProcess,
info.activityInfo.applicationInfo.uid, false);
if (app != null && app.thread != null) {
app.addPackage(info.activityInfo.packageName,
info.activityInfo.applicationInfo.versionCode, mService.mProcessStats);
processCurBroadcastLocked(r, app); //[处理串行广播]
return;
...
}
//该receiver所对应的进程尚未启动,则创建该进程
if ((r.curApp=mService.startProcessLocked(targetProcess,
info.activityInfo.applicationInfo, true,
r.intent.getFlags() | Intent.FLAG_FROM_BACKGROUND,
"broadcast", r.curComponent,
(r.intent.getFlags()&Intent.FLAG_RECEIVER_BOOT_UPGRADE) != 0, false, false))
== null) {
...
return;
}
}
}
对于广播超时处理时机:
- 首先在part3的过程中setBroadcastTimeoutLocked(timeoutTime) 设置超时广播消息;
- 然后在part2根据广播处理情况来处理:
- 当广播接收者等待时间过长,则调用 broadcastTimeoutLocked(false);也就是引爆炸弹
- 当执行完广播,则调用 cancelBroadcastTimeoutLocked; 也就是拆除炸弹
// BroadcastQueue
final void setBroadcastTimeoutLocked(long timeoutTime) {
if (! mPendingBroadcastTimeoutMessage) {
Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this);
mHandler.sendMessageAtTime(msg, timeoutTime);
mPendingBroadcastTimeoutMessage = true;
}
}
设置定时广播 BROADCAST_TIMEOUT_MSG,即当前往后推 mTimeoutPeriod 时间广播还没处理完毕,则进入广播超时流程。
// BroadcastConstants.java
private static final long DEFAULT_TIMEOUT = 10_000;
// Timeout period for this broadcast queue
public long TIMEOUT = DEFAULT_TIMEOUT;
// Unspecified fields retain their current value rather than revert to default 超时时间还是可以设置的
TIMEOUT = mParser.getLong(KEY_TIMEOUT, TIMEOUT);
来看下具体时间的设置,超时设置的是 10 s。
3.2 拆炸弹
broadcast跟service超时机制大抵相同:
// 取消超时
final void cancelBroadcastTimeoutLocked() {
if (mPendingBroadcastTimeoutMessage) {
// 移除消息
mHandler.removeMessages(BROADCAST_TIMEOUT_MSG, this);
mPendingBroadcastTimeoutMessage = false;
}
}
移除广播超时消息 BROADCAST_TIMEOUT_MSG,这样就把诈弹拆除了。
3.3 引爆炸弹
下面看下引爆炸弹的逻辑,前面我们已经介绍了 BroadcastQueue 中的 handler 的实现了,下面直接看下超时的处理逻辑:
//fromMsg = true
final void broadcastTimeoutLocked(boolean fromMsg) {
if (fromMsg) {
mPendingBroadcastTimeoutMessage = false;
}
if (mOrderedBroadcasts.size() == 0) {
return;
}
long now = SystemClock.uptimeMillis();
BroadcastRecord r = mOrderedBroadcasts.get(0);
if (fromMsg) {
if (mService.mDidDexOpt) {
mService.mDidDexOpt = false;
long timeoutTime = SystemClock.uptimeMillis() + mTimeoutPeriod;
setBroadcastTimeoutLocked(timeoutTime);
return;
}
if (!mService.mProcessesReady) {
return; //当系统还没有准备就绪时,广播处理流程中不存在广播超时
}
long timeoutTime = r.receiverTime + mTimeoutPeriod;
if (timeoutTime > now) {
//如果当前正在执行的receiver没有超时,则重新设置广播超时
setBroadcastTimeoutLocked(timeoutTime);
return;
}
}
BroadcastRecord br = mOrderedBroadcasts.get(0);
if (br.state == BroadcastRecord.WAITING_SERVICES) {
//广播已经处理完成,但需要等待已启动service执行完成。当等待足够时间,则处理下一条广播。
br.curComponent = null;
br.state = BroadcastRecord.IDLE;
processNextBroadcast(false);
return;
}
r.receiverTime = now;
//当前BroadcastRecord的anr次数执行加1操作
r.anrCount++;
if (r.nextReceiver <= 0) {
return;
}
...
Object curReceiver = r.receivers.get(r.nextReceiver-1);
//查询App进程
if (curReceiver instanceof BroadcastFilter) {
BroadcastFilter bf = (BroadcastFilter)curReceiver;
if (bf.receiverList.pid != 0
&& bf.receiverList.pid != ActivityManagerService.MY_PID) {
synchronized (mService.mPidsSelfLocked) {
app = mService.mPidsSelfLocked.get(
bf.receiverList.pid);
}
}
} else {
app = r.curApp;
}
if (app != null) {
anrMessage = "Broadcast of " + r.intent.toString();
}
if (mPendingBroadcast == r) {
mPendingBroadcast = null;
}
//继续移动到下一个广播接收者
finishReceiverLocked(r, r.resultCode, r.resultData,
r.resultExtras, r.resultAbort, false);
scheduleBroadcastsLocked();
if (anrMessage != null) {
// 发送 anr 消息,带上了 anr 进程信息和 anr 消息
mHandler.post(new AppNotResponding(app, anrMessage));
}
}