Cutting some Slack, for leaks and giggles

Cutting some Slack, for leaks and giggles

In this article I run the new LeakCanary toolkit against the Slack Android app. Read on to learn a bunch!

A new LeakCanary toolkit

In two months, I will give a talk at Droidcon SF: Cutting Edges: universal heap trimming with LeakCanary 3

At Square, we scaled our LeakCanary usage over the last nine years by running it on all UI tests on every pull request, uploading leaks detected in debug builds, and triaging leaks weekly. This works: we fixed thousands of leaks (in our apps, third-party libraries, and the Android Framework), and we're now finding fewer and fewer new leaks!

Unfortunately, we sometimes see the heap size grow over time without LeakCanary finding any issue. For example, constantly appending string logs to a collection would not trigger LeakCanary but would still lead to ANRs and OOMEs when the app eventually runs out of memory.

Inspired by the BLeak paper and the work of the Android Studio team, I built a new toolkit in LeakCanary that performs repeated heap dump diffs and detects objects with a constantly increasing number of outgoing edges (for example, a list that keeps growing).

Come learn how this works; together, we can fix all the leaks!

I have two months to turn a prototype into a real tool! I just shipped a preview in LeakCanary 3.0 alpha 4. I have been testing it on Square apps, and now I want to see if it's useful for other apps, especially complex apps. Please try it out!

In the meantime, I can do the work myself with other apps. Let's pick an app I use a lot... Slack Android!

Installing Slack Android on an emulator

LeakCanary works with heap dumps. The new heap growth detection toolking can run as a UI Automator test, invoking am dumpheap to dump the heap of another app. Unfortunately, on normal Android OS builds, this only works for apps that are debuggable or profileable as shell, which obvious isn't the case of the Slack Android production app. Fortunately, these restrictions don't apply for Android userdebug OS builds. Non Play Store Emulator images are userdebug builds.

Let's download the APKs from my phone:

$ adb shell pm path com.Slack

package:/data/app/com.Slack/base.apk
package:/data/app/com.Slack/split_config.arm64_v8a.apk
package:/data/app/com.Slack/split_config.xxhdpi.apk

$ adb pull /data/app/com.Slack/base.apk
$ adb pull /data/app/com.Slack/split_config.arm64_v8a.apk
$ adb pull /data/app/com.Slack/split_config.xxhdpi.apk

I can then create an emulator similar to my phone and install them:

$ adb install-multiple base.apk split_config.arm64_v8a.apk split_config.xxhdpi.apk

UI Automator test

First let's get our Gradle setup right:

dependencies {
  androidTestImplementation 'com.squareup.leakcanary:leakcanary-android-uiautomator:3.0-alpha-4'
  androidTestImplementation libs.assertjCore
  androidTestImplementation libs.junit
  androidTestImplementation libs.androidX.test.runner
}

android {
  defaultConfig {
    testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
  }
}

Here's what the test looks like, repeatedly switching back and forth between two workspaces:

  @Test
  fun switching_workspaces_repeatedly_should_not_grow_heap() {
    val heapDiff = detector.findRepeatedlyGrowingObjects {
      device.openWorkspaceDrawer()
      device.selectWorkspace("Android Study Group")
      device.openWorkspaceDrawer()
      device.selectWorkspace("droidcon")
    }

    assertThat(heapDiff.growingObjects).isEmpty()
  }

I can then run the test:

Here's the full test class with setup code and helper functions:

import androidx.test.platform.app.InstrumentationRegistry
import androidx.test.uiautomator.By
import androidx.test.uiautomator.UiDevice
import androidx.test.uiautomator.Until
import org.assertj.core.api.Assertions.assertThat
import org.junit.Before
import org.junit.Test
import shark.ObjectGrowthDetector
import shark.forAndroidHeap

class SlackTest {

  private val device = UiDevice.getInstance(InstrumentationRegistry.getInstrumentation())!!

  private val detector = ObjectGrowthDetector.forAndroidHeap().repeatingUiAutomatorScenario(
    dumpedAppPackageName = SLACK_PKG,
    maxHeapDumps = 10,
    scenarioLoopsPerDump = 10
  )

  @Before
  fun setUp() {
    device.restartSlack()
  }

  @Test
  fun switching_workspaces_repeatedly_should_not_grow_heap() {
    val heapDiff = detector.findRepeatedlyGrowingObjects {
      device.openWorkspaceDrawer()
      device.selectWorkspace("Android Study Group")
      device.openWorkspaceDrawer()
      device.selectWorkspace("droidcon")
    }

    assertThat(heapDiff.growingObjects).isEmpty()
  }

  private fun UiDevice.restartSlack() {
    executeShellCommand("am force-stop $SLACK_PKG")
    wait(Until.gone(By.pkg(SLACK_PKG)), 5_000)
    executeShellCommand("am start $SLACK_PKG")
    wait(Until.findObject(WORKSPACE_DRAWER_ICON_BUTTON), 5_000)
  }

  private fun UiDevice.openWorkspaceDrawer() {
    val teamAvatarButton = findObject(WORKSPACE_DRAWER_ICON_BUTTON)!!
    teamAvatarButton.click()
    wait(Until.findObject(WORKSPACE_NAME_ROW), 5_000)
  }

  private fun UiDevice.selectWorkspace(name: String) {
    val group = findObject(By.text(name))!!
    group.click()
    wait(Until.gone(WORKSPACE_NAME_ROW), 5_000)
  }

  companion object {
    const val SLACK_PKG = "com.Slack"
    val WORKSPACE_NAME_ROW = By.res(SLACK_PKG, "workspace_name")!!
    val WORKSPACE_DRAWER_ICON_BUTTON = By.res(SLACK_PKG, "team_avatar_button")!!
  }
}

If anyone wants to try running findRepeatedlyGrowingObjects() with Maestro, be my guest!

Results

I shared the results with the team at Slack. I want to showcase just one of the results, as it's interesting:

There was 1 failure:
1) switching_workspaces_repeatedly_should_not_grow_heap(SlackTest)
java.lang.AssertionError:
Expecting empty but was:<[
┬───
│ GcRoot(ThreadObject) (372 objects)
    Retained size: 289 KB
    Retained objects: 7609
    Children:
    372 objects (20 new): INSTANCE_FIELD Thread.blockerLock -> instance of java.lang.Object
    372 objects (20 new): INSTANCE_FIELD Thread.inheritedAccessControlContext -> instance of java.security.AccessControlContext
    371 objects (20 new): INSTANCE_FIELD Thread.lock -> instance of java.lang.Object
    201 objects (20 new): INSTANCE_FIELD Thread.target -> instance of slack.app.SlackAppProdImpl$$ExternalSyntheticLambda2
,

...

Here I see an increase of 20 threads between 2 heap dumps. I ran the scenario 10 times in between heap dumps, and the scenario switched workspace twice, so that's 20 workspace switches. So one new thread per workspace switch.

Let's figure out what these new threads are. I can dump the heap from adb:

$ adb shell am dumpheap -g com.Slack /data/local/tmp/slack.hprof
$ adb pull /data/local/tmp/slack.hprof

Then I can write a Kotlin script that parses the heap dump, groups threads by name and counts them:

#!/usr/bin/env kotlin

@file:DependsOn("com.squareup.leakcanary:shark:3.0-alpha-2")

import java.io.File
import shark.HprofHeapGraph.Companion.openHeapGraph

val hprofFile = File("./slack.hprof")

val threadCounts = hprofFile.openHeapGraph().use { graph ->
  graph.findClassByName(Thread::class.java.name)!!.instances
    // group by thread name
    .groupingBy { threadIntance ->
      threadIntance[Thread::class.java.name, "name"]!!.value.readAsJavaString()
    }
    .eachCount()
    .toList()
    // sort by count
    .sortedBy { it.second }
}

println(threadCounts.joinToString("\n") {
  "\"${it.first}\": ${it.second}"
})
...
"ms-event-dispatcher-1": 2
"OkHttp TaskRunner": 2
"NewSqlTransactionMonitor": 4
"file-upload-manager": 201

So there's a thread named file-upload-manager being created forever every time I switch workspaces. Not to worry though, I'm told this will be fixed in the near future.

I was really excited to show you how you can write a Kotlin script to analyze a heap dump, but in this case it would have been much easier to go with a thread dump:

$ adb shell ps -T | grep $SLACK_PID | awk '{print $10}' | sort | 
uniq -c | sort

...
   2 ms-event-dispat
   2 OkHttp TaskRunn 
   4 NewSqlTransacti
 201 file-upload-man

Conclusion

I hope this convinced you to try out the new heap growth detection toolkit in LeakCanary 3. You can use it with JVM Unit tests, Espresso, UI Automator, and even directly from the command line. Let me know what you think!