Gunwant Jain

IPC between Termux and Other Android Apps using ZMQ

Posted on ·
Table of Contents

Preface

At FlowDrive, we do not have the luxury of travelling heavy. Everything has to be the fastest it can, every millisecond counts in a single loop. Which is why we turned to ZeroMQ for handling all the networking between different services. ZeroMQ is battle-tested, extremely fast and supports a whole variety of platforms.

ZMQ has a Java implementation, JeroMQ which is a complete rewrite of ZMQ in Java. It would be our first choice, but it does not support many protocols due to their lack of implementations in Java, and IPC is one of it.

But as justified earlier, we could not afford to have our messages go through the entire TCP stack, IPCs are just simpler and much faster.

Fortunately, there exists JZMQ , the Java bindings for libzmq. I gotta mention that using it on our apps introduced more complexity, due to lack of packages for different architectures. We made it work though, by building it ourselves, and cross-compiling it for whatever architectures we needed to support.

To make sure that the reader follows the context, I have to explain the environment FlowPilot works in.

Termux

Not everything we write is in Java. Some services can have the liberty to be written in much slower languages like Python. We run those services inside Termux, an Android app which gives us a unix-like userland on Android. One of the key softwares I worked with was an init-system / process-manager for FlowPilot -- FlowInit.

FlowInit is written in Python, and therefore has to start in Termux. FlowInit communicates with FlowPilot over Pub/Sub and Req/Rep. Some communications are sensitive and making them over TCP would increase the attack vector substantially. This was the perfect time to utilise IPC.

IPC b/w Termux and Android

I mounted the Android directories on the termux land and binded a ZMQ REP socket on a shared path between Android and Termux.

def wait_for_green_flag():
    """Waits for a ready signal from javaland to start FlowInit"""

    context = zmq.Context()
    socket = context.socket(zmq.REP)
    socket.bind("ipc:///data/data/com.termux/files/home/storage/shared/Documents/houston")

    while True:
        # This is asynchronous, so sleeping doesn't matter as long as an
        # infinite loop is running
        time.sleep(Config.FREQUENCY)

        # Wait on getting a flag, then send an ACK and initiate flowinit
        flag = socket.recv_string()
        if flag == "green_flag":
            socket.send_string("ACK")
            break

But I faced with this unusual error:

Traceback (most recent call last):
  File "/data/data/com.termux/files/home/dev/flowinit/venv/bin/flowinit", line 33, in <module>
    sys.exit(load_entry_point('flowinit==0.1.0', 'console_scripts', 'flowinit')())
  File "/data/data/com.termux/files/home/dev/flowinit/venv/lib/python3.10/site-packages/flowinit-0.1.0-py3.10.egg/flowinit/flowinit.py", line 132, in main
    wait_for_green_flag()
  File "/data/data/com.termux/files/home/dev/flowinit/venv/lib/python3.10/site-packages/flowinit-0.1.0-py3.10.egg/flowinit/flowinit.py", line 58, in wait_for_green_flag
    socket.bind(HOST)
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/zmq/sugar/socket.py", line 208, in bind
    super().bind(addr)
  File "zmq/backend/cython/socket.pyx", line 540, in zmq.backend.cython.socket.Socket.bind
  File "zmq/backend/cython/checkrc.pxd", line 28, in zmq.backend.cython.checkrc._check_rc
zmq.error.ZMQError: Invalid argument
make: *** [Makefile:30: run] Error 1

I cross-checked with the PyZMQ repository, this error was not explicitly defined.

Then I checked with the libzmq codebase, this error was again not explicitly defined.

Finally I went on to systrace'ing this:

# ...
unlinkat(AT_FDCWD,"/data/data/com.termux/files/home/storage/shared/Documents/houston", 0) = -1 ENOENT (No such file or directory)
socket(AF_UNIX, SOCK_STREAM, 0)         = 13
fcntl(13, F_SETFD, FD_CLOEXEC)          = 0
bind(13, {sa_family=AF_UNIX, sun_path="/data/data/com.termux/files/home/storage/shared/Documents/houston"}, 67) = -1 EINVAL (Invalid argument)
close(13)
# ...

The actual error was arising from the bind syscall. Onto reading the manpage of bind:

ERRORS
       EINVAL addrlen is wrong, or addr is not a valid address for this
              socket's domain.

But as clearly shown in the strace log, the sun_path is well under 104 chars. Clearly some other goof was at play here. I tried this again after setting my SELinux to permissive, but the error still came up.
Till date I have not realised what was wrong with the path I provided for an IPC socket. But all evidence points to something wrong (maybe for the sake of security) on how android symlinks directories 1

After some more research on Unix Domain Sockets, I came across this paper 2 which has been now removed from clearnet, but the WayBack machine has 1 copy.

I learned more about IPC on Linux, and came across Abstract Sockets.

Abstract Sockets

Whenever you want to create sockets for IPC, you have two options on Linux --

  • Filesystem namespace
    An address in this namespace is associated with a file on the filesystem. When the server binds to an address (pathname), a socket file is automatically created.
    The format for a Filesystem namespaced socket is simple: ipc:///path/to/some/file
  • Abstract namespace
    Abstractly namespaced address are neat. Addresses under this namespace are actually not associated to a file on the filesystem. Instead they are created under /proc/net/unix/.

An Abstract socket address is distinguished from a Filesystem socket by setting sun_path[0] to a null byte \0.

Since any form of thin virtualisation like chroot, proot, whatever Termux does, bind-mount the /dev , /sys and /proc pseudo filesystems on their userland, Abstract namespaced addresses would solve the notion of having a shared filesystem between Android and Termux by completely eliminating it on a deeper level.

We could just make an Abstract IPC socket with the same name on both Termux and an Android app, and both would look for a file with the of the socket under /proc/net/unix .

Who needs block-based filesystems when the Kernel's synthetic filesystems are so versatile.

ZMQ and Abstract Sockets

The only thing remaining was to confirm whether ZMQ supports abstract sockets. And by perusing through the code-base several times earlier, I remembered that it did indeed3 We just need to prefix our paths with @ and ZMQ would identify it as an Abstract namespace address.

Final Implementation

Before implementing my hypothesis, I cross-checked Android docs if somehow their security policies block this behavior and I found that Android says that it blocks app-level access to /proc/net/unix4.

I quickly coded the Android app, and assigned new hosts with the Abstract namespaces for the sockets to connect to.

It worked !

Live Run

1

Termux normally symlinks the directories, It would be better to study how Android practices its security when symlinking source

3

Source
Man Page, which I read after solving everything :(

4

Android docs which are kind of misleading because I did not face any issues in setting my PoC up.

Read other posts