Building social features without betraying your users

Architecture · Android · Privacy

A practical journey from cloud-first to local-first social sync, and why the conventional approach was the wrong one.

CRDT local-first privacy Android

When I started designing the social layer for a health tracking app, the obvious path was clear: Firebase Firestore, real-time listeners, cloud-synced state. It's what everyone does. It works. And it took about fifteen minutes of honest reflection to realise it was completely wrong for this product.

This is the story of how I ended up with a CRDT-based local-first architecture — and why I think the conventional cloud-first approach for social health features deserves far more scrutiny than it gets.

The problem with the obvious approach

The typical social feature in a health app works like this: when you complete a workout, finish a fast, or hit your step target, the app writes your status to a cloud database. Your friends' apps subscribe to that database and display your status in real time. Simple, proven, scalable.

But pause and ask: what exactly is sitting on that cloud database?

What "aggregated booleans" actually reveal: isActive=true, elapsedHours=14, completedToday=false, written at 7:04am. Repeat daily. That's a detailed behavioural profile of someone's health habits — when they eat, when they fast, how consistent they are — sitting on a third-party server indefinitely.

The instinct to call this "just metadata" or "aggregated data" is a rationalisation. A continuous stream of health-adjacent behavioural signals is a health profile, regardless of what you name the fields. And once that data is on someone else's infrastructure, you no longer control what happens to it.

What users actually think

The market research is unambiguous. Studies of health app reviews show privacy complaints in negative reviews directly correlate with apps that have more personal data collection in their code. Competing apps in the health space explicitly market "your data stays on your device" as a feature — because users recognise it as one.

The regulatory direction is also one-way. Platform policies are tightening around health data, requiring developers to prove that any health-adjacent data accessed is essential to the app's primary function. "We use it for social features" is not a strong answer.

Cloud-first social

Health behaviour written to third-party servers continuously. Server operator can read behavioural patterns. Data persists indefinitely. User consent is buried in ToS.

Local-first social

Health data stays on device. Only leaves as encrypted ciphertext directly to circle members. Server sees coordination metadata only. User controls their data physically.

The architecture that emerged

The core insight is that social features require two fundamentally different things: coordination (who's in my group, how do I find them) and health data (what's my status right now). These can be decoupled. Coordination metadata can live on a server you control. Health data never has to leave the device.

The three layers

Signaling server — coordination only

Stores: group membership, display names, FCM tokens, invite codes, Ed25519 public keys, encrypted snapshots (ciphertext only, 7-day TTL)
Never stores: any health status, activity data, streak values, or plaintext health-adjacent data
Stack: Ktor 3.x on a $0 Oracle Cloud free-tier VM — sufficient for thousands of users

CRDT sync engine — health data stays encrypted

Type: Last-Write-Wins registers per member — correct for this use case since each member is the sole writer of their own record
Transport: Encrypted deltas (AES-256-GCM via ECDH key agreement) posted to server as opaque ciphertext, collected by requester, decrypted locally
Keys: Ed25519 keypairs in Android Keystore — hardware-backed, never extractable

Local database — source of truth

member_state: merged CRDT state per member per group, LWW semantics
activity_feed: local-only feed entries, never sent to any server
challenge_history: append-only record of completed and expired challenges

How a sync actually works

User pulls to refresh. App posts a sync request to the signaling server.

Server sends FCM data messages to all group members. Payload contains only a group ID and request ID — zero health content.

Each member's device receives the message silently. WorkManager schedules a background job that encrypts the member's current state with the requester's public key and posts the ciphertext to the server.

Requester collects and decrypts deltas locally using the private key stored in hardware — it never leaves the device. Each delta is merged into the local CRDT via LWW semantics.

UI renders progressively as members respond. Non-responders show last-known cached state with a timestamp. Server deletes all deltas after serving.

What the server actually knows

This is the most important property of the architecture. At no point does the signaling server receive plaintext health data. The encrypted snapshots it holds are AES-256-GCM ciphertext — the server is mathematically incapable of reading them without the private keys, which live in device hardware.

Data residency

Group membership listsignaling server

Display namessignaling server

Activity statusdevice only

Step countsdevice only

Any health metricdevice only

Streak valuesdevice only

Encrypted snapshots (ciphertext)server (unreadable)

Health data on third-party infrastructurenever

The honest trade-offs

This architecture is not free. It requires writing a small signaling server rather than using managed Firebase rules. The CRDT merge logic adds complexity that Firestore listeners would hide. The crypto layer — while using standard Android APIs — requires careful key lifecycle management.

There is one irreducible minimum of server dependency: you cannot reach an offline device without some form of addressing service. FCM fills this role here, carrying only a wake-up signal with no health content. Eliminating this would require both devices to be online simultaneously — an unacceptable UX constraint.

The scale ceiling also matters. With a hard cap of 50 members per group, vector clock overhead stays bounded at roughly 2KB of metadata, sync completes in under a second on any reasonable connection, and the signaling server handles thousands of users on infrastructure costing nothing per month.

Why this matters beyond health apps

The pattern generalises. Any social feature where the shared data is sensitive — financial status, location, relationship state, communication history — deserves this level of scrutiny. The default assumption that "we need a cloud database for social features" is worth questioning every time. The coordination layer and the data layer are separable, and treating them as one is an architectural decision with real privacy consequences.

Local-first is not a niche philosophy. It is, increasingly, what users expect from apps that handle anything personal. Building it in from the start is significantly easier than retrofitting it. The architecture described here adds perhaps two weeks to a social feature sprint. The cost of not building it — in user trust, in regulatory exposure, in the genuine harm of unnecessary data collection — is harder to quantify but considerably larger.

CRDT · local-first · Android · Ktor · privacy-first architecture

Building social features without betraying your users

The problem with the obvious approach

What users actually think

The architecture that emerged

The three layers

How a sync actually works

What the server actually knows

The honest trade-offs

Why this matters beyond health apps

Comments

POCs & Emerging Technologies

Why props are read only in React

More from this blog

AI Governance Is Now a Core Discipline of Program Leadership

Why Digital Transformation Programs Fail Long Before Technology Becomes the Problem

Conflict Resolution as a Core Leadership Competency

Understanding the AI Landscape: From Data Science to Agentic AI

Command Palette

Building social features without betraying your users

The problem with the obvious approach

What users actually think

The architecture that emerged

The three layers

How a sync actually works

What the server actually knows

The honest trade-offs

Why this matters beyond health apps

Comments

POCs & Emerging Technologies

Why props are read only in React

More from this blog