March 2026 · AI & Privacy

What Makes a Good AI Companion App (And Why Privacy Isn't Optional)

Most of the conversation about AI assistants frames the use case around productivity. Summarize this email. Write this code. Research this topic. That framing makes sense in a business context, where the value of AI is easy to quantify: time saved, tasks completed, money not spent on contractors.

But it misses a significant portion of how people actually use AI chat in practice. And when you look at that other category of use cases — the personal, emotional, low-stakes-but-still-private ones — the entire question of where your words go takes on a different weight.

What people actually use AI chat for

There's a kind of conversation that happens at the edge of your day. After something frustrating happens at work and you're not quite ready to talk about it with your partner. When you're trying to figure out how you actually feel about a decision. When you're bored on a commute and want to think out loud about something that's been sitting in the back of your mind. When you just want to make dumb jokes with something that will play along.

These aren't productivity tasks. They're closer to what you'd do if you had a friend who was always available, never tired, and could meet you wherever you are emotionally without requiring reciprocity. The kind of conversational outlet that doesn't cost anything socially — you don't have to wonder if you're being too much, or worry about the other person's bandwidth.

"I use it to think out loud. Half the time I don't even need an answer — I just need to say the thing to something that will respond like it's listening."

This is what the "companion" framing is actually about. Not a simulated relationship, not a substitute for human connection — more like a reflective surface that can push back a little, ask a follow-up, offer a different framing. Low stakes. Available at 2am. Doesn't judge you for caring about something that, objectively, isn't that big a deal.

Why privacy matters specifically for this use case

When you're using AI for productivity tasks — writing a cover letter, debugging code, summarizing a document — there's some privacy sensitivity, but most people have made their peace with it. The information you share is relatively professional, relatively impersonal. You know it's going through a server somewhere. You've made the tradeoff.

The companion use case is different. The things you share with a companion — that you're dreading a conversation with your manager, that you feel stuck, that something embarrassed you today, that you're not sure how you feel about something — are not things you would generally share publicly. They're the texture of your interior life.

There's an asymmetry between what you share and what you expect to happen with it. When you vent to a friend, that information lives in their memory and nowhere else. When you vent to a cloud AI service, it may be stored on servers, reviewed for safety moderation, used in aggregate to improve models, or potentially accessible through legal processes or breaches. Most users don't think about this in the moment, because the interaction feels private — it's just you and the chat window.

This gap between the felt privacy and the actual privacy is worth taking seriously, especially as AI companions become more capable and more commonly used for exactly this kind of personal conversation.

Cloud AI vs. on-device AI: what the difference actually means

Most AI chat services — ChatGPT, Claude, Gemini, Perplexity — are cloud-based. Your message is encrypted and sent to a remote server, where a large language model processes it and generates a response, which is sent back to your device. The model running on that server is typically much more capable than anything that can run on a phone today.

On-device AI works differently. The model weights — the billions of numerical parameters that define how the model responds — are downloaded to your device and stored locally. When you send a message, it goes nowhere. The inference (the computation that generates a response) happens in RAM on your device's processor. The response appears on your screen without any network call.

Cloud AI

Larger, more capable models
Fast responses (server hardware)
Always up to date
Your words go to a server
Subject to service terms and data policies
Requires internet connection

On-device AI

Smaller models (constrained by phone RAM)
Slower on older hardware
Works offline after download
Your words never leave the device
No server, no data policy exposure
~1.5–2 GB one-time download

The honest version is that on-device models today produce good conversation but aren't as capable as the best cloud models. They may lose track of context in very long chats, occasionally give vaguer answers, or reason less precisely about complex topics. That's a real tradeoff.

The question is whether that tradeoff is worth it for the specific use case. For productivity tasks where accuracy really matters, probably not — use the best cloud model available. For casual conversation, emotional processing, low-stakes chat — the kind of interaction where "pretty good" is genuinely good enough — on-device is compelling precisely because of what it gives up in capability to gain in privacy.

What "on-device" actually means technically

It's worth being specific, because "on-device AI" can be used loosely in marketing. Here's what it means in PlusOne's case: the model weights are stored in the app's private local storage on your iPhone. When you send a message, the text is passed to a local inference runtime (using Apple's Core ML and Neural Engine capabilities). The runtime runs forward passes through the model entirely in device RAM. No data leaves the device during inference. The response is generated locally and written to local storage.

The model itself is an open-source quantized language model — meaning the original floating-point weights have been compressed to use less memory, trading some precision for the ability to fit on phone hardware. The quantization happens before the model is distributed; you download the already-compressed version.

After the initial model download, PlusOne makes no network requests during a conversation. You could turn on airplane mode and every feature that involves the AI would still work exactly the same way.

The tradeoffs PlusOne chose

Building around on-device inference means giving up things. The models are smaller. The initial setup requires a ~2 GB download. Performance varies by device — older iPhones are noticeably slower at inference than newer ones. There's no seamless sync across devices, because there's no server to sync through.

These were deliberate choices. The premise of PlusOne is that for casual, personal conversation — the companion use case — the privacy guarantee is worth more than the capability gap. If you're going to talk to an AI about your actual life, you should be able to do it knowing that conversation genuinely stays with you.

That's not a universal answer. It's a specific bet about a specific use case. But for the person who wants a conversational outlet that's genuinely private, not just marketed as private — on-device is the only architecture that can actually deliver that promise.

PlusOne is a private AI companion for iOS. Conversations run on your device using downloadable open-source models. Learn more about how it works.