If you work remotely, you probably have a VPN. Maybe your company mandates it. Maybe you set one up yourself because you take security seriously. Either way, you trust it to keep your data safe.
Here's the uncomfortable truth: your VPN does nothing to protect the contents of your meeting transcripts once they reach a cloud AI service.
A VPN encrypts the tunnel between your device and the internet. That's genuinely useful—it prevents your ISP, your coffee shop's Wi-Fi owner, or a man-in-the-middle attacker from snooping on your traffic. But the moment your audio arrives at Otter.ai's servers, Fireflies.ai's infrastructure, or Zoom's AI Companion backend, the VPN's job is done. Your words are now someone else's data.
And as a Wired investigation into remote work surveillance revealed, the tools marketed as productivity boosters often double as data collection pipelines.
The Security Illusion Remote Workers Live In
The typical remote work security stack looks something like this:
- VPN — encrypts network traffic
- Encrypted messaging — protects chat conversations
- Two-factor authentication — secures account access
- Full disk encryption — protects local files
All excellent. But notice what's missing? There's no protection for what happens to your data after it arrives at a cloud service.
When you use a cloud-based AI transcription tool, here's the actual data flow:
- Your microphone captures audio on your device
- The app encrypts and sends audio to cloud servers (your VPN protects this step)
- Cloud servers decrypt your audio
- AI models process your raw speech
- Your transcript is stored on cloud infrastructure
- Your data may be used for model training, analytics, or shared with partners
Steps 3 through 6 happen entirely outside your control. Your VPN, your firewall, your endpoint security—none of it applies anymore. You've handed your words to a third party and hoped for the best.
What Cloud Transcription Services Actually Do With Your Audio
Let's look at what the major cloud transcription services disclose in their own policies.
Otter.ai's privacy policy states that they collect and process your audio recordings, transcripts, and associated metadata. They retain this data to "improve and develop" their services—language that typically means training AI models on your content.
Fireflies.ai's privacy policy similarly grants them rights to process your meeting data through their cloud infrastructure, with data retention policies that keep your information on their servers well after you've forgotten the meeting happened.
Zoom's privacy statement has drawn particular scrutiny since the launch of their AI Companion feature. As The Verge reported, Zoom updated its terms of service to allow AI training on customer content—a move that sparked widespread backlash from privacy advocates.
The Data Retention Problem
| Service | Audio Storage | Transcript Retention | Used for AI Training? |
|---|---|---|---|
| Otter.ai | Cloud servers | Until account deletion + retention period | Service improvement (likely) |
| Fireflies.ai | Cloud servers | Per retention policy | Service improvement |
| Zoom AI Companion | Zoom cloud | Per admin settings | Updated TOS allows it |
| Basil AI | Your device only | You control it | Never |
For remote workers discussing anything sensitive—client projects, financial data, personnel matters, product roadmaps—this is a significant exposure. Every meeting transcript sitting on a cloud server is a liability.
Remote Work Has Made This Problem Exponentially Worse
Before 2020, most sensitive meetings happened in conference rooms. There was no digital transcript. No audio file. The conversation existed only in the memories (and handwritten notes) of attendees.
Now, remote workers take an average of 25-30 meetings per week, according to research from Harvard Business Review. Each meeting potentially generates a full audio recording and transcript. That's 100+ hours of sensitive conversation per month being funneled into cloud AI systems.
The scale is staggering. Consider what a single remote employee's meeting transcripts over one year might contain:
- Client strategy discussions and proprietary information
- Employee performance reviews and HR conversations
- Product launch timelines and competitive intelligence
- Financial forecasts and revenue figures
- Vendor negotiations and contract terms
- Casual comments that could be taken out of context
All of this, sitting on someone else's server, protected only by that company's security practices and good intentions. As we've explored in our article on how cloud services use your voice data for AI training, those good intentions have well-documented limits.
The Breach Risk Is Real and Growing
Cloud transcription services are high-value targets for attackers. Why? Because they contain concentrated, searchable records of private business conversations across thousands of organizations.
A breach of a major transcription service wouldn't just expose one company's data—it would expose every company that uses the service. This is the aggregation risk that security professionals worry about.
According to IBM's Cost of a Data Breach Report, the average cost of a data breach reached $4.88 million in 2024, with breaches involving AI-related systems trending even higher due to the volume and sensitivity of data involved.
"The question isn't whether cloud AI services will be breached. It's when. And when it happens, years of meeting transcripts—containing the most candid, unfiltered conversations your organization has—will be exposed."
Your VPN can't protect you from a breach on someone else's infrastructure. Neither can your endpoint security, your firewall, or your security awareness training. The only way to avoid this risk entirely is to ensure your meeting data never leaves your device in the first place.
Why "Enterprise-Grade Security" Isn't the Answer
Cloud transcription vendors love to tout their security certifications. SOC 2 Type II. ISO 27001. AES-256 encryption at rest. These are meaningful credentials—they indicate that a company takes security seriously.
But they don't change the fundamental architecture. Your data still lives on their servers. Their employees may still have access. Their AI models may still train on your content. And a sufficiently determined attacker can still reach it.
For professionals in regulated industries, this matters even more. HIPAA security requirements from HHS mandate strict controls over protected health information. A healthcare worker taking AI-transcribed meeting notes about patient care through a cloud service is creating a compliance liability, regardless of how many certifications that service has.
We've covered the specific risks for regulated professionals in detail in our piece on when AI transcription apps become surveillance tools—the dynamics are remarkably similar for remote workers concerned about their employers monitoring meeting content.
The On-Device Alternative: How It Actually Works
On-device AI transcription eliminates the cloud from the equation entirely. Here's how Basil AI handles meeting transcription:
- Audio capture — Your microphone records audio on your iPhone or Mac
- Local processing — Apple's on-device Speech Recognition API transcribes in real time, using the Apple Neural Engine
- Local storage — Transcripts are stored on your device, protected by iOS/macOS encryption
- Your export, your choice — Send to Apple Notes via iCloud, or keep it entirely on-device
At no point does your audio or transcript touch a third-party server. There's no cloud processing, no retention policy to worry about, no terms of service granting usage rights to your content.
This is the architectural difference that matters. It's not about adding more encryption layers to a cloud pipeline. It's about removing the cloud from the pipeline entirely.
Apple has invested heavily in making on-device AI processing viable. As detailed in Apple's Machine Learning Research publications, their Neural Engine is specifically designed to run sophisticated AI models locally, with performance that rivals cloud processing for many tasks.
What Remote Workers Actually Need
The ideal meeting transcription solution for remote workers should meet these criteria:
- Zero cloud dependency — Processing happens entirely on-device
- Works offline — No internet connection required for transcription
- No account required — No user data to breach in the first place
- User-controlled data — Delete whenever you want, with nothing lingering on external servers
- Long recording capability — Full-day remote work sessions without gaps
- Integration-friendly — Export to tools you already use, on your terms
Basil AI was built specifically around these requirements. With 8-hour continuous recording, real-time on-device transcription, speaker identification, smart summaries, and Apple Notes integration, it delivers the productivity benefits of AI transcription without any of the privacy trade-offs.
The Bottom Line: Protect the Endpoint, Not Just the Pipe
Your VPN protects the pipe. That's valuable. Keep using it.
But for meeting transcription, the pipe isn't where your data is at risk. The risk is at the destination—the cloud servers where your audio is decrypted, processed, stored, and potentially used in ways you never intended.
The only way to eliminate that risk is to ensure your meeting data never has a destination beyond your own device. That's not paranoia. That's sound security architecture.
Remote work gave us flexibility and freedom. On-device AI ensures that freedom doesn't come at the cost of our privacy.