Building CCTV RTSP Live: A macOS App for Monitoring IP Cameras
The story of building a macOS app for RTSP camera monitoring from scratch - what started as a personal itch turned into a full-featured indie app.
Series: Building a CCTV App for macOS
- Building CCTV RTSP Live: A macOS App for Monitoring IP Cameras
- The Math Behind Cursor-Centered Zoom
- Building a Recording Engine with FFmpeg on macOS
- Fixing H265 Stream Freeze on Tuya Cameras
I built this app because I was tired of opening SmartLife on my phone just to check my cameras.
I have several IP cameras at home - different brands, but they all run on the Tuya platform and are managed through the SmartLife app. Remote access via the app is free, and it works fine on mobile. But on macOS, there’s no native client. The web version exists but it’s clunky and the performance isn’t great. Every time I wanted to glance at a feed on my Mac, I had to reach for my phone.
The other pain point: cloud storage. If you want to record footage, Tuya charges per camera per year. With more than a couple of cameras, that adds up fast. And you’re storing your footage on their servers, not yours.
Then I noticed something: all my cameras had an ONVIF setting in the SmartLife app. ONVIF is an open standard for IP cameras - it means any camera that supports it can be accessed via a standard RTSP stream URL, directly over the local network. No cloud required. No subscription. No SmartLife.
That was the unlock. I didn’t need to integrate with Tuya’s cloud or SmartLife’s API. I just needed an app that could open RTSP streams and show them all at once.
So I built it.
Why Not Just Use SmartLife?
SmartLife works fine on mobile. On macOS, there’s no native client. The web version is limited and noticeably slower. And even if you get it working, you’re still routing your video through Tuya’s cloud servers - which means latency and dependency on their uptime.
For local monitoring on your own network, ONVIF and RTSP give you a direct connection to the camera. Faster, cheaper, and fully under your control.
VLC can open RTSP streams, but it’s not designed for multi-camera monitoring. You’d need to open a separate window per camera, manage them manually, and there’s no persistent configuration.
I wanted something that felt like a real Mac app: native UI, persistent camera list, multi-camera grid, always available.
How It Started
The first version was embarrassingly minimal. A sidebar with camera names, a single view that played the selected stream. No persistence, no settings, no error handling. Just enough to prove the concept worked.
VLCKit handles RTSP playback well, so that part came together quickly. The hard part was everything around it - managing multiple streams simultaneously, handling reconnects gracefully, making the UI feel like a real Mac app rather than a prototype.
The Evolution
Multi-camera grid came next. The single-camera view was fine for checking one feed, but the whole point of having multiple cameras is seeing them all at once. I built a grid layout that scales from 1 to N cameras, with cells that resize dynamically as you add or remove cameras.
Each cell runs its own VLC media player instance. This sounds expensive but in practice it’s fine - modern Macs handle several simultaneous streams without breaking a sweat, especially with hardware decoding enabled.
The dashboard came after I realized I wanted a persistent overview - something I could leave running on a secondary display. The dashboard is a separate window mode with a cleaner layout, no sidebar, just the grid. It supports a floating-window mode that stays on top of other apps.
Recording took the longest. VLCKit’s built-in recording doesn’t work reliably for live RTSP streams, so I ended up building a separate recording engine using FFmpeg as a subprocess. That decision cascaded into a bunch of interesting problems: codec detection, container compatibility, graceful shutdown. I wrote about that separately.
Substream support was a late addition that turned out to be more useful than I expected. Most cameras expose two streams: a main stream (high resolution, high bitrate) and a substream (lower resolution, lower bitrate). For the grid view, you don’t need full resolution per cell - the cells are too small to see the difference. Switching cells to substreams cuts bandwidth and CPU usage significantly.
Technical Challenges
A few problems that took longer than expected:
H265 compatibility. Some cameras output H265 streams with profiles that VideoToolbox doesn’t handle well. The result is a freeze-reconnect loop that looks like a network problem but is actually a decoder error. The fix was making hardware acceleration configurable per camera and changing the global default to any instead of videotoolbox. Full writeup here.
Cursor-centered zoom. Getting zoom to feel right - where the point under your cursor stays fixed as you scroll - requires careful coordinate space math. My first implementation worked at 1x and drifted at higher zoom levels. The bug was a coordinate space mismatch between the NSView bounds and VLC’s internal view space. Full writeup here.
Memory management with VLC instances. Each camera cell holds a VLC media player. If cells are created and destroyed frequently (resizing the grid, switching layouts), you accumulate player instances that aren’t properly released. I had to be explicit about teardown order - stop playback, detach the drawable, then release the player - to avoid leaks.
What It Looks Like Now
The current version has:
- Multi-camera grid with configurable layout
- Per-camera settings (stream URL, name, hardware acceleration, substream URL)
- Zoom and pan with cursor-centered zoom
- Recording via FFmpeg with H265/QuickTime compatibility
- Dashboard mode with floating window support
- Localization in English and Indonesian
- Reconnect logic with exponential backoff
It’s the app I wanted when I started. I use it every day.
What’s Next
There’s a lot I still want to build:
Local NVR. A proper local recording system with configurable retention, scheduled recording, and a searchable timeline - a complete replacement for Tuya’s cloud storage. Your footage stays on your machine, no subscription required.
ONVIF camera control. ONVIF isn’t just for streaming - it also covers PTZ (pan-tilt-zoom) control, event subscriptions, and device configuration. Tuya’s web interface already supports PTZ, but I want it native in the app. Adding write capabilities via ONVIF would make this a proper camera management tool, no SmartLife needed.
Motion-triggered recording. Always-on recording works but generates a lot of footage to review. The more useful mode is recording only when something happens - motion detected, a person enters the frame. This keeps storage manageable and makes reviewing footage actually practical.
Face recognition. Being able to recognize known faces vs. unknown faces opens up a lot of possibilities: automatic logging of who came and went, alerts for unrecognized visitors. This is more complex and raises privacy considerations worth thinking through carefully, but it’s the kind of feature that makes a local NVR genuinely more useful than a cloud one.
Timeline/playback. Once local recording is in place, reviewing footage should happen inside the app - scrub through a timeline, jump to a specific time, not hunt through files in Finder.
Push notifications. Alerts to iPhone or Mac when motion is detected, without having to keep the app open and visible.
What I’d Do Differently
I’d think harder about the data model earlier. Camera configuration started as a simple array of structs and grew organically as I added features. By the time I added substream support and per-camera hardware settings, the model was carrying a lot of optional fields that would have been cleaner as a proper versioned schema from the start.
I’d also set up UI testing earlier. For a media app with lots of async state (connecting, playing, recording, reconnecting), automated tests for state transitions would have caught several regressions before I shipped them.
Closing
Building something you actually use is a different experience from building for a client or a spec. Every rough edge is something you personally hit. Every improvement is something you personally benefit from. It’s a good feedback loop.
If you have RTSP cameras and a Mac, give it a try.
Series: Building a CCTV App for macOS
- Building CCTV RTSP Live: A macOS App for Monitoring IP Cameras
- The Math Behind Cursor-Centered Zoom
- Building a Recording Engine with FFmpeg on macOS
- Fixing H265 Stream Freeze on Tuya Cameras
Want to monitor RTSP cameras on your Mac?
I'm building a native macOS app for live RTSP streams with floating dashboards and recording. Reach out if you want early access or have a similar project in mind.
Get in touch