← Back to Writings

Stop Shipping API Keys in Your iOS App: Build a Serverless AI Proxy Instead

Why compiling OpenAI or Gemini keys inside your mobile app binary is a security trap, and how to build an elegant serverless proxy using Cloudflare Workers for free.

March 20, 2026
iOS DevelopmentSwiftUICloudflare WorkersServerlessSecurityAIOpenAIGeminiASOASO Optimization

The Setup

I recently built a native iOS app called Lolos, a bilingual ATS resume and CV maker. To help users write professional summaries and optimize their job descriptions, I integrated pluggable AI engines powered by OpenAI and Gemini.

During the early stages of development, the integration was straightforward. I added the API keys directly to my source code, compiled the app, and watched the AI recommendations work seamlessly. It felt great, until I started preparing for the App Store release.

Then, the reality of mobile application security sank in.

Putting API keys directly inside a mobile app binary is a massive liability. A compiled iOS application is not a vault. Anyone with a basic intercepting proxy like Proxyman or Charles, or even a simple decompilation tool, can extract raw API keys from a binary in seconds. If a bad actor retrieves your OpenAI or Gemini key, they can run up thousands of dollars on your credit card before you even notice.

This is the story of how I moved my API keys off the user’s device and built a secure, production-ready serverless proxy using Cloudflare Workers, all for free.

Why Obvious Fixes Were Wrong

Before jumping into building a custom backend, I explored a few common client-side mitigation strategies. However, they all turned out to be security by obscurity.

1. Storing Keys in Xcode Configuration (XCConfig) or Plist Files

Many tutorials suggest moving keys to a .xcconfig file or a gitignored Secrets.plist. While this prevents you from accidentally committing secrets to public GitHub repositories, it does absolutely nothing to protect the compiled app. Xcode still packages these files into the final app bundle, making them trivial to extract.

2. String Obfuscation (XOR Encoding)

Another popular advice is to hide your key using a custom XOR encryption function or split the string into multiple hardcoded chunks inside Swift. While this stops automated scanners that search for plain sk-... regex patterns, it takes a reverse engineer less than ten minutes to hook into the network requests or inspect the runtime memory to grab the fully decrypted string.

3. Client-Side Encryption

Encrypting the keys with a public/private key pair on the device simply shifts the problem. Where do you store the decryption key? If the app needs to decrypt the key to use it, the decryption logic and the key must reside on the client, which can be reverse-engineered.

The conclusion was unavoidable. If a secret lives on the device, it is no longer a secret. The only way to secure a key is to never send it to the device in the first place.

The Serverless Proxy Blueprint

To keep the API keys secure, I needed an intermediary server. The iOS app would call this server, the server would attach the hidden API keys, make the actual request to OpenAI or Gemini, and pass the response back to the app.

For a solopreneur, setting up a traditional Linux VPS (like an EC2 instance or a DigitalOcean droplet) just to proxy API requests adds unnecessary overhead. You have to manage server security, worry about scaling, and pay a fixed monthly bill even if your app has zero active users.

This is where Cloudflare Workers shine. They are serverless JavaScript functions that run on Cloudflare’s global edge network. They are incredibly fast, scale automatically, and have a generous free tier of 100,000 requests per day.

Here is the exact architecture I built for Lolos:

┌───────────┐                 ┌───────────────────┐                 ┌─────────────┐
│  iOS App  │  ──(Token)──>   │ Cloudflare Worker │  ──(API Key)──> │  OpenAI/    │
│  (Lolos)  │  <──(JSON)───   │     AI Proxy      │  <──(JSON)───   │  Gemini     │
└───────────┘                 └───────────────────┘                 └─────────────┘

The Worker Code (lolos-worker/src/index.js)

The Worker code is extremely lightweight. It validates a shared app token, checks a simple per-IP rate limit, injects the real API key from Cloudflare’s secure vault, and forwards the payload to the respective AI provider.

const RATE_LIMIT_WINDOW_MS = 60000; // 1 minute
const RATE_LIMIT_MAX = 20;           // max requests per IP per window
const ipHits = new Map();            // ip -> { count, resetAt }

export default {
  async fetch(request, env) {
    // CORS preflight support
    if (request.method === "OPTIONS") {
      return new Response(null, { headers: corsHeaders() });
    }

    if (request.method !== "POST") {
      return json({ error: "Method not allowed" }, 405);
    }

    // 1. Shared Token Authentication
    const token = request.headers.get("X-Lolos-Token");
    if (!env.APP_TOKEN || token !== env.APP_TOKEN) {
      return json({ error: "Unauthorized" }, 401);
    }

    // 2. Per-IP Rate Limiting (Simple In-Memory Protection)
    const ip = request.headers.get("CF-Connecting-IP") || "unknown";
    if (isRateLimited(ip)) {
      return json({ error: "Rate limit exceeded. Try again shortly." }, 429);
    }

    const url = new URL(request.url);
    const path = url.pathname;

    try {
      if (path === "/openai") {
        return await proxyOpenAI(request, env);
      }
      if (path === "/gemini") {
        return await proxyGemini(request, env);
      }
      if (path === "/" || path === "/health") {
        return json({ ok: true, service: "lolos-ai-proxy" }, 200);
      }
      return json({ error: "Not found" }, 404);
    } catch (err) {
      return json({ error: "Proxy error", detail: String(err) }, 502);
    }
  },
};

// Proxies POST /openai directly to the official OpenAI API endpoint
async function proxyOpenAI(request, env) {
  if (!env.OPENAI_API_KEY) {
    return json({ error: "OpenAI not configured" }, 503);
  }
  const body = await request.text();
  const upstream = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${env.OPENAI_API_KEY}`,
    },
    body,
  });
  return passthrough(upstream);
}

// Proxies POST /gemini by mapping dynamic models to Google Generative Language API
async function proxyGemini(request, env) {
  if (!env.GEMINI_API_KEY) {
    return json({ error: "Gemini not configured" }, 503);
  }
  const incoming = await request.json();
  const model = incoming.model || "gemini-2.0-flash";
  const payload = incoming.payload ?? incoming;

  const endpoint = `https://generativelanguage.googleapis.com/v1beta/models/${model}:generateContent?key=***  const upstream = await fetch(endpoint, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(payload),
  });
  return passthrough(upstream);
}

function passthrough(upstream) {
  const headers = new Headers(corsHeaders());
  headers.set("Content-Type", upstream.headers.get("Content-Type") || "application/json");
  return new Response(upstream.body, { status: upstream.status, headers });
}

function isRateLimited(ip) {
  const now = Date.now();
  const entry = ipHits.get(ip);
  if (!entry || now > entry.resetAt) {
    ipHits.set(ip, { count: 1, resetAt: now + RATE_LIMIT_WINDOW_MS });
    return false;
  }
  entry.count += 1;
  return entry.count > RATE_LIMIT_MAX;
}

function corsHeaders() {
  return {
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Allow-Methods": "POST, OPTIONS",
    "Access-Control-Allow-Headers": "Content-Type, X-Lolos-Token",
  };
}

function json(obj, status = 200) {
  return new Response(JSON.stringify(obj), {
    status,
    headers: { ...corsHeaders(), "Content-Type": "application/json" },
  });
}

SwiftUI Wiring (The Factory Pattern)

Moving the API keys to the backend meant the iOS app had to change how it communicates. Instead of initializing providers with direct API keys, I designed an elegant factory pattern that automatically selects the safest available environment.

1. Setting Up Dynamic Secrets

I configured the app’s Secrets.swift to read configuration keys dynamically from a gitignored plist.

import Foundation

enum Secrets {
    private static let dictionary: [String: Any] = {
        guard let path = Bundle.main.path(forResource: "Secrets", ofType: "plist"),
              let dict = NSDictionary(contentsOfFile: path) as? [String: Any] else {
            return [:]
        }
        return dict
    }()

    static var openAIKey: String { value(for: "OPENAI_API_KEY") }
    static var geminiKey: String { value(for: "GEMINI_API_KEY") }
    
    // Cloudflare Worker proxy (Production Mode)
    static var proxyURL: String { value(for: "LOLOS_PROXY_URL") }
    static var proxyToken: String { value(for: "LOLOS_PROXY_TOKEN") }

    static var hasProxy: Bool { 
        !proxyURL.isEmpty && proxyURL.hasPrefix("https://") 
    }
    
    static var hasValidOpenAI: Bool { !openAIKey.isEmpty && !openAIKey.contains("placeholder") }
    static var hasValidGemini: Bool { !geminiKey.isEmpty && !geminiKey.contains("placeholder") }

    private static func value(for key: String) -> String {
        (dictionary[key] as? String)?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
    }
}

2. The Pluggable Provider Pattern

I created an AIProvider protocol that handles generating summary requests. Then, I implemented the providers to accept optional proxy information:

final class OpenAIProvider: AIProvider {
    let name = "OpenAI"
    private let apiKey: String
    private let proxyURL: String?
    private let proxyToken: String?
    private let session: URLSession

    init(apiKey: String, proxyURL: String? = nil, proxyToken: String? = nil, session: URLSession = .shared) {
        self.apiKey = apiKey
        self.proxyURL = proxyURL
        self.proxyToken = proxyToken
        self.session = session
    }

    private var usesProxy: Bool { !proxyURL.isEmpty }

    private var endpoint: URL {
        if let proxyURL, !proxyURL.isEmpty {
            return URL(string: "\(proxyURL)/openai")!
        }
        return URL(string: "https://api.openai.com/v1/chat/completions")!
    }

    func chat(system: String, user: String) async throws -> String {
        if !usesProxy {
            guard !apiKey.isEmpty else { throw AIError.missingAPIKey }
        }

        var req = URLRequest(url: endpoint)
        req.httpMethod = "POST"
        req.setValue("application/json", forHTTPHeaderField: "Content-Type")

        if usesProxy {
            if let proxyToken {
                req.setValue(proxyToken, forHTTPHeaderField: "X-Lolos-Token")
            }
        } else {
            req.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization")
        }

        // Configure OpenAI request payload and perform data task...
    }
}

3. Automatic Priority Routing

During application startup, the factory class inspects the current configuration. If the proxy URL and token are present, it selects the Worker Proxy. Otherwise, it gracefully falls back to direct dev keys, or a graceful placeholder if no config exists.

enum AIProviderFactory {
    static func makeBestAvailable() -> AIProvider {
        // 1. Proxy Mode (Production) - App carries zero API keys
        if Secrets.hasProxy {
            return OpenAIProvider(
                apiKey: "",
                proxyURL: Secrets.proxyURL,
                proxyToken: Secrets.proxyToken
            )
        }
        
        // 2. Direct OpenAI (Dev / Local Testing)
        if Secrets.hasValidOpenAI {
            return OpenAIProvider(apiKey: Secrets.openAIKey)
        }
        
        // 3. Direct Gemini (Dev / Local Testing)
        if Secrets.hasValidGemini {
            return GeminiProvider(apiKey: Secrets.geminiKey)
        }
        
        // 4. Fallback (Prevents App Hanging)
        return NoopAIProvider()
    }
}

This factory structure gave me the best of both worlds. During local SwiftUI development and quick debugging iterations, I could write keys directly in Secrets.plist and call the APIs immediately. When shipping the release build to the App Store, I simply cleared the direct keys, provided the LOLOS_PROXY_URL, and compiled a secure, key-free application binary.

The Lessons Learned

Security is often treated as a chore that slows down development, but taking a few hours to protect your assets pays dividends. By choosing a serverless proxy architecture:

  • API keys are safe: The production keys live as environment variables in Cloudflare’s secure vault. They never touch the App Store or the user’s filesystem.
  • Abuse control is cheap: While an attacker can theoretically extract the X-Lolos-Token from the app binary and make unauthorized requests to your proxy, they will quickly hit the in-memory rate limiter per IP. Even better, you can rotate the APP_TOKEN in the Cloudflare backend and release an app update without ever having to revoke or change your root OpenAI/Gemini accounts.
  • Zero-cost maintenance: Because of Cloudflare’s generous free tier, the proxy runs completely free. It only executes when the app is actively called, matching the serverless ethos perfectly.

If you are building an AI-powered iOS app, don’t ship your API keys. Spend an hour setting up a lightweight serverless proxy, and sleep easy at night knowing your credit cards are safe.

Building a mobile app with AI integrations?

I build native mobile apps and secure, lightweight serverless backends for startups and indie developers. If you want to protect your API keys without paying high monthly server bills, let's talk.

Get in touch