Vol. 01 · No. 01 · The Apple edition · From kernel to SwiftUI

Built on Apple

Native on purpose. All the way down.

flexGrid doesn't bolt platform frameworks on; it's made of them. From the Mach kernel up through SwiftUI, every layer is the one Apple ships — which is why the app feels native, because it is. Here's the inventory, with the code references that back each claim.

The stack

Built on Apple, all the way down.

flexGrid doesn't bolt platform frameworks on; it's made of them. From the Mach kernel up through SwiftUI, every layer is the one Apple ships — which is why the app feels native, because it is.

Kernel
task_infothread_infomach_task_self_

Memory pressure + per-thread CPU, sampled at the Mach kernel.

Foundation · Concurrency
actorasync letTaskGroupOSSignposter

Swift 6 actor isolation. Vision dispatches fourteen detectors with async let and gathers them in one block.

Media engine
AVFoundationAVPlayerLooperCoreImageImageIOCoreVideoVideoToolbox

Every tile is a real AVQueuePlayer with AVPlayerLooper. Aspect ratios from EXIF without decoding pixels.

Intelligence
Vision (×14)FoundationModelsVTSuperResolutionScaler

Fourteen Vision detectors concurrently on every photo. AI captions on Apple Intelligence.

Integration
AppIntentsCoreSpotlightStoreKit 2AppKit (NSFilePromiseProvider)UniformTypeIdentifiers

Nine Shortcuts intents. Drag-out via the native API. Spotlight indexes your library on opt-in.

UI
SwiftUIPDFKitWebKit.glassEffect()

Every view is SwiftUI. Liquid Glass on macOS 26, calm fallback on macOS 15.

Ten takeaways

The marketing-grade summary, every line traceable to a file.

  • 01. Every tile is a real AVQueuePlayer with AVPlayerLooper — not a GIF-style hack.
  • 02. Fourteen Vision detectors run concurrently on every photo, on-device.
  • 03. Four image analyses share one CGContext render — letterbox, brightness, monochrome, composite.
  • 04. AI captions on Apple Intelligence — your library never leaves the Mac.
  • 05. MemoryMonitor pulls task_info from the Mach kernel — the same data Activity Monitor uses.
  • 06. Aspect ratios read from EXIF with zero pixel decoding.
  • 07. Drag a tile out and it's an NSFilePromiseProvider — Finder, Mail, OBS, all just work.
  • 08. Spotlight indexing (opt-in) puts your library into the system search bar.
  • 09. Nine App Intents, Siri-spoken phrases registered.
  • 10. StoreKit 2 non-consumable IAPs — no server, no receipt validation pipeline.
01. User interface

SwiftUI, AppKit bridges, UTI

SwiftUI

import SwiftUI

The whole UI, end-to-end.

Every view, every animation, every state holder. @Observable for state, NSViewRepresentable for the handful of AppKit bridges, .glassEffect for Liquid Glass on macOS 26 with a graceful fallback below. No Electron, no web wrapper.

DesignTokens.swift · LiquidGlassBackground.swift · FlexGridApp.swift

flexGlass() no-ops on macOS < 26 and respects Reduce Transparency

AppKit (bridges)

import AppKit

Real macOS, where SwiftUI needs a hand.

Drag a tile out and it's a real NSFilePromiseProvider — Finder, Mail, OBS, Discord, all just work because it's the native API. NSWorkspace sets your desktop wallpaper, reveals files, and reads the accessibility flags so the app respects Reduce Motion and Increase Contrast.

FilePromiseDragView.swift — invisible NSView with NSDraggingSource

NSWorkspace.shared.setDesktopImageURL · activateFileViewerSelecting

NSBitmapImageRep encodes the Vision foreground mask to PNG

UniformTypeIdentifiers

import UniformTypeIdentifiers

Files in. Files out. Correctly typed.

Drag-out promises serve the file's real UTType so receivers get the right content. Drag-in destinations only accept the types they can handle. The vocabulary is Apple's, not ours.

UTType(filenameExtension:)?.identifier ?? UTType.data.identifier

Drag-in via UTType.fileURL in the side pane reader

02. Media engine

AVFoundation, VideoToolbox, Core Image, Core Video, ImageIO, Core Graphics, PDFKit + WebKit

AVFoundation

import AVFoundation

Real players. True loops. No hacks.

Every tile is a real AVQueuePlayer with an AVPlayerLooper running across the loop boundary — the same primitives Apple uses for Live Photos. A PlayerPool warms a small set of these and recycles them across cells; URL swap is replaceCurrentItem / insert(after:nil), with mute-and-volume-reset before each swap so you never hear a click.

PlayerPool.swift — LRU keyed on (activePlayers, lastUsedTime, lruOrder)

PlayerContainerView.swift — AVPlayerLooper bound to fresh AVPlayerItem from a shared asset

BackgroundAudioPlayer.swift — AVAudioPlayer with security-scoped bookmarks

iCloudHelper.isKnownLocallyAvailable(url) gates recycling — files mid-download are never selected.

VideoToolbox

import VideoToolbox

Hardware decode, by way of Apple's pipeline.

Every video cell decodes through VideoToolbox under the covers — that's what AVPlayerLayer is for. The frame-processor primitives are also wired up for future neural-upscale and motion-blur work on Apple Silicon, gated at runtime so the path stays off when the hardware can't honour it.

VideoEffectsProcessor.swift:108-310 — VTFrameProcessor wiring

Runtime guard via VTSuperResolutionScalerConfiguration.isSupported

Frame-processor effects ship on macOS 26 + Apple Silicon — checked at runtime, off otherwise.

Core Image

import CoreImage

LUTs, palettes, and dominant-color tints.

Color-cube LUTs run through CIColorCubeWithColorSpace. CIAreaAverage gives the dominant color for a cell's adaptive padding. CIKMeans extracts palette swatches without ever painting an entire frame. One shared CIContext — sRGB working color space, GPU-rendered.

VideoFilterSystem.swift:302-324 · CIColorCubeWithColorSpace

MediaHelpers.swift:542 · sharedCIContext = CIContext(options: [.workingColorSpace: NSNull()])

MediaHelpers.swift:559 · CIFilter.areaAverage() · :808 · CIKMeans

Core Video

import CoreVideo

Zero-copy hand-offs between frameworks.

IOSurface-backed CVPixelBuffers let Vision hand a mask straight to Core Image without a CPU round-trip. VTFrameProcessor reads and writes the same buffer type. The fewer pixel copies, the smoother the wall stays.

VideoEffectsProcessor.swift:42-101 · CVPixelBufferCreate with kCVPixelBufferIOSurfacePropertiesKey

VisionAnalyzer.swift:390-407 · CVPixelBuffer → CGContext (DeviceGray) → NSBitmapImageRep PNG

ImageIO

import ImageIO

Aspect ratios with zero pixel decoding.

CGImageSourceCreateWithURL reads EXIF metadata without touching the pixels — width, height, all eight orientation cases. That's how a five-thousand-photo folder opens instantly. Subsampled thumbnails decode only the target resolution, never the full image.

MediaHelpers.swift:44-58 · imageAspectRatio(url:) reads only CGImageSourceCopyPropertiesAtIndex

VisionAnalyzer.swift:1367-1374 · CGImageSourceCreateThumbnailAtIndex with kCGImageSourceCreateThumbnailFromImageAlways

Core Graphics

import CoreGraphics

One render, four answers.

Letterbox detection, brightness classification, monochrome detection, and composite/collage detection all read from a single CGContext render — one RGBA buffer, four analyses, one GPU-to-CPU copy. The naïve version is four. The considered version is one.

VisionAnalyzer.swift:872-945 · analyzePixels(_:) drives all four passes

BT.601 luminance · letterbox bar scan with 5% minimum

PDFKit + WebKit

import PDFKit · WebKit

ePub and PDF, beside the grid.

The side-pane reader is the real PDFKit and the real WebKit — not a custom renderer. Night-mode invert, four fonts, three themes; the documents your Mac already knows how to open, where you'd want them.

PdfViewerView.swift · PDFView · PDFDocument

EpubReaderView.swift · WKWebView with broadcast overlay compositor

03. On-device intelligence

Vision, FoundationModels

Vision

import Vision

Fourteen detectors. On every photo. On-device.

The new Swift Vision struct API (no VN prefix) — faces, bodies, hands, poses, text, animals, scenes, aesthetics, attention saliency, objectness, barcodes, feature print, lens smudge, horizon. All fourteen kick off concurrently via async let against a single CGImage. A separate one-render block tacks on four pixel-domain analyses for free.

VisionAnalyzer.swift — actor with acquireSlot/releaseSlot concurrency limiter

Foreground mask via legacy VNGenerateForegroundInstanceMaskRequest

Dedicated page: /vision

DetectLensSmudgeRequest is macOS 26+. Foreground mask is Apple Silicon-recommended.

FoundationModels

import FoundationModels

Smart Captions, on-device, by Apple.

SystemLanguageModel.default writes a natural-language caption for each clip. A specialized .contentTagging variant returns lowercase tags. Sessions are cached by instruction hash so warm context stays warm. The model lives on your Mac; nothing about your library goes over the wire.

FoundationModelService.swift · SystemLanguageModel · LanguageModelSession

SmartCaptionEngine.swift:209-225 · prompt assembled from filename + duration + Vision tags + OCR

macOS 26 + Apple Intelligence enabled. Surfaces .available, .unavailable, .temporarilyUnavailable(.modelNotReady) states to the UI.

04. System integration

Mach kernel, Swift Concurrency, Core Spotlight

Mach kernel APIs

import Darwin · MachO

Reads the room — at the kernel level.

MemoryMonitor calls task_info for resident memory and walks task_threads / thread_info for per-thread CPU. The same data source Activity Monitor uses. Three pressure tiers — normal, elevated, high — with hysteresis so the engine doesn't oscillate. Crossing into high trims caches, sweeps orphaned players, and pauses pricey Vision passes.

MemoryMonitor.swift:105-258 · task_info(mach_task_self_, MACH_TASK_BASIC_INFO, …)

thread_info loop with TH_USAGE_SCALE; explicit vm_deallocate of the thread list

Swift Concurrency

import Foundation

Actors, not locks.

VisionAnalyzer, MetadataCache, FoundationModelService and ConcurrencyLimiter are real actors with strict isolation. Vision dispatches its fourteen detectors with async let and gathers them in one block. The main thread is for SwiftUI; everything heavy lives somewhere else.

actor VisionAnalyzer · async let detectors[14] · TaskGroup batches across files

OSSignposter + os.Logger across 22 files for tracing

Core Spotlight

import CoreSpotlight

Your library, in Spotlight.

Opt in and CSSearchableIndex.default() indexes every scanned item — filename, AI caption, scene labels, OCR text, aesthetics rating, semantic mood. Find a clip from anywhere on macOS; flexgrid://open?id=<itemID> routes the click straight back into the right folder.

SpotlightIndexer.swift:78,132 · CSSearchableItemAttributeSet populated from VisionTags + caption

FlexGridApp.swift:365-383 · URL scheme handler resolves to MediaSourceManager

Off by default. Three opt-ins: master enable, OCR text, faces flag.

05. Automation

App Intents

App Intents

import AppIntents

Nine Shortcuts and Siri actions.

Shuffle, Next Page, Previous Page, Toggle Playback, Set Grid Size, Open New Window, Toggle Blackout, Toggle Freeze, Switch Persona. Wire any of them to a Focus mode, a Stream Deck button, or a Siri phrase. FlexGridShortcutsProvider registers the spoken-phrase variants.

AppIntentsProvider.swift · nine AppIntent types + AppShortcutsProvider

SetGridSizeIntent: tileCount clamped 1…25 (default 4)

06. Commerce

StoreKit 2

StoreKit 2

import StoreKit

One-time purchase. No server.

Plus is a non-consumable StoreKit 2 IAP. Modern async/await transaction loop: Transaction.currentEntitlements on launch, Transaction.updates listened in a long-running Task. Restore is built in. No third-party backend, no receipt-validation server, nothing for you to maintain.

TierManager.swift:364-505 · product loader · entitlement scan · transaction listener

Verified-state-only insertion; unverified transactions throw and back out

Every tile is a real AVQueuePlayer with a true AVPlayerLooper. No seek-back hack.

PlayerPool.swift

The Vision deep-dive

Fourteen detectors. On every photo. On-device.

Vision earns its own page. We walk through every request type we use, the four-passes-share-one-render trick, the foreground-mask pipeline, and the ScanReadinessGate that keeps the surface honest until coverage is real.

In one sentence

Every scanned photo is handed to one VNImageRequestHandler that fans out to fourteen concurrent detectors; while those run, a single CGContext render of the same image feeds four pixel-domain analyses on a shared RGBA buffer.

07. What we're not claiming yet

A small honesty section.

A couple of frameworks live in the engineering notes but aren't in the shipping code. They go here, not on the matrix — because on the matrix means in the binary.

Roadmap

Metal (custom shaders)

Generative background shaders are described in Behind-the-Scenes but not yet in the shipping source. SwiftUI's compositor, Core Image, and VideoToolbox are doing the GPU work today.

Roadmap

Accelerate / AVAudioEngine (BeatDetector)

An FFT-driven beat detector that would let cells react to background audio is on the engineering whiteboard — not in the current build.

In closing

Native because it is.

None of this is incidental. flexGrid picks the right Apple framework for every job and uses it the way it was meant to be used. That's the whole engineering thesis.