HomeiOS DevelopmentAn Interpreter for Swift | Cocoanetics

An Interpreter for Swift | Cocoanetics


Just a few days in the past I launched SwiftBash — a sandboxed bash interpreter written in pure Swift. On the finish of the four-green-checkmarks submit I promised the following instalment can be about one thing else: SwiftScript, the identical thought however for Swift itself.

It’s precisely that. Actual Swift syntax, walked by a tree-walking interpreter, no LLVM, no codegen, no Course of/fork/exec — meant for the locations the place Swift as a compiled binary isn’t an possibility.

After this success with an AST for bash, I figured, let’s up the sport and check out the identical with Swift Syntax. My Claude Opus has confirmed time once more that it has the required tenacity to make any silly thought come true.

Let me simply say it outright: I get very unhappy each time any person insists that TypeScript is the way forward for agentic coding. So I saved sending my want to the universe that Swift must be additionally within the race for that. Now – lastly – I used to be capable of manifest the lacking piece: an interpreter for the language I really like.

Why a Swift interpreter

Swift can already be a scripting language. #!/usr/bin/env swift works right now, and the toolchain even reads command-line arguments and hyperlinks dynamic modules alongside the best way. That covers the case the place you will have the toolchain put in and also you’re allowed to compile-and-run.

What it doesn’t cowl:

  • iOS apps. App Sandbox forbids spawning processes and forbids working JIT-compiled code.
  • macOS sandboxes — similar constraint.
  • Server-side hosts the place you don’t ship the compiler for measurement or safety causes.

Swift is simply too lovely a language to go away unavailable in every single place a compile-and-exec pipeline isn’t. The identical instinct that drove SwiftBash applies right here: take the language an LLM (or a human) already needs to write down, and make it run contained in the host course of, in a deterministic sandbox, with no shell-out.

What it’s constructed on

The inspiration is swift-syntax — Apple’s official, source-of-truth Swift parser, which additionally underpins the trendy Swift compiler frontend. I’d already been utilizing it for 2 earlier tasks:

  • SwiftButler was an early experiment with studying Swift supply and reasoning about it.
  • SwiftMCP‘s macros lean closely on AST-walking to reveal Swift features to MCP purchasers.

When you belief swift-syntax to provide the AST, “interpret it” stops sounding ridiculous and begins sounding like a day venture. SwiftScript has matured properly previous proof-of-concept since then: an interpreter walks the AST, evaluates expressions to actual Swift values, and on the leaves — perform calls, property accesses, initialisers — truly invokes the actual system features.

The repo is about 30,000 traces of Swift right now. Virtually half of that’s auto-generated bridge code. Extra on that in a second.

Boxing and unboxing

The laborious half is the seam between interpreted Swift values and actual Swift values. Contained in the interpreter, each worth is a Worth — a single enum that is aware of be each form Swift cares about:

public oblique enum Worth: @unchecked Sendable {
    case int(Int)
    case double(Double)
    case string(String)
    case bool(Bool)
    case void
    case perform(Perform)
    case vary(decrease: Int, higher: Int, closed: Bool)
    case array([Value])
    case non-compulsory(Worth?)
    case tuple([Value], labels: [String?])
    case dict([DictEntry])
    case set([Value])
    /// Opaque service for a host-Swift worth (CharacterSet, URL, Date,
    /// Information, …) that we do not mannequin structurally.
    case opaque(typeName: String, worth: Any)
    case structValue(typeName: String, fields: [StructField])
    case classInstance(ClassInstance)
    case enumValue(typeName: String, caseName: String, associatedValues: [Value])
}

Int, String, Double, Bool get their very own instances as a result of they’re so frequent. Every little thing Basis arms us that we can’t decompose — URL, Date, CharacterSet, Information, Calendar, JSONEncoder — lives inside .opaque(typeName:, worth: Any). The typeName is a string we use for runtime kind checks; the worth is the precise host occasion held as Any.

Calling URL.absoluteString from inside a script means crossing that seam:

"var URL.absoluteString: String": .computed { receiver in
    let recv: URL = attempt unboxOpaque(receiver, as: URL.self, typeName: "URL")
    return .string(recv.absoluteString)
},

Three issues to note. First, the bridge is keyed by a string that’s a one-line abstract of the Swift declaration"var URL.absoluteString: String". The identical form works for init, func, static let, and so on. It’s greppable, it’s declarative, and it matches the Swift you’d write by hand. Second, the closure is the one piece of executable logic — obtain a Worth, return a Worth. Third, the 2 helpers — unboxOpaque and boxOpaque — carry your complete seam between interpreter values and host values:

/// Wrap a host-Swift worth of kind `T` as a `Worth.opaque`.
func boxOpaque(_ worth: T, typeName: String) -> Worth {
    return .opaque(typeName: typeName, worth: worth)
}

/// Recuperate a host-Swift worth from a `Worth.opaque`. Verifies the boxed
/// `typeName` matches — a type-name mismatch throws relatively than risking
/// a foul downcast.
func unboxOpaque(_ worth: Worth, as: T.Sort, typeName expectedName: String) throws -> T {
    guard case .opaque(let actualName, let any) = worth else {
        throw RuntimeError.invalid("anticipated (expectedName), received (typeName(worth))")
    }
    guard actualName == expectedName else {
        throw RuntimeError.invalid("anticipated (expectedName), received (actualName)")
    }
    guard let solid = any as? T else {
        throw RuntimeError.invalid("opaque worth of kind (actualName) didn't solid")
    }
    return solid
}

A way with arguments seems to be the identical in each instructions:

"func URL.appendingPathComponent()": .technique { receiver, args in
    let recv: URL = attempt unboxOpaque(receiver, as: URL.self, typeName: "URL")
    return boxOpaque(recv.appendingPathComponent(attempt unboxString(args[0])),
                     typeName: "URL")
},

unboxString(args[0]) pulls a String again out of Worth.string; the decision returns an actual URL; boxOpaque packs it again as Worth.opaque(typeName: "URL", …). The script by no means sees the URL occasion immediately, however each operation on it’s the precise Basis technique, with all its actual behaviour — its NSURL bridging, its .fileURL quirks, its percent-encoding guidelines. We’re not reimplementing Basis; we’re routing by way of it.

The bridge generator

You don’t write 13,000 traces of these entries by hand. Or at the very least: I didn’t, after writing the primary two by hand, swearing audibly, and writing BridgeGeneratorTool/fundamental.swift as a substitute.

It’s a 2,200-line command-line device that:

  1. Takes a number of image graphs — Apple’s machine-readable JSON dump of a module’s public floor, which Xcode emits throughout DocC builds. Stdlib’s image graph and Basis’s image graph give us each public kind, each public perform, each initialiser, each protocol conformance, each generic constraint.
  2. Walks these graphs, classifies every image by its form (computed property, occasion technique, static technique, init, failable init, throwing init, throwing async technique with generics, …), and emits the suitable .technique / .computed / .staticMethod bridge entry for every.
  3. Writes two output recordsdata — one for stdlib, one for Basis — every containing tens of hundreds of entries.

The generator handles an extended tail of instances that might in any other case have eaten a month of debugging:

  • Elective-returning features unbox their arguments, name, then field the outcome as .non-compulsory(boxOpaque(...)) if non-nil, .non-compulsory(nil) in any other case.
  • Throwing initialisers (init?(string:)) get an specific failure-path bridge — return .non-compulsory(nil) when the host name returns nil.
  • Generic features with kind constraints emit a generic verify at name time. The interpreter has a built-in protocol-predicate desk that decides whether or not a Worth “is” Encodable/Comparable/Sequence/and so on. with out attempting to really conform it:
"Encodable":  { _ in true },          // ScriptCodable wraps any Worth
"Hashable":   { _ in true },
"Comparable": { v in
    change v { case .int, .double, .string: return true; default: return false }
},
"Sequence": { v in
    change v {
    case .array, .set, .vary, .string, .dict: return true
    default: return false
    }
},

Encodable returning true for all the pieces feels like a cheat. It isn’t — the following paragraph explains.

Codable round-trips by way of actual Basis

Script code does this on a regular basis:

let person = Person(title: "Bob", age: 42)
let information = attempt JSONEncoder().encode(person)
print(String(information: information, encoding: .utf8)!)

Person is a struct outlined contained in the script. The interpreter has a Worth.structValue(typeName: "Person", fields: [...]) for it. JSONEncoder().encode(person) is a bunch name into actual Basis — and Basis has no thought encode Worth.

The trick is a skinny Codable adapter: a ScriptCodable wrapper that conforms to Codable and walks the Worth tree itself, asking the encoder for the appropriate container variety at every step (single-value for primitives, keyed for structs/dicts, unkeyed for arrays). Encoding is symmetric and desires no kind context. Decoding is the more durable route — JSON’s {} may very well be any struct; null may very well be any non-compulsory — so the decoder reads the script kind title and a back-reference to the interpreter from decoder.userInfo:

public init(from decoder: Decoder) throws {
    guard let interp = decoder.userInfo[.scriptInterpreter] as? Interpreter,
          let typeName = decoder.userInfo[.scriptTargetType] as? String
    else { … }
    self.worth = attempt Self.decodeValue(
        from: decoder, typeName: typeName, interp: interp
    )
}

What this buys: the script is utilizing the actual JSONEncoder with the actual methods (.iso8601, .convertToSnakeCase, …). We don’t reimplement the format. We don’t must. Each Date/URL/Information quirk is Basis’s quirk, not ours.

The identical wrapper handles PropertyListEncoder, customized Encoders the person writes, JSONDecoder from a community response — something within the Codable ecosystem. One adapter, ~340 traces, extends to your complete serialisation floor of the usual library.

Mirror works too

Swift’s Mirror(reflecting: x) walks the structural form of any worth. Script code can do this by itself values:

"init Mirror(reflecting:)": .`init` { args in
    let field = MirrorBox(mirrored: args[0])
    return .opaque(typeName: "Mirror", worth: field)
},
"var Mirror.kids": .computed { recv in
    guard case .opaque(_, let any) = recv,
          let field = any as? MirrorBox else { … }
    return .array(MirrorModule.childrenOf(field.mirrored))
},

MirrorModule.childrenOf switches on the Worth enum and returns a [Value] of (label: String?, worth: Worth) tuples — .struct returns its fields, .classInstance walks its property cells, .array enumerates its parts with nil labels, .dict returns key/worth pairs. So generic dump helpers, debug printers, and data-driven serialisers — all of the patterns that lean on Mirror.kids — port immediately into script code with the identical floor.

KeyPaths are synthesised closures

individuals.map(.age) in actual Swift makes use of a KeyPath. We don’t mannequin that. As an alternative:

func consider(keyPath: KeyPathExprSyntax) throws -> Worth {
    var steps: [String] = []
    for part in keyPath.parts {
        change part.part {
        case .property(let prop):
            steps.append(prop.declName.baseName.textual content)
        ...
        }
    }
    // Synthesise `{ $0.steps[0].steps[1]... }` as a closure
    ...
}

.age turns into a one-arg closure { $0.age }. .deal with.metropolis turns into { $0.deal with.metropolis }. individuals.map(.age) is then simply individuals.map { $0.age }. The host signature map(_: (Ingredient) -> T) accepts a Perform worth, and we run it beneath the interpreter the identical manner as any user-written closure. Subscript- and optional-chaining parts in keypaths are surfaced as runtime-unsupported errors relatively than being silently mistranslated — they’re uncommon in script code and faking them can be worse than rejecting them.

Iteration, each instructions

Two adapters bridge iteration between host-Swift and script-Swift:

ScriptSequence is a Sequence over any iterable Worth — array, set, dict, string, vary. Wrapping a script worth provides host-Swift code one thing to cross to Array(_:), Set(_:), zip, prefix, and the remainder of the stdlib’s algorithm floor. Bridge code that should stroll a Worth not has to change on each form.

AsyncStreamBox goes the opposite manner — a reference-typed service for an asynchronous component supply, surfaced into the interpreter as .opaque("AsyncStream", field). A registered builtin captures a bunch AsyncIterator‘s .subsequent() in a closure; the for-loop adapter within the interpreter drives it through attempt await stream.subsequent(). That’s how URLSession.bytes(for:) turns into a script-side for await byte in stream { … } with none per-Basis-API glue.

Concurrency, with one sincere cheat

The interpreter’s analysis graph is totally async throws from prime to backside — each consider(...) name signature suspends. So await in script code lands on Swift’s actual concurrency runtime: bridged async leaves (URLSession.shared.information(...), the sleep builtin) genuinely droop and resume.

The cheat is Process { … }. In actual Swift, Process { closure } spawns a brand new concurrent activity and returns a deal with. In SwiftScript, Process { closure } runs the closure physique inline and returns .void. Why: the interpreter mutates shared state (scopes, classDefs, the bridge desk, …) that isn’t Sendable. Spawning actual concurrent Swift duties would race. Inline execution + actual await on leaves is one of the best of each worlds — script code calls bridged async APIs and will get actual suspension, however the interpreter retains its single-threaded mutation assure.

actor Foo { … } declarations are registered as courses for a similar cause. The one-threaded runtime has nothing to isolate, so the await key phrase on actor strategies is a no-op on the expression stage and technique dispatch goes by way of the identical path as courses.

Cross-platform classification, routinely

This one is my favorite engineering contact within the venture, and the rationale 5 checkmarks mild up relatively than three.

The Apple image graph for Basis is large. It contains a number of Apple-only stuff — NSCoding, AppKit-bridged courses, issues that merely don’t ship in swift-corelibs-foundation. A naive bridge generator would emit entries for these and the Linux/Home windows/Android builds would fail to hyperlink.

Hand-curating an “Apple-only” record can be tedious and inevitably stale. As an alternative there’s a tiny companion device, SCLSymbolExtractor, which parses the precise supply tree of swift-corelibs-foundation with swift-syntax and emits a flat record of each public kind member it declares:

Sort.memberName                       (cross-platform member)
Sort.memberName  UNAVAILABLE          (declared however @out there(*, unavailable))
Sort.            (cross-platform kind marker; member title empty)
.topLevelFunc                         (cross-platform free perform)

The ensuing file (Assets/foundation-symbols-scl.txt, ~8000 traces) is consumed by the bridge generator, which then routinely wraps each Apple-only entry in #if canImport(Darwin). No hand-curated +Apple.swift companions; no merge conflicts when Linux’s Basis will get a brand new technique; the entire coverage is a regenerate-from-source step.

The companion handles the gnarly instances too: @out there(*, unavailable) declarations keep marked Apple-only as a result of Linux declares the image however throws at runtime; entries the generator can’t classify (no proudly owning kind, no signature) are conservatively wrapped.

What it’s good for

The pure area of interest is the place the place bash begins to creak — something that desires actual numbers, structured information, or an area perform with named parameters:

struct Pattern { let label: String; let values: [Double] }

func imply(_ xs: [Double]) -> Double {
    xs.cut back(0, +) / Double(xs.depend)
}

let samples = [
    Sample(label: "alpha", values: [12.1, 13.4, 11.9]),
    Pattern(label: "beta",  values: [9.5, 8.7, 10.2]),
]

for s in samples {
    print("(s.label): (imply(s.values))")
}

That’s a swift-script one-liner away from working. No compile step, no toolchain on the runtime host, no shell-out. The identical supply loaded by an iOS app, evaluated in-process, sandboxed.

The Examples/llm_probes/ folder within the repo is a set of ten small applications an LLM would possibly usually write — imply/stddev, primes, quadratic system, Fibonacci, Simpson’s rule numerical integration, compound curiosity. All of them run unmodified.

The shebang case works:

#!/usr/bin/env swift-script
import Basis

let nums = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
let sorted = nums.sorted()
print("sorted:", sorted)
print("imply:  ", String(format: "%.2f", Double(nums.cut back(0, +)) / Double(nums.depend)))

chmod +x it, run it immediately, the binary on $PATH is swift-script as a substitute of swift. The script by no means will get compiled. It runs fully inside swift-script‘s course of.

The place it falls brief

Two huge sincere caveats up entrance:

  1. No kind checking forward of time. SwiftScript evaluates varieties when the decision occurs — when unboxOpaque(receiver, as: URL.self, typeName: "URL") runs and both will get again a URL or throws. There’s no compile cross to inform you “that’s an Int, you’ll be able to’t cross it to URL.appendingPathComponent” earlier than the script begins working. swift-syntax already provides us the structural info; the interpreter simply doesn’t make the most of it for kind evaluation but. That’s the following apparent frontier.
  2. Class inheritance is approximated relatively than walked by way of an actual vtable. A subclass is registered with a recorded superclass title; technique dispatch walks the chain at name time; override is checked structurally (declaring override on a technique that doesn’t truly shadow a superclass technique is a compile-error-equivalent runtime error). It helps the on a regular basis patterns — override, retailer fields, share state through reference semantics — and even permits inheriting from bridged dad and mom (a script class inheriting from URL or Date wraps an actual native occasion and falls by way of to the bridged floor for unmodelled members). What it doesn’t completely mirror is each nook of Swift’s class semantics: tremendous chains throughout a number of ranges behave accurately for regular calls, however unique patterns (initialiser inheritance with required, dynamic dispatch by way of Self-typed return) are best-effort.

Neither limitation is key — each are work, not impossibilities.

Just a few numbers

For anybody curious concerning the form of the code:

LOC
Complete Swift in repo ~30,000
Auto-generated stdlib bridges ~1,100
Auto-generated Basis bridges ~13,500
Bridge generator device ~2,200
Interpreter (all the pieces else) ~13,000
Take a look at suite 69 check recordsdata

The bridge generator is the only highest-leverage piece of code within the venture. Each new Basis kind Apple ships turns into out there to script code by re-running the generator towards the up to date image graph; no per-type human work.

The Assets/ listing holds three lists that drive the era: a 126-line allowlist of varieties we all the time embody, a 200-line blocklist of varieties/members the auto-bridge can’t deal with (and the place a hand-rolled bridge in Modules/ takes over), and the 8000-line foundation-symbols-scl.txt cross-platform classifier described above.

Bitrig’s Compiler

I might be amiss if I didn’t tip my hat to Bitrig who tackled this downside barely in another way. As an alternative of strolling the AST tree they invented a compiler that compiles the Swift code right into a type of byte code first. Then the second step executes these instructions in one thing just like a digital machine. However on the finish it nonetheless must transition to the binary world. This strategy optimizes for efficiency as a result of it avoids having to navigate across the tree and boxing and unboxing values.

However untimely optimization is the loss of life of many a venture, so I focused on making it work first. We are able to nonetheless fear about efficiency later. Bitrig is focussing on SwiftUI code that will get written on-device. My main objective is to make use of Swift as top notch scripting language, so efficiency is a lesser concern.

What I’d like to see

It could be great if the official Swift venture leaned into protected, embedded scripting as a first-class use case — a sanctioned interpreter mode, blessed bridges over Basis and the usual library, and a transparent reply to “I need to ship a Swift script that runs inside an iOS app’s sandbox with out compiling code.”

Till then, SwiftScript is what I’ve. The repo is over right here, the README has the set up line, and the identical five-checkmark CI as SwiftBash now retains it sincere on macOS, iOS, Linux, Home windows, and Android. As typical, I’m very a lot focused on your ideas on this and any of my different OSS tasks.


Classes: Updates

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments