HomeiOS DevelopmentIntroducing SwiftBash | Cocoanetics

Introducing SwiftBash | Cocoanetics


Each coding agent I exploit — Claude Code, Codex, even PI — leans on the identical software: /bin/bash. PI specifically runs virtually solely by means of bash, no sandbox in sight. There’s a great motive for that. Bash is likely one of the most closely represented languages in any pre-training corpus on the planet, and LLMs write it fluently. In case you give a mannequin a file to control, a folder to examine, or a one-shot pipeline to assemble, the reply that falls out is nearly all the time just a few strains of shell.

The draw back is the friction. Until you reside in YOLO mode, you spend half your day clicking Permit on discover, grep, sed, and cat prompts. Codex within the cloud sidesteps this by spinning up a contemporary container per job. On my Mac, each Codex and Claude Code fortunately edit my precise information — and even with git worktrees, I’ve ended up with stray uncommitted modifications on principal greater than as soon as.

So I began questioning: bash isn’t actually that difficult a language. What if I simply had Opus write me a bash interpreter — in Swift?

A weekend with the 1M context window

During the last day or so I had Opus on Additional Excessive refill the 1M context window a few instances over. I gave it Vercel’s just-bash for inspiration and bashlex as a reference for a way an actual bash parser is structured, and let it cook dinner.

The constraints I cared about:

  • Pure fashionable Swift. No Course of, no fork, no exec. Has to drop right into a Mac, iOS, or Linux app with out dragging libc shell-out conduct right into a sandboxed binary.
  • Every little thing an LLM would truly write. ls, cat, grep, sed, discover, awk, jq, tar, curl, bc, xargs, mktemp, the lot.
  • Actual sandboxing. Both a cordoned-off temp folder that appears like an actual filesystem to the script, or a pure in-memory tree that by no means touches the disk in any respect.

That final one was the entire level. Codex’s cloud sandboxes are good exactly as a result of they’re disposable. I needed the identical property regionally — and on iOS, the place you’ll be able to’t fork something anyway.

What it appears like

The library is break up into three merchandise plus a CLI. The smallest helpful program is that this:

import BashInterpreter
import BashCommandKit

let shell = Shell()                    // sandbox-by-default id
shell.registerStandardCommands()       // ls, cat, grep, sed, discover, …

strive await shell.run("""
    for f in *.txt; do
      echo "$(basename "$f" .txt): $(wc -l 

Each command is a registered Swift sort. Pipelines are AsyncStream channels. The filesystem is a FileSystem protocol — and there are three implementations to choose from:

  • RealFileSystem — the host’s FileManager, for trusted scripts.
  • SandboxedOverlayFileSystem — confines the script to at least one host listing plus an in-memory /tmp. Symlink escapes are blocked, each path passes by means of realpath(3), and error messages reference digital paths solely — host paths by no means leak.
  • InMemoryFileSystem — pure in-memory tree. Nothing ever hits the disk.

A freshly-constructed Shell() already leaks nothing concerning the host:

$ echo 'whoami; hostname; ls /Customers; cat /and many others/passwd' 
    | swift-bash exec --sandbox /tmp/work /dev/stdin
person
sandbox
ls: /Customers: No such file or listing
cat: /and many others/passwd: No such file or listing

The 4 virtualisation axes — filesystem, community, processes, id — are all impartial. You choose into each. Need the script to have the ability to name your API however nothing else?

shell.networkConfig = NetworkConfig(
    allowedURLPrefixes: ["https://api.example.com/v1/"],
    allowedMethods: ["GET", "POST"],
    denyPrivateIPs: true   // block 127.0.0.1, 10/8, 192.168/16, …
)

That’s it. curl reads from Shell.networkConfig and refuses all the things else with exit standing 7.

Bash 4, not bash 3.2

One small shock from this venture: macOS nonetheless ships /bin/bash 3.2 from 2007, due to a GPL licensing factor. Fashionable Linux, Homebrew, and principally everybody else are on bash 4 or 5. So when LLMs generate bash, they generate bash 4 — associative arrays, ${var^^} case conversion, ${arr[-1]} destructive indexing, mapfile, coproc. SwiftBash targets bash 4.x semantics for all the things it implements, which implies scripts that an LLM writes typically simply work — no “dangerous substitution” surprises.

declare -A counts
for phrase in $(cat phrases.txt); do
  counts[$word]=$(( ${counts[$word]:-0} + 1 ))
executed
for ok in "${!counts[@]}"; do
  echo "$ok: ${counts[$k]}"
executed | kind -k2 -rn

That runs in SwiftBash. It doesn’t run in /bin/bash on a inventory Mac.

The onerous ones, correctly executed

The factor I’m most happy about — and actually a bit stunned by — is how full the implementations of the staple instructions ended up being. These aren’t shims that deal with the three flags an LLM occurs to make use of most frequently. They’re correct implementations of what are, in lots of instances, full programming languages in their very own proper.

The most important ones, ranked by strains of Swift it took to implement them:

Command Swift LOC What it truly is
jq ~4,500 JSON question language: lexer, parser, evaluator, ~80 builtins
awk ~3,000 Sample-action language: lexer, parser, expression tree, builtins
sed ~1,600 Stream-editor mini-language: deal with ranges, s/// with backrefs, b/t branches, maintain house
discover ~900 Expression tree with -and/-or/-not, -exec … {} +, time/dimension predicates
curl ~600 HTTP consumer with the allow-list and SSRF defenses bolted in
bc ~400 Expression calculator with -l math library (Double-precision)

jq, awk, and sed specifically every wanted their very own parser and evaluator — they’re actual languages. The truth that all three got here out coherent, with associative arrays and user-defined features in awk, with hold-space and labels in sed, with path expressions and cut back/foreach in jq, is the half I preserve being slightly amazed by. These are the instructions that make bash truly helpful for information manipulation, they usually’re those I’d most miss in the event that they had been stubbed out.

Past that tier there’s stable protection on grep, rg (ripgrep), kind, tar, gzip/gunzip, diff/patch, yq, tr, minimize, paste, be a part of, comm, xargs, and the remainder of the textbook unix toolkit.

Cowl the bulk, fail actually on the remaining

The design rule I stored coming again to: deal with the vast majority of real-world utilization, and once you hit a limitation, fail in a method the mannequin can learn and route round.

LLMs are remarkably good at restoration in the event you give them an sincere error. They’re horrible in the event you silently produce improper output. So each command emits the identical type of error an actual GNU/BSD software would — prefixed with the command title, written to stderr, with a non-zero exit standing:

$ swift-bash exec script.sh
column: unknown choice: --table-columns
awk: perform `gensub' not carried out
ps: -L not supported in sandbox

When an agent sees awk: perform 'gensub' not carried out, it does the apparent factor: it rewrites the road as a sed substitution or an awk gsub, and strikes on. That restoration loop is the entire motive this works as an LLM software. A silent failure or a improper reply would poison the remainder of the session; a loud, particular error is simply one other information level the mannequin handles in stride.

The corollary: I’d a lot somewhat ship a command with 80% protection and crisp error messages on the lacking 20% than a command with 95% protection and undefined conduct on the sides. If the autopsy on a failed agent run is “it tried comm -12 --check-order and SwiftBash quietly ignored the flag,” I’ve made the improper tradeoff.

Math, due to course you want math

LLM-generated bash loves bc for arithmetic. SwiftBash ships a bc that’s “ok” — it’s Double-accuracy somewhat than arbitrary precision, however for the sorts of expressions an agent truly writes it’s indistinguishable from the actual factor:

$ echo "scale=6; 22/7" | bc
3.142857

$ echo "s(1.5707963)" | bc -l        # sine, with the maths library
.999999999999

$ echo "sqrt(2) * 100" | bc -l
141.42135623730950488

# sum a column of numbers
$ awk '{print $2}' gross sales.tsv | paste -sd+ - | bc
18420.50

Mixed with awk, paste, and the same old $(( … )) arithmetic enlargement, that covers principally each “do a fast calculation” factor an agent reaches for.

Just a few actual scripts

Simply to provide you a way of what runs unmodified — these are the type of one-liners and small pipelines that LLMs produce continuously, they usually all undergo the in-process interpreter with out spawning a single subprocess.

# Discover the ten largest supply information in a tree.
discover . -name '*.swift' -type f -print0 
  | xargs -0 wc -l 
  | kind -rn 
  | head -11 
  | tail -10
# Depend TODO/FIXME feedback by writer, utilizing grep + awk.
grep -rn -E 'TODO|FIXME' Sources/ 
  | awk -F: '{ print $1 }' 
  | xargs -I{} git log -1 --format="%an" -- {} 
  | kind | uniq -c | kind -rn
# Rewrite a config file in place: bump each model: x.y.z by one patch.
sed -i.bak -E 's/^(model: [0-9]+.[0-9]+.)([0-9]+)/1
  $((2+1))/' config.yaml
# Tally HTTP standing codes from an entry log.
awk '{ print $9 }' entry.log 
  | kind | uniq -c | kind -rn 
  | head

None of those want /bin/bash, none want Course of. They run inside the identical Swift course of that hosts your app.

The CLI

There’s a swift-bash binary that mirrors the embedded interpreter — similar parser, similar instructions, similar sandbox flags. You should utilize it as a safer bash for scripts you don’t absolutely belief:

# AI-generated script, no host entry in any respect.
echo "$llm_output" | swift-bash exec --sandbox /tmp/work /dev/stdin

# Sandboxed run with read-only entry to at least one particular API.
swift-bash exec --sandbox ~/Paperwork/scratch 
                --allow-url https://api.github.com/repos/instance/ 
                analyze.sh

It additionally has a parse subcommand that prints the AST, which is helpful once you’re making an attempt to know why some bizarre quoting edge case isn’t doing what you anticipated.

What it’s truly for

The imaginative and prescient is an iPad coding-agent app that embeds this factor as its bash software. OpenAI offers you code_interpreter over the wire, and it’s nice — but when I’ve a wonderfully serviceable interpreter that runs in-process on the system, why pay a round-trip to run wc -l? Mild agentic exploration, summarising a folder of CSVs the person dropped into the sandbox, primary information wrangling — all of it stays native, and all of it stays contained in the sandbox the host app handed the script.

To be clear: SwiftBash solely manipulates information inside the sandbox you give it. It doesn’t attain into the person’s Pictures library or learn arbitrary information from the Recordsdata app. However the sandbox is a traditional Swift FileSystem, which implies an embedding app can plug in no matter additional instructions it needs. I can think about pulling in just a few of my SwiftText routines — Markdown-to-HTML, HTML-to-PDF, that kind of factor — and registering them as bash instructions. Then you’ll be able to have an LLM produce a report in Markdown contained in the sandbox and get a cultured HTML or PDF out of the identical script.

It additionally seems to be a helpful CLI in its personal proper. I now attain for swift-bash exec --sandbox every time an LLM palms me a script and I haven’t but learn the entire thing.

And another factor

I requested Opus to summarise the teachings we discovered constructing the bash interpreter — what the abstractions ended up being, the place the parser and the executor break up, how AsyncStream pipelines truly need to be wired. Then I handed that abstract to one other Opus and requested it to begin a Swift interpreter on the identical structure.

It’s already additional alongside than I anticipated. Most arithmetic, management move, and performance definitions work. I’ll in all probability wire it into SwiftBash itself as a stand-in for swiftc in order that #!/usr/bin/env swift scripts can run inside the identical sandbox as all the things else.

Identical trick, totally different language — and the identical motive it really works. The coaching information is already there. We simply have to provide it someplace secure to run.

Why open supply?

Actually? As a result of I don’t know the way full or right that is but. Bash is a sprawling, decades-old language with all types of corners (job management, brace enlargement edge instances, the seventeen other ways [[ … ]] differs from [ … ]), and I’ve coated the elements that LLM-generated scripts truly train — however “truly train” is a shifting goal. Each mannequin I throw at it finds one other quoting wrinkle.

So I’m placing it on GitHub. In case you learn this and suppose that’s a enjoyable concept, however you forgot about X, please inform me. If in case you have a use case I haven’t considered — embedding it in a Shortcuts motion, wiring it as much as a neighborhood mannequin, utilizing it as a educating sandbox for a bash class — I’d love to listen to that too. The repo is the dialog; I’ll meet you there.


Classes: Administrative

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments