Chapter 43

Build a Simple Shell

Build a Simple Shell

Every time you type ls -la | grep ".go" | wc -l in a terminal, far more happens than meets the eye. Your shell must: parse that line of text, understand what | means, create separate processes for ls, grep, and wc, connect ls's standard output to grep's standard input, connect grep's standard output to wc's standard input, wait for all of them to finish, and then print the final result to your terminal.

This entire sequence touches some of the most fundamental concepts in operating systems: process creation (fork/exec), file descriptor inheritance, pipes, and signals. A shell is a process manager — a language translator between humans and the kernel.

Building a shell is the most direct way to learn operating system principles. Not by reading textbook descriptions, but by invoking system calls directly, feeling the kernel's behavior in your own code. In this chapter we build a bash-lite in Go with pipes, I/O redirection, built-in commands, and history.

Level 1 · What a Shell Really Is

Command Interpreter and Process Manager

The word "shell" evokes the outer layer around an operating system kernel (the core). A shell has two core responsibilities:

Command interpreter: translates text typed by humans into operations the operating system can execute. ls -la → find the /bin/ls program → execute it with arguments -l and -a.

Process manager: creates and manages child processes. Every program you run is a child process created by the shell. The shell decides where each child process's standard input/output connects (terminal? file? the other end of a pipe?), decides when to wait for a child to finish, and decides how to send signals (Ctrl+C, Ctrl+Z) to children.

One of Unix's core design philosophies is: everything is a file. Pipes are files, terminals are files, network connections are files. Shell I/O redirection and pipes are fundamentally operations on file descriptors (FDs).

Why Building a Shell Teaches OS Fundamentals

When you implement | (pipes), you must truly understand the pipe() system call: it creates two file descriptors — one read-only, one write-only — and data written to the write end can be read from the read end. You must understand file descriptor inheritance after fork, understand why child processes should close the end they don't use (otherwise the read end never receives EOF), understand why parent processes should close fds after exec (to prevent resource leaks).

This knowledge reads one way in a textbook and another way entirely when you're debugging your own code. When you first successfully see cat file | wc -l produce the correct output, your understanding of pipes transforms from concept into intuition.

POSIX Shell Standard

What we're building is a simplified shell — not fully POSIX sh compliant. A true POSIX shell must handle: variables, arithmetic expansion, command substitution, glob expansion, here-docs, function definitions, conditionals... every feature has enormous detail.

If you want a complete POSIX shell in Go, look at mvdan/sh — a complete POSIX shell parser and interpreter, with partial support for Bash extensions ([[, (( )), etc.). This chapter's goal is not completeness, but deep understanding of each mechanism by implementing the essential subset.

Level 2 · Core Mechanism Principles

The Hierarchy of Shell Grammar

A shell command line decomposes into these layers:

Command Line
  └── Pipeline: cmd1 | cmd2 | cmd3
        └── Simple Command: cmd [args...] [redirections...]
              └── Token: command name / argument / redirection operator / special character

Tokenization cuts the raw string into tokens. Parsing organizes tokens into a command tree. Execution walks the command tree to create processes.

Special tokens:

Token Meaning
| Pipe: previous command's stdout connects to next command's stdin
> Output redirect: stdout written to file (overwrite)
>> Output redirect append
< Input redirect: stdin reads from file
2> Error redirect: stderr written to file
& Background: don't wait for command to finish
; Command separator: run sequentially
&& Short-circuit AND: run next only if previous succeeded
|| Short-circuit OR: run next only if previous failed
(cmd) Subshell: execute in a child process environment

os/exec.Cmd vs syscall.ForkExec

Go provides two layers of process creation API at different abstraction levels:

os/exec.Cmd: high-level API, encapsulating the details of fork+exec. exec.Command("ls", "-la") creates a Cmd object; cmd.Start() launches it; cmd.Wait() waits for it to finish. I/O is configured through cmd.Stdin, cmd.Stdout, cmd.Stderr fields, which can be os.File, io.Reader/io.Writer, or pipes from os.Pipe().

syscall.ForkExec / syscall.Exec: direct system call invocation. syscall.ForkExec executes fork(2) + exec(2) with fine-grained control over child process attributes (process group, session, extra file descriptors, umask, etc.). Only needed for deeper control — implementing job control or setuid calls.

For building bash-lite, os/exec.Cmd provides sufficient control while avoiding the low-level details of direct file descriptor manipulation. Job control (fg/bg, Ctrl+Z) is when syscall.SysProcAttr comes in.

Process Groups and Job Control

Job control lets users move processes between foreground and background, suspend and resume them. It depends on the OS concepts of process groups and sessions.

A process group is a collection of related processes with a shared process group ID (PGID). The three processes in cmd1 | cmd2 | cmd3 typically belong to the same process group.

Signals can be sent to an entire process group: kill(-pgid, SIGTERM) sends SIGTERM to every process in the group. This is why Ctrl+C terminates an entire pipeline (not just the first process).

// Make the child process the leader of a new process group
cmd.SysProcAttr = &syscall.SysProcAttr{
    Setpgid: true,  // child's PGID = child's PID (new process group leader)
}

File Descriptor Inheritance

After fork(), the child process inherits all of the parent's file descriptors. This is both the foundation of how pipes work and the source of resource leaks.

For the pipe cmd1 | cmd2, the correct fd management sequence:

Parent creates pipe: pipe(read_fd, write_fd)

Fork cmd1:
  - cmd1's stdout = write_fd (cmd1 writes its output to the pipe)
  - cmd1 closes read_fd (cmd1 doesn't read the pipe, no need to keep read end)
  - parent closes write_fd (parent doesn't write, no need to keep write end)

Fork cmd2:
  - cmd2's stdin = read_fd (cmd2 reads from the pipe)
  - cmd2 closes write_fd (cmd2 doesn't write, no need to keep write end)
  - parent closes read_fd (parent doesn't read, no need to keep read end)

If cmd1's process doesn't close read_fd, cmd1 holds a reference to the pipe's read end. When cmd1 exits, that reference disappears. But if cmd2 is waiting for data from cmd1, and cmd1 is still alive (even writing nothing), the pipe read end will never receive EOF, and cmd2 will block forever. This is a "pipe deadlock."

Level 3 · Building bash-lite From Scratch

Project Structure

bashlit/
├── main.go         ← REPL main loop
├── lexer/
│   └── lexer.go    ← Tokenization: string → token list
├── parser/
│   └── parser.go   ← Parsing: token list → command tree
├── executor/
│   └── executor.go ← Execution: command tree → create processes
└── builtin/
    └── builtin.go  ← Built-in commands: cd, exit, export, history

Step 1: Lexer

package lexer

import (
    "strings"
    "unicode"
)

type TokenType int

const (
    WORD       TokenType = iota
    PIPE
    REDIR_OUT
    REDIR_APP
    REDIR_IN
    REDIR_ERR
    BACKGROUND
    SEMICOLON
    AND
    OR
    LPAREN
    RPAREN
    EOF
)

type Token struct {
    Type TokenType
    Val  string
}

type Lexer struct {
    input []rune
    pos   int
}

func New(input string) *Lexer {
    return &Lexer{input: []rune(input)}
}

func (l *Lexer) Tokenize() []Token {
    var tokens []Token
    for l.pos < len(l.input) {
        ch := l.input[l.pos]

        if unicode.IsSpace(ch) {
            l.pos++
            continue
        }

        if ch == '"' || ch == '\'' {
            tokens = append(tokens, Token{WORD, l.readQuoted(ch)})
            continue
        }

        switch {
        case ch == '|' && l.peek() == '|':
            tokens = append(tokens, Token{OR, "||"})
            l.pos += 2
        case ch == '&' && l.peek() == '&':
            tokens = append(tokens, Token{AND, "&&"})
            l.pos += 2
        case ch == '>':
            if l.peek() == '>' {
                tokens = append(tokens, Token{REDIR_APP, ">>"})
                l.pos += 2
            } else {
                tokens = append(tokens, Token{REDIR_OUT, ">"})
                l.pos++
            }
        case ch == '<':
            tokens = append(tokens, Token{REDIR_IN, "<"})
            l.pos++
        case ch == '|':
            tokens = append(tokens, Token{PIPE, "|"})
            l.pos++
        case ch == '&':
            tokens = append(tokens, Token{BACKGROUND, "&"})
            l.pos++
        case ch == ';':
            tokens = append(tokens, Token{SEMICOLON, ";"})
            l.pos++
        case ch == '(':
            tokens = append(tokens, Token{LPAREN, "("})
            l.pos++
        case ch == ')':
            tokens = append(tokens, Token{RPAREN, ")"})
            l.pos++
        case ch == '2' && l.peek() == '>':
            tokens = append(tokens, Token{REDIR_ERR, "2>"})
            l.pos += 2
        default:
            tokens = append(tokens, Token{WORD, l.readWord()})
        }
    }
    tokens = append(tokens, Token{EOF, ""})
    return tokens
}

func (l *Lexer) readWord() string {
    start := l.pos
    for l.pos < len(l.input) {
        ch := l.input[l.pos]
        if unicode.IsSpace(ch) || strings.ContainsRune("|&;<>()", ch) {
            break
        }
        l.pos++
    }
    return string(l.input[start:l.pos])
}

func (l *Lexer) readQuoted(quote rune) string {
    l.pos++ // skip opening quote
    start := l.pos
    for l.pos < len(l.input) && l.input[l.pos] != quote {
        if l.input[l.pos] == '\\' {
            l.pos++ // skip escape character
        }
        l.pos++
    }
    val := string(l.input[start:l.pos])
    if l.pos < len(l.input) {
        l.pos++ // skip closing quote
    }
    return val
}

func (l *Lexer) peek() rune {
    if l.pos+1 < len(l.input) {
        return l.input[l.pos+1]
    }
    return 0
}

Step 2: Parser (Command Tree)

package parser

import "github.com/yourname/bashlit/lexer"

// Command represents an executable command node in the AST
type Command struct {
    Args       []string  // command name and arguments (after expansion)
    RedirIn    string    // < filename
    RedirOut   string    // > filename
    RedirApp   string    // >> filename
    RedirErr   string    // 2> filename
    Background bool      // & background execution
    Next       *Command  // ; next command
    Pipe       *Command  // | next command
    And        *Command  // && next command
    Or         *Command  // || next command
    Subshell   *Command  // (cmd) subshell
}

type Parser struct {
    tokens []lexer.Token
    pos    int
}

func New(tokens []lexer.Token) *Parser {
    return &Parser{tokens: tokens}
}

func (p *Parser) Parse() *Command {
    return p.parseList()
}

// parseList parses a semicolon-separated command list
func (p *Parser) parseList() *Command {
    cmd := p.parsePipeline()
    if p.current().Type == lexer.SEMICOLON {
        p.advance()
        cmd.Next = p.parseList()
    }
    return cmd
}

// parsePipeline parses a pipeline
func (p *Parser) parsePipeline() *Command {
    cmd := p.parseAndOr()
    if p.current().Type == lexer.PIPE {
        p.advance()
        cmd.Pipe = p.parsePipeline()
    }
    return cmd
}

// parseAndOr parses && and || logical connectors
func (p *Parser) parseAndOr() *Command {
    cmd := p.parseSimple()
    switch p.current().Type {
    case lexer.AND:
        p.advance()
        cmd.And = p.parseAndOr()
    case lexer.OR:
        p.advance()
        cmd.Or = p.parseAndOr()
    }
    return cmd
}

// parseSimple parses a single command with its redirections
func (p *Parser) parseSimple() *Command {
    cmd := &Command{}

    // Subshell: (cmd)
    if p.current().Type == lexer.LPAREN {
        p.advance()
        sub := p.parseList()
        if p.current().Type == lexer.RPAREN {
            p.advance()
        }
        cmd.Subshell = sub
        return cmd
    }

    for {
        tok := p.current()
        switch tok.Type {
        case lexer.WORD:
            cmd.Args = append(cmd.Args, tok.Val)
            p.advance()
        case lexer.REDIR_IN:
            p.advance()
            cmd.RedirIn = p.current().Val
            p.advance()
        case lexer.REDIR_OUT:
            p.advance()
            cmd.RedirOut = p.current().Val
            p.advance()
        case lexer.REDIR_APP:
            p.advance()
            cmd.RedirApp = p.current().Val
            p.advance()
        case lexer.REDIR_ERR:
            p.advance()
            cmd.RedirErr = p.current().Val
            p.advance()
        case lexer.BACKGROUND:
            cmd.Background = true
            p.advance()
            return cmd
        default:
            return cmd
        }
    }
}

func (p *Parser) current() lexer.Token {
    if p.pos < len(p.tokens) {
        return p.tokens[p.pos]
    }
    return lexer.Token{Type: lexer.EOF}
}

func (p *Parser) advance() { p.pos++ }

Step 3: Executor (Pipes and Redirections)

package executor

import (
    "fmt"
    "os"
    "os/exec"
    "path/filepath"
    "strings"

    "github.com/yourname/bashlit/builtin"
    "github.com/yourname/bashlit/parser"
)

type Executor struct {
    env     map[string]string
    history []string
}

func New() *Executor {
    return &Executor{env: make(map[string]string)}
}

// Execute walks and executes a command tree
func (e *Executor) Execute(cmd *parser.Command) int {
    if cmd == nil {
        return 0
    }
    if cmd.Pipe != nil {
        return e.executePipeline(cmd)
    }
    if cmd.And != nil {
        if code := e.executeSingle(cmd); code == 0 {
            return e.Execute(cmd.And)
        } else {
            return code
        }
    }
    if cmd.Or != nil {
        if code := e.executeSingle(cmd); code != 0 {
            return e.Execute(cmd.Or)
        } else {
            return code
        }
    }
    code := e.executeSingle(cmd)
    if cmd.Next != nil {
        return e.Execute(cmd.Next)
    }
    return code
}

// executePipeline executes cmd1 | cmd2 | ... | cmdN
func (e *Executor) executePipeline(first *parser.Command) int {
    // Collect all commands in the pipeline
    var cmds []*parser.Command
    for cur := first; cur != nil; cur = cur.Pipe {
        cmds = append(cmds, cur)
    }
    n := len(cmds)

    procs := make([]*exec.Cmd, n)
    pipeReads := make([]*os.File, n-1)
    pipeWrites := make([]*os.File, n-1)

    for i := 0; i < n-1; i++ {
        r, w, err := os.Pipe()
        if err != nil {
            fmt.Fprintf(os.Stderr, "pipe: %v\n", err)
            return 1
        }
        pipeReads[i] = r
        pipeWrites[i] = w
    }

    for i, cmd := range cmds {
        args := e.expandArgs(cmd.Args)
        if len(args) == 0 {
            continue
        }
        procs[i] = exec.Command(args[0], args[1:]...)
        procs[i].Env = e.buildEnv()

        // Configure stdin
        if i == 0 {
            if cmd.RedirIn != "" {
                f, err := os.Open(cmd.RedirIn)
                if err != nil {
                    fmt.Fprintf(os.Stderr, "open %s: %v\n", cmd.RedirIn, err)
                    return 1
                }
                procs[i].Stdin = f
            } else {
                procs[i].Stdin = os.Stdin
            }
        } else {
            procs[i].Stdin = pipeReads[i-1]
        }

        // Configure stdout
        if i == n-1 {
            if out, err := e.openOutput(cmd); err != nil {
                fmt.Fprintln(os.Stderr, err)
                return 1
            } else if out != nil {
                procs[i].Stdout = out
            } else {
                procs[i].Stdout = os.Stdout
            }
        } else {
            procs[i].Stdout = pipeWrites[i]
        }
        procs[i].Stderr = os.Stderr
    }

    for _, proc := range procs {
        if proc == nil {
            continue
        }
        if err := proc.Start(); err != nil {
            fmt.Fprintf(os.Stderr, "start: %v\n", err)
            return 1
        }
    }

    // CRITICAL: close all pipe ends held by the parent.
    // If we don't, child processes reading from the pipe will never get EOF.
    for _, w := range pipeWrites {
        w.Close()
    }
    for _, r := range pipeReads {
        r.Close()
    }

    var lastCode int
    for _, proc := range procs {
        if proc == nil {
            continue
        }
        if err := proc.Wait(); err != nil {
            if exitErr, ok := err.(*exec.ExitError); ok {
                lastCode = exitErr.ExitCode()
            } else {
                lastCode = 1
            }
        } else {
            lastCode = 0
        }
    }
    return lastCode
}

// executeSingle executes one command with I/O redirections
func (e *Executor) executeSingle(cmd *parser.Command) int {
    if cmd.Subshell != nil {
        return e.Execute(cmd.Subshell)
    }
    if len(cmd.Args) == 0 {
        return 0
    }
    args := e.expandArgs(cmd.Args)

    if code, handled := builtin.Run(args, e.env, &e.history); handled {
        return code
    }

    proc := exec.Command(args[0], args[1:]...)
    proc.Env = e.buildEnv()

    // stdin
    if cmd.RedirIn != "" {
        f, err := os.Open(cmd.RedirIn)
        if err != nil {
            fmt.Fprintf(os.Stderr, "open %s: %v\n", cmd.RedirIn, err)
            return 1
        }
        defer f.Close()
        proc.Stdin = f
    } else {
        proc.Stdin = os.Stdin
    }

    // stdout
    if out, err := e.openOutput(cmd); err != nil {
        fmt.Fprintln(os.Stderr, err)
        return 1
    } else if out != nil {
        defer out.Close()
        proc.Stdout = out
    } else {
        proc.Stdout = os.Stdout
    }

    // stderr
    if cmd.RedirErr != "" {
        f, err := os.Create(cmd.RedirErr)
        if err != nil {
            fmt.Fprintf(os.Stderr, "create %s: %v\n", cmd.RedirErr, err)
            return 1
        }
        defer f.Close()
        proc.Stderr = f
    } else {
        proc.Stderr = os.Stderr
    }

    if cmd.Background {
        if err := proc.Start(); err != nil {
            fmt.Fprintf(os.Stderr, "%s: %v\n", args[0], err)
            return 1
        }
        fmt.Printf("[%d]\n", proc.Process.Pid)
        return 0
    }

    if err := proc.Run(); err != nil {
        if exitErr, ok := err.(*exec.ExitError); ok {
            return exitErr.ExitCode()
        }
        fmt.Fprintf(os.Stderr, "%s: command not found\n", args[0])
        return 127
    }
    return 0
}

func (e *Executor) openOutput(cmd *parser.Command) (*os.File, error) {
    if cmd.RedirOut != "" {
        return os.Create(cmd.RedirOut)
    }
    if cmd.RedirApp != "" {
        return os.OpenFile(cmd.RedirApp, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
    }
    return nil, nil
}

// expandArgs expands environment variables and globs in arguments
func (e *Executor) expandArgs(args []string) []string {
    var result []string
    for _, arg := range args {
        expanded := os.Expand(arg, func(key string) string {
            if v, ok := e.env[key]; ok {
                return v
            }
            return os.Getenv(key)
        })
        if strings.ContainsAny(expanded, "*?[") {
            if matches, err := filepath.Glob(expanded); err == nil && len(matches) > 0 {
                result = append(result, matches...)
                continue
            }
        }
        result = append(result, expanded)
    }
    return result
}

func (e *Executor) buildEnv() []string {
    var env []string
    for k, v := range e.env {
        env = append(env, k+"="+v)
    }
    env = append(env, os.Environ()...)
    return env
}

Step 4: Built-in Commands

package builtin

import (
    "fmt"
    "os"
    "strconv"
    "strings"
)

// Run executes a built-in command if args[0] matches one.
// Returns (exit_code, true) on match, (0, false) otherwise.
func Run(args []string, env map[string]string, history *[]string) (int, bool) {
    if len(args) == 0 {
        return 0, false
    }
    switch args[0] {
    case "cd":
        return cd(args), true
    case "exit":
        return doExit(args), true
    case "export":
        return export(args, env), true
    case "unset":
        return unset(args, env), true
    case "echo":
        return echo(args, env), true
    case "history":
        return printHistory(*history), true
    case "pwd":
        return pwd(), true
    case "type":
        return typeCmd(args), true
    }
    return 0, false
}

func cd(args []string) int {
    dir := os.Getenv("HOME")
    if len(args) >= 2 {
        dir = args[1]
    }
    if dir == "" {
        fmt.Fprintln(os.Stderr, "cd: HOME not set")
        return 1
    }
    if err := os.Chdir(dir); err != nil {
        fmt.Fprintf(os.Stderr, "cd: %v\n", err)
        return 1
    }
    return 0
}

func doExit(args []string) int {
    code := 0
    if len(args) >= 2 {
        if n, err := strconv.Atoi(args[1]); err == nil {
            code = n
        }
    }
    os.Exit(code)
    return code
}

func export(args []string, env map[string]string) int {
    for _, arg := range args[1:] {
        if idx := strings.Index(arg, "="); idx != -1 {
            env[arg[:idx]] = arg[idx+1:]
        } else if v := os.Getenv(arg); v != "" {
            env[arg] = v
        }
    }
    return 0
}

func unset(args []string, env map[string]string) int {
    for _, key := range args[1:] {
        delete(env, key)
        os.Unsetenv(key)
    }
    return 0
}

func echo(args []string, env map[string]string) int {
    newline := true
    start := 1
    if len(args) > 1 && args[1] == "-n" {
        newline = false
        start = 2
    }
    var parts []string
    for _, arg := range args[start:] {
        parts = append(parts, os.Expand(arg, func(key string) string {
            if v, ok := env[key]; ok {
                return v
            }
            return os.Getenv(key)
        }))
    }
    if newline {
        fmt.Println(strings.Join(parts, " "))
    } else {
        fmt.Print(strings.Join(parts, " "))
    }
    return 0
}

func printHistory(history []string) int {
    for i, cmd := range history {
        fmt.Printf("%4d  %s\n", i+1, cmd)
    }
    return 0
}

func pwd() int {
    dir, err := os.Getwd()
    if err != nil {
        fmt.Fprintln(os.Stderr, err)
        return 1
    }
    fmt.Println(dir)
    return 0
}

func typeCmd(args []string) int {
    builtins := map[string]bool{
        "cd": true, "exit": true, "export": true, "echo": true,
        "history": true, "pwd": true, "type": true, "unset": true,
    }
    for _, name := range args[1:] {
        if builtins[name] {
            fmt.Printf("%s is a shell builtin\n", name)
        } else {
            fmt.Printf("%s is an external command\n", name)
        }
    }
    return 0
}

Step 5: REPL Main Loop with History and Tab Completion

package main

import (
    "fmt"
    "os"
    "os/signal"
    "strings"
    "syscall"

    "github.com/chzyer/readline"
    "github.com/yourname/bashlit/executor"
    "github.com/yourname/bashlit/lexer"
    "github.com/yourname/bashlit/parser"
)

func main() {
    ex := executor.New()

    // Use chzyer/readline for history and tab completion
    rl, err := readline.NewEx(&readline.Config{
        Prompt:          "\033[32m$ \033[0m",
        HistoryFile:     os.Getenv("HOME") + "/.bashlit_history",
        AutoComplete:    readline.NewPrefixCompleter(buildCompleter()...),
        InterruptPrompt: "^C",
        EOFPrompt:       "exit",
    })
    if err != nil {
        fmt.Fprintln(os.Stderr, err)
        os.Exit(1)
    }
    defer rl.Close()

    // Handle Ctrl+C: interrupt current input, don't exit the shell
    sigCh := make(chan os.Signal, 1)
    signal.Notify(sigCh, syscall.SIGINT)
    go func() {
        for range sigCh {
            // readline handles Ctrl+C internally; this goroutine prevents shell exit
        }
    }()

    for {
        // Dynamic prompt showing current directory
        dir, _ := os.Getwd()
        home := os.Getenv("HOME")
        if strings.HasPrefix(dir, home) {
            dir = "~" + dir[len(home):]
        }
        rl.SetPrompt(fmt.Sprintf("\033[36m%s\033[0m \033[32m$\033[0m ", dir))

        line, err := rl.Readline()
        if err != nil { // io.EOF or Ctrl+D
            fmt.Println("exit")
            break
        }

        line = strings.TrimSpace(line)
        if line == "" {
            continue
        }

        tokens := lexer.New(line).Tokenize()
        cmdTree := parser.New(tokens).Parse()
        ex.Execute(cmdTree)
    }
}

func buildCompleter() []readline.PrefixCompleterInterface {
    builtins := []string{"cd", "exit", "export", "echo", "history", "pwd", "type", "unset"}
    var items []readline.PrefixCompleterInterface
    for _, b := range builtins {
        items = append(items, readline.PcItem(b))
    }
    return items
}

Level 4 · Advanced Topics and Edge Cases

Job Control: fg/bg and Ctrl+Z

Real shells support job control: Ctrl+Z suspends the foreground process, bg puts it in the background to continue running, fg pulls a background job to the foreground.

Implementing job control requires:

  1. Using SysProcAttr.Setpgid = true so each pipeline group becomes an independent process group.
  2. Using tcsetpgrp() to switch terminal control from the shell's process group to the foreground process group.
  3. Using SIGTTOU/SIGTTIN signals to coordinate terminal I/O.
  4. Maintaining a job table that tracks each background job's process group ID and status.
// Put child in its own process group (required for job control)
cmd.SysProcAttr = &syscall.SysProcAttr{
    Setpgid: true,
}
if err := cmd.Start(); err != nil {
    return 1
}
pgid := cmd.Process.Pid

// Transfer terminal control to child's process group (foreground run)
// tcsetpgrp is accessed via syscall.Syscall:
// syscall.Syscall(syscall.SYS_IOCTL, uintptr(ttyFd),
//     syscall.TIOCSPGRP, uintptr(unsafe.Pointer(&pgid)))

// When child stops (SIGTSTP), wait4 returns with WIFSTOPPED
var wstatus syscall.WaitStatus
syscall.Wait4(cmd.Process.Pid, &wstatus, syscall.WUNTRACED, nil)
if wstatus.Stopped() {
    fmt.Printf("\n[1]+  Stopped  %s\n", strings.Join(cmd.Args, " "))
    // Return terminal control to shell's process group
}

Heredoc Support

A heredoc (Here Document) lets you embed multi-line text on the command line:

cat <<EOF
line one
line two
EOF

Implementation: recognize the <<DELIMITER syntax, then continuously read input lines until you encounter DELIMITER alone on a line, pass the collected lines to the command's stdin via strings.NewReader.

func readHeredoc(delimiter string) string {
    var lines []string
    scanner := bufio.NewScanner(os.Stdin)
    for scanner.Scan() {
        line := scanner.Text()
        if line == delimiter {
            break
        }
        lines = append(lines, line)
    }
    return strings.Join(lines, "\n")
}

// In the parser, when you see <<WORD:
if cmd.Heredoc != "" {
    content := readHeredoc(cmd.Heredoc)
    proc.Stdin = strings.NewReader(content)
}

Full Variable Expansion

Complete variable expansion is far more complex than os.Expand handles:

echo ${VAR:-default}   # use "default" if VAR is unset or empty
echo ${VAR:+set}       # use "set" if VAR is set and non-empty
echo ${#VAR}           # length of VAR
echo ${VAR%suffix}     # remove shortest matching suffix
echo ${VAR##prefix}    # remove longest matching prefix
echo $?                # exit code of last command
echo $$                # PID of current shell
echo $!                # PID of most recent background process

These special variables and expansion forms require your own implementation. os.Expand only handles simple $VAR substitution. A real implementation uses a dedicated expander that recognizes these patterns via a small parser.

Globbing with filepath.Glob

File glob patterns (*.go, src/**/*.md) expand to matching filenames before being passed to commands. Our executor already does basic globbing:

if strings.ContainsAny(expanded, "*?[") {
    if matches, err := filepath.Glob(expanded); err == nil && len(matches) > 0 {
        result = append(result, matches...)
        continue
    }
}

One important detail: if a glob pattern matches nothing, POSIX shells pass the literal pattern string to the command (as Bash does by default, unless nullglob is set). This is a common source of confusion and bugs.

Comparing to mvdan/sh

mvdan/sh is a complete POSIX shell parser and interpreter in Go, with high code quality that makes it excellent for learning. Its architecture cleanly separates:

Compared to our bash-lite, mvdan/sh is tens of times larger in code volume, handling every detail of the POSIX specification. But the core ideas are identical: tokenize → parse → AST → walk and execute.

Once you've built bash-lite and understood each mechanism, reading mvdan/sh's source becomes natural. You'll find that every difference between your simplified implementation and the industrial one corresponds to a real shell behavior detail — not a mysterious engineering black box.

The most important output of this exercise is not the working shell binary. It's the model in your head: you now understand that a pipeline is just a sequence of os.Pipe() calls with carefully managed file descriptor inheritance; that built-ins exist because they need to mutate the shell's own state (you can't cd in a child process and have it affect the parent); that every Ctrl+C sends a signal to a process group, not just one process.

These are the mental models that turn you from a shell user into someone who truly understands the system. And understanding the system is how you build reliable software on top of it.

Rate this chapter
4.6  / 5  (3 ratings)

💬 Comments