Skip to content

Instantly share code, notes, and snippets.

@nat-418
Last active April 1, 2024 03:23
Show Gist options
  • Save nat-418/bdccb2428cde750c9f42bf9bbc1a50d3 to your computer and use it in GitHub Desktop.
Save nat-418/bdccb2428cde750c9f42bf9bbc1a50d3 to your computer and use it in GitHub Desktop.
Why Tcl?

Why Tcl?

Introduction

I use Tcl as my scripting language of choice, and recently someone asked me why. This article is an attempt to answer that question.

Ousterhout's dichotomy claims that there are two general categories of programming languages:

  • low-level, statically-typed systems languages
  • high-level, dynamic scripting languages

Systems languages are best for efficiently handling large quantities of data, implementing algorithims, and can handle significant internal complexity. Scripting languages are best for glueing other programs together, exploring a problem, and extending applications. Ousterhout designed Tcl with this dichotomy in mind:

Thus I designed Tcl to make it really easy to drop down into C or C++ when you come across tasks that make more sense in a lower-level language. This way Tcl doesn't have to solve all of the world's problems. Stallman appears to prefer an approach where a single language is used for everything, but I don't know of a successful instance of this approach. Even Emacs uses substantial amounts of C internally, no?

Tcl is a very simple language that you can learn quickly and keep in your head. It is a mature platform, with a comprehensive standard library. Unlike some other scripting languages, Tcl is stable and does not require complex virtual environment or dependency management. To get started, all you need is the interpreter tclsh and the standard library tcllib.

I reach for Tcl whenever I have a shell script that gets more than trivially complex, because shell scripts are brittle: not only is the syntax tricky—I always use shellcheck to make sure I catch all the necessary double-quotes, for example—but many of the basic tools used in shell scripting are subtly different from machine to machine—even a simple call to echo can break. By contrast, Tcl programs will work across Linux, BSD, macOS, and Windows.

I also like to use Tcl to prototype and test ideas. Hal Abelson called Tcl "Lisp without a brain", and I find the strange blend of Unix and Lisp ideas in Tcl enjoyable to work with. Tcl is command-oriented and generally procedural, but can accomodate object-oriented, functional, and metaprogramming.

Example programs

Recently I have been working through The Go Programming Language by Donovan and & Kernighan, and thought I would give a few examples of how Tcl compares to Go, Python, and shell scripts. Each example will begin with the Go code from the book end with a performance benchmark made using hyperfine.

Echo command-line input

To begin with, let's simply take command-line input and print it out.

Go

package main

import (
	"fmt"
	"os"
)

func main() {
	var s, sep string
	for i := 1; i < len(os.Args); i++ {
		s += sep + os.Args[i]
		sep = " "
	}
	fmt.Println(s)
}

Python

#!/usr/bin/env python

import sys

output = " ".join(sys.argv[1:])

print(output)

Tcl

#!/usr/bin/env tclsh

puts $argv

Benchmark

Command Mean [ms] Min [ms] Max [ms] Relative
./echo "Hello, world!" 0.5 ± 0.1 0.4 1.0 1.00
echo "Hello, world!" 1.0 ± 0.1 0.8 1.6 1.88 ± 0.35
tclsh ./echo.tcl "Hello, world!" 1.9 ± 0.1 1.6 2.8 3.44 ± 0.57
python ./echo.py "Hello, world!" 10.9 ± 0.3 10.3 11.7 19.90 ± 2.94

We can see that Tcl is very straightforward to write and fast to start-up.

Count duplicate lines in files

Given some files like this example.txt:

foo
hello
foo bar
hello
foo bar
bar

We produce the result:

2        foo bar
2        hello

Go

package main

import (
	"fmt"
	"io/ioutil"
	"os"
	"strings"
)

func main() {
	counts := make(map[string]int)
	for _, filename := range os.Args[1:] {
		data, err := ioutil.ReadFile(filename)
		if err != nil {
			fmt.Fprintf(os.Stderr, "dup3: %v\n", err)
			continue
		}
		for _, line := range strings.Split(string(data), "\n") {
			counts[line]++
		}
	}
	for line, n := range counts {
		if n > 1 {
			fmt.Printf("%d\t%s\n", n, line)
		}
	}
}

Shell

#!/bin/sh

set -e
set -o pipefail

for path in "$@"
do
	sort "$path" | uniq --count --repeated || {
		echo "^dup error^"
		continue
	}
done

Note how awkward it is to try and handle errors in this shell script.

Python

#!/usr/bin/env python

import collections
import sys

for i, p in enumerate(sys.argv):
    if i == 0:
        continue
    try:
        with open(p) as f:
            c = collections.Counter(f.readlines())
            for k, v in c.most_common():
                if v > 1:
                    print(v, "\t", k.replace("\n", ""))
    except Exception as e:
        print("dup: ", str(e))
        continue

Tcl

#!/usr/bin/env tclsh

proc main {paths} {
    foreach path $paths {
        try {
            set file [read [open $path r]]
        } on error {err} {
            puts "dup: $err"
            continue
        }

        foreach line [split $file \n] {
            incr count($line)
        }

        foreach {line n} [array get count] {
            if {$n > 1} {
                puts "$n\t$line"
            }
        }
    }
}

main $argv

Benchmarks

To compare these programs, I did a a few rounds of testing: first with just one small file, then with batch of files, and finally including a 230 megabyte torture-test.

First Round
Command Mean [ms] Min [ms] Max [ms] Relative
./dup 0.txt 0.6 ± 0.1 0.5 1.4 1.00
tclsh dup.tcl 0.txt 2.1 ± 0.2 1.8 2.8 3.36 ± 0.69
sh dup.sh 0.txt 3.3 ± 0.3 2.8 4.7 5.32 ± 1.15
python dup.py 0.txt 12.1 ± 0.5 10.9 13.6 19.25 ± 3.76
Second Round
Command Mean [ms] Min [ms] Max [ms] Relative
./dup 0.txt err 1.txt 2.txt 3.txt 4.txt 4.6 ± 0.6 3.6 7.0 1.00
python dup.py 0.txt err 1.txt 2.txt 3.txt 4.txt 17.8 ± 0.8 15.9 20.7 3.84 ± 0.50
sh dup.sh 0.txt err 1.txt 2.txt 3.txt 4.txt 30.1 ± 0.9 27.9 32.3 6.51 ± 0.82
tclsh dup.tcl 0.txt err 1.txt 2.txt 3.txt 4.txt 34.8 ± 1.4 32.0 37.9 7.53 ± 0.97
Third Round
Command Mean [ms] Min [ms] Max [ms] Relative
./dup 0.txt err 1.txt 2.txt 3.txt 4.txt torture.txt 201.8 ± 7.6 190.7 214.9 1.00
python dup.py 0.txt err 1.txt 2.txt 3.txt 4.txt torture.txt 875.6 ± 28.2 848.8 940.7 4.34 ± 0.21
tclsh dup.tcl 0.txt err 1.txt 2.txt 3.txt 4.txt torture.txt 1292.4 ± 121.4 1165.9 1484.7 6.40 ± 0.65
sh dup.sh 0.txt err 1.txt 2.txt 3.txt 4.txt torture.txt 2212.5 ± 47.7 2146.4 2293.4 10.96 ± 0.47

Although I prefer the Go and Tcl approach above of directly implementing the solution to the problem, I do see some advantages to Python's dependence on more complex libraries when it comes to performance in this case.

GET request

We do the equivalent of curl whatever and print the result.

Go

package main

import (
	"fmt"
	"io/ioutil"
	"net/http"
	"os"
)

func main() {
	for _, url := range os.Args[1:] {
		resp, err := http.Get(url)
		if err != nil {
			fmt.Fprintf(os.Stderr, "fetch: %v\n", err)
			os.Exit(1)
		}
		b, err := ioutil.ReadAll(resp.Body)
		resp.Body.Close()
		if err != nil {
			fmt.Fprintf(os.Stderr, "fetch: reading %s: %v\n", url, err)
			os.Exit(1)
		}
		fmt.Printf("%s", b)
	}
}

Python

#!/usr/bin/env python

import requests
import sys

url = sys.argv[1]

try:
    response = requests.get(url)
except Exception as e:
    print("fetch: ", e)
    sys.exit(1)

print(response.text)

The requests here is a third-party package that required installation and a virtual environment to use. I don't know how to catch more granular exceptions in this example.

Tcl

#!/usr/bin/env tclsh

package require http

set url [lindex $argv 0]

proc main {url} {
    try {
        set http [::http::geturl $url]
    } on error {err} {
        puts stderr "fetch: $err"
        exit 1
    }
    
    try {
        set html [::http::data $http]
    } on error {err} {
        puts stderr "fetch: reading $url $err"
        exit 1
    }
    
    puts $html
}

main $url

Benchmarks

The first test is with a local file served with miniserve, in order to see how fast is it possible to complete the request. The second test is with a remote server to simulate real-world use.

First Round
Command Mean [ms] Min [ms] Max [ms] Relative
./fetch http://[::1]:8080 1.7 ± 0.2 1.3 2.6 1.00
sh fetch.sh http://[::1]:8080 5.1 ± 0.4 4.4 7.0 3.07 ± 0.43
tclsh fetch.tcl http://[::1]:8080 10.2 ± 0.4 9.5 11.9 6.16 ± 0.76
python fetch.py http://[::1]:8080 89.5 ± 1.0 88.0 91.5 54.03 ± 6.34
Second Round
Command Mean [ms] Min [ms] Max [ms] Relative
sh fetch.sh http://example.com 251.4 ± 22.7 235.2 289.2 1.00
./fetch http://example.com 253.5 ± 22.9 232.5 282.7 1.01 ± 0.13
tclsh fetch.tcl http://example.com 259.2 ± 21.9 240.7 292.6 1.03 ± 0.13
python fetch.py http://example.com 342.8 ± 21.9 319.7 370.7 1.36 ± 0.15

It's possible I am simply not familiar enough with Python to see how to get that granular exception handling, but I like how Tcl strikes a balance: the code is explicit without being verbose. I don't want to fuss with third-party libraries.

Parallel GET requests

This time we do multiple parallel requests, and then calculate the size of the response and how long it took to get it in seconds. The output should look like this:

$ ./script http://example.com http://ddg.gg
0.38s     1256  http://example.com
1.08s     5999  http://ddg.gg
1.08s elapsed

Go

package main

import (
    "fmt"
    "io"
    "io/ioutil"
    "net/http"
    "os"
    "time"
)

func main() {
    start := time.Now()
    ch := make(chan string)
    for _, url := range os.Args[1:] {
        go fetch(url, ch) // start a goroutine
    }
    for range os.Args[1:] {
        fmt.Println(<-ch) // receive from channel ch
    }
    fmt.Printf("%.2fs elapsed\n", time.Since(start).Seconds())
}

func fetch(url string, ch chan<- string) {
    start := time.Now()
    resp, err := http.Get(url)
    if err != nil {
        ch <- fmt.Sprint(err) // send to channel ch
        return
    }

    nbytes, err := io.Copy(ioutil.Discard, resp.Body)
    resp.Body.Close() // avoid leaking resources
    if err != nil {
        ch <- fmt.Sprintf("while reading %s: %v", url, err)
        return
    }
    secs := time.Since(start).Seconds()
    ch <- fmt.Sprintf("%.2fs  %7d  %s", secs, nbytes, url)
}

Python

#!/usr/bin/env python

import json
import requests
import sys
import time

from concurrent.futures import ThreadPoolExecutor

def fetch_url(data):
    index, url = data
    try:
        r = requests.get(url, timeout=10)
    except requests.exceptions.ConnectTimeout:
        return

    time_taken = round(time.time()-start, 2)
    print('{}s \t{}'.format(time_taken, len(r.text)))

start = time.time()
with ThreadPoolExecutor(max_workers=10) as runner:
    for _ in runner.map(fetch_url, enumerate(sys.argv[1:])):
        pass

    runner.shutdown()

time_taken = round(time.time()-start, 2)
print('{}s elapsed'.format(time_taken))

Tcl

#!/usr/bin/env tclsh

package require http

set done false
set requests 0
set responses 0
set start [clock milliseconds]

proc main {urls} {
    global done start
    
    foreach url $urls {
        fetch $url
    }

    vwait done

    set stop [clock milliseconds]
    puts [format {%0.2fs elapsed} [expr {($stop - $start) / 1000.0}]]
}

proc fetch {url} {
    global requests

    try {
        ::http::geturl $url -command callback
        incr requests
    } on error {err} {
        puts stderr "fetch: $err"
        exit 1
    }
}

proc callback {token} {
    global done requests responses start

    try {
        set data [::http::data $token]
    } on error {err} {
        puts stderr "fetch: reading $url $err"
        exit 1
    }

    set stop [clock milliseconds] 
    set elapsed [format {%0.2fs} [expr {($stop - $start) / 1000.0}]]
    set length [string length $data]

    puts "$elapsed\t$length"

    incr responses

    if {$requests eq $responses} {
        set done true
    }
}

main $argv

Note that in the above code, expr is a command that interprets a domain-specific language for mathematical expressions. Tcl allows you to write your own DSLs as well.

Benchmark

Command Mean [ms] Min [ms] Max [ms] Relative
tclsh ./fetchall.tcl http://example.com http://github.com http://127.0.0.1:8080 http://127.0.0.1:1222 http://127.0.0.1:1223 245.1 ± 29.0 220.6 336.6 1.00
./fetchall http://example.com http://github.com http://127.0.0.1:8080 http://127.0.0.1:1222 http://127.0.0.1:1223 253.6 ± 64.4 213.3 444.4 1.03 ± 0.29
python ./fetchall.py http://example.com http://github.com http://127.0.0.1:8080 http://127.0.0.1:1222 http://127.0.0.1:1223 348.5 ± 27.0 321.9 400.8 1.42 ± 0.20

I was surprised to see Tcl be faster than Go on average. Perhaps this is due to network variance, but my home connection is pretty stable. In any case, all three programs perform acceptably but work differently. Go of course used its version of light-weight processes, goroutines. The Python code was multi-threaded. Tcl supports multiple threads and coroutines, but for this task it seemed best to use the event loop and callbacks. The Tcl program is longer, but the logic of how it works is all on the page—Python again required third-party packages and a virtual environment to get working.

Conclusion

Hopefully the above examples give a sense of Tcl. In general, I think the sales pitch for Tcl is that it is simple, fast, and expressive. Tcl has been extended for automating interactions with Expect and writing cross-platform GUI applications with Tk. To learn more, check out these resources:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment