Skip to content

Instantly share code, notes, and snippets.

@tbl3rd
Last active April 7, 2022 13:41
Show Gist options
  • Save tbl3rd/ba2570598a5cf3af3243df719c7211e2 to your computer and use it in GitHub Desktop.
Save tbl3rd/ba2570598a5cf3af3243df719c7211e2 to your computer and use it in GitHub Desktop.
Codes Without Commas for SoftEng Meetup
"We will waste 45 minutes writing a do-nothing program in a
60-year-old programming language. The program explains a
mistake made by 2 physicists in 1957. (And don't miss the
lame joke at the end.) Luncheon will be served as usual."
(ns commas
"CODES WITHOUT COMMAS -- with apologies to F.H.C. Crick et al")
(comment "http://www.pnas.org/content/43/5/416" is the paper.
The year is 1957. What do we know now?)
(def nucleotides ["Adenine" "Cytosine" "Guanine" "Thymine"])
nucleotides ; => ["Adenine" "Cytosine" "Guanine" "Thymine"]
(count nucleotides) ; => 4
(def essential {:F "phenylalanine"
:H "histidine"
:I "isoleucine"
:K "lysine"
:L "leucine"
:M "methionine"
:T "threonine"
:V "valine"
:W "tryptophan"})
(def conditional {:C "cysteine"
:G "glycine"
:P "proline"
:Q "glutamine"
:R "arginine"
:Y "tyrosine"})
(def dispensable {:A "alanine"
:D "aspartic acid"
:E "glutamic acid"
:N "asparagine"
:S "serine"})
(def amino-acids (merge essential conditional dispensable))
(count amino-acids) ; => 20
"The problem of how a sequence of four things (nucleotides)
can determine a sequence of twenty things (amino acids)
is known as the 'coding' problem."
(comment Now ... given that. What can we find out?)
;; BREAK
"Begin digression: Model base pairing with a simple Clojure map."
(str "Model" \space "base" \space "pairing" \.) ; => "Model base pairing."
(def sentence ["Model" \space "base" \space "pairing" \.])
sentence ; => ["Model" \space "base" \space "pairing" \.]
(apply str sentence) ; => "Model base pairing."
((partial apply str) sentence) ; => "Model base pairing."
(def string (partial apply str))
(string sentence) ; => "Model base pairing."
nucleotides ; => ["Adenine" "Cytosine" "Guanine" "Thymine"]
(first nucleotides) ; => "Adenine"
(first (second nucleotides)) ; => \C
(map first nucleotides) ; => (\A \C \G \T)
(def ACGT (string (map first nucleotides)))
ACGT ; => "ACGT"
(string (reverse ACGT)) ; => "TGCA"
(def pair (zipmap ACGT (reverse ACGT)))
pair ; => {\A \T \C \G \G \C \T \A}
(map pair ACGT) ; => (\T \G \C \A)
[(rand-nth ACGT) (rand-nth ACGT)] ; => [\G \A]
(def strand (repeatedly (partial rand-nth ACGT)))
(string (take 23 strand)) ; => "TCTGTCCCCGTAGACAAGACGTT"
(string (take 23 (map pair strand))) ; => "AGACAGGGGCATCTGTTCTGCAA"
"End digression: Now where were we?" "We were counting things."
(count ACGT) ; => 4
(count amino-acids) ; => 20
(count (for [x ACGT] [x])) ; => 4
(count (for [x ACGT y ACGT] [x y])) ; => 16
(count (for [x ACGT y ACGT z ACGT] [x y z])) ; => 64
;; BREAK
"Not enough selections taking nucleotides 2 at a time to cover the
aminos but more than enough taken 3 at a time. Call them triples."
(def triples (map string (for [x ACGT y ACGT z ACGT] [x y z])))
triples ; => ["AAA" "AAC" "AAG" "AAT" "ACA" "ACC" "ACG" "ACT"
;; "AGA" "AGC" "AGG" "AGT" "ATA" "ATC" "ATG" "ATT"
;; "CAA" "CAC" "CAG" "CAT" "CCA" "CCC" "CCG" "CCT"
;; "CGA" "CGC" "CGG" "CGT" "CTA" "CTC" "CTG" "CTT"
;; "GAA" "GAC" "GAG" "GAT" "GCA" "GCC" "GCG" "GCT"
;; "GGA" "GGC" "GGG" "GGT" "GTA" "GTC" "GTG" "GTT"
;; "TAA" "TAC" "TAG" "TAT" "TCA" "TCC" "TCG" "TCT"
;; "TGA" "TGC" "TGG" "TGT" "TTA" "TTC" "TTG" "TTT"]
(def rotations
(fn [s] (let [n (count s)]
(map string (take n (partition n 1 (cycle s)))))))
(rotations ACGT) ; => ("ACGT" "CGTA" "GTAC" "TACG")
(take 7 (map rotations triples)) ; => (("AAA" "AAA" "AAA")
; ("AAC" "ACA" "CAA")
; ("AAG" "AGA" "GAA")
; ("AAT" "ATA" "TAA")
; ("ACA" "CAA" "AAC")
; ("ACC" "CCA" "CAC")
; ("ACG" "CGA" "GAC"))
(set (str ACGT ACGT ACGT)) ; => #{\A \C \G \T}
[((set ACGT) \T) ((set ACGT) \Z)] ; => [\T nil]
(def codons (set (map (comp set rotations) triples)))
(count codons) ; => 24
(take 4 codons) ; => (#{"ACC" "CCA" "CAC"}
; #{"GGG"}
; #{"TTT"}
; #{"GCC" "CGC" "CCG"})
(map first (group-by count codons)) ; => (3 1)
(def sense-codons ((group-by count codons) 3))
(count sense-codons) ; => 20 Eureka!
;; BREAK
(take 5 sense-codons) ; => (#{"ACC" "CCA" "CAC"}
; #{"GCC" "CGC" "CCG"}
; #{"CAA" "ACA" "AAC"}
; #{"CTC" "CCT" "TCC"}
; #{"AGC" "CAG" "GCA"})
(def sense (map first sense-codons))
(count sense) ; => 20
(take 7 sense) ; => ("ACC" "GCC" "CAA" "CTC" "AGC" "TAT" "GAG")
(def nonsense (remove (set sense) (set triples)))
(count nonsense) ; => 44
(def code (zipmap sense (keys amino-acids)))
(sort-by second code) ; => (["AGC" :A]
; ["TAA" :C]
; ["TGT" :D]
; ["TCA" :E]
; ["TAT" :F]
; ["TAG" :G]
; ["CTG" :H]
; ["CAA" :I]
; ["ACG" :K]
; ["ACC" :L]
; ["GCC" :M]
; ["GAA" :N]
; ["CTA" :P]
; ["GAT" :Q]
; ["CTC" :R]
; ["TCG" :S]
; ["CTT" :T]
; ["GTG" :V]
; ["GAG" :W]
; ["GGC" :Y])
(map string (take 3 (partition 3 1 strand))) ; => ("TTA" "TAA" "AAT")
(def amino-keys (map (comp code string) (partition 3 1 strand)))
(take 13 amino-keys) ; => (nil :C nil nil nil :Y nil nil :M nil nil :Y :M)
(def aminos (keep amino-acids amino-keys))
(take 4 aminos) ;=> ("cysteine" "tyrosine" "methionine" "tyrosine")
"This is fun programming but (as of 1961) bad biology."
"Biology is not physics and certainly not engineering."
"Life’s processes are rarely efficient or mistake free."
"(Crick and Gamow were both degreed in physics.)"
"What is this program about? Can you follow it?"
"This programming language is as old as this paper."
"It was designed to help machines help us think."
"It is almost the simplest language that can work."
"So the program is about the world more than itself."
"A programming language can be a tool of discovery"
"and not just the means of commanding a computer."
"You can develop and explore a program while it runs."
"And see the value of each expression as it’s evaluated."
"They appear here literally as the computer prints them."
"('Notebook systems' like Jupyter return to this idea.)"
"How do you think with your programming language?"
"Does it aid or hinder your understanding of the world?"
"(-: BTW: Did you notice any commas in this code? :-)"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment