This is a continuation of the discussion on this gist.
-
-
Save Davidebyzero/9735222 to your computer and use it in GitHub Desktop.
I thought of a pretty good analogy for what constructing ECMAScript regexes for functions on the natural numbers is like. I think it's much like doing straightedge and compass geometric construction. There are many things that are impossible to construct with those tools, but that hasn't stopped mathematicians from devoting much time to the subject.
I suspect that only a strict subset of primitive recursive functions are possible but at this point still can't prove it. I'm also pretty sure that with unbounded scratch space, all primitive recursive functions become possible (this would probably be easy to prove).
Sorry to intrude, but there's still one loose end: the log2 golf challenge. I have a length 108 73 66 solution (8cf2ef821f9428e4a611b457ccc57344). What's the length of your b31a12cf9a9982d7d3204d198f87ba4f solution?
Hello, Grimy, and welcome!
Wow, that is awesome. My b31a12cf9a9982d7d3204d198f87ba4f is 69 bytes, so you have beaten me! Very well done. I'd like to exchange our regexes, but privately, to maintain the viability of this challenge for a little longer. I'd like to see your earlier versions too!
I recently forked the log2 regex into:
- A version that strictly rounds down, which is 93 bytes with md5sum cef4f7a85f9703825c97b01012358d4d, and like the original, returns no-match for zero
- A base 10 version that returns the number of digits, which is 112 bytes with md5sum cc065c2c0e212933c0bcb1e3e9717a86 (it's the same as strictly rounded-down logarithm plus one – except with an input of zero, for which both regexes return no-match, because you can't return 1 with an input of 0)
- Logarithm in base N (passed as two comma-delimited unary numbers, base first, and including the base and comma in the returned result) at 165 bytes with md5sum df2b6dab8eb3fdf9a6c2d2e4e52257f3 (have not made a strictly-rounding-down or digit-counting version of this yet).
Tying up another loose end, I also wrote a regex to take a single unary number as input, and divide by sqrt(2) (rounding down) and return the result as a match, using an algorithm very similar to the one I sketched above. It uses molecular lookahead. The first fully correct version was 1159 bytes and I got it down to 849. I plan on porting it to pure ECMAScript, but I expect it will roughly double in length.
Thanks!
I am now down to 57 chars on the log2 regex (md5sum c58bef1340954f1bbaf02c5790f33a01, --npcg+
only, --neam
independent). I’ll gladly share it if you want.
I haven’t yet tried applying this algorithm to the base 10 or rounding down version of the task, but it should give significant improvements.
EDIT:
- 76 chars for the version that strictly rounds down (or 74 if I’m allowed to return 0 instead of no match for a 0 input).
- 97 chars for the base 10 version
Wow! 57 chars. That's incredible. I don't want you to just show it to me right away though; I'd like some time to try to figure it out. Can't work on it right away but I plan on it. I would indeed like to see them, please!
I did have an idea on how to make my floor-logarithm regexes much shorter. It did a little worse than the technique I already used, and frankly I'm very happy about that. The one I'd already used is much more interesting, and the one I just now tried would trivialize it if it did better.
And I'm also quite impressed to see that 76-57=19, a smaller difference than my 93-69=24.
This discussion has been continued in the Mathematical Regexes Stack Exchange chat room.
So it turns out that even back when I wrote this, there was already a paper that superseded it: On the density of abundant numbers, Kobayashi (2010). It just didn't show up in my searches at the time.
This is an excellent paper. It goes into depth on things I'd wished were in Deléglise's 1998 paper. It derives the time-complexity of computing n digits of the abundant number density to be O(een). It includes the full C++ source code (and describes their workings) of the program that computes bounds on the density; I tried it, and it works! I was able to compute bounds narrowed down to [0.2476167, 0.2476528] in 245 minutes, single-threaded, by passing it N=200000000000000 on the command line. That's weird, because it's almost as narrow as the new bounds established by the paper: [0.2476171, 0.2476475]. I would have thought it would take a lot more computing time to reach that.
Edit: After 23.54 hours of computation with N = 5000000000000000, the program gave a result of [0.24761740, 0.24764507]. That's even tighter than the bounds stated in the paper.
Edit 2: After 40.48 hours of computation with N = 15000000000000000, it gave a result of [0.24761748, 0.24764422].
FWIW, it also gets the closed-open interval right, stating them as [a, b). It does have at least one error though:
on page 6. I think he meant "any finite union of multiple periodic sets is a periodic set".
I do wish the paper discussed what limitation the use of native floating point data types (80-bit long double) has on the best bounds achievable by the program. There is some discussion on the matter starting at page 112, but if I understand correctly, this only describes how a problem regarding precision was solved. In scanning through the paper, I didn't find anything explicitly discussing limitations on precision that still apply. So I don't know whether there's a point at which the program will start narrowing the bounds without rigorous mathematical justification.
The same mathematician also co-wrote an article in 2014 about the error term in the count of abundant numbers. This pertains to another thing I said earlier here:
The function they came up with, though, is very very pessimistic (extremely slow to converge, much slower than the density itself seems to converge). It is a start though, I suppose. The article itself is behind a paywall, unfortunately.