Skip to content

Instantly share code, notes, and snippets.

@bgschiller
Created November 15, 2013 19:27
Show Gist options
  • Save bgschiller/7490164 to your computer and use it in GitHub Desktop.
Save bgschiller/7490164 to your computer and use it in GitHub Desktop.
Python Sieve of Eratosthenes class
{
"metadata": {
"name": "Sieve of Eratosthenes"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": "#The Sieve of Eratosthenes\nFrom Wikipedia: In mathematics, the sieve of Eratosthenes (Greek: \u03ba\u03cc\u03c3\u03ba\u03b9\u03bd\u03bf\u03bd \u1f18\u03c1\u03b1\u03c4\u03bf\u03c3\u03b8\u03ad\u03bd\u03bf\u03c5\u03c2), one of a number of prime number sieves, is a simple, ancient algorithm for finding all prime numbers up to any given limit. It does so by iteratively marking as composite (i.e. not prime) the multiples of each prime, starting with the multiples of 2.\n\n##Algorithm\n\n 1. Create a list of numbers 2..n\n 2. Initially, let p=2, the first prime.\n 3. Starting from 2*p, count up in increments of p\n 3.1 Strike these numbers from the list.\n 4. Find the first number greater than p still in the list.\n 4.1 If no such number exists, stop.\n 4.2 Otherwise, this is the new p."
},
{
"cell_type": "code",
"collapsed": false,
"input": "def sieve1(limit):\n '''given an upper limit, <limit>, return a list of all primes below\n <limit>'''\n candidates = range(3,limit,2)\n primes = [2]\n while len(candidates) > 0:\n p = candidates[0]\n #We know p is a prime, because it hasn't been crossed out yet.\n primes.append(p)\n multiples = range(p,limit,p)\n #multiples all have a factor of p, so they can't be prime!\n for m in multiples:\n if m in candidates:\n candidates.remove(m)\n return primes",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": "sieves = [sieve1, sieve2]\nfor sieve in sieves:\n print 'evaluating', sieve.__name__\n %timeit sieve(10000)",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": "evaluating sieve1\n1 loops, best of 3: 1.3 s per loop"
},
{
"output_type": "stream",
"stream": "stdout",
"text": "\nevaluating sieve2\n100 loops, best of 3: 2.51 ms per loop"
},
{
"output_type": "stream",
"stream": "stdout",
"text": "\n"
}
],
"prompt_number": 56
},
{
"cell_type": "code",
"collapsed": false,
"input": "def sieve2(limit):\n candidates = [True] * limit\n candidates[0] = False\n candidates[1] = False\n primes = []\n for p, isprime in enumerate(candidates):\n if isprime:\n primes.append(p)\n for multiple in xrange(p*p,limit,p):\n #if x is a multiple of p (x==r*p), and x < p*p,\n #Then r < p, so we caught it when we iterated over\n #multiples of r\n candidates[multiple] = False\n return primes",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 55
},
{
"cell_type": "code",
"collapsed": false,
"input": "_ = sieve2(5000)",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 64
},
{
"cell_type": "code",
"collapsed": false,
"input": "one_to_ten = range(1,11)\nxone_to_ten = xrange(1,11)",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 48
},
{
"cell_type": "code",
"collapsed": false,
"input": "for num in xone_to_ten:\n print num",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n"
}
],
"prompt_number": 51
},
{
"cell_type": "markdown",
"metadata": {},
"source": "##Memoized Version\n\nThis one may be a bit too much to stomach, but I've included it because I like it. After running tests, it turns out that it's not as fast as sieve2. This is very surprising to me! This one only has to do the work once, and then spit out the same answer every other time."
},
{
"cell_type": "code",
"collapsed": false,
"input": "import itertools\nfrom bisect import bisect_left\n\nclass MemoPrimes(object):\n def __init__(self, limit=None):\n def custom_range(start=0,stop=None,step=1):\n '''xrange in python 2.7 fails on numbers larger than C longs.\n we write a custom version'''\n if stop is None:\n #handle single argument case. ugly...\n stop = start\n start = 0\n i = start\n while i < stop:\n yield i\n i += step\n \n self.range = custom_range\n self.primes = [2,3]\n if limit is not None:\n self.primes_up_to(limit)\n \n def primes_up_to(self,limit):\n '''Return all primes up to, but not including, limit.\n Calling this function is equivalent to calling one of the \n sieves. (but probably faster)'''\n self.extend(limit)\n def below_limit(x):\n return x < limit\n return list(itertools.takewhile(below_limit, self.primes))\n \n def get_next_prime(self,num=1):\n '''retrieve the next <num> primes. <num> defaults to 1.\n Calling this function repeatedly is much slower than calling \n MemoPrimes.extend once. You should only use this method if you\n don't know where the next prime will appear, not merely to extend \n the list.'''\n candidate = self.primes[-1] + 2 #next odd after largest known prime\n while num > 0:\n if self.isprime(candidate):\n self.primes.append(candidate)\n candidate += 2 #next odd after largest (newly) known prime\n num -= 1\n \n def isprime(self,x):\n if x <= self.primes[-1]:\n i = bisect_left(self.primes, x) #bisect_left returns the index to place x in a sorted list\n return (i != len(self.primes) and self.primes[i] == x) #if the value at that index IS x, prime\n self.extend(int(x**0.5 + 1))\n assert x <= self.primes[-1]\n return self.isprime(x)\n \n def extend(self,limit):\n\n '''Extend the list to include all primes up to, but not including, limit'''\n if self.primes[-1] >= limit:\n return #no work to do\n candidates = set(self.range(self.primes[-1] + 2, limit,2))\n \n for p in self.primes:\n for multiple in self.range(p*p,limit,p):\n candidates.discard(multiple)\n \n while len(candidates) > 0:\n p = min(candidates)\n candidates.discard(p)\n self.primes.append(p)\n for m in self.range(p*p,limit,p):\n candidates.discard(multiple)\n ",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 101
},
{
"cell_type": "code",
"collapsed": false,
"input": "mp = MemoPrimes()\n_ = mp.primes_up_to(10000)",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 102
},
{
"cell_type": "code",
"collapsed": false,
"input": "sieves = [sieve2,mp.primes_up_to]\nfor sieve in sieves:\n print 'evaluating', sieve.__name__\n %timeit -n 3 sieve(100000)",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": "evaluating sieve2\n3 loops, best of 3: 26.2 ms per loop"
},
{
"output_type": "stream",
"stream": "stdout",
"text": "\nevaluating primes_up_to\n3 loops, best of 3: 48.6 ms per loop"
},
{
"output_type": "stream",
"stream": "stdout",
"text": "\n"
}
],
"prompt_number": 105
},
{
"cell_type": "code",
"collapsed": false,
"input": "cProfile.run('mp.primes_up_to(10000)')",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": " 3339 function calls in 0.002 seconds\n\n Ordered by: standard name\n\n ncalls tottime percall cumtime percall filename:lineno(function)\n 1 0.002 0.002 0.002 0.002 <ipython-input-101-0a5c7f272c64>:23(primes_up_to)\n 3335 0.001 0.000 0.001 0.000 <ipython-input-101-0a5c7f272c64>:28(below_limit)\n 1 0.000 0.000 0.000 0.000 <ipython-input-101-0a5c7f272c64>:53(extend)\n 1 0.000 0.000 0.002 0.002 <string>:1(<module>)\n 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n\n\n"
}
],
"prompt_number": 104
},
{
"cell_type": "code",
"collapsed": false,
"input": "",
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment