Assumptions:
- PBKDF2 with HMAC-SHA1 must execute SHA1 iterations twice to generate a 256 bit key (digest size is < derived key size)
- Each HMAC-SHA1 operation executes two internal SHA1 operations.
- Thus, 4000 iterations of PBKDF2 using HMAC-SHA1 requires 16,000 SHA1 operations
- PBKDF2 with HMAC-SHA256 executes 1 SHA1 operations per iteration to generate a 256 bit key (digest size = derived key size)
- Each HMAC-SHA256 operation executes two internal SHA256 operations.
- Thus, 500 iterations of PBKDF2 using HMAC-SHA256 requires 1000 SHA-256 operations (1/16th the number of operations)
- SHA256 is slower than SHA1, though it's difficult to get exact benchmarks. For ease of caclulations let us assume that on a GPU, SHA1 is twice as fast as SHA256 (potential sources: http://hashcat.net/oclhashcat-lite/, http://www.insidepro.com/eng/egb.shtml)
Conclusion:
Assuming a GPU could calculate SHA1 twice as fast as SHA256 (i.e. SHA256 is 50% slower than SHA1), PBKDF2 with HMAC-SHA256 and 500 iterations would actually be much weaker, and 8x faster to attack, than PBKDF2 with HMAC-SHA1 and 4,000 iterations.