Created
June 4, 2018 18:08
-
-
Save breakin/8df6d6c29369a1188e4e76df686024af to your computer and use it in GitHub Desktop.
Intution importance sampling
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Lets pretend we have a 1d function f defined on x in [0,1] that we don't quite know, but we have a feeling for the general shape. | |
For x<0.5 it is often small, below 0.1. For x>0.5 it is higher, often close to 1.0. | |
Now we can break f into two functions on A=[0,0.5] and B=[0.5,1] and integrate them one at the time. | |
The sum of the two will be the integral of the entire range. | |
How many samples should we take for each of those two? | |
A key observation here is that if we get the integral on A 5% wrong that will be way worse than if we get the integral on B 5% wrong. Why? This is due to the fact that the integral over A will be much smaller than the integral over B, so it will contribute less to the full integral. If we place more samples in A or B we must compensate since they will have a different N-factor that we | |
divide with when we do the averaging. | |
Now we can divide it even further into say 1000 pieces and apply the same reasoning. | |
Or we can do it using IS where we do it in a continous fashion. | |
The unknown factor here with regards to path tracing is that we don't know the shadow factor and if there is much shadows in the B range then the contribution from the A range might dombinate. | |
If so we will get MORE noise do to IS since we didn't do A properly. | |
That's pretty much it as far as intution goes. | |
Now if we know that our function f(x) <= g(x)*h(x) but we only can importance sample according to g(x) OR h(x) we can use multiple importance sampling to get the best of both worlds. | |
If we somehow manage to sample according to g(x)*h(x) that is instead called product sampling. | |
I think the terms here are a bit loose since I think it is called product sampling even if shadows aren't part of the product. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Intro
Lets continue with some notes on multiple importance sampling. My understanding is not as rock solid here as in the importance sampling case. My primary source I learned this from is MIS chapter in the Veach thesis (https://graphics.stanford.edu/courses/cs348b-03/papers/veach-chapter9.pdf). Veach talks alot about bidirectional pathtracing but trying to learn MIS in that context is overkill so I would stick to trying to understand something simpler. The examples in chapter 9 actually isn't about BDPT at all so they are more straightfoward and I really really recommend reading and learning MIS from the paper.
Our surface integral F(X) has three parts. Incoming light L(x), visibility V(x) to that light and then material B(x) response. We assume that we want light in a known outgoing direction so our only parameter is the incoming direction x. We want to do importance sampling. But if we can't sample acccording to L(x)*B(x)*V(x) and not even L(x)*B(x) we are in trouble. We could try to importance sample according to the sum of L(x) and B(x) but that might give us something that doesn't actually help.
There are papers that do product sampling of L(x)*B(x). Might work out in some cases but MIS won the war as far as I undestand it.
Sidenote
Sidenote: In my first pathtracer that I did together with a friend back in 2004 we had a check for glossiness. If glossiness was above a certain threshold it switched from doing IS on L(x) to B(x). It wasn't very good at all and MIS tries to solve this better, but it is the same problem they try to solve.
MIS
So we have what in MIS-speak is called two techniques;
Now lets pretend we only have one single integral at one point to care about. We have a state (x, material constants, outgoing direction) and we want to integrate over incoming direction in.
Also pretend we only have ONE light source for now.
Technique L: Importance sample according to L(in).
Technique B: Importance sample according to B(in).
Now for incoming directions that are coming from the light source we know that technique 1 will be awesome. It will shoot a lot of rays in that direction and probabilities will be high which means that eventual high energy areas will not blow up as fireflies (we divide by probability so we want it high). But if there happens to be a lot of light in other directions due to a very narrow BRDF-lobe there then we will get a very noisy estimate using technique L once we happen to shoot a ray there (and we must to be unbiased). The problem here is that in directions where the light is dim the contribution could be very large due to BRDF-lobe being large (and we don't importance sample according to B).
Same for technique B. It will be awesome at sampling incoming direction inside the brdf lobe but for contributions outside the lobe it will not be so good. If it happens to hit a light source outside the lobe it will be a firefly for sure.
If on the other hand the light source is all white (any direction) and the brdf-lobe is very diffuse then the two esimates will be equally good more or less.
The idea then is for every state (x, material, outgoing direction) and also incoming direction determine a factor that says how much to use each technique. It will be a lerp-factor so maybe we let technique L decide 20% and technique B 80% for a particular state.
If we do both techniques we will shoot two rays. If both estimates do N rays and have uniform probability we will get a factor of 50% everywhere and it will be as if we actually did 2*N rays using an uniform estimator.
The main thing to take away here is that out estimates for the two techniques doesn't actually estimate the real integral. They take responsibility for a fraction of the incoming direction each.
Easy situations:
Notice that our importance sampling doesn't know about this and the techniques will sample rays that might be masked away completely (but not very often since it is of low probability).
So where is a technique good? We actually have a measure of that and that is the probability of choosing a particular direction.
The balance heuristic from Veach gies more or less that the weights for a technique should be the probability of choosing that direction divided by the sum of probabilities for all techniques.
There are also other heuristics but that is more or less it. If a technique has a high probability compared to other techniques of generating something (based on state) we let it get a high weighting factor.
Notes on redering
That is all the time I have right now due to baby :)