Skip to content

Instantly share code, notes, and snippets.

@danoneata
Last active December 6, 2023 06:25
Show Gist options
  • Save danoneata/9927923 to your computer and use it in GitHub Desktop.
Save danoneata/9927923 to your computer and use it in GitHub Desktop.
Fisher vectors with sklearn
import numpy as np
import pdb
from sklearn.datasets import make_classification
from sklearn.mixture import GaussianMixture as GMM
def fisher_vector(xx, gmm):
"""Computes the Fisher vector on a set of descriptors.
Parameters
----------
xx: array_like, shape (N, D) or (D, )
The set of descriptors
gmm: instance of sklearn mixture.GMM object
Gauassian mixture model of the descriptors.
Returns
-------
fv: array_like, shape (K + 2 * D * K, )
Fisher vector (derivatives with respect to the mixing weights, means
and variances) of the given descriptors.
Reference
---------
J. Krapac, J. Verbeek, F. Jurie. Modeling Spatial Layout with Fisher
Vectors for Image Categorization. In ICCV, 2011.
http://hal.inria.fr/docs/00/61/94/03/PDF/final.r1.pdf
"""
xx = np.atleast_2d(xx)
N = xx.shape[0]
# Compute posterior probabilities.
Q = gmm.predict_proba(xx) # NxK
# Compute the sufficient statistics of descriptors.
Q_sum = np.sum(Q, 0)[:, np.newaxis] / N
Q_xx = np.dot(Q.T, xx) / N
Q_xx_2 = np.dot(Q.T, xx ** 2) / N
# Compute derivatives with respect to mixing weights, means and variances.
d_pi = Q_sum.squeeze() - gmm.weights_
d_mu = Q_xx - Q_sum * gmm.means_
d_sigma = (
- Q_xx_2
- Q_sum * gmm.means_ ** 2
+ Q_sum * gmm.covariances_
+ 2 * Q_xx * gmm.means_)
# Merge derivatives into a vector.
return np.hstack((d_pi, d_mu.flatten(), d_sigma.flatten()))
def main():
# Short demo.
K = 64
N = 1000
xx, _ = make_classification(n_samples=N)
xx_tr, xx_te = xx[: -100], xx[-100: ]
gmm = GMM(n_components=K, covariance_type='diag')
gmm.fit(xx_tr)
fv = fisher_vector(xx_te, gmm)
pdb.set_trace()
if __name__ == '__main__':
main()
@sobhanhemati
Copy link

Thank you for clarification.
Do you have any implementation of the analytical diagonal approximation so that I can add that the current implementation?
It seems that analytical diagonal approximation works about 1 percent better :))
Thank you in advance

@danoneata
Copy link
Author

danoneata commented Nov 25, 2020

@sobhanhemati Equations (16–18) from (Sanchez et al., 2013) provide the Fisher vectors that include the analytical approximation; hence, you can modify the computation of d_pi, d_mu, d_sigma in the gist above as follows:

    # at line 43
    s = np.sqrt(gmm.weights_)[:, np.newaxis]
    d_pi = (Q_sum.squeeze() - gmm.weights_) / s.squeeze()
    d_mu = (Q_xx - Q_sum * gmm.means_) * np.sqrt(gmm.covariances_) ** -1 / s
    d_sigma = - (
        - Q_xx_2
        - Q_sum * gmm.means_ ** 2
        + Q_sum * gmm.covariances_
        + 2 * Q_xx * gmm.means_) / (s * np.sqrt(2))

Note that I haven't tested this implementation, so you might want to double check it. And I would suggest to try both methods for estimating the diagonal Fisher information matrix and see which one works better for you — Sanchez et al. mention in their paper:

Note that we do not claim that this difference is significant nor that the closed-form approximation is superior to the empirical one in general.

Finally, do not forget to L2 and power normalise the Fisher vectors — these transformations yield much more substantial improvements (about 6-7% points each) than the choice of the approximation for the Fisher information matrix (see Table 1 from Sanchez et al.).

@sobhanhemati
Copy link

Thank you so much for the comprehensive answer.
I really appreciate that.

@PARVATHYAJITHA
Copy link

Hai I'm beginner so i don't know working of fisher vector encoding. Please help to understand this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment