Yuge Shi (Jimmy) YugeTen

Prerequisite: gensim.

Two scripts under src/ that you need to look at:

generate_embeddings.py: Creates dataloaders for embedded sentences using fasttext model trained on CUB dictionary. The train_loader and test_loader will return dataB of length 2: dataB[0]: [batch_size, sentence_length, embedding_vector_size] dataB[1]: [batch_size], original sentence length before truncation or padding (you can probably ignore this one, but I kept it there just in case you need the original length to truncate the sentence when calculating correlations)
coherence.py: this one is pretty much ready to go, it is defaulted to load the trained cub model under expeirments/ft_obj. You just need to import the CCA module (see usage in line 65, 66 and 77) that can be called with:

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (block1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

	import os
	import io
	import subprocess
	import numpy as np
	from scipy.sparse.linalg import svds
	from scipy.io import loadmat

	import torch
	from torchvision import datasets, transforms

	import os
	import io
	import subprocess
	import numpy as np
	from scipy.sparse.linalg import svds
	from scipy.io import loadmat

	import torch
	from torchvision import datasets, transforms

	Paper + Code release! We propose Fish, an effective algorithm for domain generalisation. It learns invariant features by maximising gradient inner product across domains.

	Paper: https://arxiv.org/abs/2104.09937
	Code: https://github.com/YugeTen/fish

	Work done during internship at FAIR with syhw. A thread [insert FIgure1]

	======

	Our main idea: if the gradients of different domains point in similar directions, taking either gradient step improves the model's performance on both domains, indicating that the features learned by either gradient step are invariant across them.

	# MNIST model specification

	import torch
	import torch.distributions as dist
	import torch.nn as nn
	import torch.nn.functional as F
	from numpy import prod, sqrt
	from torch.utils.data import DataLoader
	from torchvision import datasets, transforms
	from torchvision.utils import save_image, make_grid

	X = blah, Y = blah
	ones = np.ones(X.shape) # X and Y must be of the same shape
	XX = np.diag(np.diag(X.dot(X.T))).dot(ones)
	YY = ones.dot(np.diag(np.diag(Y.dot(Y.T))))
	XY = X.dot(Y.T)
	C = XX + YY - 2*XY