Skip to content

Instantly share code, notes, and snippets.

@ozancaglayan
Created December 10, 2015 17:51
Show Gist options
  • Save ozancaglayan/45fb305885fe0a990ac8 to your computer and use it in GitHub Desktop.
Save ozancaglayan/45fb305885fe0a990ac8 to your computer and use it in GitHub Desktop.
advanced indexing
# pp: (batch, sequence_step, target vocabulary probabilities)
# yy: (batch, sequence_step's true label)
# Soru: pp[yy] gibi dogru yerlerden dogru olasiliklari nasi cekebilirim?
In [211]: pp.shape
Out[211]: (256, 33, 20004)
In [212]: yy.shape
Out[212]: (256, 33)
@orhanf
Copy link

orhanf commented Dec 10, 2015

advanced indexing kullanmadan cozum asagida abi, biraz flatten edip indexlerle oynuyosun o kadar
bi de abi why categorical crossentropy ? problem sequence'lar uzerinde ise cost un negative log probability ler olmayacak mi ? yani biz bunu minimize ediyoruz nmt de (eger baska bi surrogate kullanmiyosan tabi)

import numpy as np
from theano import tensor

pp = np.transpose(pp, (1, 0, 2))  # rearrange dimensions
y_flat = yy.flatten()

num_words = pp.shape[2]
num_labels = y_flat.shape[0]

y_flat_idx = tensor.arange(num_labels) * num_words + y_flat

result = pp.flatten()[y_flat_idx]

@ozancaglayan
Copy link
Author

y_flat_idx = np.repeat(np.arange(batch_size), seq_size) * num_words + y_flat

Böyle olmayacak mı ya?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment