Try more architectures
Basic architectures are sometimes better
Try other forms of ensembling than cv
Blend with linear regression
Rely more on shakeup predictions
Make sure copied code is correct
Pay more attention to correlations between folds
Try not to extensively tune hyperparameters
Optimizing thresholds can lead to "brittle" models
Random initializations between folds might help diversity\
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import os | |
| import numpy as np | |
| import torch | |
| import torchvision | |
| from torch.autograd import Variable | |
| import torch.nn as nn | |
| import torch.nn.functional as F |
-
means and stddevs of activations should be close to 0 and 1 to prevent gradients exploding or vanishing
-
activations of layers have stddevs close to sqrt(num_input_channels)
-
so, to get the stddevs back to 1, multiply random weights by 1 / sqrt(c_in)
-
this works well without activations, but results in vanishing or exploding gradients when used with a tanh or sigmoid activation function
-
bias weights should be initialized to 0
-
intializations can either be from a uniform distribution or a normal distribution
-
use xavier for sigmoid and softmax activations
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| version: "2" | |
| networks: | |
| gitea: | |
| external: false | |
| services: | |
| server: | |
| image: gitea/gitea:latest | |
| environment: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9) | |
| scheduler = torch.optim.CyclicMomentum(optimizer) | |
| data_loader = torch.utils.data.DataLoader(...) | |
| for epoch in range(10): | |
| for batch in data_loader: | |
| scheduler.batch_step() | |
| train_batch(...) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| class Encoder(nn.Module): | |
| def __init__(self, in_ch, out_ch, r): | |
| super(Encoder, self).__init__() | |
| self.conv = nn.Conv2d(in_ch, out_ch, 3, padding=1) | |
| self.se = SqueezeAndExcitation(out_ch, r) | |
| def forward(self, x): | |
| x = F.relu(self.conv(x), inplace=True) | |
| x = self.se(x) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Send a notification to your phone directly with IFTTT (https://ifttt.com/) notifying | |
| # you when a training run ends or at the end of an epoch. | |
| notify({'value1': 'Notification title', 'value2': 'Notification body'}, key=[IFTTT_KEY]) | |
| # Automatically set random seeds for Python, numpy, and Pytorch to make sure your results can be reproduced | |
| seed_envirionment(42) | |
| # Print how much GPU memory is currently allocated | |
| gpu_usage(device, digits=4) | |
| # GPU Usage: 6.5 GB |
OlderNewer