Mohammed Innat innat

Preface

The following notes are generated from ChatGPT and modified while dumping here.

GPU Driver:

The GPU driver acts as an interface between your operating system and the hardware.
It ensures your OS can communicate with and utilize the GPU for tasks.

CUDA Toolkit:

The CUDA Toolkit is required for developing and running GPU-accelerated applications.

Note

While writing a video data to tfrecord format, the output tfrecord file size would be much larger than the original video file. For quick demonstration purpose, some may use frame step to encode the frame to keep the overal size minimal. But in actual case (research or project) all frame should be considered while encoding to tfrecord. By doing so, while using the tfrecord in the training time, we can sample frames with different indices. Check this discussion. The following code is tested in tf 2.12.

video data layout

Let's say, we have a video data set in the following format.

About: A simple demonstration to translate multihead self attention from PyTorch to Keras.

Multi-Head Self Attention

import torch
import torch.nn as nn

class TorchAttentionModel(nn.Module):

# ref. https://www.tensorflow.org/tutorials/video/video_classification

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

class Conv2Plus1D(keras.Model):
    def __init__(self, filters, kernel_size, padding):
        """A sequence of convolutional layers

UNet With ImageNet as Backbone

from tensorflow import keras
from tensorflow.keras import layers as nn
from tensorflow.keras import backend as K
from tensorflow.keras import applications

def short_summary(model):
    trainable_count = np.sum(

	import tensorflow as tf

	def linear_interpolation(volume, target_depth, depth_axis=0):
	# Get the original depth size along the specified axis
	original_depth = tf.shape(volume)[depth_axis]

	# Generate floating-point indices for the target depth
	indices = tf.linspace(0.0, tf.cast(original_depth - 1, tf.float32), target_depth)

	# Split indices into integer and fractional parts

	import tensorflow as tf
	from tensorflow.keras import layers

	H_AXIS = -3
	W_AXIS = -2

	class RandomCutout(layers.Layer):
	"""Randomly cut out rectangles from images and fill them.

	Args:

	# Ref: https://gist.github.com/Rocketknight1/efc47242914788def0144b341b1ad638

	import math
	import tensorflow as tf
	from tensorflow.keras import layers

	class TFAdaptiveAveragePooling(layers.Layer):
	def __init__(self, output_size, **kwargs):
	super().__init__(**kwargs)
	if not isinstance(output_size, (list, tuple)):

	from typing import Tuple

	import tensorflow as tf
	from keras import layers

	def uniform_temporal_subsample(x, num_samples, temporal_dim=-4):
	"""
	Uniformly subsamples num_samples indices from the temporal dimension of the video.
	When num_samples is larger than the size of temporal dimension of the video, it
	will sample frames based on nearest neighbor interpolation.

	from keras_core import layers

	def ResBlockIdentity(filter, stride):

	def apply(inputs):

	skip = inputs
	x = layers.Conv2D(filter, kernel_size=3, strides=stride, padding='same')(inputs)
	x = layers.BatchNormalization()(x)
	x = layers.Activation('relu')(x)