Last active
July 14, 2025 09:13
-
-
Save Nivratti/ea81e952e07ffbbf03e6d44a7dbbef8f to your computer and use it in GitHub Desktop.
Directory structure for deep learning projects
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
project_name/ | |
│ | |
├── data/ # All data-related directories and files | |
│ ├── raw/ # Original, immutable data dump | |
│ ├── processed/ # Data transformed for modeling, such as TFRecords or HDF5 | |
│ ├── interim/ # Temporary processed datasets, possibly smaller or for prototyping | |
│ | |
├── docs/ # Project documentation | |
│ ├── api/ # API reference | |
│ ├── design_docs/ # Design documents, architectural choices | |
│ ├── tutorials/ # Step-by-step guides and tutorials | |
│ └── diagrams/ # Network architecture diagrams, flowcharts | |
│ | |
├── notebooks/ # Jupyter notebooks | |
│ ├── eda/ # Exploratory Data Analysis notebooks | |
│ ├── model_experiments/ # Prototyping and experiments | |
│ └── results/ # Visualization of final results, comparisons | |
│ | |
├── src/ # Source code | |
│ ├── data_pipeline/ # All modules related to the data pipeline | |
│ │ ├── sourcing/ # Scripts and utilities for sourcing data | |
│ │ │ ├── downloaders/ # Scripts to download data from various sources | |
│ │ │ ├── scrapers/ # Web scraping scripts or API-based scripts | |
│ │ │ ├── validators/ # Scripts to validate data integrity after sourcing | |
│ │ │ └── annotations/ # Tools and scripts related to data annotations or labeling | |
│ │ │ | |
│ │ ├── preprocessing/ # Data preprocessing modules | |
│ │ │ ├── cleaners/ # Scripts to clean data, handle missing values, outliers, etc. | |
│ │ │ ├── transformers/ # Transformation scripts (e.g., normalization, tokenization) | |
│ │ │ ├── encoders/ # Encoding techniques, like one-hot, label encoding, etc. | |
│ │ │ └── splitters/ # Scripts to split datasets into train/test/validation sets | |
│ │ │ | |
│ │ ├── augmentations/ # Data augmentation modules | |
│ │ │ ├── image/ # Image data augmentation (e.g., rotations, flips) | |
│ │ │ ├── text/ # Text data augmentation (e.g., back translation, synonym replacement) | |
│ │ │ ├── audio/ # Audio data augmentation (e.g., noise injection, speed change) | |
│ │ │ └── utils/ # Common utilities for augmentation | |
│ │ │ | |
│ │ ├── loaders/ # Data loading utilities for feeding to models | |
│ │ │ ├── dataset_classes/ # Framework-specific dataset classes (e.g., TensorFlow's tf.data.Dataset, PyTorch's Dataset) | |
│ │ │ ├── generators/ # Data generators for on-the-fly data feeding and augmentation | |
│ │ │ ├── batchers/ # Batch data and manage batch-related operations | |
│ │ │ └── async_loaders/ # Asynchronous data loading, especially useful for large datasets | |
│ │ │ | |
│ │ ├── validation/ # Data validation post-processing | |
│ │ │ ├── sanity_checks/ # Scripts to check data sanity post-processing | |
│ │ │ ├── statistics/ # Scripts to compute and visualize data statistics | |
│ │ │ └── comparers/ # Compare data before and after preprocessing | |
│ │ │ | |
│ │ └── utils/ # Common utilities for the data pipeline | |
│ │ ├── visualization/ # Tools to visualize samples, augmentations, etc. | |
│ │ ├── logging/ # Data processing related logging utilities | |
│ │ └── helpers/ # Miscellaneous helper functions and scripts | |
│ │ | |
│ ├── models/ # Neural network models and components | |
│ │ ├── architectures/ # Different neural network architectures | |
│ │ ├── layers/ # Custom layers | |
│ │ ├── losses/ # Custom loss functions | |
│ │ └── metrics/ # Custom evaluation metrics | |
│ │ | |
│ ├── training/ | |
│ │ ├── experiments/ # Specific experiment settings and results | |
│ │ │ ├── experiment1/ # Each experiment can have its own subdirectory | |
│ │ │ │ ├── config.yml # Configuration file for the experiment | |
│ │ │ │ ├── logs/ # Training logs for this experiment | |
│ │ │ │ └── results/ # Resulting plots, performance metrics, etc. | |
│ │ │ ├── experiment2/ | |
│ │ │ └── ... | |
│ │ │ | |
│ │ ├── scripts/ # Actual scripts to run training | |
│ │ │ ├── train_model.py # Main training script | |
│ │ │ ├── validate_model.py # Validation script | |
│ │ │ ├── resume_training.py # Script to resume training from checkpoints | |
│ │ │ └── distributed_train.py # For distributed and parallel training setups | |
│ │ │ | |
│ │ ├── hyperparameters/ # Hyperparameter tuning and optimization | |
│ │ │ ├── search_algos/ # Algorithms for hyperparameter search | |
│ │ │ │ ├── grid_search.py | |
│ │ │ │ ├── random_search.py | |
│ │ │ │ └── bayesian_optimization.py | |
│ │ │ ├── search_spaces/ # Definitions of hyperparameter search spaces | |
│ │ │ └── tuning_results/ # Results from hyperparameter tuning runs | |
│ │ │ | |
│ │ ├── callbacks/ # Custom callbacks used during training | |
│ │ │ ├── lr_schedulers/ # Learning rate scheduling callbacks | |
│ │ │ ├── early_stopping.py # Early stopping callback | |
│ │ │ ├── model_checkpointing.py # Save model checkpoints during training | |
│ │ │ └── tensorboard_logging.py # Log metrics and other details to TensorBoard | |
│ │ │ | |
│ │ ├── strategies/ # Training strategies for different setups | |
│ │ │ ├── single_gpu_strategy.py # Training strategy for a single GPU setup | |
│ │ │ ├── multi_gpu_strategy.py # Multi-GPU training strategy | |
│ │ │ └── tpu_strategy.py # TPU training strategy | |
│ │ │ | |
│ │ ├── metrics/ # Metrics to evaluate model during training | |
│ │ │ ├── accuracy.py # Accuracy metric | |
│ │ │ ├── f1_score.py # F1 Score | |
│ │ │ └── custom_metric.py # Any other custom metrics | |
│ │ │ | |
│ │ └── utils/ # Miscellaneous utilities for training | |
│ │ ├── gradient_clipping.py # Gradient clipping utilities | |
│ │ ├── weight_initialization.py # Weight initialization strategies | |
│ │ └── mixed_precision.py # Mixed precision training utilities | |
│ │ | |
│ ├── inference/ # Inference related modules | |
│ │ ├── deployment/ # Deployment related utilities | |
│ │ │ ├── docker/ # Docker files for containerized deployment | |
│ │ │ ├── cloud_functions/ # Cloud function code for serverless deployment | |
│ │ │ ├── api_endpoints/ # RESTful API endpoints | |
│ │ │ └── edge_devices/ # Deployment scripts for edge devices (like Raspberry Pi, etc.) | |
│ │ │ | |
│ │ ├── tools/ # Inference tools | |
│ │ │ ├── model_converters/ # Convert models to different formats (ONNX, TensorFlow Lite, etc.) | |
│ │ │ ├── benchmarking/ # Benchmarking the model's performance in real-world scenarios | |
│ │ │ └── visualization/ # Visualize predictions, attention maps, etc. | |
│ │ │ | |
│ │ └── utils/ # Common utilities for inference | |
│ │ ├── preprocessing/ # Preprocess input data for inference | |
│ │ ├── postprocessing/ # Convert raw model outputs to interpretable results | |
│ │ └── logging/ # Inference related logging utilities | |
│ │ | |
│ ├── testing/ | |
│ │ ├── unit_tests/ # Unit tests for individual functions and components | |
│ │ │ ├── data_pipeline_tests/ | |
│ │ │ ├── model_tests/ | |
│ │ │ └── training_utils_tests/ | |
│ │ │ | |
│ │ ├── integration_tests/ # Tests that check the interactions between modules | |
│ │ │ ├── pipeline_integration/ | |
│ │ │ └── model_training_integration/ | |
│ │ │ | |
│ │ └── utils/ # Utilities for testing | |
│ │ ├── fixtures/ # Sample data, models, etc. to be used during tests | |
│ │ ├── mockers/ # Mock functions, classes, etc. for testing | |
│ │ └── visualization/ # Visualize results, errors, etc. during tests | |
│ │ | |
│ └── utils/ # Miscellaneous utilities | |
│ ├── file_handlers/ # Handle file read/write operations | |
│ ├── visualization/ # Common visualization tools | |
│ └── others/ # Any other miscellaneous utilities | |
│ | |
├── results/ # Final results and outputs | |
│ ├── models/ # Trained models, weights, checkpoints | |
│ ├── plots/ # Resulting plots and visualizations | |
│ └── tables/ # Resulting tables, usually in CSV or Excel format | |
│ | |
├── config/ # Configuration files and environment settings | |
│ ├── environment.yml # Conda environment file | |
│ ├── requirements.txt # Python package requirements file | |
│ └── model_config.yml # Model and training configuration file | |
│ | |
└── README.md # Overview and instructions for the project |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment