Skip to content

Instantly share code, notes, and snippets.

@markgarrigan
Created August 6, 2025 13:21
Show Gist options
  • Select an option

  • Save markgarrigan/50d2eb5b6b7487b808bcc8ca9eb76f8c to your computer and use it in GitHub Desktop.

Select an option

Save markgarrigan/50d2eb5b6b7487b808bcc8ca9eb76f8c to your computer and use it in GitHub Desktop.
Example Copilot Instruction File

Project Overview

This project involves building predictive models to analyze customer churn using Python, pandas, scikit-learn, and XGBoost. It includes data preprocessing, feature engineering, model training, and evaluation.

Folder Structure

  • /data: Contains raw and processed datasets. Raw data should not be committed.
  • /notebooks: Jupyter notebooks for exploratory data analysis and prototyping.
  • /src: Python scripts for data processing, modeling, and evaluation.
  • /models: Saved model artifacts.
  • /reports: Generated reports and visualizations.

Libraries and Frameworks

  • pandas and NumPy for data manipulation.
  • scikit-learn and XGBoost for modeling.
  • matplotlib and seaborn for visualization.
  • Jupyter for interactive development.

Coding Standards

  • Use consistent naming conventions for variables (e.g., snake_case).
  • Avoid hardcoding file paths; use config files or environment variables.
  • Include docstrings for all functions and classes.
  • Use logging instead of print statements for debugging.
  • Set random seeds for reproducibility in modeling scripts.

Notebook Guidelines

  • Clear outputs before committing notebooks.
  • Use markdown cells to explain each step of the analysis.
  • Avoid committing large datasets or outputs.
  • Keep notebooks modular and focused on a single task.

Model Evaluation Guidelines

  • Include metrics such as accuracy, precision, recall, F1-score, and ROC-AUC.
  • Visualize confusion matrices and ROC curves.
  • Compare multiple models and justify selection.

Data Privacy and Security

  • Do not commit raw data files containing sensitive information.
  • Use .gitignore to exclude large or private files.
  • Mask or anonymize any personally identifiable information (PII).

Review Focus Areas

  • Are functions well-documented and modular?
  • Is the code reproducible (e.g., random seeds, environment setup)?
  • Are notebooks clean and readable?
  • Are evaluation metrics appropriate and clearly presented?
  • Are data privacy practices followed?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment