markgarrigan/copilot-instructions.md

Created August 6, 2025 13:21

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/markgarrigan/50d2eb5b6b7487b808bcc8ca9eb76f8c.js"></script>
Save markgarrigan/50d2eb5b6b7487b808bcc8ca9eb76f8c to your computer and use it in GitHub Desktop.

Download ZIP

Example Copilot Instruction File

Raw

copilot-instructions.md

Project Overview

This project involves building predictive models to analyze customer churn using Python, pandas, scikit-learn, and XGBoost. It includes data preprocessing, feature engineering, model training, and evaluation.

Folder Structure

/data: Contains raw and processed datasets. Raw data should not be committed.
/notebooks: Jupyter notebooks for exploratory data analysis and prototyping.
/src: Python scripts for data processing, modeling, and evaluation.
/models: Saved model artifacts.
/reports: Generated reports and visualizations.

Libraries and Frameworks

pandas and NumPy for data manipulation.
scikit-learn and XGBoost for modeling.
matplotlib and seaborn for visualization.
Jupyter for interactive development.

Coding Standards

Use consistent naming conventions for variables (e.g., snake_case).
Avoid hardcoding file paths; use config files or environment variables.
Include docstrings for all functions and classes.
Use logging instead of print statements for debugging.
Set random seeds for reproducibility in modeling scripts.

Notebook Guidelines

Clear outputs before committing notebooks.
Use markdown cells to explain each step of the analysis.
Avoid committing large datasets or outputs.
Keep notebooks modular and focused on a single task.

Model Evaluation Guidelines

Include metrics such as accuracy, precision, recall, F1-score, and ROC-AUC.
Visualize confusion matrices and ROC curves.
Compare multiple models and justify selection.

Data Privacy and Security

Do not commit raw data files containing sensitive information.
Use .gitignore to exclude large or private files.
Mask or anonymize any personally identifiable information (PII).

Review Focus Areas

Are functions well-documented and modular?
Is the code reproducible (e.g., random seeds, environment setup)?
Are notebooks clean and readable?
Are evaluation metrics appropriate and clearly presented?
Are data privacy practices followed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment