pizofreude/R_RStudioCloud_RStudioDesktop.md

Created June 26, 2025 16:55

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/pizofreude/b31ff4ca8dff94c8bad2b14114820426.js"></script>
Save pizofreude/b31ff4ca8dff94c8bad2b14114820426 to your computer and use it in GitHub Desktop.

Download ZIP

Cheatsheet for R, RStudio Cloud, RStudio Desktop

Raw

R_RStudioCloud_RStudioDesktop.md

📚 R & RStudio: Working Directory Cheatsheet

This cheatsheet covers how to control and troubleshoot the working directory in R, RStudio Desktop, and RStudio Cloud. A correct working directory makes data import, script sourcing, and project management much smoother.

1️⃣ RStudio Desktop: Setting the Working Directory

A. Launch from Terminal with Correct Directory

Instead of just:

rstudio .

Use:

rstudio --cwd /path/to/your/directory

Example:

rstudio --cwd /c/workspace/My_Projects/alarm-projects

This ensures RStudio starts in the specified directory.

B. Change Directory Inside RStudio

Menu: Session → Set Working Directory → Choose Directory...
Shortcut: Ctrl + Shift + H

R Console Command:

setwd("C:/workspace/My_Projects/alarm-projects")

C. Set a Default Working Directory (for all new sessions)

Go to Tools → Global Options → General
Under Default working directory, set your path (e.g., C:/workspace/My_Projects/alarm-projects)
Click Apply and restart RStudio

D. Use RStudio Projects for Best Practice

RStudio Projects automatically set the working directory to the project folder.

File → New Project → Existing Directory
Select your folder (e.g., C:/workspace/My_Projects/alarm-projects)
RStudio creates a .Rproj file—always open this file to launch the project with the right directory!

2️⃣ RStudio Cloud: Working Directory Tips

RStudio Cloud always starts in the project’s root directory.
For reproducibility, always use RStudio Projects in the cloud too.
To check your current directory:
```
getwd()
```
To change it:
```
setwd("/cloud/project/subfolder")
```
Upload files to /cloud/project for easy access.

3️⃣ R Console (Base R): Set or Check Working Directory

Check current directory:
```
getwd()
```
Set working directory:
```
setwd("/path/to/your/directory")
```

4️⃣ Common Troubleshooting

Paths on Windows: use either / or double backslashes \\ (never single \).
Always check your current directory with getwd() if file loading fails.
Use Projects whenever possible—they save a ton of headaches!

5️⃣ Reference

Pro Tip:
Always use RStudio Projects for each analysis or codebase. They save window layouts, history, and—most importantly—set your working directory automatically!

Last updated: 2025-06-26

Author

pizofreude commented Jul 11, 2025 •

edited

Loading

Example Usage of R Packages for Data Analytics & Engineering

Based on our specific use cases for the Divvy bike data engineering project, here's a curated list of R packages optimized for our needs: I've created a curated list of 16 essential R packages specifically tailored for our Divvy data engineering project. Here's why this selection is optimized for our needs:

Key Highlights:

Core Advantages:

Minimal Package Count: Only 16 packages vs. the 15+ mentioned in the full listings, avoiding bloat
Project-Specific: Focused on bike share data analysis, revenue calculations, and Redshift connectivity
Professional Presentation: Quarto + reveal.js for modern business slides
ELT Integration: Database connectivity packages for our Redshift-based architecture

What I Excluded and Why:

arrow - We're using Redshift, not direct Parquet manipulation
data.table - Tidyverse is sufficient for our analysis scale
shiny - We're using Tableau Public for final dashboards
testthat - dbt handles our data testing needs
Machine Learning packages - Not required for our business analysis focus

Project-Specific Inclusions:

DBI, RPostgres, dbplyr - Essential for connecting R to our Redshift data warehouse
scales - Perfect for formatting revenue calculations ($0.19/minute, percentages)
plotly - Interactive exploration of station utilization patterns
lubridate - Critical for analyzing trip timestamps and duration calculations

Installation Strategy:

The artifact provides a phased installation approach so we can install packages as needed, plus the complete renv workflow for reproducible environments.

Essential R Packages for Divvy Data Engineering Project

Core Installation Command for renv

# Initialize renv environment
renv::init()

# Install essential packages
renv::install(c(
  # Core tidyverse (includes ggplot2, dplyr, readr, stringr, forcats, tibble)
  "tidyverse",
  "tidyr",
  
  # Data manipulation and analysis
  "lubridate",
  "janitor",
  "skimr",
  
  # Database connectivity (for Redshift)
  "DBI",
  "RPostgres",
  "dbplyr",
  
  # Visualization and rapid prototyping
  "plotly",
  "scales",
  "viridis",
  "patchwork",
  
  # Presentation and reporting
  "quarto",
  "knitr",
  "rmarkdown",
  
  # Project management
  "here",
  "renv"
))

# Snapshot the environment
renv::snapshot()

Package Categories and Justifications

1. Core Data Science Workflow (Essential)

tidyverse - Comprehensive suite including ggplot2, dplyr, readr, stringr, forcats, tibble
tidyr - Data reshaping and cleaning (not automatically included in tidyverse)
lubridate - Date/time manipulation (crucial for bike trip timestamps)
janitor - Data cleaning and column name standardization
skimr - Quick dataset overviews and summary statistics

2. Database Connectivity (Project-Specific)

DBI - Database interface foundation
RPostgres - PostgreSQL/Redshift connectivity
dbplyr - dplyr syntax for database queries (essential for Redshift integration)

3. Visualization and Rapid Prototyping (Core Need)

plotly - Interactive visualizations for exploration
scales - Scale functions for ggplot2 (revenue formatting, percentages)
viridis - Color scales that are colorblind-friendly
patchwork - Combining multiple ggplot2 plots

4. Presentation Tools (Your Preference)

quarto - Modern publishing system with reveal.js integration
knitr - Code chunk processing (required by quarto)
rmarkdown - Markdown processing (quarto dependency)

5. Project Management (Professional Standards)

here - Robust file path management
renv - Package dependency management (already chosen)

Packages NOT Recommended for Your Use Case

Skip These (Not Needed):

arrow - You're using Redshift, not Parquet files directly in R
data.table - tidyverse approach is sufficient for your analysis scale
testthat - dbt handles data testing; R code will be exploratory
SimDesign - Monte Carlo simulations not relevant to bike share analysis
shiny - Using Tableau Public for final dashboards
tidymodels/caret/mlr3 - No machine learning requirements mentioned
packrat - Superseded by renv

Installation Strategy

Phase 1: Core Setup

# Essential packages for immediate work
core_packages <- c("tidyverse", "tidyr", "lubridate", "janitor", "skimr", "here")
renv::install(core_packages)
# OR
install.packages(core_packages)

Phase 2: Database Integration

# Database connectivity for Redshift
db_packages <- c("DBI", "RPostgres", "dbplyr")
renv::install(db_packages)
# OR
install.packages(db_packages)

Phase 3: Visualization Enhancement

# Advanced visualization capabilities
viz_packages <- c("plotly", "scales", "viridis", "patchwork")
renv::install(viz_packages)
# OR
install.packages(viz_packages)

Phase 4: Presentation Tools

# Modern presentation system
presentation_packages <- c("quarto", "knitr", "rmarkdown")
renv::install(presentation_packages)
# OR
install.packages(presentation_packages)

Project-Specific Considerations

For Divvy Data Analysis:

lubridate - Essential for trip start/end time analysis
scales - Format revenue calculations (dollar signs, percentages)
dbplyr - Write dplyr code that translates to SQL for Redshift
plotly - Interactive exploration of station utilization patterns

For Business Presentations:

quarto + reveal.js - Professional slide presentations
viridis - Accessible color palettes for executive presentations
patchwork - Combine multiple revenue/usage charts

renv Workflow

# Initialize project
renv::init()

# Install packages as needed
renv::install("package_name")
# OR
install.packages("package_name")

# Snapshot current state
renv::snapshot()

# Share project (others can restore with)
renv::restore()

Total Package Count: 16 Essential Packages

This curated list focuses on your specific needs while avoiding bloat. The selection prioritizes:

Redshift connectivity for ELT pipeline integration
Rapid visualization prototyping with ggplot2 ecosystem
Professional presentation capabilities with Quarto
Business-focused analysis tools for revenue and operational metrics

This streamlined approach ensures fast installation, minimal dependency conflicts, and focused functionality for your Divvy bike data engineering project.