nov05 / 20241122_AWS SageMaker JupyterLab (or any other IDE), set up GitHub username and password.md

Last active November 24, 2024 11:03

20241122_AWS SageMaker JupyterLab (or any other IDE), set up GitHub username and password

Don't use the email you registered with GitHub for commits. Instead, GitHub provides you with a proxy email for this purpose. Just go to 'Settings - Emails' in your GitHub account, and you'll find the proxy email there.
Don't use your GitHub login password for commits. Instead, go to 'Settings - Developer Settings - Personal access tokens', create a token, and use that as your password for commits. Since Fine-grained tokens are still in Preview, I'm using a classic token for now.

nov05 / 20250126_AWS SageMaker model training and deployment resources.md

Last active January 27, 2025 17:48

🟢 Different Levels of AWS Resources for Machine Learning Model Training and Deployment

👉 EC2 Instances: Full User Control (Least Pre-built Content)
With EC2, you have complete control over the entire setup. You need to:
- Start an EC2 instance (e.g., GPU-enabled for training deep learning models).
- Install dependencies manually (e.g., Python, ML libraries like PyTorch or TensorFlow).
- Copy or configure the training script, and handle the training data management (downloading data from S3 or other sources).
- Run the training process manually using your own code.
- Manage all aspects of the environment, scaling, and resource management.

nov05 / 20250126_AWS SageMaker linear learner distributed training.md

Created January 26, 2025 20:25

To apply distributed training for the AWS SageMaker Linear Learner algorithm, you would typically rely on SageMaker's built-in distributed training capabilities. The Linear Learner algorithm supports distributed training by scaling across multiple instances and using multiple GPUs or CPU cores.

How to Apply Distributed Training for Linear Learner Algorithm in SageMaker

1. Using SageMaker Pre-built Containers with Distributed Training

SageMaker Linear Learner algorithm provides a straightforward approach to use distributed training across multiple instances by setting the instance_count parameter to more than 1.

Steps:

nov05 / 20250128_AWS S3 data to SageMaker machine learning training.md

Last active January 31, 2025 22:52

🟢 AWS S3 data to SageMaker machine learning training

WebDataset source code
https://github.com/webdataset/webdataset

Code snippets are from the following sources:

✅ Why I Chose WebDataset for Training on 50TB of Data?
Ahmad Sachal, May 22, 2023

nov05 / 20250205_VSCdoe remove Jupyter kernels.md

Last active February 6, 2025 00:27

Uninstall all VS Code extensions
Delete C:\Users\*\.vscode\extensions folder
Reinstall extensions
Remove Jupyter kernels

(base) PS D:\github\udacity-nd009t-capstone-starter> jupyter kernelspec list
Available kernels:

nov05 / 20250205_udacity_nd189_capstone_training_issues.md

Last active February 8, 2025 06:40

⚠️🟢 Issue: training error

[1,mpirank:0,algo-1]<stderr>:../aten/src/ATen/native/cuda/Loss.cu:242: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.
[1,mpirank:0,algo-1]<stderr>:../aten/src/ATen/native/cuda/Loss.cu:242: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [6[1,mpirank:0,algo-1]<stderr>:,0,0] Assertion `t >= 0 && t < n_classes` failed.
[1,mpirank:0,algo-1]<stderr>:../aten/src/ATen/native/cuda/Loss.cu:242: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [30,0,0] Assertion `t >= 0 && t < n_classes` failed.
...
[1,mpirank:1,algo-2]<stdout>:  File "train.py", line 675, in <module>
[1,mpirank:1,algo-2]<stdout>:    main(task)
[1,mpirank:1,algo-2]<stdout>:  File "train.py", line 572, in main

nov05 / 20250206_WebDataset code snippets.md

Last active February 6, 2025 22:35

Google Docs: 20250125_AWS SageMaker Input Mode, WebDataset

✅✅✅ My working code: Create WebDataset from local data files to local .tar files

## example code for webdataset
import webdataset as wds
import io
import json

nov05 / 20250213_aws bedrock.md

Last active February 13, 2025 23:10

⚠️ Issue

(awsmle_py310) PS D:\github\udacity-cd13926-Building-Apps-Amazon-Bedrock-exercises\Experiments> aws bedrock invoke-model `
>> --model-id anthropic.claude-3-5-sonnet-20240620-v1:0 `
>> --body file://claude_input.json output.json

usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:

nov05 / 20250220_udacity_nd209_robotics_issues.md

Last active April 11, 2025 00:40

🟢 20250221_Import VMWare OVA to GCP VM

☝️ Check my Google Docs

🟢⚠️ It is hard to ROSLaunch that Gazebo world that I created via udacity_office.launch.

nov05 / 20250312_SAP HANA to SQL Server_SLT (SAP Landscape Transformation).md

Last active March 14, 2025 21:35

🟢 ChatGPT output (could be wrong. verify carefully)

To export SAP HANA data in real time continuously, you can use several methods depending on the target system and the purpose. Here are some of the most common approaches:

1. Smart Data Integration (SDI)

Use Case: Real-time data replication and transformation.
How: SDI allows you to create real-time data replication tasks between SAP HANA and other systems. You can define data flows that continuously export data from HANA and send it to another system, such as another HANA instance or a non-HANA database.
Steps:
1. Set up a Data Provisioning Agent.

Configure the SDI connection to the target system.