Skip to content

Instantly share code, notes, and snippets.

@lanzafame
Forked from SgtCoin/singularity-workshop.md
Last active February 8, 2024 10:42
Show Gist options
  • Save lanzafame/45db0014bc7da5885344665bcaf52eae to your computer and use it in GitHub Desktop.
Save lanzafame/45db0014bc7da5885344665bcaf52eae to your computer and use it in GitHub Desktop.
Singularity Workshop

Singularity Workshop Guide

Welcome to the Singularity Workshop!

Getting Started

This instruction will guide you through all steps necessary to use Singularity to

  1. Prepare an open dataset from S3
  2. Send deal to a local emulated storage provider f02815405
  3. Make retrievals from the emulated storage provider using HTTP and Bitswap

Initial Setup

Note This process will require you to download the sim-sp software, a simulated storage provider.

Download Singularity and sim-sp:

  • Get the latest Singularity release here.
  • Download the latest sim-sp release here.
  • Note Avoid package formats like '.deb' or '.rpm'.
  1. Extract and Organize:

    • Locate the downloaded files, usually in the Downloads folder.
    • Double-click each file to extract. Move the extracted contents (look for files named singularity and sim-sp) into a new folder on your Desktop, named SingularityWorkshop.
  2. Prepare Your Environment:

    • Open the Terminal application (found in Applications > Utilities).
    • Change the directory to your workshop folder:
cd ~/Desktop/SingularityWorkshop
  • Verify the setup by running:
SingularityWorkshop % ./singularity version  

singularity v0.5.11-5c597b4
     
SingularityWorkshop % ./sim-sp     

NAME:
sim-sp - Utility for simulating a storage provider

USAGE:
sim-sp [global options] command [command options] [arguments...]

COMMANDS:
run            Run the simulated storage provider f02815405
generate-peer  Generate a new peer
help, h        Shows a list of commands or help for one command

GLOBAL OPTIONS:
--help, -h  show help

If you encounter an error message stating that the software cannot be opened because it's from an unidentified developer, you can adjust your Mac's security settings to allow these applications through Gatekeeper with the following commands.

xattr -d com.apple.quarantine singularity
xattr -d com.apple.quarantine sim-sp

For Windows user, you will need to use below command for the whole workshop

singularity.exe version
sim-sp.exe -h
  1. Download IPFS CLI:
    • Download the IPFS CLI for Mac from here.
    • Repeat the extraction process and ensure the ipfs executable is in the SingularityWorkshop folder.

Singularity Setup

  1. Initialize Singularity:
    • In the Terminal, within your SingularityWorkshop folder, initialize Singularity's database:
    ./singularity admin init
     
    2024-01-23T00:15:08.417-0500	INFO	model	model/migrate.go:51	Auto migrating tables
    

Resetting the Database:

  • In case you need to reset the database, perform the following:
./singularity admin reset --really-do-it

2024-01-23T00:17:47.840-0500	INFO	model	model/migrate.go:100	Dropping all tables
2024-01-23T00:17:47.847-0500	INFO	model	model/migrate.go:51	Auto migrating tables

Preparing Data

  1. Connect to AWS S3:
    • Connect to AWS S3 bucket as a storage connection. Let's name the connection as civic
./singularity storage create s3 aws --region us-west-2 --name civic --path civic-aws-opendata
     
ID  Name   Type  Path                
1   civic  s3    civic-aws-opendata  
  • Check the connection:
./singularity storage list
     
ID  Name   Type  Path                
1   civic  s3    civic-aws-opendata  
  1. Data Preparation:
    • Prep the data for scanning:
./singularity prep create --name civic --source civic

ID  Name   DeleteAfterExport  MaxSize      PieceSize    NoInline  NoDag  
1   civic  false              33822867456  34359738368  false     false  
Source Storages:
   ID  Name   Type  Path                
   1   civic  s3    civic-aws-opendata  
  • Start the scanning process:
./singularity prep start-scan civic civic

ID  Type  State  ErrorMessage  WorkerID  
1   scan  ready                <nil>     

Note Now the data source is marked as ready to be scanned, but we have not yet started running any worker to scan and prepare the dataset. Usually the dataset worker should be always running, but in this workshop, we are going to run it on-demand. The command will take one minute to complete depending on the Internet speed, it will look like it hangs at created pack job 2 with 43 (numbers may vary) file range but it will finish soon. The command will exit upon completion of data preparation.

  1. Run Dataset Worker:
    • Open a new Terminal window and navigate to SingularityWorkshop.
    • Run the dataset worker:
./singularity run dataset-worker --exit-on-error --exit-on-complete

...
2024-01-23T09:05:01.837-0500	INFO	pushfile	push/pushfile.go:97	new file	{"file":         {"id":0,"cid":"","path":"VariantSummaries/date=01-Sep-2023/VariantSummaries.tsv","hash":"28efaeaa78cb555dfbab637306f5fcb3","size":383564,"lastModifiedNano":1695064925000000000,"attachmentId":1,"directoryId":null}}
2024-01-23T09:05:01.841-0500	INFO	scan	scan/scan.go:122	created pack job 2 with 67 file ranges
  1. DAG Generation:
    • Once the dataset-worker has completed, start DAG generation:
./singularity prep start-daggen civic civic

ID  Type    State  ErrorMessage  WorkerID  
3   daggen  ready                <nil>    
  • Run the dataset worker again to complete the process:
./singularity run dataset-worker --exit-on-error --exit-on-complete

2024-01-23T09:07:18.435-0500	INFO	datasetworker	service/service.go:61	starting service	{"name": "Preparation Worker Thread - 6be79f85-881e-49d1-8991-6832d22108ad"}
2024-01-23T09:07:18.448-0500	INFO	datasetworker	datasetworker/datasetworker.go:281	no work found, exiting	{"workerID": "6be79f85-881e-49d1-8991-6832d22108ad"}
2024-01-23T09:07:18.448-0500	INFO	datasetworker	datasetworker/datasetworker.go:105	healthcheck cleanup stopped	{"workerID": "6be79f85-881e-49d1-8991-6832d22108ad"}
2024-01-23T09:07:18.448-0500	INFO	datasetworker	datasetworker/datasetworker.go:98	health report stopped	{"workerID": "6be79f85-881e-49d1-8991-6832d22108ad"}
2024-01-23T09:07:18.448-0500	INFO	datasetworker	datasetworker/datasetworker.go:124	cleanup complete	{"workerID": "6be79f85-881e-49d1-8991-6832d22108ad"}
2024-01-23T09:07:18.448-0500	INFO	datasetworker	datasetworker/datasetworker.go:132	worker thread finished	{"workerID": "6be79f85-881e-49d1-8991-6832d22108ad"}
2024-01-23T09:07:18.448-0500	INFO	datasetworker	service/service.go:82	service stopped	{"stopped": "1 out of 1"}
2024-01-23T09:07:18.448-0500	INFO	datasetworker	service/service.go:103	all services stopped

Great, we have completed the data preparation and can now list all prepared pieces using below command. It will show piece_size and piece_cid of each pieces which are very important parameters in deal proposals to storage providers.

  • List all prepared pieces
   ./singularity prep list-pieces civic

   AttachmentID  SourceStorageID  
   1             1                
   SourceStorage
      ID  Name   Type  Path                
      1   civic  s3    civic-aws-opendata  
   Pieces
      PieceCID                                                          PieceSize    RootCID                                                      FileSize  StoragePath  
      baga6ea4seaqibzb2aj5ueuzexkwo5bqn465gpnkmpinz3ycnfeuiknyx6yfgmga  34359738368  bafkreienh4xooo72bvfbcxls5paoh4zjbms3mtyncce3acs235d2cx3mmu  60303302               
      baga6ea4seaqfsqn7f2rhapcbysiamw7rfd6tes5p3hq6jh75vmvicb4lmoscsda  34359738368  bafybeihien4mh6kuiggh3hpejefwihkedhcjwihvz35j7iqj3gfvqmukgi  11748 

Distributing and Retrieving Data

  1. Running the Content Provider:
    • In a 3rd new Terminal window, run the content provider:
./singularity run content-provider
2024-01-23T09:08:57.182-0500	INFO	contentprovider	service/service.go:61	starting service	{"name": "HTTPServer"}

      ____    __
      / __/___/ /  ___
   / _// __/ _ \/ _ \
   /___/\__/_//_/\___/ v4.10.2
   High performance, minimalist Go web framework
   https://echo.labstack.com
   ____________________________________O/_______
                                       O\
   ⇨ http server started on 127.0.0.1:7777
  1. Making Deals:

Note Go back to your first terminal window before executing the following commands.

  • Import a test wallet. We will use a known private key for the test:
./singularity wallet import

   Enter the private key:          7b2254797065223a22736563703235366b31222c22507269766174654b6579223a226b35507976337148327349586343595a58594f5775453149326e32554539436861556b6c4e36695a5763453d227d
ID        Address                                    
f0808055  f1fib3pv7jua2ockdugtz7viz3cyy6lkhh7rfx3sa 
  • Attach the wallet to your preparation civic:
./singularity prep attach-wallet civic f0808055

ID  Name   DeleteAfterExport  MaxSize      PieceSize    NoInline  NoDag  
1   civic  false              33822867456  34359738368  false     false  
Wallets
   ID        Address                                    
   f0808055  f1fib3pv7jua2ockdugtz7viz3cyy6lkhh7rfx3sa  
Source Storages:
   ID  Name   Type  Path                
   1   civic  s3    civic-aws-opendata
  • In your 2nd Terminal window, run the emulated storage provider:
./sim-sp run

2024-01-23T09:34:25.065-0500	INFO	sim-sp	sim-sp/start.go:92	Creating libp2p host with peer ID: 12D3KooWDeNSud283YaRmhqbZDynLNmtATBxjUPAUJxtPyEXXp9u
2024-01-23T09:34:25.080-0500	INFO	sim-sp	sim-sp/contentprovider.go:43	Starting bitswap content provider
2024-01-23T09:34:25.082-0500	INFO	sim-sp	sim-sp/contentprovider.go:55	Starting HTTP server on :7778
  1. Send Out Deals:

Finally, it's time to send out the deals, to simplify this process, we are going to send deals for all available pieces from this open dataset to the storage provider. Do not change the miner id f02815405, do NOT replace {PIECE_CID}.

  • In your command terminal window, Schedule deals:
./singularity deal schedule create --verified=false --preparation civic --provider f02815405 --url-template "http://127.0.0.1:7777/piece/{PIECE_CID}"

ID  Provider   TotalDealSize  Verified  StartDelay  Duration    State   ScheduleCron  ScheduleCronPerpetual  ScheduleDealNumber  ScheduleDealSize  MaxPendingDealNumber  MaxPendingDealSize  Notes  Force  PreparationID  
1   f02815405  0              false     72h0m0s     12840h0m0s  active                false                  0                   0                 0                     0                          false  1              
  • In another new Terminal window, push out the deals:
./singularity run deal-pusher

2024-01-23T09:41:50.255-0500	INFO	dealpusher	service/service.go:61	starting service	{"name": "DealPusher"}
2024-01-23T09:41:50.258-0500	INFO	dealpusher	dealpusher/dealpusher.go:64	start
2024-01-23T09:41:50.263-0500	INFO	dealpusher	dealpusher/dealpusher.go:497	adding new schedule	{"schedule_id": 1}
2024-01-23T09:41:50.265-0500	INFO	dealpusher	dealpusher/dealpusher.go:276	current stats for schedule	{"schedule_id": 1, "pending": {"DealNumber":0,"DealSize":0}, "total": {"DealNumber":0,"DealSize":0}, "current": {"DealNumber":0,"DealSize":0}}
2024-01-23T09:41:50.266-0500	INFO	replication	replication/makedeal.go:520	making deal	{"client": "f0808055", "pieceCID": "baga6ea4seaqibzb2aj5ueuzexkwo5bqn465gpnkmpinz3ycnfeuiknyx6yfgmga", "provider": "f02815405"}
2024-01-23T09:41:51.834-0500	INFO	replication	replication/makedeal.go:520	making deal	{"client": "f0808055", "pieceCID": "baga6ea4seaqfsqn7f2rhapcbysiamw7rfd6tes5p3hq6jh75vmvicb4lmoscsda", "provider": "f02815405"}
2024-01-23T09:41:51.838-0500	INFO	dealpusher	dealpusher/dealpusher.go:343	no more pieces to send deal	{"schedule_id": 1}
2024-01-23T09:41:51.839-0500	INFO	dealpusher	dealpusher/dealpusher.go:115	schedule completed	{"schedule": 1}

You will now have things happening in 3 different terminal windows.

Singularity Deal Pusher is sending deal proposals to the emulated storage provider. Sim-sp is receiving boost online deals and is trying to download and parse the CAR files from Singularity Content Provider. Singularity Content Provider is reading S3 objects and converting into CAR stream for download.

Wait until you see two of Deal xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx completed successfully, then you are ready for the upcoming retrieval session. You may also kill the deal-pusher and content-provider service and leave sim-sp running for retrievals.

Retrievals

The emulated Storage Provider will enable IPFS Gateway so you can actually browse the dataset using the RootCID by go to http://127.0.0.1:7778/ipfs/bafybeihien4mh6kuiggh3hpejefwihkedhcjwihvz35j7iqj3gfvqmukgi, this RootCID comes from the preparation result ./singularity prep explore civic civic

You may also browse the files using IPFS Gateway and download files via any HTTP clients.

  1. Initialize IPFS:
    • In a new Terminal window, initialize and run IPFS:
ipfs init
ipfs daemon
  1. Connect to Storage Provider:
    • Connect to the emulated storage provider:
ipfs swarm connect /ip4/127.0.0.1/tcp/24001/p2p/12D3KooWDeNSud283YaRmhqbZDynLNmtATBxjUPAUJxtPyEXXp9u

connect 12D3KooWDeNSud283YaRmhqbZDynLNmtATBxjUPAUJxtPyEXXp9u success
  1. Retrieve Dataset:
    • Retrieve the dataset using the RootCID:
ipfs get -o out bafybeihien4mh6kuiggh3hpejefwihkedhcjwihvz35j7iqj3gfvqmukgi

You've successfully navigated the Singularity Workshop! If you encounter any issues or have questions, refer to the Filecoin documentation for additional information.

Note The new version of Boost (v2) no longer supports the trusted IPFS gateway by default. https://boost.filecoin.io/retrieving-data-from-filecoin/http-retrieval#trusted-plain-file-directory-retrievals

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment