Yash Bonde yashbonde

File System UX on Postgres

What are the properties of file system UX and how we implemented it in postgres.

Before we begin, here's the table SQL

CREATE TABLE documents (
	id text NOT NULL,
	collection_id text NULL,

NimbleBox Apprenticeship Open Challenges

Hi there, thank you for your interest in NimbleBox.

We built ChatNBX and ChainFury

We are looking for talented coders who sit at the intersection of ML and SDE. There are three positions:

ML Engineer: Systems level thinker, eats servers for lunch
ML Researcher: Model whisperer, any model, anytime, any where
Front-End engineer: UI Wizard, 'nuff said

Why the fury?

(ENG-01) The first engineering blog.

ChainFury started as a weekend hackathon but since then has developed into a much bigger project (dare I say, one of the last systems). The core idea behind it being the rapid development (with chains), deployment (with embeddable chatbot UI) and gathering feedback for the performance. Initially it was built with langflow as inspiration which was in turn built on top of langchain.

Chandrani's written a great starting blog on ChainFury.

	// This is a copy of saturn/runner.py translated to Go

	package main

	import (
	"encoding/json"
	"fmt"
	"io/ioutil"
	"os"

	import os
	import sys
	import json
	import logging
	import subprocess
	from pathlib import Path

	try:
	import tensorflow as tf
	except:

	#!/usr/bin/env python

	# Copyright 2020 DeepMind Technologies Limited. All Rights Reserved.
	# Licensed under the Apache License, Version 2.0
	# Modifications copyright Yash Bonde (C) 2021 Nimblebox.ai, Inc.

	# This file is peak Google! <3
	# How far can you push Python before it's just too hard?

	from typing import Any, Dict, Iterable, List, Tuple, Optional

	# wrapper for using GPT generation first-class
	# MIT - License, 2021, Yash Bonde

	import os
	import torch
	import pickle
	import hashlib
	import warnings
	import numpy as np
	from time import time

	In this quick script we are trying to solve sharding problem:
	often in very large datasets there is no way to tokenize everything and store
	them. Considering the CLM datasets we have a fixed dataset where each row
	has dynamic number of tokens. A dummy looks like follows:

	j n sequence (w/o EOT = 42)
	[0] [15] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
	[1] [13] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
	[2] [11] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
	[3] [13] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],

Yash Bonde yashbonde

File System UX on Postgres

NimbleBox Apprenticeship Open Challenges

Why the fury?

Success