Jason Vertrees inchoate

Selectively Querying Documents in Llama Index

Note: This gist has been updated to be far simpler than the original implementation, focusing on a more streamlined approach to selectively querying documents based on metadata.

Introduction

When working with Llama Index and other Retrieval-Augmented Generation (RAG) systems, most tutorials focus on ingesting and querying a single document. You typically read the document from a source, parse it, embed it, and store it in your vector store. Once there, querying is straightforward. But what if you have multiple documents and want to selectively query only one, such as Document #2 (doc_id=2), from your vector store?

This article demonstrates how to encapsulate the creation of a filtered query engine, which allows you to specify the nodes to query based on custom metadata. This approach provides a more structured and efficient way to retrieve relevant information, making it easier to manage and scale your querying process.

Emitting Fully Nested Python Schemas and Models with SQLModel

I ran into a problem while developing some cool tools. I had a nested Pydantic structure that I needed to migrate to SQLModel. But, I needed to retain the ability to emit fully nested schema and models -- which was lost. So, I had to write a few helper functions and models.

In the realm of modern web development and data-driven applications, managing complex data structures and their relationships can be a daunting task. SQLModel, a powerful library that bridges the gap between SQLAlchemy and Pydantic, offers an elegant solution for defining and interacting with SQL databases using Python. In this small write up, we dive into a practical demonstration of leveraging SQLModel to emit fully nested Python schemas and models. Surprisingly, this doesn't work out of the box.

Our focus is on a versatile script that illustrates how to dynamically generate and manipulate nested data structures. By defining SQLModel-based types, we can seamlessly map th

If you've instantiated your CrewAI Crew as a class, using the convenient decorators (@agent, @task, for example) as shown in the documentation you probably ended up with something like this:

@crew
def crew(self) -> Crew:
    """Creates the crew"""
    return Crew(
        agents=self.agents,
        tasks=self.tasks,
        process=Process.sequential,
        full_output=True,

I found a bug in Azure's az CLI when it parses variables.

I was tagging a lot of resources and initially did this:

export ENV=demo
export VERSION=
export RG=rg-apps-${ENV}${VERSION}
export LOCATION=eastus
export SUBSCRIPTION="my-subscription-id"

Here's my TL;DR on Azure Service Principals, with a full example of creating one to push docker images to Azure Container Registry.

First, you need an Azure Container Registry. So, go make one. And, once you're done go to its page in the UI and click Overview > JSON View. See that Resource ID? Copy that. That will become the scope or the Azure thing we want to give to give our principal access to. Scopes can be at various levels. In this example, I'm very finely scoping down to ONLY this container registry resource. This principal will have no permissions anywhere else. You can go up levels in the scope hierarchy if you want, and say, provide access to all resources in a resource group or even a subscription. My resource ID looks a bit like this: /subscriptions/1d6a...982f/resourceGroups/my-container-registry/providers/Microsoft.ContainerRegistry/registries/registryname. Also, store the name of your registry if you want to push an image to it later.

Second, you need to determine what permissions

Problem

When I click on links from Slack or Outlook on MacOS they open in seemingly random browser windows/profiles. This is annoying.

Solution

Open links in a particular google chrome profile window. Be less annoyed.

In Chrome, visit chrome://version and find the desired profile name. Mine was Default. Copy that profile's directory name, like Profile 2 or Default, not the profile's vanity name you see when you click on your profile icon in the browser.
Install Finicky: brew install finicky. After install it should be running and you should see the icon in the upper toolbar.
From the Finicky Toolbar Item, click > Config > Create New
Edit the new file ~/.finicky and make it look something like this, filling in your profile name:

	"""
	Adding this SQLModel mixin will add created_at, updated_at, and deleted_at to your tables.
	`created_at` will auto-populate upon creation. `updated_at` will do the right thing on updates.

	Usage:

	class YourModel(TimestampMixin):
	...

	At this point all classes inheriting from YourModel will have the usual fields added.

	# Service Graph:
	kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=servicegraph -o jsonpath='{.items[0].metadata.name}') 8088:8088 &

	# Kiali:
	kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=kiali -o jsonpath='{.items[0].metadata.name}') 20001:20001 &

	# Grafana:
	kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=grafana -o jsonpath='{.items[0].metadata.name}') 3000:3000 &

	# Prometheus:

	#
	# This will build the memtier_benchmark tool from Redis labs and
	# run it against a Redis server running on your Mac's localhost.
	#
	git clone https://github.com/RedisLabs/memtier_benchmark.git
	cd memtier_benchmark
	docker build -t memtier_benchmark .
	docker run --rm -it memtier_benchmark -s host.docker.internal

	"""Checks to see if media URIs listed in the given file are valid.

	Warning this is untested, shit code.

	"Valid" means:
	- the file is under 32 megs
	- the file exists at the URI

	Usage:
	# check all URLs in the `your-uri-file.txt` and output failures only.