Skip to content

Instantly share code, notes, and snippets.

@dtaivpp
Last active April 16, 2025 21:12
Show Gist options
  • Select an option

  • Save dtaivpp/9d2a458e59cf94ba803e99c4d8b140ec to your computer and use it in GitHub Desktop.

Select an option

Save dtaivpp/9d2a458e59cf94ba803e99c4d8b140ec to your computer and use it in GitHub Desktop.

Semantic Search with OpenSearch and Cohere

Cluster Settings:

PUT /_cluster/settings
{
    "persistent": {
        "plugins.ml_commons.allow_registering_model_via_url": true,
        "plugins.ml_commons.only_run_on_ml_node": false,
        "plugins.ml_commons.connector_access_control_enabled": true,
        "plugins.ml_commons.model_access_control_enabled": true,
        "plugins.ml_commons.trusted_connector_endpoints_regex": [
          "^https://runtime\\.sagemaker\\..*[a-z0-9-]\\.amazonaws\\.com/.*$",
          "^https://api\\.openai\\.com/.*$",
          "^https://api\\.cohere\\.ai/.*$"
        ]
    }
}

Create Model Group:

POST /_plugins/_ml/model_groups/_register
{
    "name": "Cohere_Group",
    "description": "Public Cohere Model Group",
    "access_mode": "public"
}
# MODEL_GROUP_ID: 

Create Connector:

POST /_plugins/_ml/connectors/_create
{
   "name": "Cohere Connector",
   "description": "External connector for connections into Cohere",
   "version": "1.0",
   "protocol": "http",
   "credential": {
           "cohere_key": "<COHERE KEY HERE>"
       },
    "parameters": {
      "model": "embed-english-v2.0",
      "truncate": "END"
    },
   "actions": [{
       "action_type": "predict",
       "method": "POST",
       "url": "https://api.cohere.ai/v1/embed",
       "headers": {
               "Authorization": "Bearer ${credential.cohere_key}"
           },
			"request_body": "{ \"texts\": ${parameters.prompt}, \"truncate\": \"${parameters.truncate}\", \"model\": \"${parameters.model}\" }",
			"pre_process_function": "connector.pre_process.cohere.embedding",
			 "post_process_function": "connector.post_process.cohere.embedding"
       }]
}
# CONNECTOR_ID:

Register and deploy a model to the cluster:

POST /_plugins/_ml/models/_register?deploy=true
{
    "name": "embed-english-v2.0",
    "function_name": "remote",
    "description": "test model",
    "model_group_id": "<MODEL_GROUP_ID>",
    "connector_id": "<CONNECTOR_ID>"
}
# TASK_ID: 

Should see the model created and get Model ID:

GET /_plugins/_ml/tasks/<TASK_ID>
# MODEL_ID: 

Test the model embedding

POST /_plugins/_ml/_predict/text_embedding/<MODEL_ID>
{
  "text_docs": ["This should get embedded."],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}

Create Ingestion Pipeline

PUT _ingest/pipeline/cohere-ingest-pipeline
{
  "description": "Cohere Neural Search Pipeline",
  "processors" : [
    {
      "text_embedding": {
        "model_id": "<MODEL_ID>",
        "field_map": {
          "content": "content_embedding"
        }
      }
    }
  ]
}

Create KNN index. Note* need to match space to model space. eg embed-english-v2.0 recommends cosine similarity:

PUT /cohere-index
{
	"settings": {
		"index.knn": true,
		"default_pipeline": "cohere-ingest-pipeline"
	},
	"mappings": {
		"properties": {
			"content_embedding": {
				"type": "knn_vector",
				"dimension": 4096,
				"method": {
					"name": "hnsw",
					"space_type": "cosinesimil",
					"engine": "nmslib"
				}
			},
			"content": {
				"type": "text"
			}
		}
	}
}

Hydrate index with _bulk

POST _bulk
{ "create" : { "_index" : "cohere-index", "_id" : "1" }}
{ "content":"Testing neural search"}
{ "create" : { "_index" : "cohere-index", "_id" : "2" }}
{ "content": "What are we doing"}
{ "create" : { "_index" : "cohere-index", "_id" : "3" } }
{ "content": "This should exist"}

Search

GET /cohere-index/_search
{
  "query": {
    "bool" : {
      "should" : [
        {
          "script_score": {
              "neural": {
                "content_embedding": {
                  "query_text": "How do I ingest to opensearch",
                  "k": 10
              }
            },
            "script": {
              "source": "_score * 1.5"
            }
          }
        }
        ,
        {
          "script_score": {
            "query": {
              "match": { "content": "I want information about the new compression algorithems in OpenSearch" }
            },
            "script": {
              "source": "_score * 1.7"
            }
          }
        }
      ]
    }
  }
}

Cleanup

POST /_plugins/_ml/models/<MODEL_ID>/_undeploy
DELETE /_plugins/_ml/models/<MODEL_ID>
DELETE /_plugins/_ml/connectors/<CONNECTOR_ID>
DELETE _ingest/pipeline/cohere-ingest-pipeline
DELETE cohere-index

Troubleshoot:

POST /_plugins/_ml/models/<MODEL_ID>/_predict
{
  "parameters": {
    "texts": ["This should exist"]
  }
} 
GET /cohere-index/_search
{
  "query": {
    "match_all": {}
  }
}
@cloudsmithy

Copy link
Copy Markdown

Is this your locally deployed opensearch? I reported an error using AWS hosting "Message": "Your request: '/_cluster/settings' payload is not allowed."
}

@dtaivpp

dtaivpp commented Dec 8, 2023

Copy link
Copy Markdown
Author

Yes, Amazon OpenSearch service does not have these feature flags (for conversational search and RAG). Connectors and hybrid search should be there however if I'm not mistaken.

@jonwiggins

jonwiggins commented Dec 12, 2023

Copy link
Copy Markdown

@dtaivpp I get that error whenever I try to enable plugins.ml_commons.allow_registering_model_via_url - It looks like this is required to load the sparse search models (https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/#step-2-register-a-local-opensearch-provided-model). Does Amazon OpenSearch not support this? Or do you know if there are any plans to?

@dtaivpp

dtaivpp commented Dec 13, 2023

Copy link
Copy Markdown
Author

@jonwiggins Reading through the docs it seems like it should but let me take a look and see if I can test it.

@jonwiggins

Copy link
Copy Markdown

@dtaivpp Awesome, thank you! Totally stuck trying to use this feature, I really appreciate the help.

@dtaivpp

dtaivpp commented Dec 14, 2023

Copy link
Copy Markdown
Author

@jonwiggins Just occurred to me, which version of OpenSearch are you currently deploying?

@jonwiggins

Copy link
Copy Markdown

@dtaivpp OpenSearch_2_11_R20231113-P1 - which I think is the latest version AWS allows.
Are you able to turn that feature on in AWS OpenSearch?

@dtaivpp

dtaivpp commented Dec 22, 2023

Copy link
Copy Markdown
Author

@jonwiggins so after some digging it seems it is available but only with hosted models on sagemaker. The self run models aren't supported currently -_-

@jonwiggins

Copy link
Copy Markdown

Darn that's really disappointing. Thanks for getting back to me though, otherwise I would have kept trying to make it work, I appreciate it.

@dtaivpp

dtaivpp commented Dec 22, 2023

Copy link
Copy Markdown
Author

All good @jonwiggins. Also, if you have an account TAM for your AWS account I’d ask that they open what’s called a PFR for this. Enough people have been asking for this that I want to start getting it documented so we can eventually support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment