โ† Back to AI & LLMs
AI & LLMs by @mrlynn

voyageai-skill

Voyage AI embedding and reranking CLI integrated with MongoDB

0
Source Code

๐Ÿงญ Voyage AI Skill

Uses the vai CLI (voyageai-cli) for Voyage AI embeddings, reranking, and MongoDB Atlas Vector Search. Pure Node.js โ€” no Python required.

Setup

npm install -g voyageai-cli

Environment Variables

Variable Required For Description
VOYAGE_API_KEY embed, rerank, store, search, similarity, ingest, ping Model API key from MongoDB Atlas
MONGODB_URI store, search, index, ingest, ping (optional) Atlas connection string

Get your API key: MongoDB Atlas โ†’ AI Models โ†’ Create model API key

Command Reference (14 commands)

embed โ€” Generate embeddings

vai embed "What is MongoDB?"
vai embed "search query" --model voyage-4-large --input-type query --dimensions 512
vai embed --file document.txt --input-type document
cat texts.txt | vai embed
vai embed "hello" --output-format array

rerank โ€” Rerank documents

vai rerank --query "database performance" --documents "MongoDB is fast" "SQL is relational"
vai rerank --query "best database" --documents-file candidates.json --top-k 3

store โ€” Embed and store in Atlas

vai store --db mydb --collection docs --field embedding \
  --text "MongoDB Atlas is a cloud database" \
  --metadata '{"source": "docs"}'

# Batch from JSONL
vai store --db mydb --collection docs --field embedding --file documents.jsonl

search โ€” Vector search

vai search --query "cloud database" --db mydb --collection docs \
  --index vector_index --field embedding

# With pre-filter
vai search --query "performance" --db mydb --collection docs \
  --index vector_index --field embedding --filter '{"category": "guides"}' --limit 5

index โ€” Manage vector search indexes

vai index create --db mydb --collection docs --field embedding \
  --dimensions 1024 --similarity cosine --index-name my_index
vai index list --db mydb --collection docs
vai index delete --db mydb --collection docs --index-name my_index

models โ€” List available models

vai models
vai models --type embedding
vai models --type reranking
vai models --json

ping โ€” Test connectivity

vai ping
vai ping --json

config โ€” Manage persistent configuration

vai config set api-key "pa-your-key"
echo "pa-your-key" | vai config set api-key --stdin
vai config get
vai config delete api-key
vai config path
vai config reset

demo โ€” Interactive guided walkthrough

vai demo
vai demo --no-pause
vai demo --skip-pipeline
vai demo --keep

explain โ€” Learn about AI concepts

vai explain                      # List all topics
vai explain embeddings
vai explain reranking
vai explain vector-search
vai explain rag
vai explain cosine-similarity
vai explain two-stage-retrieval
vai explain input-type
vai explain models
vai explain api-keys
vai explain api-access
vai explain batch-processing

similarity โ€” Compare text similarity

vai similarity "MongoDB is a document database" "MongoDB Atlas is a cloud database"
vai similarity "database performance" --against "MongoDB is fast" "PostgreSQL is relational"
vai similarity --file1 doc1.txt --file2 doc2.txt
vai similarity "text A" "text B" --json

ingest โ€” Bulk import with progress

vai ingest --file corpus.jsonl --db myapp --collection docs --field embedding
vai ingest --file data.csv --db myapp --collection docs --field embedding --text-column content
vai ingest --file corpus.jsonl --db myapp --collection docs --field embedding \
  --model voyage-4 --batch-size 100 --input-type document
vai ingest --file corpus.jsonl --db myapp --collection docs --field embedding --dry-run

completions โ€” Shell completion scripts

vai completions bash    # Output bash completion script
vai completions zsh     # Output zsh completion script

# Install bash completions
vai completions bash >> ~/.bashrc && source ~/.bashrc

# Install zsh completions
vai completions zsh > ~/.zsh/completions/_vai

help โ€” Display help

vai help
vai help embed
vai embed --help

Common Workflows

Embed โ†’ Store โ†’ Search Pipeline

# 1. Store documents
vai store --db myapp --collection articles --field embedding \
  --text "MongoDB Atlas provides a fully managed cloud database" \
  --metadata '{"title": "Atlas Overview"}'

# 2. Create index
vai index create --db myapp --collection articles --field embedding \
  --dimensions 1024 --similarity cosine --index-name article_search

# 3. Search
vai search --query "how does cloud database work" \
  --db myapp --collection articles --index article_search --field embedding

Two-Stage Retrieval (Embed + Rerank)

# 1. Get candidates via vector search
vai search --query "database scaling" --db myapp --collection articles \
  --index article_search --field embedding --limit 20 --json > candidates.json

# 2. Rerank for precision
vai rerank --query "database scaling" --documents-file candidates.json --top-k 5

Bulk Ingest Pipeline

# 1. Validate data (dry run)
vai ingest --file corpus.jsonl --db myapp --collection docs --field embedding --dry-run

# 2. Ingest with progress
vai ingest --file corpus.jsonl --db myapp --collection docs --field embedding

# 3. Create index
vai index create --db myapp --collection docs --field embedding \
  --dimensions 1024 --similarity cosine

Global Flags

Flag Description
--json Machine-readable JSON output
--quiet Suppress non-essential output

References