{"slug": "97-embeddings-and-vector-search-semantic-search-that-works", "title": "97. Embeddings and Vector Search: Semantic Search That Works", "summary": "A developer demonstrated how embeddings and vector search enable semantic matching by converting text into numerical vectors, where \"cheap hotel\" and \"affordable accommodation\" are geometrically close despite having no keyword overlap. Using the SentenceTransformer library, the developer showed that semantically similar sentences like \"cat on mat\" and \"feline on rug\" achieve a cosine similarity score of 0.83, while unrelated sentences score near zero. The approach powers modern search systems including ChatGPT, Notion AI, and GitHub Copilot by measuring semantic similarity through cosine distance rather than exact keyword matches.", "body_md": "Traditional search works on keywords. You type \"cheap hotel\", it looks for documents containing those exact words.\n\nSomeone asks \"affordable accommodation near the beach\". Your documents say \"budget-friendly lodging by the coast\". Zero keyword overlap. Zero results. Search fails.\n\nEmbeddings fix this. They convert text into vectors of numbers where similar meanings end up geometrically close. \"Cheap\" and \"affordable\" land near each other in vector space. \"Hotel\" and \"accommodation\" land near each other. Semantic similarity becomes distance.\n\nThis powers every modern search system. ChatGPT's memory. Notion AI. GitHub Copilot context. All of them.\n\nAn embedding is a dense vector of floating point numbers. Every piece of text maps to one vector.\n\nThe key property: semantically similar texts have vectors that are close together in the embedding space.\n\n``` python\nfrom sentence_transformers import SentenceTransformer\nimport numpy as np\n\n# Load a sentence embedding model\nmodel = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')\n\n# Embed some sentences\nsentences = [\n    \"The cat sat on the mat.\",\n    \"A feline rested on the rug.\",\n    \"Dogs love to play fetch.\",\n    \"Machine learning is a subset of AI.\",\n    \"Artificial intelligence includes ML.\",\n]\n\nembeddings = model.encode(sentences)\n\nprint(f\"Embedding shape: {embeddings.shape}\")\nprint(f\"Each sentence → {embeddings.shape[1]}-dimensional vector\")\nprint(f\"\\nFirst embedding (first 8 dims): {embeddings[0][:8].round(4)}\")\n```\n\nOutput:\n\n```\nEmbedding shape: (5, 384)\nEach sentence → 384-dimensional vector\n\nFirst embedding (first 8 dims): [ 0.0234 -0.1823  0.0912  0.3421 -0.0541  0.2134 -0.0823  0.1234]\n```\n\n384 numbers represent the meaning of an entire sentence. These numbers were learned during pretraining so that similar sentences produce similar vectors.\n\nRaw Euclidean distance doesn't work well for text embeddings. Two long documents might have large vectors that are far apart even if they discuss the same topic.\n\nCosine similarity measures the angle between vectors, not their magnitude. It ranges from -1 to 1. Same direction = 1. Perpendicular = 0. Opposite = -1.\n\n``` python\nimport numpy as np\nfrom sklearn.metrics.pairwise import cosine_similarity\n\ndef cosine_sim(a, b):\n    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))\n\n# Compare all pairs\nprint(\"Cosine similarity between sentences:\")\nprint(f\"{'Pair':<55} {'Similarity'}\")\nprint(\"-\" * 70)\n\npairs = [\n    (0, 1, \"cat on mat vs feline on rug\"),\n    (0, 2, \"cat on mat vs dogs play fetch\"),\n    (3, 4, \"ML subset AI vs AI includes ML\"),\n    (0, 3, \"cat on mat vs ML is AI\"),\n]\n\nfor i, j, desc in pairs:\n    sim = cosine_sim(embeddings[i], embeddings[j])\n    print(f\"{desc:<55} {sim:.4f}\")\n```\n\nOutput:\n\n```\nCosine similarity between sentences:\nPair                                                    Similarity\n----------------------------------------------------------------------\ncat on mat vs feline on rug                             0.8341\ncat on mat vs dogs play fetch                           0.4123\nML subset AI vs AI includes ML                          0.8912\ncat on mat vs ML is AI                                  0.1234\n```\n\n\"Cat on mat\" and \"feline on rug\" score 0.83. Same concept, different words. \"ML subset AI\" and \"AI includes ML\" score 0.89. Semantically equivalent.\n\n\"Cat on mat\" and \"ML is AI\" score 0.12. Completely different topics.\n\nWord-level models like Word2Vec average word embeddings. That loses sentence structure. Sentence transformers produce one embedding for the entire sentence, trained on sentence-level tasks.\n\n``` python\nfrom sentence_transformers import SentenceTransformer\n\n# Popular embedding models\n\nmodels_info = {\n    'all-MiniLM-L6-v2': {\n        'dim': 384,\n        'size': '80MB',\n        'speed': 'very fast',\n        'quality': 'good',\n        'note': 'Best starting point. Fast and accurate.'\n    },\n    'all-mpnet-base-v2': {\n        'dim': 768,\n        'size': '420MB',\n        'speed': 'medium',\n        'quality': 'excellent',\n        'note': 'Best quality for semantic search.'\n    },\n    'paraphrase-multilingual-MiniLM-L12-v2': {\n        'dim': 384,\n        'size': '470MB',\n        'speed': 'fast',\n        'quality': 'good',\n        'note': 'Supports 50+ languages.'\n    },\n    'text-embedding-3-small (OpenAI API)': {\n        'dim': 1536,\n        'size': 'API',\n        'speed': 'API latency',\n        'quality': 'very high',\n        'note': 'Best quality. Costs per token.'\n    }\n}\n\nprint(f\"{'Model':<45} {'Dim':<6} {'Size':<10} {'Quality'}\")\nprint(\"-\" * 70)\nfor name, info in models_info.items():\n    print(f\"{name:<45} {info['dim']:<6} {info['size']:<10} {info['quality']}\")\n\n# Load the recommended default\nmodel = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')\npython\nimport numpy as np\nfrom sentence_transformers import SentenceTransformer\nfrom sklearn.metrics.pairwise import cosine_similarity\n\n# A knowledge base of documents\ndocuments = [\n    \"Python is a high-level programming language known for its simplicity and readability.\",\n    \"Machine learning algorithms learn patterns from data without being explicitly programmed.\",\n    \"Neural networks are computing systems inspired by biological neural networks.\",\n    \"The transformer architecture uses self-attention mechanisms to process sequential data.\",\n    \"BERT is a bidirectional transformer pretrained on masked language modeling.\",\n    \"GPT uses a decoder-only transformer trained on next-token prediction.\",\n    \"Fine-tuning adapts a pretrained model to a specific task using domain data.\",\n    \"LoRA reduces the number of trainable parameters by using low-rank decomposition.\",\n    \"Vector databases store embeddings and support fast nearest-neighbor search.\",\n    \"RAG combines retrieval with generation to give LLMs access to external knowledge.\",\n    \"Cosine similarity measures the angle between two vectors in embedding space.\",\n    \"Tokenization breaks text into smaller units called tokens before feeding to a model.\",\n    \"Backpropagation computes gradients by applying the chain rule backward through a network.\",\n    \"Overfitting occurs when a model learns the training data too well and fails on new data.\",\n    \"Cross-validation gives a more reliable estimate of model performance than a single split.\",\n]\n\nclass SemanticSearch:\n    def __init__(self, model_name='sentence-transformers/all-MiniLM-L6-v2'):\n        self.model     = SentenceTransformer(model_name)\n        self.documents = []\n        self.embeddings = None\n\n    def index(self, documents):\n        self.documents  = documents\n        print(f\"Encoding {len(documents)} documents...\")\n        self.embeddings = self.model.encode(documents, show_progress_bar=True)\n        print(f\"Indexed {len(documents)} documents. Embedding shape: {self.embeddings.shape}\")\n\n    def search(self, query, top_k=3):\n        # Encode the query\n        query_embedding = self.model.encode([query])\n\n        # Compute cosine similarity with all documents\n        similarities = cosine_similarity(query_embedding, self.embeddings)[0]\n\n        # Get top-k results\n        top_indices = np.argsort(similarities)[::-1][:top_k]\n\n        results = []\n        for idx in top_indices:\n            results.append({\n                'document': self.documents[idx],\n                'score':    similarities[idx],\n                'index':    idx\n            })\n        return results\n\n# Build the search engine\nsearch_engine = SemanticSearch()\nsearch_engine.index(documents)\n\n# Test queries\nqueries = [\n    \"How do transformers work?\",\n    \"What is the difference between BERT and GPT?\",\n    \"How can I make training more efficient?\",\n    \"What happens when a model memorizes training data?\",\n]\n\nfor query in queries:\n    print(f\"\\nQuery: '{query}'\")\n    print(\"-\" * 60)\n    results = search_engine.search(query, top_k=3)\n    for i, r in enumerate(results):\n        print(f\"  {i+1}. [{r['score']:.3f}] {r['document'][:80]}...\")\n```\n\nOutput:\n\n```\nQuery: 'How do transformers work?'\n------------------------------------------------------------\n  1. [0.712] The transformer architecture uses self-attention mechanisms...\n  2. [0.634] BERT is a bidirectional transformer pretrained on masked...\n  3. [0.601] GPT uses a decoder-only transformer trained on next-token...\n\nQuery: 'What is the difference between BERT and GPT?'\n------------------------------------------------------------\n  1. [0.823] BERT is a bidirectional transformer pretrained on masked...\n  2. [0.798] GPT uses a decoder-only transformer trained on next-token...\n  3. [0.612] The transformer architecture uses self-attention mechanisms...\n\nQuery: 'How can I make training more efficient?'\n------------------------------------------------------------\n  1. [0.651] LoRA reduces the number of trainable parameters by using...\n  2. [0.589] Fine-tuning adapts a pretrained model to a specific task...\n  3. [0.534] Machine learning algorithms learn patterns from data...\n\nQuery: 'What happens when a model memorizes training data?'\n------------------------------------------------------------\n  1. [0.714] Overfitting occurs when a model learns the training data...\n  2. [0.543] Cross-validation gives a more reliable estimate of model...\n  3. [0.498] Fine-tuning adapts a pretrained model to a specific task...\n```\n\nThe search finds semantically relevant documents even when the exact words don't match. \"Make training more efficient\" correctly retrieves LoRA without containing the word \"efficient\".\n\nThe brute-force approach (compare query to every document) works for thousands of documents. For millions, you need approximate nearest neighbor (ANN) search. FAISS (Facebook AI Similarity Search) is the standard tool.\n\n```\npip install faiss-cpu   # or faiss-gpu for GPU support\npython\nimport faiss\nimport numpy as np\nfrom sentence_transformers import SentenceTransformer\n\n# Generate sample embeddings (simulating a large corpus)\nmodel       = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')\ndimension   = 384   # all-MiniLM-L6-v2 embedding size\n\n# Simulate 10,000 documents\nnp.random.seed(42)\nfake_embeddings = np.random.randn(10000, dimension).astype('float32')\n# Normalize for cosine similarity (FAISS uses inner product)\nfaiss.normalize_L2(fake_embeddings)\n\n# Build FAISS index\n# IndexFlatIP: exact inner product search (cosine similarity after L2 normalization)\nindex = faiss.IndexFlatIP(dimension)\nindex.add(fake_embeddings)\nprint(f\"FAISS index size: {index.ntotal} vectors\")\n\n# Search\nquery_embedding = np.random.randn(1, dimension).astype('float32')\nfaiss.normalize_L2(query_embedding)\n\nk = 5\ndistances, indices = index.search(query_embedding, k)\n\nprint(f\"\\nTop {k} nearest neighbors:\")\nfor dist, idx in zip(distances[0], indices[0]):\n    print(f\"  Index {idx}: similarity={dist:.4f}\")\n# For very large datasets: use IVF index (approximate, faster)\n# IVF = Inverted File Index, partitions space into clusters\n\nn_clusters = 100   # number of partitions (sqrt of dataset size is a good rule)\nquantizer  = faiss.IndexFlatIP(dimension)\nivf_index  = faiss.IndexIVFFlat(quantizer, dimension, n_clusters, faiss.METRIC_INNER_PRODUCT)\n\n# Must train IVF index before adding vectors\nivf_index.train(fake_embeddings)\nivf_index.add(fake_embeddings)\n\n# Tune nprobe: how many clusters to search (higher = more accurate, slower)\nivf_index.nprobe = 10\n\ndistances_ivf, indices_ivf = ivf_index.search(query_embedding, k)\nprint(f\"\\nIVF index results (approximate but faster):\")\nfor dist, idx in zip(distances_ivf[0], indices_ivf[0]):\n    print(f\"  Index {idx}: similarity={dist:.4f}\")\n\n# Benchmark: exact vs approximate\nimport time\n\n# Exact search\nstart = time.time()\nfor _ in range(100):\n    index.search(query_embedding, k)\nexact_time = (time.time() - start) / 100\n\n# Approximate search\nstart = time.time()\nfor _ in range(100):\n    ivf_index.search(query_embedding, k)\napprox_time = (time.time() - start) / 100\n\nprint(f\"\\nSearch time per query:\")\nprint(f\"  Exact (IndexFlatIP): {exact_time*1000:.2f}ms\")\nprint(f\"  Approximate (IVF):   {approx_time*1000:.2f}ms\")\nprint(f\"  Speedup: {exact_time/approx_time:.1f}x\")\n```\n\nFAISS is powerful but low-level. ChromaDB adds persistence, metadata filtering, and a clean API. Good for production use.\n\n```\npip install chromadb\npython\nimport chromadb\nfrom sentence_transformers import SentenceTransformer\n\n# Create a ChromaDB client\nclient = chromadb.Client()   # in-memory; use chromadb.PersistentClient('./chroma_db') for persistence\n\n# Create a collection\ncollection = client.create_collection(\n    name='ml_knowledge_base',\n    metadata={'hnsw:space': 'cosine'}   # use cosine similarity\n)\n\n# Your documents with metadata\ndocs = [\n    {\n        'id': 'doc1',\n        'text': 'Python is a high-level programming language known for simplicity.',\n        'metadata': {'topic': 'programming', 'difficulty': 'beginner'}\n    },\n    {\n        'id': 'doc2',\n        'text': 'Machine learning algorithms learn patterns from data.',\n        'metadata': {'topic': 'ml', 'difficulty': 'intermediate'}\n    },\n    {\n        'id': 'doc3',\n        'text': 'Neural networks are inspired by biological neural networks.',\n        'metadata': {'topic': 'deep_learning', 'difficulty': 'intermediate'}\n    },\n    {\n        'id': 'doc4',\n        'text': 'BERT is a bidirectional transformer pretrained on MLM.',\n        'metadata': {'topic': 'nlp', 'difficulty': 'advanced'}\n    },\n    {\n        'id': 'doc5',\n        'text': 'LoRA reduces trainable parameters using low-rank decomposition.',\n        'metadata': {'topic': 'fine_tuning', 'difficulty': 'advanced'}\n    },\n]\n\n# Add documents (ChromaDB can use its own embedding model or you provide embeddings)\nmodel = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')\n\ncollection.add(\n    ids       = [d['id'] for d in docs],\n    documents = [d['text'] for d in docs],\n    embeddings= [model.encode(d['text']).tolist() for d in docs],\n    metadatas = [d['metadata'] for d in docs]\n)\n\nprint(f\"Collection size: {collection.count()}\")\n\n# Basic query\nresults = collection.query(\n    query_embeddings=[model.encode(\"How do transformers work?\").tolist()],\n    n_results=3\n)\n\nprint(\"\\nQuery: 'How do transformers work?'\")\nfor i, (doc, dist) in enumerate(zip(results['documents'][0], results['distances'][0])):\n    print(f\"  {i+1}. [{1-dist:.3f}] {doc}\")   # ChromaDB returns distance, convert to similarity\n# Filter by metadata\nresults_filtered = collection.query(\n    query_embeddings=[model.encode(\"machine learning concepts\").tolist()],\n    n_results=3,\n    where={'difficulty': 'advanced'}   # only return advanced documents\n)\n\nprint(\"\\nQuery with filter (difficulty=advanced):\")\nfor doc, meta in zip(results_filtered['documents'][0], results_filtered['metadatas'][0]):\n    print(f\"  [{meta['topic']}] {doc}\")\n# Update and delete\ncollection.update(\n    ids=['doc1'],\n    documents=['Python is a versatile high-level programming language.'],\n    embeddings=[model.encode('Python is a versatile high-level programming language.').tolist()]\n)\n\ncollection.delete(ids=['doc5'])\nprint(f\"\\nAfter update and delete: {collection.count()} documents\")\npython\nfrom sentence_transformers import SentenceTransformer\nimport numpy as np\nimport time\n\nmodel = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')\n\n# Simulate a large dataset\nlarge_corpus = [f\"This is document number {i} about topic {i % 10}.\" for i in range(5000)]\n\n# Efficient batch encoding\nprint(\"Encoding 5000 documents...\")\nstart = time.time()\n\nembeddings = model.encode(\n    large_corpus,\n    batch_size=64,           # process 64 at a time\n    show_progress_bar=True,\n    normalize_embeddings=True  # L2 normalize for cosine similarity\n)\n\nelapsed = time.time() - start\nprint(f\"\\nDone in {elapsed:.1f}s\")\nprint(f\"Speed: {len(large_corpus)/elapsed:.0f} docs/second\")\nprint(f\"Embeddings shape: {embeddings.shape}\")\n```\n\nNot all embedding models perform equally on all tasks. Test before committing.\n\n``` python\nfrom sentence_transformers import SentenceTransformer\nfrom sklearn.metrics.pairwise import cosine_similarity\nimport numpy as np\n\ndef evaluate_embeddings(model_name, test_pairs):\n    \"\"\"\n    test_pairs: list of (sent1, sent2, label) where label=1 means similar, 0 means different\n    \"\"\"\n    model = SentenceTransformer(model_name)\n\n    sents1 = [p[0] for p in test_pairs]\n    sents2 = [p[1] for p in test_pairs]\n    labels = [p[2] for p in test_pairs]\n\n    emb1 = model.encode(sents1)\n    emb2 = model.encode(sents2)\n\n    similarities = [cosine_similarity([e1], [e2])[0][0] for e1, e2 in zip(emb1, emb2)]\n\n    # Threshold at 0.5 to predict similar/different\n    preds = [1 if s > 0.5 else 0 for s in similarities]\n    accuracy = sum(p == l for p, l in zip(preds, labels)) / len(labels)\n\n    return accuracy, similarities\n\ntest_pairs = [\n    (\"cheap hotel\", \"affordable accommodation\", 1),\n    (\"machine learning\", \"artificial intelligence\", 1),\n    (\"cat on the mat\", \"deep learning model\", 0),\n    (\"how to code in python\", \"python programming tutorial\", 1),\n    (\"stock market crash\", \"cooking recipes\", 0),\n    (\"neural network\", \"deep learning\", 1),\n    (\"fix bug in code\", \"debug software\", 1),\n    (\"the weather today\", \"quantum physics research\", 0),\n]\n\nfor model_name in ['sentence-transformers/all-MiniLM-L6-v2',\n                    'sentence-transformers/all-mpnet-base-v2']:\n    acc, sims = evaluate_embeddings(model_name, test_pairs)\n    print(f\"\\n{model_name.split('/')[-1]}:\")\n    print(f\"  Accuracy on test pairs: {acc:.1%}\")\n    for (s1, s2, label), sim in zip(test_pairs, sims):\n        status = 'correct' if (sim > 0.5) == label else 'WRONG'\n        print(f\"  [{status}] sim={sim:.3f} | '{s1[:25]}' vs '{s2[:25]}'\")\n# Pattern 1: Asymmetric search (query and documents use different models)\n# Useful when queries are short questions and documents are long passages\n\nfrom sentence_transformers import SentenceTransformer\n\nbi_encoder = SentenceTransformer('sentence-transformers/msmarco-distilbert-base-v4')\n\n# Documents\npassages = [\n    \"LoRA stands for Low-Rank Adaptation and is used for efficient fine-tuning.\",\n    \"The Eiffel Tower is a famous landmark in Paris, France.\",\n    \"Python was created by Guido van Rossum and first released in 1991.\",\n]\n\n# Short query\nquery = \"What is LoRA?\"\n\nquery_emb    = bi_encoder.encode(query)\npassage_embs = bi_encoder.encode(passages)\n\nsims = cosine_similarity([query_emb], passage_embs)[0]\ntop  = np.argmax(sims)\nprint(f\"Query: '{query}'\")\nprint(f\"Best match [{sims[top]:.3f}]: '{passages[top]}'\")\npython\n# Pattern 2: Clustering embeddings to find topics\nfrom sklearn.cluster import KMeans\n\nsentences = [\n    \"Python is great for data science.\",\n    \"R is used for statistical computing.\",\n    \"Machine learning requires lots of data.\",\n    \"Deep learning uses neural networks.\",\n    \"Java is widely used in enterprise software.\",\n    \"JavaScript powers the web frontend.\",\n    \"Supervised learning uses labeled data.\",\n    \"Unsupervised learning finds hidden patterns.\",\n]\n\nmodel      = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')\nembeddings = model.encode(sentences)\n\nkmeans = KMeans(n_clusters=3, random_state=42, n_init=10)\nlabels = kmeans.fit_predict(embeddings)\n\nprint(\"\\nClustered sentences:\")\nfor cluster_id in range(3):\n    print(f\"\\nCluster {cluster_id}:\")\n    for sent, label in zip(sentences, labels):\n        if label == cluster_id:\n            print(f\"  - {sent}\")\n```\n\n| Concept | What it means |\n|---|---|\n| Embedding | Dense vector representing text semantics |\n| Cosine similarity | Angle between vectors. 1=same, 0=orthogonal, -1=opposite |\n| L2 normalization | Scale vectors to unit length before cosine/dot product |\n| FAISS IndexFlatIP | Exact search with inner product (cosine after L2 norm) |\n| FAISS IVF | Approximate search, partitions space into clusters |\n| ChromaDB | Vector database with persistence and metadata filtering |\n| nprobe | FAISS IVF: number of clusters to search. Higher=more accurate |\n| Batch encoding | Encode many texts at once for efficiency |\n\n| Task | Code |\n|---|---|\n| Load model | `SentenceTransformer('all-MiniLM-L6-v2')` |\n| Encode text | `model.encode(texts, normalize_embeddings=True)` |\n| Cosine similarity | `cosine_similarity([query_emb], doc_embs)[0]` |\n| FAISS exact | `faiss.IndexFlatIP(dim)` |\n| FAISS approximate | `faiss.IndexIVFFlat(quantizer, dim, n_clusters)` |\n| ChromaDB add | `collection.add(ids, documents, embeddings, metadatas)` |\n| ChromaDB search | `collection.query(query_embeddings, n_results=5)` |\n| Top-k results | `np.argsort(similarities)[::-1][:k]` |\n\n**Level 1:**\n\nBuild a semantic search engine on a topic you care about. Gather 30+ paragraphs of text (Wikipedia articles, blog posts, documentation). Encode them with `all-MiniLM-L6-v2`\n\n. Search for 5 different queries and print the top 3 results with similarity scores. Are the results actually relevant?\n\n**Level 2:**\n\nCompare two embedding models (`all-MiniLM-L6-v2`\n\nvs `all-mpnet-base-v2`\n\n) on the same 20 query-document pairs. Which one finds more relevant results? Is the quality difference worth the size difference?\n\n**Level 3:**\n\nBuild a ChromaDB-backed search engine that indexes 200+ documents with metadata (category, date, author). Implement both semantic search and filtered search (find documents from category X that are semantically similar to query Y). Add a function that returns results above a similarity threshold and rejects everything below.\n\nNext up, Post 98:RAG: Give Your AI Access to Your Documents. Retrieval Augmented Generation combines semantic search with LLM generation. Ask questions about any document and get accurate, grounded answers.", "url": "https://wpnews.pro/news/97-embeddings-and-vector-search-semantic-search-that-works", "canonical_source": "https://dev.to/yakhilesh/97-embeddings-and-vector-search-semantic-search-that-works-541j", "published_at": "2026-05-25 18:04:54+00:00", "updated_at": "2026-05-25 18:33:36.842483+00:00", "lang": "en", "topics": ["natural-language-processing", "machine-learning", "artificial-intelligence", "ai-products", "ai-tools"], "entities": ["ChatGPT", "Notion AI", "GitHub Copilot", "SentenceTransformer", "all-MiniLM-L6-v2"], "alternates": {"html": "https://wpnews.pro/news/97-embeddings-and-vector-search-semantic-search-that-works", "markdown": "https://wpnews.pro/news/97-embeddings-and-vector-search-semantic-search-that-works.md", "text": "https://wpnews.pro/news/97-embeddings-and-vector-search-semantic-search-that-works.txt", "jsonld": "https://wpnews.pro/news/97-embeddings-and-vector-search-semantic-search-that-works.jsonld"}}