Vector Databases

Understand what vector databases are and how they power retrieval-augmented generation and semantic search in modern AI systems.

What is a Vector Database?

A vector database stores data as high-dimensional numeric vectors (embeddings). Similar vectors are located “near” each other in vector space, allowing fast nearest-neighbour search. This makes vector databases ideal for semantic search, recommendation engines, and retrieval-augmented generation (RAG).

Why Use One?

  • Sub-second similarity search over millions of items.
  • Language-agnostic: works for text, image, audio, or code embeddings.
  • Scales horizontally and supports real-time updates.
🧒 Explain Like I’m 5

Think of a vector database like a magical toy box that groups toys that aresimilar together—even if they don’t share the same name. Put in a “dog” figurine and the box knows “puppy” and even “cat” are close by because they’re all animals. This makes it super fast to find things that are alike!

Popular Vector Databases

  • Milvus – open-source, highly-performant, GPU-accelerated.
  • Pinecone – fully managed service with metadata filtering.
  • Qdrant – open-source with hybrid (text + vector) search.
  • Weaviate – GraphQL API and built-in vectorisers.