Vector Databases

Vector databases keep showing up in every AI conversation right now. Pinecone, Weaviate, Qdrant, pgvector — you can’t read about building with AI without bumping into one of them. But if you haven’t dug into how AI search actually works, the whole thing can feel a bit abstract and hand-wavy.

So let’s talk through it properly. What vector data is, how it works, where people are actually using it, and whether your project needs one at all.

What Is Vector Data, Really?

A vector is just a list of numbers. That’s it. But what makes it interesting is what those numbers represent — the meaning of something.

Take the sentence “I love hiking in the mountains.” When you pass this through a language model, it doesn’t just see words. It converts the whole thing into something like [0.23, -0.87, 0.45, 0.11, ...] — hundreds or thousands of decimal numbers called an embedding. The clever bit is that sentences with similar meaning end up with similar numbers. “I enjoy trekking through the hills” produces a list very close to the first one. “I prefer the beach” is a bit further away. “What is the capital of France?” is very far away.

So meaning gets turned into geometry. Things that mean similar things end up near each other in mathematical space. That’s the whole idea behind vector search — instead of matching words, you’re matching meaning.

How Vector Databases Actually Work

A regular database finds things by matching exact values. Give me every row where status = 'active'. Give me every user where name LIKE '%smith%'. That works brilliantly for structured data. It completely falls apart when someone searches for “something cosy to wear in winter” and you want to surface a wool jumper even though the word “cosy” doesn’t appear in the product description.

A vector database stores embeddings and lets you search by how close they are to each other. You convert your search query into a vector, the database finds the stored vectors that are mathematically closest to it, and you get results that match by meaning rather than by words. The underlying algorithm is called approximate nearest neighbour search — it’s built to do this quickly across millions of records.

The result is search that actually understands what someone is looking for. “What’s a good trail for beginners?” can surface a result labelled “easy forest walk for all ages” even though those phrases share no words in common.

How Do You Create Embeddings?

You don’t write the vectors yourself — that would be a nightmare. You use an embedding model, which is a neural network that’s been trained specifically to convert content into meaningful numbers. The most common options are:

The flow is simple: take your content, send it to an embedding model, get a vector back, store it. Do that for everything you want to make searchable. Then when a user searches, embed their query the same way and find the closest matches.

Where People Are Actually Using This

RAG — making AI chatbots answer from your own content

This is the big one right now. When you ask an AI assistant a question about your company’s internal documentation, what’s happening behind the scenes is: your question gets embedded, the system searches the vector database for the most relevant chunks of your content, and those chunks get passed to the language model so it can answer based on your actual data — not just whatever it was trained on. This is how Notion AI, custom ChatGPT integrations, and most enterprise knowledge bases are built.

Search that actually works

Keyword search breaks the moment a user doesn’t know the exact terms. Someone searching for “something to help me sleep” shouldn’t get zero results because nothing is literally named “something to help me sleep.” Vector search finds what they mean — melatonin, sleep hygiene guides, white noise machines — without requiring the exact words to match.

Recommendations

“Users who liked this also liked…” is fundamentally a similarity search problem. If you can embed your products, articles, songs, or videos into vectors, you can find what’s genuinely similar rather than just sharing the same tags. The results feel noticeably better to users.

Image search

Images get embedded just like text. Upload a photo and find visually similar results. Describe an image in words and find photos that match. Pinterest’s visual search, Google Lens, and reverse image search all work on variations of this idea.

Fraud detection

Normal transactions cluster together in vector space. Fraudulent ones sit unusually far from the cluster. Vector databases make it straightforward to flag outliers in real time.

Personalisation

As users interact with your product, their behaviour can be tracked and represented as a vector. Match that against content vectors and you get personalisation that adapts based on what users actually do, rather than what category you’ve manually put them in.

The Main Databases Worth Knowing

  • Pinecone — fully managed, easy to set up, scales well. Most popular for teams that want something production-ready without managing infrastructure.
  • Weaviate — open-source with a cloud option, supports hybrid search combining vector and keyword. Good for more complex setups.
  • Qdrant — open-source, fast, great developer experience, runs locally or in the cloud. The one to look at if you want to self-host.
  • Chroma — open-source, easy to run locally, built specifically for AI applications. Popular for prototyping.
  • pgvector — a Postgres extension. Adds vector search to your existing database. If you’re already on Postgres, this is the most practical starting point.
  • Redis Vector Search — vector search on top of Redis, useful if you’re already using Redis and need low latency.

The Good Parts

  • Finds meaning, not just matching words. This genuinely changes how search feels to users. It’s not a small improvement.
  • Works across content types. Text, images, audio, code — anything that can be embedded can be searched and compared in the same system.
  • Scales for similarity search. Designed exactly for this problem, with indexing built to keep results fast even at massive scale.
  • Makes AI features actually useful. RAG and semantic search built on vector databases tend to feel dramatically better than their keyword-based equivalents. The difference is noticeable.
  • Not locked to one provider. Your embedding model and your vector database are separate. You can switch either one without rebuilding everything.

The Not-So-Good Parts

  • Another thing to run. Unless you’re using pgvector on existing Postgres, you’re adding another database to your stack. Another thing to host, monitor, back up, and keep running.
  • Results are approximate, not exact. Most vector search uses approximate algorithms that trade a tiny bit of accuracy for speed. For most use cases that’s totally fine, but it’s worth knowing you’re not getting a mathematically guaranteed result.
  • Embedding costs something. Every piece of content needs to be embedded before it goes in. That means API calls, time, and money. And if you switch embedding models later, you may need to re-embed your entire dataset.
  • Not a replacement for a regular database. Looking up a user by ID or filtering by date range — vector databases aren’t built for that. Most production setups run both, which adds complexity.
  • Harder to debug when something goes wrong. When keyword search returns a bad result, you can usually see why. When vector search returns something unexpected, figuring out why that vector was the closest match takes more digging.

Can You Use It in a Normal Web Project?

Yes, and it’s less of a big deal to set up than it sounds. The real question is just whether your product actually has a problem that vector search solves.

If you have any of these, it’s worth thinking about:

  • A search feature where users regularly can’t find things even though the content exists
  • An AI chatbot or assistant that needs to answer questions from your own content
  • Recommendations that need to be smarter than “same category”
  • A big content library where users struggle to discover what’s in it

The lowest-friction starting point for most web apps is pgvector. If you’re already on PostgreSQL — and most web apps are — it’s a single extension. You get vector search in the same database you’re already running, with familiar SQL syntax, no new infrastructure, no extra service to manage. For projects that aren’t dealing with billions of records, pgvector handles most of what you’d need.

If you want something fully managed with more scale, Pinecone is the quickest path to production. Chroma is the easiest to get running locally for a prototype.

The basic loop for any project is:

  • Pick an embedding model — OpenAI’s text-embedding-3-small is cheap and works well
  • Embed your content when it’s created or updated, store the vectors
  • When a user searches, embed the query the same way and run a similarity search
  • Return the matching results

The tooling has gotten simple enough that the implementation genuinely isn’t the hard part anymore. The harder part is figuring out which features in your product actually benefit from semantic search — and building retrieval pipelines that surface the right content consistently.

Worth Bookmarking