Why Next.js instead of WordPress?

Next.js loads in under 1 second, scores 95+ on Google PageSpeed, and has no plugin vulnerabilities to patch. For businesses that care about SEO and performance, it is not a close comparison.

How long does a website take to build?

A standard business site takes 2-4 weeks. Custom web apps and AI integrations take 6-12 weeks depending on scope. We move quickly because we write clean code from scratch.

Do you work with Thai businesses?

Yes. We work in English and Thai. All projects include Thai payment gateway options (PromptPay, 2C2P) and LINE integration as standard where relevant.

Can you take over an existing codebase?

Yes, with a code audit first. We review the existing architecture and give you an honest assessment before committing. We only take over projects we can actually improve.

← Back to Blog

PineconeVector SearchAI

Vector Search with Pinecone for AI Applications

15 May 2026 · by Yunmin Shin

What Is Vector Search and When Do You Need It?

Traditional database search is keyword-based — it finds records where a text field contains the exact word you searched for. Vector search is semantic — it finds records that are conceptually similar to your query, even if they share no words in common.

A user searching "where can I eat late at night in Bangkok" might not find results from a database with records using the phrase "24-hour restaurants in the city." A vector search would surface them, because the meaning of the phrases is similar.

Vector search is the enabling technology behind AI-powered features: intelligent product search, document Q&A systems (RAG), semantic content recommendations, and customer support chatbots that retrieve relevant knowledge base articles.

What Is Pinecone?

Pinecone is a managed vector database. You store embedding vectors (arrays of floating point numbers that represent the semantic meaning of text or other data) in Pinecone, and query it to find the most similar vectors to a query embedding. Pinecone handles indexing, storage, and retrieval at scale with low latency.

The alternative is pgvector, a PostgreSQL extension that adds vector support to your existing database. pgvector is a good choice if you are already on PostgreSQL and your vector dataset is small to medium (under a few million vectors). Pinecone scales more easily to very large datasets and provides faster approximate nearest neighbor search.

How Do You Generate Embeddings?

An embedding model converts text into a vector. OpenAI's text-embedding-3-small model is the standard choice — it is cheap, fast, and produces high-quality embeddings:

const response = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: "Your text to embed here",
});
const embedding = response.data[0].embedding; // 1536-dimensional float array

Generate embeddings for all documents when you index them, and for each user query at search time.

How Do You Index and Query Pinecone?

Install the Pinecone client:

npm install @pinecone-database/pinecone

Index a document:

const index = pinecone.index("your-index-name");
await index.upsert([{
  id: document.id,
  values: embedding,
  metadata: { title: document.title, url: document.url },
}]);

Query for similar documents:

const results = await index.query({
  vector: queryEmbedding,
  topK: 5,
  includeMetadata: true,
});

The topK results are the most semantically similar documents to the query. Use the metadata to display them to the user or pass them to a language model for RAG.

What Is RAG and How Does It Work?

Retrieval-Augmented Generation (RAG) is the pattern that powers document chatbots. Instead of asking a language model to answer from its training data, you:

Embed the user's question
Retrieve the most relevant documents from Pinecone
Include those documents in the prompt to the language model
Ask the model to answer based on the provided context

This allows the model to answer questions about private company data, recent events, or domain-specific knowledge that was not in its training data. It is the correct architecture for internal knowledge bases, product documentation chatbots, and any AI assistant that needs accurate, up-to-date information.