Vector indexes and semantic search | VDF AI Documentation

When search-by-meaning needs an index of its own

Searching your knowledge covers how VDF AI searches across your sources by default. That works beautifully out of the box for documents.

But there are moments when you want more control. You want to decide what gets indexed. You want to choose how text is split into searchable pieces. You want to pick the model that powers the search. And you want to know exactly what your team is searching across when they ask a question.

That’s what a vector index gives you: a search surface you own, scoped to data you choose, powered by a model you pick.

You can ignore this page completely and search will still work. Vector indexes are for the moments when default search isn't sharp enough — usually because you're searching across a database column, a curated feature list, or a very large corpus and want the search experience to match.

Who this is for

Data leads building a focused semantic search over a specific dataset.
Analysts and operations teams who want to ask natural-language questions over tables.
Workspace admins curating a “golden” search index for their team to use across products.

You don’t need any technical background. The screens walk you through every choice.

What a vector index is, in one sentence

A vector index is a searchable layer built from one of your sources, where the search understands meaning rather than just keywords.

You build it. You name it. You pick what goes in. Once built, every other product surface — Chat, Agents, Networks — can search it.

What you can build an index from

The most common sources:

A database asset

Index one or more text-heavy columns of a connected table — descriptions, notes, reviews, transcripts, status comments.

A feature list

Use the curated list you built in [Features and relationships](/docs/products/vdf-ai-data/features-and-relationships) as the scope of the index.

A connected app

Index a specific Confluence space, Jira project, or GitHub repo as a standalone search surface.

A file collection

Group uploaded files and build an index over only that collection.

Building an index — the choices you make

Every index is one short form. Three meaningful choices.

1. What goes in

Pick the source and scope it as narrowly as you can. A narrower index produces sharper search than a wider one. You can always add a second index later for a different scope.

2. How text is split

VDF AI Data breaks the source into small pieces before it can be searched. Two numbers control how:

Chunk size. How much text goes into each piece. Smaller chunks (a few sentences) make searches precise. Larger chunks (a few paragraphs) preserve more context.
Overlap. How much of one chunk is repeated at the start of the next. Some overlap helps the search find ideas that span chunk boundaries.

The defaults work for most teams. The first index you build, accept the defaults. After you've searched it for a few days, you'll know whether the answers feel "too narrow" (raise chunk size) or "too generic" (lower it).

A simple guide:

If your source is…	Try this
Short text snippets (reviews, support tickets, status comments)	Smaller chunks, low overlap
Long-form documents (policies, contracts, articles)	Medium chunks, moderate overlap
Mixed-length material	Defaults — they’re tuned for this case

3. Which embedding model

The embedding model is what turns text into a form a search can match by meaning. VDF AI Data has a catalog of models. They differ on three axes:

Quality. Some are better at picking up subtle differences in meaning.
Speed. Some are faster to index and faster to search.
Language coverage. Some are tuned for English, others handle many languages well.

The catalog labels the trade-offs clearly. For most teams, the default model is a solid starting point. Switch when you know why — for example, when your data is mostly in a non-English language and a language-specific model exists.

What happens after you click “Build”

Three stages, all visible in the build log.

Reading.
VDF AI Data pulls the source content. For a database column, this is a snapshot read. For documents, it's a fetch of the current versions.
Chunking.
The content is split into searchable pieces according to your chunk-size and overlap settings.
Embedding.
Each chunk is processed by the embedding model. This is where most of the build time lives — and where bigger indexes take longer.

When the build finishes, the index moves to Ready state and is immediately usable across every other VDF AI product surface.

Build status, in plain language

State	What it means	What to do
Draft	You’re still configuring; not building yet	Finish the form and click Build
Running	Build in progress; you can watch the log	Wait — the time scales with source size
Ready	Index is live and searchable	Start using it in Chat, Agents, Networks
Needs attention	Build hit an issue (source unavailable, model busy)	Read the log; usually retry-able

Searching the index

Once an index is Ready, you search it from any product surface — or directly from the index detail page.

The search interface is plain language:

“What do customers say about onboarding speed?”
“Find every reference to refunds for the EU market.”
“Show me the comments about latency on the orders table.”

You can also pick how many results to return (top 5 is usually enough; bump to 20 for a broader sweep) and read the matching chunks ranked by relevance.

Search history

Every search is logged for you — the query, the time, how many results came back. Two reasons to look at the history:

Refining a workflow. If you keep asking variations of the same question, your team probably needs a feature list, an agent, or a saved view of that question.
Understanding usage. Workspace admins use search history to see which indexes are getting used and which aren’t.

When to rebuild an index

A vector index is a snapshot. It doesn’t refresh automatically — and most of the time you don’t want it to.

Two natural rebuild triggers:

The source changed substantially. A bulk reload, a schema change, a new chunk of documents.
You changed your mind about chunking or model. Rebuild with the new settings; compare the search results before/after.

Set a calendar reminder for a monthly rebuild on indexes that back important workflows. Routine, predictable, and you never get caught with a stale index powering a customer-facing search.

A few patterns that work

One narrow index per use case, not one giant index

A focused index on “support ticket comments” produces sharper search than a sprawling index on “everything from the support database.” Multiple narrow indexes beat one wide one almost every time.

Pair an index with a feature list

If you’ve built a feature list for an asset, scope the index to the same list. The search and the analysis stay aligned.

Track which products use which index

In your team’s docs, write down which index powers which agent, network, or workflow. When you rebuild, you know what to retest.

Where to go next

Searching your knowledge — how to ask great questions, with or without a custom index.
Features and relationships — scope an index to the columns that matter for your use case.
Fine-tuning datasets — go from “searchable” to “trained on.”
VDF AI Agents — give an agent a vector index as a knowledge source.