The GitHub Semantic Code Search Tool
Search your vectorized GitHub data by meaning across repositories and components, with similarity scores and links back to the source — the retrieval layer for any agent that reasons over your codebase.
The function you need exists — somewhere
In a large codebase, finding the right implementation means knowing the file name or the exact symbol. Newcomers and agents alike waste time grepping for code they can’t name.
You must know the name
Text search only works if you already know what the code is called.
Concepts span files
"How do we handle auth?" isn’t one file — it’s a pattern scattered across the repo.
No relevance ranking
Grep returns every match equally; nothing tells you which one matters.
Proprietary code can’t leave
Your source is IP — it can’t go to a hosted code-search service.
Meaning-aware search across your repos
Semantics
Describe it, don’t name it
Find code by what it does.
The tool embeds your natural-language query and matches it against vectorized repositories and components, returning the most relevant code even when you don’t know the file or symbol name.
- Natural-language code search
- Repositories and/or components
- Similarity threshold filtering
- Scope to a single owner/repo
No file name needed
Precision
Ranked, linked, thresholded
Only the matches worth reading.
Each hit comes with a similarity score and a URL back to the source, and min_similarity lets an agent drop weak matches so it acts only on high-confidence results.
Filter low-confidence hits
Governance
On-prem code retrieval
Your source stays your source.
The index and search run inside your perimeter, scoped per user with audit logging — making semantic code search usable on proprietary repositories that can’t touch hosted tools.
Per-tenant, logged
Parameters
The github_vector_search tool accepts these inputs when an agent calls it. Required inputs are flagged.
default: all Optional Type of GitHub data to search. repositoriescomponentsall
default: 10 Optional Maximum number of results to return (1–50).
default: 0.3 Optional Minimum cosine similarity (0–1) a result must reach. Set 0 to disable.
Where code search pays back
Codebase onboarding
Let a new engineer ask "where do we validate webhooks?" and jump straight to the code.
Reuse discovery
Find an existing utility before someone reimplements it.
Impact scoping
Locate every component that touches a concept before a refactor.
Agent grounding
Ground an engineering agent’s answers in your actual repositories.
Pattern audits
Find all places a risky pattern appears, by meaning rather than regex.
Cross-repo search
Search across many repos at once, or scope to one with owner/repo.
Assigned to agents, orchestrated as networks
On VDF AI, an industry’s use cases map to agents, and you assign tools like this one to those agents. Compose multiple agents into a governed, on-premise network.
What changes after you assign it
Questions about the GitHub Semantic Code Search tool
What is GitHub semantic code search?
It is a tool that searches your vectorized GitHub repositories and components by meaning, returning ranked matches with similarity scores and source URLs. Assigned to an agent, it lets the agent find and reason over your real code without knowing exact file names.
Can I search a single repository?
Yes. Provide owner and repo together to scope the search to one repository, or omit them to search across everything indexed for the user.
What does min_similarity do?
It sets the minimum cosine similarity (0–1) a result must reach to be returned, defaulting to 0.30. Raising it makes results stricter; setting it to 0 disables the threshold.
Is our source code exposed?
No. Indexing and search run on-premise or in your sovereign cloud, scoped per user and audit-logged, so proprietary code never leaves your perimeter.
How does it pair with other tools?
It is often assigned alongside the GitHub repository explorer and code review tools so an agent can find code, inspect it, and review it — and combined in a network with other agents.
Assign GitHub Semantic Code Search to these agents
These VDF AI agents can be assigned this tool. Open an agent to see the full toolkit it can run.
Tools that work well alongside this one
Where this tool delivers value
Let agents search your code by meaning
See GitHub semantic code search assigned to an engineering agent — on infrastructure you control.