Skip to content

feat: add select:repo query modifier to return matching repositories#1015

Open
thomasleveil wants to merge 15 commits intosourcebot-dev:mainfrom
thomasleveil:feat/select-repo
Open

feat: add select:repo query modifier to return matching repositories#1015
thomasleveil wants to merge 15 commits intosourcebot-dev:mainfrom
thomasleveil:feat/select-repo

Conversation

@thomasleveil
Copy link

@thomasleveil thomasleveil commented Mar 18, 2026

Summary

Adds select:repo as a query modifier that returns a deduplicated, sorted list of matching repositories instead of individual file results. This is modeled after Sourcegraph's select: projection syntax.

Use cases


Changes

Query language (packages/queryLanguage)

  • Adds SelectExpr to the Lezer grammar (select: "repo" | "file")
  • Adds "select:" to the tokenizer prefix list so it is not parsed as a plain word

Search pipeline (packages/web)

  • parser.ts: extractSelectMode() walks the parse tree for SelectExpr and returns { ir, selectMode }. SelectExpr maps to { const: true } in the IR — a no-op for zoekt, processed client-side only.
  • searchApi.ts: applySelectRepo() deduplicates file results by repository and returns RepoResult[] sorted by match count descending.
  • zoektSearcher.ts: accumulates a repo map across SSE chunks during streaming; intermediate chunks emit files=[], repoResults=[...partial]; final message emits the complete sorted list.
  • useStreamedSearch.ts: state gains repoResults: RepoResult[], included in the cache entry for instant replay on back/forward navigation.

UI (packages/web)

  • New RepoResultsPanel component: displays repos with match counts; each row navigates to a scoped repo:<name> search.
  • searchResultsPage: detects select:repo in the active query and swaps the file results panel for RepoResultsPanel.
  • Search bar: select: autocompletion, syntax highlighting, and refine-mode suggestion.

MCP (packages/mcp)

  • New search_repos tool: appends select:repo internally, accepts query, filterByLanguages, caseSensitive, ref, maxResults, returns a formatted repo list with match counts and optional URLs.
  • repoResultSchema and RepoResult type added to schemas/types (mirroring the web package).

Tests (packages/mcp)

  • 15 tests across 4 suites using node:test + tsx (no extra test framework):
    • repoResultSchema validation (3)
    • searchResponseSchema backward compatibility (3)
    • search_code hasModifiers transform (5)
    • search_repos end-to-end via InMemoryTransport (4)

Demo

// Search bar:
react hooks select:repo

// Returns: list of repos matching "react hooks", sorted by number of matches
// Click any repo → navigates to: react hooks repo:<repo-name>
// MCP:
search_repos({ query: "useState", filterByLanguages: ["TypeScript"] })
// → "1. my-org/frontend (42 matches)\n2. my-org/utils (7 matches)"

Summary by CodeRabbit

  • New Features

    • Added select:repo query modifier to return deduplicated repository results sorted by match count
    • New repository results panel showing matched projects, counts, and quick navigation
    • Search bar gains "Select mode" suggestions to refine queries for repository-level results
  • Tests

    • Added comprehensive tests covering select:repo behavior, suggestion handling, and end-to-end repo result rendering

Introduces SelectExpr as a new top-level expression in query.grammar:

  SelectExpr { selectKw selectValue }
  selectValue { "repo" | "file" }

Also adds "select:" to the PREFIXES list in tokens.ts so it is not
tokenised as a plain word by the lezer lexer.

The generated parser (parser.ts / parser.terms.ts) must be rebuilt
after this change via `yarn build` in packages/queryLanguage.
Adds SelectMode ('repo' | null) and extractSelectMode() which walks the
lezer parse tree looking for a SelectExpr node.

parseQuerySyntaxIntoIR() now returns { ir, selectMode } so callers can
act on the projection modifier without touching the zoekt IR.

SelectExpr maps to { const: true } in the IR — a no-op for zoekt — so
the modifier is transparent to the search engine and only processed
client-side.

Also adds:
- RepoResult type (repositoryId, repository, repositoryInfo?, matchCount)
  exported from @/features/search
- select: entry in useRefineModeSuggestions so it surfaces in the
  refine dropdown alongside the type it represents
When selectMode === 'repo', results are post-processed rather than
filtered at the zoekt level:

  searchApi.ts
    - applySelectRepo() deduplicates file results by repository and
      returns RepoResult[] sorted by matchCount desc
    - search() and streamSearch() detect selectMode and route
      accordingly

  zoektSearcher.ts
    - zoektStreamSearch() accepts selectMode and accumulates a
      _repoMap across SSE chunks
    - Intermediate chunks emit files=[], repoResults=[...partial]
    - Final message emits the complete sorted repoResults list

  useStreamedSearch.ts
    - State gains repoResults: RepoResult[]
    - CacheEntry includes repoResults for instant replay on back/forward
Adds RepoResultsPanel — a new component that displays the deduplicated
list of matching repositories with their match count. Each row is
clickable and navigates to a scoped `repo:<name>` search.

searchResultsPage now detects select:repo in the active query and
swaps the file results panel for RepoResultsPanel. The two modes are
mutually exclusive — no layout change is needed for the file panel.
- constants.ts: adds select:repo to the suggestion completions
- searchSuggestionsBox.tsx: surfaces select: suggestions in the dropdown
- useSuggestionModeMappings.ts: maps 'select' to its completions
- zoektLanguageExtension.ts: highlights 'select:' as a keyword prefix

Typing 'select:' in the search bar now shows 'repo' and 'file' as
completions, consistent with how lang:, repo:, and file: work.
schemas.ts / types.ts
  - Adds repoResultSchema and RepoResult, mirroring the web package
  - Adds repoResults?: RepoResult[] to searchResponseSchema so the
    MCP client can parse select:repo API responses

index.ts — new tool: search_repos
  - Accepts query, filterByLanguages, caseSensitive, ref, maxResults
  - Appends select:repo to the query before calling the search API
  - Returns a formatted list: repo name, match count, optional URL
  - Designed to answer 'which repos use X?' questions directly

index.ts — shared helpers (reusable by future tools e.g. search_commits)
  - searchFilterParamsSchema: zod schema spread into each tool's params
  - buildQueryFilters(): pure function that appends lang:, repo:,
    file:, rev: filter tokens to a query string

index.ts — exports `server` for use in tests
15 tests across 4 suites using node:test + tsx (no extra test framework):

  repoResultSchema (3)
    - valid parse, with optional repositoryInfo, missing required field

  searchResponseSchema with repoResults (3)
    - backward compat (no repoResults), with repoResults, invalid entry

  search_code hasModifiers transform (5)
    - detects select:, lang:, repo:; no false positives on plain text
      or partial words like selector:

  search_repos tool end-to-end via InMemoryTransport (4)
    - returns repo list, empty-results message, lang: filter appended,
      maxResults respected with total count in output

Run: yarn workspace @sourcebot/mcp test
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 18, 2026

Walkthrough

Adds a select:repo query modifier end-to-end: grammar/tokenizer, MCP tool and tests, search API and streaming aggregation, schemas/types, and web UI components to display deduplicated repository results with match counts.

Changes

Cohort / File(s) Summary
Query Language
packages/queryLanguage/src/query.grammar, packages/queryLanguage/src/tokens.ts
Add SelectExpr rule and select: token; integrate into prefix/value productions so select: is recognized as a modifier.
MCP Server & Tests
packages/mcp/src/index.ts, packages/mcp/src/schemas.ts, packages/mcp/src/types.ts, packages/mcp/src/__tests__/select-repo.test.ts, packages/mcp/package.json
Export server instance, introduce repoResultSchema/RepoResult type, add search_repos tool and shared query builders, add tests for select:repo behavior, and add package test script.
Search Schemas & Types (web)
packages/web/src/features/search/types.ts, packages/web/src/features/search/index.ts
Add repoResultSchema and RepoResult type; extend streamed and final response schemas with optional repoResults.
Search Parser & Mode Extraction
packages/web/src/features/search/parser.ts
Add SelectMode type, getSelectModeFromQuery, extract select mode during parse, and thread selectMode out of parseResult (parse returns {ir, selectMode}).
Search API & Zoekt Integration
packages/web/src/features/search/searchApi.ts, packages/web/src/features/search/zoektSearcher.ts
Thread selectMode into search/stream paths, add accumulateRepoMap and applySelectRepo to aggregate per-repo counts, and include repoResults in streaming final responses when selectMode==='repo'.
Streaming & Client State
packages/web/src/app/[domain]/search/useStreamedSearch.ts
Propagate repoResults through streaming state, cache entries, and final state; initialize and merge incoming repoResults.
Web UI — Search Bar & Tokenization
packages/web/src/app/[domain]/components/searchBar/constants.ts, .../searchSuggestionsBox.tsx, .../useRefineModeSuggestions.ts, .../useSuggestionModeMappings.ts, .../zoektLanguageExtension.ts
Add select: prefix to suggestion lists and tokenizer; introduce "select" suggestion mode and related mappings/suggestions.
Web UI — Results UI
packages/web/src/app/[domain]/search/components/repoResultsPanel.tsx, packages/web/src/app/[domain]/search/components/searchResultsPage.tsx
New RepoResultsPanel component; detect select:repo mode and conditionally render repository-level results with match counts, and pass repoResults/isSelectRepoMode/searchQuery through PanelGroup.
Docs
CHANGELOG.md
Document the new select:repo modifier, search_repos MCP tool, and RepoResultsPanel UI.

Sequence Diagram

sequenceDiagram
    participant User
    participant Parser as Query Parser
    participant SearchAPI as Search API
    participant Zoekt as Zoekt Search
    participant UI as Search UI

    User->>Parser: submit query with select:repo
    Parser->>Parser: parse & extract selectMode='repo'
    Parser-->>SearchAPI: IR + selectMode

    SearchAPI->>Zoekt: execute search request (with selectMode)
    Zoekt-->>SearchAPI: stream file chunks

    alt selectMode == 'repo'
        SearchAPI->>SearchAPI: accumulate per-repo counts (accumulateRepoMap)
        SearchAPI->>SearchAPI: applySelectRepo() => build repoResults (dedup + counts)
        SearchAPI-->>UI: final SearchResponse with repoResults
    else
        SearchAPI-->>UI: file-centric SearchResponse
    end

    UI->>UI: isSelectRepoMode = true
    UI->>UI: render RepoResultsPanel with repoResults
    User->>UI: click repo row
    UI->>UI: remove select:repo, add repo:<name>, navigate to file-level search
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

sourcebot-team

Suggested reviewers

  • brendan-kellam
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main feature: adding a select:repo query modifier that returns matching repositories instead of file-level results.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Important

Merge conflicts detected (Beta)

  • Resolve merge conflict in branch feat/select-repo
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/web/src/app/[domain]/search/components/searchResultsPage.tsx (1)

339-345: ⚠️ Potential issue | 🟡 Minor

Streaming status text is file-centric even in repo projection mode.

When isSelectRepoMode is active, the loading line still reports file counts (...in 0 files). Consider a repo-aware streaming status string here to avoid confusing feedback.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/app/`[domain]/search/components/searchResultsPage.tsx around
lines 339 - 345, The streaming status currently always shows file-centric text;
update the JSX that renders when isStreaming to branch on isSelectRepoMode and
render a repo-centric message when true (e.g., "Found X matches in Y
repositories" or similar) instead of "in N files". Locate the isStreaming block
in searchResultsPage.tsx (the fragment containing RefreshCwIcon, the
"Searching..." <p>, and the conditional that uses numMatches and
fileMatches.length) and change the inner conditional to check isSelectRepoMode
and use repository-aware counts (use existing variables that represent repo
counts or compute repos from fileMatches if necessary) so the status text
reflects repo projection mode.
🧹 Nitpick comments (2)
packages/mcp/src/__tests__/select-repo.test.ts (1)

106-117: Consider importing the regex from the source instead of duplicating it.

The hasModifiers regex is duplicated here rather than imported from the actual implementation. If the implementation changes, this test could still pass while the actual behavior differs.

That said, this approach has merit for testing expected behavior independently of implementation details. If this is intentional (testing the contract rather than the implementation), consider adding a comment to clarify.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/mcp/src/__tests__/select-repo.test.ts` around lines 106 - 117, The
test duplicates the hasModifiers regex (currently assigned to RE) instead of
using the canonical value from the implementation; update the test to import the
hasModifiers (or exported regex) from the source module and use that in place of
the local RE variable so the test tracks real behavior (or, if duplication was
intentional, replace the local RE with a short comment explaining that the test
asserts the contract and must remain independent). Ensure you reference and
replace usages of RE in the test with the imported hasModifiers symbol (or add
the explanatory comment above the RE declaration).
packages/web/src/features/search/zoektSearcher.ts (1)

242-253: Consider centralizing repo aggregation logic used in both streaming and non-streaming paths.

This block duplicates the same aggregation strategy from packages/web/src/features/search/searchApi.ts (applySelectRepo). Extracting a shared helper would reduce behavior drift (especially for match counting and sorting rules).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/features/search/zoektSearcher.ts` around lines 242 - 253,
The repo-aggregation logic inside the selectMode === 'repo' block is duplicated
with applySelectRepo in searchApi.ts; extract a shared helper (e.g.,
aggregateRepoMap or applySelectRepoShared) that accepts (files, repositoryInfo,
_accumulatedRepoMap) and encapsulates the repositoryId lookup, repositoryInfo
find, and the matchCount calculation (file.chunks.reduce((acc, chunk) => acc +
chunk.matchRanges.length, 0)) and update/insert semantics currently performed on
_accumulatedRepoMap; replace the in-place loop in zoektSearcher.ts with a call
to that new helper and update searchApi.ts to call the same helper so both
streaming and non-streaming paths share identical aggregation and sorting
behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/mcp/src/index.ts`:
- Around line 489-490: The handler for search_repos is dropping the repo/path
filters from the request schema so filterByRepos and filterByFilepaths never get
applied; update the search_repos handler to read filterByRepos and
filterByFilepaths from the parsed input (from searchFilterParamsSchema) and
apply them to the search pipeline (either by passing them into the
repo/filepath-aware search function you call or by filtering the results before
returning). Specifically, locate the search_repos handler and the use of
searchFilterParamsSchema and ensure the parsed object includes filterByRepos and
filterByFilepaths, then wire those values into the function that performs
repository/filepath scoping (or add an explicit filtering step using those
arrays) so user-supplied filters are honored.
- Around line 506-508: The current substring check on fullQuery for
'select:repo' can misfire when that sequence is embedded in another token;
update the detection to be token-aware by splitting fullQuery into tokens or
using a regex with word boundaries to match the exact token 'select:repo' (e.g.,
/^\bselect:repo\b$/-style matching across tokens), and only append '
select:repo' when no exact token is present; adjust the logic around the
existing fullQuery variable so that you don't incorrectly skip adding the
modifier when the sequence appears inside another token.

In `@packages/web/src/app/`[domain]/search/components/repoResultsPanel.tsx:
- Around line 18-24: The repo name is interpolated raw in navigateToRepo causing
invalid/overbroad filters for names with spaces, quotes, or special chars;
sanitize and quote it before appending: produce a safeRepo by escaping
backslashes and double-quotes in repoName (e.g., replace "\" with "\\", and `"`
with `\"`), then wrap safeRepo in double quotes if it contains whitespace or any
non-alphanumeric/-/._ characters, and finally use that quoted safeRepo in the
concat call (replace the current repo:${repoName} insertion). Update
navigateToRepo and any helper used for building newQuery to use this
escaped/quoted value.

In `@packages/web/src/app/`[domain]/search/components/searchResultsPage.tsx:
- Around line 87-89: Replace the syntax-unaware regex check that defines
isSelectRepoMode (currently using searchQuery with
/(?:^|\s)select:repo(?:\s|$)/) with the parser-derived select-mode signal from
the search pipeline/hook (e.g., a provided selectMode or searchState.selectMode
value exposed by the search hook you use elsewhere). Locate where
isSelectRepoMode and searchQuery are used in searchResultsPage.tsx and read the
authoritative select mode from the search hook/state instead of inspecting raw
query text so panel selection matches the parser behavior.

In `@packages/web/src/features/search/searchApi.ts`:
- Around line 124-137: The repo aggregation currently sets matchCount by summing
chunk.matchRanges only, which omits filename-only matches; update the logic in
searchApi.ts (the repoMap population and the existing.matchCount update) to also
add the filename match count (e.g., add 1 when the file has a filename-only hit
flag/property) when computing matchCount, and make the same change in the
mirrored logic inside zoektSearcher.ts so both paths increment matchCount for
filename-only matches as well as chunk matches; reference the repoMap variable
and the matchCount field to locate and update the two places.

---

Outside diff comments:
In `@packages/web/src/app/`[domain]/search/components/searchResultsPage.tsx:
- Around line 339-345: The streaming status currently always shows file-centric
text; update the JSX that renders when isStreaming to branch on isSelectRepoMode
and render a repo-centric message when true (e.g., "Found X matches in Y
repositories" or similar) instead of "in N files". Locate the isStreaming block
in searchResultsPage.tsx (the fragment containing RefreshCwIcon, the
"Searching..." <p>, and the conditional that uses numMatches and
fileMatches.length) and change the inner conditional to check isSelectRepoMode
and use repository-aware counts (use existing variables that represent repo
counts or compute repos from fileMatches if necessary) so the status text
reflects repo projection mode.

---

Nitpick comments:
In `@packages/mcp/src/__tests__/select-repo.test.ts`:
- Around line 106-117: The test duplicates the hasModifiers regex (currently
assigned to RE) instead of using the canonical value from the implementation;
update the test to import the hasModifiers (or exported regex) from the source
module and use that in place of the local RE variable so the test tracks real
behavior (or, if duplication was intentional, replace the local RE with a short
comment explaining that the test asserts the contract and must remain
independent). Ensure you reference and replace usages of RE in the test with the
imported hasModifiers symbol (or add the explanatory comment above the RE
declaration).

In `@packages/web/src/features/search/zoektSearcher.ts`:
- Around line 242-253: The repo-aggregation logic inside the selectMode ===
'repo' block is duplicated with applySelectRepo in searchApi.ts; extract a
shared helper (e.g., aggregateRepoMap or applySelectRepoShared) that accepts
(files, repositoryInfo, _accumulatedRepoMap) and encapsulates the repositoryId
lookup, repositoryInfo find, and the matchCount calculation
(file.chunks.reduce((acc, chunk) => acc + chunk.matchRanges.length, 0)) and
update/insert semantics currently performed on _accumulatedRepoMap; replace the
in-place loop in zoektSearcher.ts with a call to that new helper and update
searchApi.ts to call the same helper so both streaming and non-streaming paths
share identical aggregation and sorting behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e1babde4-96b6-4b14-bacd-4df2ce2178b1

📥 Commits

Reviewing files that changed from the base of the PR and between e29056b and 09ccf9e.

📒 Files selected for processing (21)
  • CHANGELOG.md
  • packages/mcp/package.json
  • packages/mcp/src/__tests__/select-repo.test.ts
  • packages/mcp/src/index.ts
  • packages/mcp/src/schemas.ts
  • packages/mcp/src/types.ts
  • packages/queryLanguage/src/query.grammar
  • packages/queryLanguage/src/tokens.ts
  • packages/web/src/app/[domain]/components/searchBar/constants.ts
  • packages/web/src/app/[domain]/components/searchBar/searchSuggestionsBox.tsx
  • packages/web/src/app/[domain]/components/searchBar/useRefineModeSuggestions.ts
  • packages/web/src/app/[domain]/components/searchBar/useSuggestionModeMappings.ts
  • packages/web/src/app/[domain]/components/searchBar/zoektLanguageExtension.ts
  • packages/web/src/app/[domain]/search/components/repoResultsPanel.tsx
  • packages/web/src/app/[domain]/search/components/searchResultsPage.tsx
  • packages/web/src/app/[domain]/search/useStreamedSearch.ts
  • packages/web/src/features/search/index.ts
  • packages/web/src/features/search/parser.ts
  • packages/web/src/features/search/searchApi.ts
  • packages/web/src/features/search/types.ts
  • packages/web/src/features/search/zoektSearcher.ts

The streaming (zoektStreamSearch) and non-streaming (applySelectRepo) paths
had identical repo-aggregation loops. Extract accumulateRepoMap() in
zoektSearcher.ts and use it in both places so any future change to match
counting or deduplication stays in one spot.

Addresses CodeRabbit review comment on PR sourcebot-dev#1015.
… regex

searchResultsPage was using a hand-rolled regex to detect select:repo in the
query. If the grammar evolves (new select values, quoting rules) the regex could
silently drift.

Export getSelectModeFromQuery() from parser.ts — it runs the same non-strict
Lezer parser as the backend — and use it as the authoritative signal for
isSelectRepoMode in searchResultsPage.

Addresses CodeRabbit review comment on PR sourcebot-dev#1015.
repo:my org/repo-name is invalid zoekt syntax — the space terminates the
token. Wrap names that contain anything other than alphanumerics, hyphens,
dots, underscores, or slashes in double-quotes and escape inner backslashes
and double-quotes so the resulting filter is always a valid zoekt token.

Addresses CodeRabbit review comment on PR sourcebot-dev#1015.
…ndler

The search_repos tool spread searchFilterParamsSchema (which includes
filterByRepos and filterByFilepaths) but the handler destructured neither
field, so both filters were silently dropped. Callers expecting to scope
a search_repos call to specific repositories or file paths got unscoped
results instead.

Destructure and forward both fields to buildQueryFilters.

Addresses CodeRabbit review comment on PR sourcebot-dev#1015.
String.includes('select:repo') would match the substring inside a longer
token such as 'myprefix-select:repo' and suppress the modifier incorrectly.
Use a regex that requires whitespace or string boundaries on both sides,
consistent with how zoekt tokenises query modifiers.

Addresses CodeRabbit review comment on PR sourcebot-dev#1015.
While streaming, the status bar displayed 'Found N matches in M files'
even when the query uses select:repo (where files is always empty).
Branch on isSelectRepoMode to show 'Found N repositories' instead,
consistent with the RepoResultsPanel that is rendered below.

Addresses CodeRabbit review comment on PR sourcebot-dev#1015.
The hasModifiers regex was defined locally rather than imported from
the implementation. Add a comment explaining this is deliberate: the
suite asserts the expected detection contract, not the implementation
detail, so a regression in source will still fail these tests even if
the regex was simultaneously updated there.

Addresses CodeRabbit nitpick on PR sourcebot-dev#1015.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
packages/web/src/features/search/searchApi.ts (1)

121-126: ⚠️ Potential issue | 🟠 Major

Repo counts still miss filename-only hits.

applySelectRepo() inherits its matchCount from accumulateRepoMap(), and the shared helper in packages/web/src/features/search/zoektSearcher.ts:156-171 still sums only chunk ranges. Repos matched only via filename can therefore show 0 matches and be sorted too low in both repo-mode code paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/features/search/searchApi.ts` around lines 121 - 126,
applySelectRepo / accumulateRepoMap currently uses match counts derived only
from chunk ranges, so repos that matched solely by filename end up with
matchCount = 0; update the shared helper (accumulateRepoMap and the helper used
in zoektSearcher) to include filename-only matches when computing/incrementing
RepoResult.matchCount (e.g., add filenameMatches or increment by 1 per filename
hit) so that repoResults sorting reflects filename hits as well; ensure the same
change is applied to the helper invoked by zoektSearcher to keep both repo-mode
code paths consistent.
🧹 Nitpick comments (1)
packages/mcp/src/index.ts (1)

553-559: Entry-point guard works but is fragile.

The detection logic handles the .ts.js filename comparison, and tests confirm it works. However, the string manipulation (replace, split, pop, endsWith) could mismatch on edge cases like symlinked paths or different working directories.

A more robust alternative would be a dedicated CLI entry file that imports and starts the server, keeping index.ts purely as a module export. That said, the current approach is functional for the test use case demonstrated.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/mcp/src/index.ts` around lines 553 - 559, The current fragile
entry-point guard using isMain (process.argv[1] with import.meta.url and string
ops) should be removed from index.ts and replaced by creating a dedicated CLI
entry file that imports runServer from this module and calls it; specifically,
delete the isMain block in packages/mcp/src/index.ts and export runServer (and
any needed startup helpers) as pure module exports, then add a new small CLI
script (e.g., mcp-cli) that imports { runServer } and runs
runServer().catch(...) to handle startup errors and process.exit. Ensure
references to import.meta.url and the isMain symbol are eliminated so index.ts
is safe to import in tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/mcp/src/index.ts`:
- Around line 497-531: The parameter destructure uses "filterByRepos: repos"
which conflicts with the later "const repos = response.repoResults" declaration;
rename the search result variable (for example to "repoResults" or
"searchRepos") wherever "const repos = response.repoResults", its later checks
(repos.length === 0), and the "limited = repos.slice(...)" use it so the
parameter "repos" remains the input filter array and the new name holds
response.repoResults; update all references (response.repoResults, repos.length,
repos.slice, limited) to the new variable name.

In `@packages/web/src/app/`[domain]/search/components/repoResultsPanel.tsx:
- Around line 18-35: navigateToRepo currently rebuilds the URL using only the
query param which drops active search flags (case, regex, match cap, etc.);
modify navigateToRepo to preserve and include the existing search option params
when calling createPathWithQueryParams (e.g., include the same SearchQueryParams
keys used elsewhere such as case, regexp/regex, patternType, and
matchLimit/matchCap) by reading them from the current route/props and passing
them alongside SearchQueryParams.query and the newQuery; also update the parent
SearchResultsPage to forward those active search option props into the component
so navigateToRepo can access them.

In `@packages/web/src/app/`[domain]/search/components/searchResultsPage.tsx:
- Around line 344-347: RepoResults are being accumulated chunk-by-chunk in
useStreamedSearch and can contain duplicates (same repo with updated
matchCount), so before using repoResults.length and passing repoResults into
RepoResultsPanel deduplicate by repo identifier: transform repoResults (from
useStreamedSearch) into a map keyed by repository id/name that keeps the latest
entry (highest matchCount or newest chunk), then use the map's values for the
count and as the array passed to RepoResultsPanel; update references to
repoResults in this component (the length check and the prop passed to
RepoResultsPanel) to use the deduplicated array instead.

In `@packages/web/src/features/search/parser.ts`:
- Around line 458-460: The SelectExpr is being lowered to { const: true, query:
"const" } inside boolean trees which incorrectly turns expressions like "foo or
select:repo" into match-all and "-select:repo" into match-none; modify the
parser logic that constructs AndExpr/OrExpr/NegateExpr (and any
boolean-normalization code handling SelectExpr) so that SelectExpr is either
validated/allowed only at the top-level or removed/stripped before building
boolean nodes: detect SelectExpr in functions that build/flatten AndExpr,
OrExpr, and NegateExpr and either reject it or strip it out (and preserve it
separately as a projection modifier) instead of converting it to a const:true
term inside the boolean tree.

---

Duplicate comments:
In `@packages/web/src/features/search/searchApi.ts`:
- Around line 121-126: applySelectRepo / accumulateRepoMap currently uses match
counts derived only from chunk ranges, so repos that matched solely by filename
end up with matchCount = 0; update the shared helper (accumulateRepoMap and the
helper used in zoektSearcher) to include filename-only matches when
computing/incrementing RepoResult.matchCount (e.g., add filenameMatches or
increment by 1 per filename hit) so that repoResults sorting reflects filename
hits as well; ensure the same change is applied to the helper invoked by
zoektSearcher to keep both repo-mode code paths consistent.

---

Nitpick comments:
In `@packages/mcp/src/index.ts`:
- Around line 553-559: The current fragile entry-point guard using isMain
(process.argv[1] with import.meta.url and string ops) should be removed from
index.ts and replaced by creating a dedicated CLI entry file that imports
runServer from this module and calls it; specifically, delete the isMain block
in packages/mcp/src/index.ts and export runServer (and any needed startup
helpers) as pure module exports, then add a new small CLI script (e.g., mcp-cli)
that imports { runServer } and runs runServer().catch(...) to handle startup
errors and process.exit. Ensure references to import.meta.url and the isMain
symbol are eliminated so index.ts is safe to import in tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 29342677-d9e0-4b07-b9e2-30484423665c

📥 Commits

Reviewing files that changed from the base of the PR and between 09ccf9e and 8779b74.

📒 Files selected for processing (7)
  • packages/mcp/src/__tests__/select-repo.test.ts
  • packages/mcp/src/index.ts
  • packages/web/src/app/[domain]/search/components/repoResultsPanel.tsx
  • packages/web/src/app/[domain]/search/components/searchResultsPage.tsx
  • packages/web/src/features/search/parser.ts
  • packages/web/src/features/search/searchApi.ts
  • packages/web/src/features/search/zoektSearcher.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/mcp/src/tests/select-repo.test.ts
  • packages/web/src/features/search/zoektSearcher.ts

Comment on lines +497 to +531
async ({
query,
filterByLanguages: languages = [],
filterByRepos: repos = [],
filterByFilepaths: filepaths = [],
caseSensitive = false,
ref,
useRegex = false,
maxResults = 50,
}) => {
let fullQuery = buildQueryFilters({ query, filterByLanguages: languages, filterByRepos: repos, filterByFilepaths: filepaths, ref });
if (!/(?:^|\s)select:repo(?:\s|$)/.test(fullQuery)) {
fullQuery += ' select:repo';
}

const response = await search({
query: fullQuery,
matches: env.DEFAULT_MATCHES,
contextLines: 0,
isRegexEnabled: useRegex,
isCaseSensitivityEnabled: caseSensitive,
});

const repos = response.repoResults ?? [];

if (repos.length === 0) {
return {
content: [{
type: "text",
text: `No repositories found matching: ${query}`,
}],
};
}

const limited = repos.slice(0, maxResults);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Variable repos is redeclared, causing a runtime error.

The destructuring parameter filterByRepos: repos = [] on line 500 declares repos in the function scope. Line 520 then attempts to redeclare it with const repos, which will throw SyntaxError: Identifier 'repos' has already been declared.

Proposed fix — rename the result variable
         const response = await search({
             query: fullQuery,
             matches: env.DEFAULT_MATCHES,
             contextLines: 0,
             isRegexEnabled: useRegex,
             isCaseSensitivityEnabled: caseSensitive,
         });

-        const repos = response.repoResults ?? [];
+        const repoResults = response.repoResults ?? [];

-        if (repos.length === 0) {
+        if (repoResults.length === 0) {
             return {
                 content: [{
                     type: "text",
                     text: `No repositories found matching: ${query}`,
                 }],
             };
         }

-        const limited = repos.slice(0, maxResults);
+        const limited = repoResults.slice(0, maxResults);
         const lines = limited.map(r =>
             `repo: ${r.repository}  matches: ${r.matchCount}${r.repositoryInfo?.webUrl ? `  url: ${r.repositoryInfo.webUrl}` : ''}`
         );

         const text = [
-            `Found ${repos.length} repositor${repos.length === 1 ? 'y' : 'ies'} matching "${query}"${limited.length < repos.length ? ` (showing top ${maxResults})` : ''}:`,
+            `Found ${repoResults.length} repositor${repoResults.length === 1 ? 'y' : 'ies'} matching "${query}"${limited.length < repoResults.length ? ` (showing top ${maxResults})` : ''}:`,
             '',
             ...lines,
         ].join('\n');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/mcp/src/index.ts` around lines 497 - 531, The parameter destructure
uses "filterByRepos: repos" which conflicts with the later "const repos =
response.repoResults" declaration; rename the search result variable (for
example to "repoResults" or "searchRepos") wherever "const repos =
response.repoResults", its later checks (repos.length === 0), and the "limited =
repos.slice(...)" use it so the parameter "repos" remains the input filter array
and the new name holds response.repoResults; update all references
(response.repoResults, repos.length, repos.slice, limited) to the new variable
name.

Comment on lines +18 to +35
const navigateToRepo = (repoName: string) => {
// Quote repo names that contain spaces or special characters so the
// zoekt repo: filter parses them as a single token.
const needsQuoting = /[^a-zA-Z0-9\-._/]/.test(repoName);
const safeRepo = needsQuoting
? `"${repoName.replace(/\\/g, '\\\\').replace(/"/g, '\\"')}"`
: repoName;

// Replace select:repo with repo:xxx, preserving all other filters
const newQuery = searchQuery
.replace(/(?:^|\s)select:repo(?:\s|$)/g, ' ')
.trim()
.concat(` repo:${safeRepo}`)
.trim();
const path = createPathWithQueryParams(
`/${domain}/search`,
[SearchQueryParams.query, newQuery],
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep the active search options when drilling into a repo.

Line 32 rebuilds the URL with only the query param. Clicking a repo from a regex or case-sensitive select:repo search currently drops those flags (and the active match cap), so the follow-up search can return a different result set than the repo list came from.

Proposed fix
 interface RepoResultsPanelProps {
     repoResults: RepoResult[];
     searchQuery: string;
+    matches: number;
+    isRegexEnabled: boolean;
+    isCaseSensitivityEnabled: boolean;
 }
 
-export const RepoResultsPanel = ({ repoResults, searchQuery }: RepoResultsPanelProps) => {
+export const RepoResultsPanel = ({
+    repoResults,
+    searchQuery,
+    matches,
+    isRegexEnabled,
+    isCaseSensitivityEnabled,
+}: RepoResultsPanelProps) => {
     const domain = useDomain();
     const router = useRouter();
 
@@
         const path = createPathWithQueryParams(
             `/${domain}/search`,
             [SearchQueryParams.query, newQuery],
+            [SearchQueryParams.matches, `${matches}`],
+            [SearchQueryParams.isRegexEnabled, isRegexEnabled ? "true" : null],
+            [SearchQueryParams.isCaseSensitivityEnabled, isCaseSensitivityEnabled ? "true" : null],
         );
         router.push(path);
     };

Please also pass the new props from SearchResultsPage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/app/`[domain]/search/components/repoResultsPanel.tsx around
lines 18 - 35, navigateToRepo currently rebuilds the URL using only the query
param which drops active search flags (case, regex, match cap, etc.); modify
navigateToRepo to preserve and include the existing search option params when
calling createPathWithQueryParams (e.g., include the same SearchQueryParams keys
used elsewhere such as case, regexp/regex, patternType, and matchLimit/matchCap)
by reading them from the current route/props and passing them alongside
SearchQueryParams.query and the newQuery; also update the parent
SearchResultsPage to forward those active search option props into the component
so navigateToRepo can access them.

Comment on lines +344 to +347
{isSelectRepoMode ? (
(repoResults?.length ?? 0) > 0 && (
<p className="text-sm font-medium">{`Found ${repoResults!.length} ${repoResults!.length === 1 ? 'repository' : 'repositories'}`}</p>
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deduplicate partial repo results before using them in the streaming UI.

useStreamedSearch currently appends chunk-level repoResults (packages/web/src/app/[domain]/search/useStreamedSearch.ts:200-210). Because the backend is already accumulating repo totals across chunks, the same repo can arrive again with a higher matchCount. Using repoResults.length at Line 346 and passing the raw array to RepoResultsPanel at Line 402 will overcount and show duplicate rows mid-stream.

Also applies to: 400-402

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/app/`[domain]/search/components/searchResultsPage.tsx around
lines 344 - 347, RepoResults are being accumulated chunk-by-chunk in
useStreamedSearch and can contain duplicates (same repo with updated
matchCount), so before using repoResults.length and passing repoResults into
RepoResultsPanel deduplicate by repo identifier: transform repoResults (from
useStreamedSearch) into a map keyed by repository id/name that keeps the latest
entry (highest matchCount or newest chunk), then use the map's values for the
count and as the array passed to RepoResultsPanel; update references to
repoResults in this component (the length check and the prop passed to
RepoResultsPanel) to use the deduplicated array instead.

Comment on lines +458 to +460
case SelectExpr: {
// select: is a projection modifier — no-op match-all, optimized away by zoekt
return { const: true, query: "const" };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

select: should not be lowered to const:true inside the boolean tree.

Line 460 is safe only in implicit-AND cases. foo or select:repo becomes match-all, and -select:repo becomes match-none. Since select: is a projection modifier, validate it as top-level-only or strip it before building AndExpr/OrExpr/NegateExpr.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/features/search/parser.ts` around lines 458 - 460, The
SelectExpr is being lowered to { const: true, query: "const" } inside boolean
trees which incorrectly turns expressions like "foo or select:repo" into
match-all and "-select:repo" into match-none; modify the parser logic that
constructs AndExpr/OrExpr/NegateExpr (and any boolean-normalization code
handling SelectExpr) so that SelectExpr is either validated/allowed only at the
top-level or removed/stripped before building boolean nodes: detect SelectExpr
in functions that build/flatten AndExpr, OrExpr, and NegateExpr and either
reject it or strip it out (and preserve it separately as a projection modifier)
instead of converting it to a const:true term inside the boolean tree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant