Skip to content

docs: plan to centralize hardcoded configuration#482

Merged
carlos-alm merged 8 commits intomainfrom
docs/plan-centralize-config
Mar 17, 2026
Merged

docs: plan to centralize hardcoded configuration#482
carlos-alm merged 8 commits intomainfrom
docs/plan-centralize-config

Conversation

@carlos-alm
Copy link
Contributor

@carlos-alm carlos-alm commented Mar 17, 2026

Summary

  • Inventoried ~70 individual hardcoded constants (34 inventory entries, expanding to ~70 discrete values when counting sub-keys) scattered across the codebase (thresholds, depths, weights, limits)
  • Designed a 6-phase plan to route them through the existing .codegraphrc.json config system
  • Includes PR feat: add codegraph brief command for hook context injection #480 brief command thresholds (risk tiers, BFS depths)
  • Zero breaking changes — all defaults match current hardcoded values

Categories found

Category Count Examples
Analysis params 12 BFS depths, risk tier thresholds, false-positive filter
Risk weights 3 groups Metric weights, role weights, default weight
Search 6 Already in DEFAULTS but modules duplicate locally
Display 5 Column widths, truncation limits
MCP pagination 22 Per-tool default page sizes (hard cap stays hardcoded)
Infrastructure 7 Kept hardcoded (standard formulas, security, MCP_MAX_LIMIT)

Implementation phases

  1. Extend DEFAULTS schema + fix mergeConfig for recursive deep merge (hard prerequisite)
  2. Wire analysis parameters (impact, audit, check, sequence, module-map, brief)
  3. Wire risk & community parameters (classifiers, Louvain, structure)
  4. Wire search parameters (connect existing config keys)
  5. Wire display & MCP parameters
  6. Documentation + CLAUDE.md configuration guidance

Test plan

  • Plan review — verify inventory completeness against codebase
  • Validate proposed schema against existing .codegraphrc.json usage

Inventory 78+ hardcoded magic numbers across the codebase and design
a 6-phase plan to route them through the existing .codegraphrc.json
config system. Includes PR #480 brief command thresholds.
@claude
Copy link

claude bot commented Mar 17, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Configuration Centralization Plan

Starting comprehensive review of the configuration centralization plan...

Review Checklist

  • Read and analyze the planning document
  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity assessment
  • Phase 4: Code quality evaluation
  • Phase 5: Backlog compliance grading
  • Final assessment and recommendation

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR adds a planning document (docs/tasks/PLAN_centralize_config.md) that designs a 6-phase initiative to route ~70 hardcoded behavioral constants through the existing .codegraphrc.json config system. The plan has been substantially refined over multiple review rounds, correctly classifying constants into configurable vs. infrastructure-hardcoded categories, fixing mergeConfig shallow-merge as a hard Phase 1 prerequisite, splitting asymmetric BFS depth keys, keeping MCP_MAX_LIMIT hardcoded, and aligning inventory line numbers with the current source.

Two implementation planning gaps remain:

  • Phase 3 file list omits triage.jsscoreRisk in risk.js already accepts a weights parameter, so B1 config wiring must happen at the call site in triage.js (the only caller). triage.js is not listed in Phase 3's files-to-change, meaning a Phase 3 implementer following the plan verbatim would miss it. B2/B3 (ROLE_WEIGHTS, DEFAULT_ROLE_WEIGHT) remain hardcoded inside the function body and do require risk.js signature changes.
  • Phase 5 threading approach for file-utils.js is unresolved — the plan notes that file-utils.js shouldn't call loadConfig() directly and offers two alternatives ("pass down from callers" vs. "module-level config cache"), but doesn't commit to either. The scope table entry of "3 files changed" only holds for the config-cache path; the pass-down path adds context.js and exports.js as required changes. Leaving this open makes Phase 5's risk estimate unreliable.

One minor inaccuracy: D5 points to line 76 in file-utils.js, but the + 5 constant is on line 77 (line 76 is the comment).

Confidence Score: 4/5

  • Safe to merge as a planning document — no code changes, zero runtime risk, and the identified gaps are implementation-time concerns rather than design flaws.
  • The plan is thorough and has been iteratively improved through multiple review rounds addressing critical issues (mergeConfig fix, MCP_MAX_LIMIT, phase count, BFS depth split, D4 split, etc.). The two remaining gaps (triage.js omission in Phase 3 and unresolved threading approach in Phase 5) will cause friction for implementing engineers but don't affect correctness of the overall design. They are scoped to specific phases and discoverable during implementation.
  • docs/tasks/PLAN_centralize_config.md — Phase 3 and Phase 5 sections need minor corrections before implementation begins

Important Files Changed

Filename Overview
docs/tasks/PLAN_centralize_config.md Well-structured 6-phase centralization plan that has been substantially improved through multiple review rounds; two remaining gaps: Phase 3 omits triage.js from its file list (it's the sole scoreRisk caller and owns B1 wiring), and Phase 5 leaves the file-utils.js threading approach unresolved, making the "3 files" scope estimate unreliable. One minor line-number inaccuracy (D5: 76→77).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    P1["Phase 1 — Extend DEFAULTS schema\n• Add analysis, community, structure, risk, display, mcp\n• Fix mergeConfig for recursive deep merge\n⚠️ Hard prerequisite for all subsequent phases"]
    P2["Phase 2 — Wire analysis parameters\n• impact.js, audit.js, check.js\n• sequence.js, module-map.js, brief.js\n(A1–A12)"]
    P3["Phase 3 — Wire risk & community parameters\n• risk.js (B2, B3 — extend scoreRisk signature)\n• triage.js (B1 — call-site wiring)\n• louvain.js (A7), structure.js (A8)"]
    P4["Phase 4 — Wire search parameters\n• hybrid.js (C1, C2)\n• semantic.js (C3, C4, C5)\n(config.search already exists)"]
    P5["Phase 5 — Wire display & MCP parameters\n• result-formatter.js (D1)\n• file-utils.js (D2–D5)\n• paginate.js (E1)\n⚠️ Threading approach unresolved"]
    P6["Phase 6 — Documentation & migration\n• README.md, docs/configuration.md\n• .codegraphrc.schema.json\n• CLAUDE.md configuration guidance"]

    P1 --> P2
    P1 --> P3
    P1 --> P4
    P1 --> P5
    P2 --> P6
    P3 --> P6
    P4 --> P6
    P5 --> P6
Loading

Last reviewed commit: e2885eb

Comment on lines +104 to +121
risk: {
weights: { // B1
fanIn: 0.25,
complexity: 0.3,
churn: 0.2,
role: 0.15,
mi: 0.1,
},
roleWeights: { // B2
core: 1.0,
utility: 0.9,
entry: 0.8,
adapter: 0.5,
leaf: 0.2,
dead: 0.1,
},
defaultRoleWeight: 0.5, // B3
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 mergeConfig is shallow — risk.weights partial overrides will silently drop un-specified keys

The existing mergeConfig in src/infrastructure/config.js (lines 123–138) only merges one level deep. For a top-level key like risk, it does { ...defaults['risk'], ...user['risk'] }. That's fine for flat properties like defaultRoleWeight, but when the user supplies a partial risk.weights override, the entire nested object is replaced, not merged:

// user config:
{ "risk": { "weights": { "complexity": 0.4, "churn": 0.1 } } }

// result after mergeConfig:
{ risk: { weights: { complexity: 0.4, churn: 0.1 },   // ← fanIn, role, mi are GONE
           roleWeights: { ... }, defaultRoleWeight: 0.5 } }

The dropped weights default to undefined, causing scoreRisk to compute 0.25 * normFanIn + undefined * ... — which silently produces NaN risk scores.

The Example .codegraphrc.json in the plan (lines 238–244) shows exactly this partial override pattern, which would be broken out of the box.

Phase 1 must include making mergeConfig recursive (at least 2 levels deep) before any of the nested sub-keys (risk.weights, risk.roleWeights) are wired. Without this fix, partial user overrides will silently corrupt risk scoring. Consider noting this as a hard blocker, not just a "may need to verify."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — upgraded the mergeConfig note from "may need recursive merge" to a hard Phase 1 prerequisite. The plan now explicitly states that the current 1-level shallow merge will silently drop sibling keys on partial nested overrides, producing NaN risk scores, and that this must be fixed before any nested config keys are wired.

Comment on lines +131 to +134
mcp: {
defaults: { /* E1: current MCP_DEFAULTS object */ },
maxLimit: 1000, // E2
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Making MCP_MAX_LIMIT user-configurable defeats its security purpose

MCP_MAX_LIMIT is explicitly documented in src/shared/paginate.js as a "Hard cap to prevent abuse via MCP." Routing it through .codegraphrc.json means a user — or any process with write access to the project directory — can raise it arbitrarily (e.g. "maxLimit": 999999). This would allow MCP clients to request unbounded result sets, defeating the abuse-prevention intent.

Unlike the other constants in the plan (thresholds, depths, weights), this cap is a server-side safety boundary, not a tuning knob. Consider keeping MCP_MAX_LIMIT hardcoded (move it to Category F), or at minimum require an operator-level mechanism (e.g. a CLI flag or env var) rather than the project config file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved MCP_MAX_LIMIT to Category F (keep hardcoded). Removed it from the proposed DEFAULTS schema, updated the Phase 5 file list to note it stays hardcoded, and added it to the "What stays hardcoded" section with explanation: it's a server-side safety boundary, not a tuning knob.

| 5 — Display/MCP | 3 | 2 | Medium (shared utility coupling) |
| 6 — Docs + CLAUDE.md | 4 | 0 | None |

**Total: ~20 files changed, 6 PRs, one concern per PR.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Phase count mismatch — plan body has 7 phases, scope table shows 6

The PR description calls this a "6-phase plan" and the scope table lists Phase 1–6, but the Implementation Plan section defines 7 phases (Phase 1–6, then a separate Phase 7 for CLAUDE.md updates).

Phase 7 is also not in the scope table, so there's no effort/file estimate for it. Please either:

  • Merge Phase 7 into Phase 6 (they share the same PR according to the plan text), or
  • Add Phase 7 to the scope table and update the PR description and plan title to say "7-phase plan"
Suggested change
**Total: ~20 files changed, 6 PRs, one concern per PR.**
**Total: ~20 files changed, 7 PRs, one concern per PR.**

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — merged Phase 7 into Phase 6 (they were already the same PR). The scope table now shows 6 phases consistently, matching the PR description and total line. Phase 6 description updated to include the CLAUDE.md configuration guidance as item 5.


## Problem

The config system already exists and handles deep-merge + env overrides, but ~50+ behavioral constants are hardcoded in individual modules and never read from config. Users cannot tune thresholds, depths, weights, or limits without editing source code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inconsistent magic-number count between plan and PR description

The problem statement (line 7) says "~50+ behavioral constants", but the PR description header claims "78+ hardcoded magic numbers". The two numbers differ significantly and neither is sourced/derived in the document.

For a plan whose explicit goal is completeness of inventory, documenting how the total is arrived at (e.g. counting individual values in B1 and B2 separately, counting each MCP_DEFAULTS entry as distinct) would help reviewers verify completeness and give future implementers confidence that the inventory is exhaustive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated the problem statement to say "~70 individual behavioral constants (34 inventory entries expanding to ~70 discrete values when counting sub-keys in B1, B2, and E1)" with a derivation inline. PR description also updated to match.

…482)

- Fix magic-number count: ~70 individual values (34 entries), add derivation
- Make recursive mergeConfig a hard Phase 1 prerequisite (not optional)
- Move MCP_MAX_LIMIT to Category F (keep hardcoded — security boundary)
- Merge Phase 7 into Phase 6 to fix phase count mismatch
@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +92 to +96
sequenceDepth: 10, // A5: sequence diagram depth
falsePositiveCallers: 20, // A6: generic function filter threshold
briefBfsDepth: 5, // A11/A12: brief command BFS depth (callers + importers)
briefHighRiskCallers: 10, // A9: brief high-risk tier threshold
briefMediumRiskCallers: 3, // A10: brief medium-risk tier threshold
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 build.driftThreshold belongs in build, not community

The existing build.driftThreshold controls incremental-build health in src/domain/graph/builder/stages/finalize.js (line 52):

const driftThreshold = config.build?.driftThreshold ?? 0.2;
if (nodeDrift > driftThreshold || edgeDrift > driftThreshold) { ... }

This threshold asks "has the graph structure changed significantly between builds?" — it's a build-pipeline concern, not a community-detection concern. Moving it to community.driftThreshold creates a misleading namespace: users searching for how to tune incremental build warnings will not look under community.

structureDriftThreshold (A8) is also a poor fit for community: it controls cohesion in module boundary detection (moduleBoundariesData), not the Louvain resolution parameter.

Consider:

  • Keep build.driftThreshold exactly where it is (already in DEFAULTS, already wired — it needs no migration)
  • Use structure.cohesionThreshold (or add it under build) for A8
  • Reserve the community section purely for Louvain parameters (resolution)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — kept build.driftThreshold in the build namespace where it already lives (wired in finalize.js line 52). Moved A8 (structureDriftThreshold) to a new structure.cohesionThreshold key instead. The community section now contains only the Louvain resolution parameter. Removed all references to the deprecated alias since no migration is needed.

Comment on lines +25 to +28
| A9 | `maxCallers >= 10` | `domain/analysis/brief.js` | 40 | `brief` high-risk tier threshold |
| A10 | `maxCallers >= 3` | `domain/analysis/brief.js` | 41 | `brief` medium-risk tier threshold |
| A11 | `maxDepth = 5` | `domain/analysis/brief.js` | 50 | `brief` transitive caller BFS depth |
| A12 | `maxDepth = 5` | `domain/analysis/brief.js` | 76 | `brief` transitive importer BFS depth |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Stale line numbers in brief.js inventory entries

All four brief.js entries point to incorrect lines. The actual locations (verified against current src/domain/analysis/brief.js) are:

Entry Plan says Actual line
A9 maxCallers >= 10 40 38
A10 maxCallers >= 3 41 39
A11 maxDepth = 5 (callers) 50 47
A12 maxDepth = 5 (importers) 76 73

Since this plan is primarily an implementation guide, each implementing PR will start by locating these constants. Off-by-3 line references add friction, especially for A11/A12 where both entries share the same value and file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — corrected all four line numbers to match current brief.js: A9→38, A10→39, A11→47, A12→73.

Comment on lines +97 to +100
},

community: {
resolution: 1.0, // A7: Louvain resolution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 briefBfsDepth conflates two independently meaningful BFS depths

A11 (countTransitiveCallers) and A12 (countTransitiveImporters) both default to maxDepth = 5 today, but they traverse the graph in opposite directions and have different performance profiles (callers go up the call graph; importers go up the import graph). Collapsing them into a single briefBfsDepth key prevents users from independently tuning one without the other.

Consider two separate keys:

briefCallerDepth: 5,    // A11: transitive caller BFS (countTransitiveCallers)
briefImporterDepth: 5,  // A12: transitive importer BFS (countTransitiveImporters)

If keeping a single key is preferred for simplicity, at minimum document this limitation in Phase 6.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — split briefBfsDepth into two independent keys: briefCallerDepth (A11, countTransitiveCallers) and briefImporterDepth (A12, countTransitiveImporters). Updated the DEFAULTS schema, Phase 2 wiring references, and brief.js config key references accordingly.

Comment on lines +154 to +157
**Files:** `src/infrastructure/config.js`, `tests/unit/config.test.js`

1. Add `analysis`, `community`, `risk`, `display`, `mcp` sections to `DEFAULTS`
2. Move `build.driftThreshold` → `community.driftThreshold` (keep `build.driftThreshold` as deprecated alias)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Phase 2's check.js wiring is ambiguous about which key takes precedence

The plan says check.js should read config.check.depth (already exists) and config.analysis.defaultDepth, but it doesn't specify the precedence relationship between them. Currently, check.js line 220 is:

const maxDepth = opts.depth || 3;

config.check.depth is already in DEFAULTS (check: { depth: 3 }) but the code ignores it in favour of the hardcoded 3. There are three ways to resolve this in Phase 2:

  1. Replace hardcoded 3 with config.check.depth (keeping check.depth as the sole authoritative key — cleanest)
  2. Replace with config.check.depth ?? config.analysis.defaultDepth (chain with new key as a global fallback)
  3. Use config.analysis.defaultDepth directly, making config.check.depth an override

Option 1 seems cleanest since check.depth already exists. Options 2 and 3 leave users unclear about which config key to set when they want to tune check's blast-radius. The plan should spell out the resolution order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — clarified that config.check.depth is the sole authoritative key for check.js depth (already in DEFAULTS). The plan now explicitly says: replace hardcoded 3 with config.check.depth, do not chain with config.analysis.defaultDepth. This avoids ambiguous dual-key precedence.

- Keep build.driftThreshold in `build` namespace (not community)
- Add `structure.cohesionThreshold` for A8 instead
- Fix stale line numbers for A9-A12 (brief.js)
- Split briefBfsDepth into briefCallerDepth + briefImporterDepth
- Clarify check.js uses config.check.depth as sole authoritative key
@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +45 to +46
| C4 | `minScore = 0.2` | `domain/search/search/semantic.js` | 13, 52 | Minimum similarity threshold |
| C5 | `SIMILARITY_WARN_THRESHOLD = 0.85` | `domain/search/search/semantic.js` | 71 | Duplicate query warning |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 C5 SIMILARITY_WARN_THRESHOLD has no config key and no wiring phase

SIMILARITY_WARN_THRESHOLD = 0.85 at src/domain/search/search/semantic.js:71 is included in the Category C inventory (C5), but it is absent from the proposed DEFAULTS schema additions and from the Phase 4 file list. Phase 4 only says "wire existing config keys" — there is no existing config.search.similarityWarnThreshold key to wire, so C5 would remain hardcoded after all six phases complete, contrary to the plan's stated goal.

A new key should be added to the DEFAULTS design block and wired in Phase 4:

search: {
  defaultMinScore: 0.2,
  rrfK: 60,
  topK: 15,
  similarityWarnThreshold: 0.85,  // C5: duplicate-query warning in multiSearchData
},

Phase 4 should be updated to mention src/domain/search/search/semantic.js → replace SIMILARITY_WARN_THRESHOLD with config.search.similarityWarnThreshold.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added search.similarityWarnThreshold: 0.85 to the proposed DEFAULTS schema, and updated Phase 4 to explicitly wire C5 by replacing the hardcoded SIMILARITY_WARN_THRESHOLD in semantic.js with config.search.similarityWarnThreshold.

| D1 | `MAX_COL_WIDTH = 40` | `presentation/result-formatter.js` | 82 | Table column width |
| D2 | `50 lines` | `shared/file-utils.js` | 23 | Source context excerpt length |
| D3 | `100 chars` | `shared/file-utils.js` | 48, 63 | Summary/docstring truncation |
| D4 | `10 / 20 lines` | `shared/file-utils.js` | 36, 54 | JSDoc scan depth |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 D4 conflates two distinct constants with different values into one key

The plan records D4 as 10 / 20 lines from file-utils.js lines 36 and 54 and maps both to display.jsdocScanLines: 10. These are two separate constants with different semantics:

  • Line 36Math.max(0, idx - 10): scans 10 lines upward from a definition to detect the end of a comment block (*/).
  • Line 54Math.max(0, jsdocEnd - 20): scans 20 lines upward from the block end to locate the /** opening.

If an implementer in Phase 5 wires config.display.jsdocScanLines to both locations, line 54's range silently shrinks from 20 → 10. For JSDoc blocks longer than 10 lines, extractSummary would fail to find the /** opener and return null where it previously returned a description.

Consider splitting into two keys:

display: {
  jsdocEndScanLines: 10,      // D4a: lines to scan for block-end marker
  jsdocOpenScanLines: 20,     // D4b: lines to scan for /** opening
  ...
}

Or, at minimum, note in Phase 5 that line 54 uses 20 (not 10) and should remain 20 or be wired to a separate key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — split D4 into two distinct entries: D4a (jsdocEndScanLines: 10, line 36 — scans upward for */) and D4b (jsdocOpenScanLines: 20, line 54 — scans upward for /**). Both the inventory table and DEFAULTS schema now reflect the correct separate values. Phase 5 wiring explicitly lists both keys with their different defaults.

- Add C5 (similarityWarnThreshold) to DEFAULTS schema and Phase 4 wiring
- Split D4 into D4a (jsdocEndScanLines: 10) and D4b (jsdocOpenScanLines: 20)
- Update Phase 5 file-utils.js wiring to reference both D4 keys
@carlos-alm
Copy link
Contributor Author

@greptileai

| C3 | `limit = 15` | `domain/search/search/semantic.js` | 12 | Semantic search default limit |
| C4 | `minScore = 0.2` | `domain/search/search/semantic.js` | 13, 52 | Minimum similarity threshold |
| C5 | `SIMILARITY_WARN_THRESHOLD = 0.85` | `domain/search/search/semantic.js` | 71 | Duplicate query warning |
| C6 | Batch sizes per model | `domain/search/models.js` | 66-75 | Embedding batch sizes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 C6 has no config key and no definitive phase assignment

C6 | Batch sizes per model | domain/search/models.js | 66-75 is listed in the Category C inventory as a hardcoded value, but:

  • It is absent from the proposed DEFAULTS schema additions (no search.batchSizes or similar key is designed).
  • Phase 4 mentions it only with hedging language: "batch sizes could be config-overridable per model" — no config key is specified, no file→key mapping is given.

This means C6 will remain hardcoded after all six phases complete, contrary to the plan's goal of eliminating all magic numbers from the inventory. Every other entry in Category C has a concrete key assigned.

There are two clean resolutions:

  1. Add a config key — e.g. search.batchSizeMap — and wire models.js in Phase 4 (more complex due to the per-model map shape).
  2. Move C6 to Category F (keep hardcoded) with an explicit rationale (e.g. batch sizes are model-specific implementation details rarely tuned by end-users, analogous to F3 watcher debounce).

Either way, the ambiguity should be resolved so Phase 4 implementers don't discover an undesigned item mid-PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved C6 (embedding batch sizes) to Category F with explicit rationale: model-specific implementation details rarely tuned by end-users, analogous to watcher debounce (F3). Removed ambiguous "could be" wording from Phase 4 and updated models.js entry to state batch sizes stay hardcoded.

// ... existing fields ...

analysis: {
defaultDepth: 3, // A2: BFS depth for impact/diff-impact
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 analysis.defaultDepth name implies a global fallback it doesn't provide

The key analysis.defaultDepth: 3 is mapped exclusively to A2 ("BFS depth for impact/diff-impact" — impact.js lines 31, 144). However, every other per-feature depth already has its own specific key in the same section (auditDepth, sequenceDepth, fnImpactDepth, briefCallerDepth, briefImporterDepth) and check.depth lives under check. The name defaultDepth implies it's a catch-all fallback that governs all analysis commands, but setting it to 5 would only affect impact/diff-impact — not audit, check, or sequence.

A user who wants to raise blast-radius depth across the board would be surprised to find only impact affected.

Consider renaming to impactDepth for consistency with the rest of the section:

Suggested change
defaultDepth: 3, // A2: BFS depth for impact/diff-impact
impactDepth: 3, // A2: BFS depth for impact/diff-impact

The Phase 2 wiring instructions would need the same update (config.analysis.impactDepth instead of config.analysis.defaultDepth).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — renamed analysis.defaultDepth to analysis.impactDepth throughout the plan (DEFAULTS schema, Phase 2 wiring instructions, and check.js note). This is consistent with the other per-feature depth keys (auditDepth, sequenceDepth, fnImpactDepth, briefCallerDepth, etc.).

@carlos-alm
Copy link
Contributor Author

PR #484 (fix for #481) introduced three new hardcoded constants in the native Rust engine that should be reflected in this plan's inventory under Category F (infrastructure, keep hardcoded):

# Value File Line Controls
F8 MAX_WALK_DEPTH = 200 crates/codegraph-core/src/extractors/helpers.rs 6 AST walk recursion limit for all extractors + walk_ast_nodes_with_config — stack overflow prevention
F9 MAX_WALK_DEPTH = 200 crates/codegraph-core/src/complexity.rs 6 Complexity walk/walk_children recursion limit — stack overflow prevention
F10 MAX_WALK_DEPTH = 200 crates/codegraph-core/src/cfg.rs 5 CFG process_if recursion limit — stack overflow prevention

These match the existing MAX_VISIT_DEPTH = 200 in dataflow.rs (already hardcoded). All four are safety boundaries — making them user-configurable could allow crash-inducing values. Same rationale as MCP_MAX_LIMIT.

Note: dataflow.rs:11 already has MAX_VISIT_DEPTH = 200 which could be added as F11 for completeness.

- Move C6 (embedding batch sizes) to Category F with rationale:
  model-specific implementation details rarely tuned by end-users
- Rename analysis.defaultDepth to analysis.impactDepth for consistency
  with other per-feature depth keys (auditDepth, sequenceDepth, etc.)
- Remove ambiguous Phase 4 wording about models.js configurability
@carlos-alm
Copy link
Contributor Author

@greptileai

Reflects MAX_WALK_DEPTH constants introduced by #484 (fix for #481)
and the pre-existing MAX_VISIT_DEPTH in dataflow.rs plus MCP_MAX_LIMIT.
All are safety boundaries that should stay hardcoded.
Comment on lines +207 to +208
**Tests:** Unit tests for risk scoring with custom weights. Integration test for Louvain with custom resolution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 C3 (semantic.js limit) missing from Phase 4 wiring

C3 (limit = 15 in semantic.js line 12) is listed in the inventory and the Phase 4 note acknowledges that config.search.topK already exists in DEFAULTS. However, the explicit wiring instruction for semantic.js only covers config.search.defaultMinScore and config.search.similarityWarnThreshold (C5) — config.search.topK is never mentioned for semantic.js.

After all six phases complete, C3 remains hardcoded, contrary to the plan's stated goal of eliminating every inventory item. An implementer following Phase 4 line-by-line would wire topK for hybrid.js (C1) but silently miss it for semantic.js (C3).

Suggested change
**Tests:** Unit tests for risk scoring with custom weights. Integration test for Louvain with custom resolution.
- `src/domain/search/search/hybrid.js` → read `config.search.rrfK`, `config.search.topK`
- `src/domain/search/search/semantic.js` → read `config.search.defaultMinScore`, `config.search.topK` (C3), and `config.search.similarityWarnThreshold` (C5, replaces hardcoded `SIMILARITY_WARN_THRESHOLD`)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added config.search.topK (C3) to the semantic.js wiring instruction in Phase 4. The line now reads: read config.search.defaultMinScore, config.search.topK (C3), and config.search.similarityWarnThreshold (C5).

### Phase 6 — Documentation & migration (1 PR)

1. Update `README.md` configuration section with the full schema
2. Add a `docs/configuration.md` reference with all keys, types, defaults, and descriptions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 structure section missing from Phase 6 CLAUDE.md list

Phase 6, item 5 instructs adding a CLAUDE.md Configuration section documenting "the full list of configurable sections" — but the enumerated list (analysis, community, risk, display, mcp, search, check, coChange, manifesto) omits structure, despite structure.cohesionThreshold being a new section explicitly added in Phase 1 and wired in Phase 3. A user reading the CLAUDE.md reference would never discover that structure.cohesionThreshold is configurable.

Suggested change
2. Add a `docs/configuration.md` reference with all keys, types, defaults, and descriptions
- The full list of configurable sections (`analysis`, `community`, `structure`, `risk`, `display`, `mcp`, `search`, `check`, `coChange`, `manifesto`)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added structure to the Phase 6 configurable sections list. It now reads: analysis, community, structure, risk, display, mcp, search, check, coChange, manifesto.

- Add config.search.topK (C3) to semantic.js wiring in Phase 4
- Add structure to the Phase 6 CLAUDE.md configurable sections list
@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm
Copy link
Contributor Author

@claude

@claude
Copy link

claude bot commented Mar 17, 2026

Claude finished @carlos-alm's task —— View job


Analyzing PR State

I see you've tagged me on this configuration centralization planning PR. Let me analyze the current state and provide guidance.

Todo List

  • Read the current planning document
  • Review latest Greptile feedback status
  • Assess the new constants mentioned in recent comments
  • Provide guidance on next steps

Current State Analysis

I've reviewed your comprehensive planning document - this is excellent work! You've methodically inventoried ~70 hardcoded constants across 6 categories and designed a clean 6-phase implementation plan.

The plan shows you've addressed most of Greptile feedback from previous rounds:

  • ✅ Fixed the mergeConfig deep-merge prerequisite
  • ✅ Moved MCP_MAX_LIMIT to Category F (keep hardcoded)
  • ✅ Split BFS depth keys for brief command
  • ✅ Properly handled the JSDoc scan-line split (D4a/D4b)
  • ✅ Fixed line number references

Outstanding Issues

From the latest Greptile review, there are still 2 minor gaps:

  1. C3 missing from Phase 4 wiring: The semantic.js topK config (C3) is in your inventory but not explicitly mentioned in the Phase 4 wiring instructions for semantic.js
  2. structure namespace missing from Phase 6 docs: The CLAUDE.md documentation list in Phase 6 doesn't include the new structure section

New Constants from PR #484

You mentioned 3 new MAX_WALK_DEPTH = 200 constants from PR #484 that should be added to Category F:

  • F10: extractors/helpers.rs:6 - AST walk recursion limit
  • F11: complexity.rs:6 - Complexity walk recursion limit
  • F12: cfg.rs:5 - CFG recursion limit

These are correctly categorized as infrastructure/safety boundaries that should stay hardcoded.

Recommendation

The plan is in excellent shape and ready for implementation. The 2 remaining Greptile gaps are minor and could be:

  • Fixed now with a quick edit to the plan
  • Or addressed during implementation (since they're plan-level gaps, not code issues)

What specific help do you need? Are you looking to:

  1. Fix the 2 remaining gaps in the plan?
  2. Get guidance on implementation approach?
  3. Something else entirely?

@carlos-alm carlos-alm merged commit 07e8bc0 into main Mar 17, 2026
22 checks passed
@carlos-alm carlos-alm deleted the docs/plan-centralize-config branch March 17, 2026 07:58
@github-actions github-actions bot locked and limited conversation to collaborators Mar 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant