Skip to content

refactor: decompose domain analysis and feature modules#492

Merged
carlos-alm merged 25 commits intomainfrom
refactor/titan-domain-features
Mar 17, 2026
Merged

refactor: decompose domain analysis and feature modules#492
carlos-alm merged 25 commits intomainfrom
refactor/titan-domain-features

Conversation

@carlos-alm
Copy link
Contributor

Summary

Titan quality sweep — phase 5b + 6.1: decompose domain analysis functions and feature modules.

Domain analysis

  • contextData → split file-context from function-context assembly
  • diffImpactData → separate git-diff parsing from impact aggregation
  • statsData → extract stats collection and quality scoring
  • pathData/fnDepsData → extract path formatting helpers

Features

  • buildComplexityMetrics → extract metric collection and formatting
  • buildStructure → split traversal, cohesion scoring, classification
  • buildCFGData/buildDataflowEdges → extract graph construction helpers
  • sequenceData → separate BFS traversal from message construction
  • validateBoundaryConfig (cog 101→2) → extract per-section validators
  • communitiesData (cog 32→4) → extract community building and drift analysis
  • triageData (bugs 1.4→0.48) → extract scoring and summary computation

Stacked on

Files changed (12)

  • src/domain/analysis/context.js, dependencies.js, impact.js, module-map.js
  • src/features/complexity.js, structure.js, cfg.js, dataflow.js, sequence.js
  • src/features/boundaries.js, communities.js, triage.js

Test plan

  • All 1883 tests pass
  • Lint clean
  • No new cycles

- Remove dead `truncate` function from ast-analysis/shared.js (0 consumers)
- Remove dead `truncStart` function from presentation/table.js (0 consumers)
- Un-export `BATCH_CHUNK` in builder/helpers.js (only used internally)

Skipped sync.json targets that were false positives:
- BUILTIN_RECEIVERS: used by incremental.js + build-edges.js
- TRANSIENT_CODES/RETRY_DELAY_MS: internal to readFileSafe
- MAX_COL_WIDTH: internal to printAutoTable
- findFunctionNode: re-exported from index.js, used in tests

Impact: 1 functions changed, 32 affected
…ures

Impact: 5 functions changed, 7 affected
connection.js: add debug() logging to all 8 catch-with-fallback blocks
so failures are observable without changing behavior.

migrations.js: replace 14 try/catch blocks in initSchema with hasColumn()
and hasTable() guards. CREATE INDEX calls use IF NOT EXISTS directly.
getBuildMeta uses hasTable() check instead of try/catch.

Impact: 10 functions changed, 19 affected
Add debug() logging to 10 empty catch blocks across context.js,
symbol-lookup.js, exports.js, impact.js, and module-map.js.
All catches retain their fallback behavior but failures are now
observable via debug logging.

Impact: 6 functions changed, 18 affected
Add debug() logging to 6 empty catch blocks: 3 in disposeParsers()
for WASM resource cleanup, 2 in ensureWasmTrees() for file read and
parse failures, and 1 in getActiveEngine() for version lookup.

Impact: 3 functions changed, 0 affected
Add debug() logging to 9 empty catch blocks across complexity.js (5),
cfg.js (2), and dataflow.js (2). All catches for file read and parse
failures now log the error message before continuing.

Impact: 4 functions changed, 2 affected
Split the monolithic walkJavaScriptNode switch (13 cases, cognitive 228)
into 11 focused handler functions. The dispatcher is now a thin switch
that delegates to handleFunctionDecl, handleClassDecl, handleMethodDef,
handleInterfaceDecl, handleTypeAliasDecl, handleVariableDecl,
handleEnumDecl, handleCallExpr, handleImportStmt, handleExportStmt,
and handleExpressionStmt.

The expression_statement case now reuses the existing
handleCommonJSAssignment helper, eliminating ~50 lines of duplication.

Worst handler complexity: handleVariableDecl (cognitive 20), down from
the original monolithic function (cognitive 279).

Impact: 13 functions changed, 3 affected
Split walkPythonNode switch into 7 focused handlers: handlePyFunctionDef,
handlePyClassDef, handlePyCall, handlePyImport, handlePyExpressionStmt,
handlePyImportFrom, plus the decorated_definition inline dispatch.

Moved extractPythonParameters, extractPythonClassProperties, walkInitBody,
and findPythonParentClass from closures to module-scope functions.

Impact: 12 functions changed, 5 affected
Split walkJavaNode switch into 8 focused handlers plus an
extractJavaInterfaces helper. Moved findJavaParentClass to module scope.
The class_declaration case (deepest nesting in the file) is now split
between handleJavaClassDecl and extractJavaInterfaces.

Impact: 12 functions changed, 5 affected
Apply the same per-category handler decomposition to all remaining
language extractors: Go (6 handlers), Ruby (8 handlers), PHP (11
handlers), C# (11 handlers), Rust (9 handlers), HCL (4 handlers).

Each extractor now follows the template established by the JS extractor:
- Thin entry function creates ctx, delegates to walkXNode
- walkXNode is a thin dispatcher switch
- Each case is a named handler function at module scope
- Helper functions (findParentClass, etc.) moved to module scope

Impact: 66 functions changed, 23 affected
…pers

Move nested handler functions to module level in cfg-visitor.js,
dataflow-visitor.js, and complexity-visitor.js — reducing cognitive
complexity of each factory function from 100-337 down to thin
coordinators. Extract WASM pre-parse, visitor setup, result storage,
and build delegation from runAnalyses into focused helper functions.

Impact: 66 functions changed, 43 affected
Extract edge-building by type (import, call-native, call-JS, class
hierarchy) from buildEdges. Extract per-phase insertion logic from
insertNodes. Extract scoped/incremental/full-build paths and
reverse-dep cascade from detectChanges. Extract setup, engine init,
alias loading from pipeline.js. Extract node/edge-building helpers
from incremental.js rebuildFile.

Impact: 44 functions changed, 19 affected
Impact: 5 functions changed, 3 affected
…sification

Impact: 8 functions changed, 3 affected
Impact: 10 functions changed, 5 affected
Impact: 5 functions changed, 2 affected
Impact: 12 functions changed, 6 affected
…age)

Extract per-section validators from validateBoundaryConfig (cog 101→2).
Extract buildCommunityObjects and analyzeDrift from communitiesData (cog 32→4).
Extract buildTriageItems and computeTriageSummary from triageData (bugs 1.4→0.48).

Impact: 13 functions changed, 11 affected
@claude
Copy link

claude bot commented Mar 17, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Refactor Domain Analysis and Feature Modules

Reviewing this refactoring PR following the codegraph review process...

Todo List

  • Phase 1: Root cause analysis - understand the decomposition rationale
  • Phase 2: Configuration change detection - check for rule relaxations
  • Phase 3: Test integrity - verify no weakened assertions
  • Phase 4: Code quality review - examine refactored modules
  • Phase 5: Backlog compliance assessment
  • Final Assessment: Rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: This PR continues the "Titan quality sweep" by decomposing large, monolithic functions in domain analysis and feature modules into smaller, focused helpers. The goal is to improve cognitive complexity and maintainability.

Why This Approach: The refactoring extracts reusable helper functions (e.g., buildTriageItems, computeTriageSummary, validateModules) from the main data processing functions, reducing cyclomatic complexity while maintaining identical functionality. This is architectural refactoring, not feature changes.

Risk Assessment: Low risk - pure refactoring with no behavior changes, no config relaxations detected.

Phase 2: Configuration Change Detection ✅

No HIGH-RISK configuration changes detected:

  • ✅ No biome.json, package.json, or build config changes
  • ✅ No eslint-disable or @ts-ignore comments added
  • ✅ No rule relaxations or coverage threshold changes

Phase 3: Test Integrity ✅

  • ✅ No test files modified in this PR
  • ✅ No test expectations lowered or assertions weakened
  • ✅ Claims all 1883 tests still pass (verified via PR description)

Phase 4: Code Quality Review ✅

Refactoring Pattern: Excellent decomposition approach - extracts focused helper functions while preserving public APIs:

  • src/features/triage.js: Extracted buildTriageItems(), computeTriageSummary() with clear responsibilities
  • src/features/boundaries.js: Extracted validateModules(), validateRules(), validatePreset() per-section validators
  • src/domain/analysis/*: Split large functions into BFS traversals, data builders, and formatters

Quality Indicators:

  • ✅ All public exports preserved (triageData, evaluateBoundaries, contextData, etc.)
  • ✅ Consistent architectural pattern: helpers → main data function → public API
  • ✅ Clear separation of concerns (validation, computation, formatting)
  • ✅ Proper error handling preserved in all refactored functions

Phase 5: Backlog Compliance ✅

Zero-dep: ✅ — No new runtime dependencies added to package.json
Foundation-aligned: ✅ — Pure internal refactoring, no external API or architecture changes
Problem-fit: 3/5 — Maintainability improvements help long-term but don't directly address core AI token waste problem
Breaking: No — All public APIs preserved, purely internal decomposition
Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Critical Concerns

None identified. This is clean architectural refactoring.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: High-quality refactoring that improves maintainability without risk. Excellent decomposition patterns, all APIs preserved, no configuration changes, no test modifications. The cognitive complexity improvements (e.g., boundaries cog 101→2, triage bugs 1.4→0.48) demonstrate measurable quality gains. Only minor concern is this is maintenance work rather than core feature development.

Strong points:

  • Clean separation of concerns in all refactored modules
  • Consistent helper extraction pattern across all files
  • Zero behavioral changes while improving internal structure
  • Excellent commit message documentation with impact metrics

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR is a pure structural refactoring ("Titan quality sweep — phase 5b + 6.1") that decomposes large monolithic functions across the domain/analysis/ and features/ layers into focused, named helper functions. No public APIs, data schemas, or observable behaviors are changed.

Key changes:

  • context.js — Inline callee/caller/test/complexity assembly blocks extracted into buildCallees, buildCallers, buildRelatedTests, getComplexityMetrics, getNodeChildrenSafe; the recursive explainCallees closure is lifted to module level with explicit parameter passing
  • dependencies.js — BFS path-finding and endpoint resolution split into bfsShortestPath, reconstructPath, resolveEndpoints, buildTransitiveCallers
  • impact.js — Git interaction (findGitRoot, runGitDiff, parseGitDiff), impact computation (findAffectedFunctions, buildFunctionImpactResults), and auxiliary lookups (lookupCoChanges, lookupOwnership, checkBoundaryViolations) extracted from diffImpactData
  • module-map.jsstatsData decomposed into 9 helpers covering node/edge counting, language distribution, hotspot finding, embeddings, quality metrics, roles, and complexity summary
  • boundaries.jsvalidateBoundaryConfig (cog 101→2) split into validateModules, validatePreset, validateTargetList, validateRules, validateLayerAssignments
  • cfg.js / complexity.js / dataflow.js — WASM parser initialization, tree-acquisition, and persistence steps each extracted into dedicated helpers
  • sequence.js — BFS traversal (bfsCallees), dataflow annotation (annotateDataflow), and participant construction (buildParticipants) separated from the entry-point lookup
  • structure.js — 7 build-time helpers extracted from buildStructure
  • triage.js — Scoring/summary logic moved to buildTriageItems, computeTriageSummary; SORT_FNS hoisted to module-level constant
  • inspect.js / overview.js — Presentation rendering extracted into dedicated render* / print* functions, eliminating deeply nested closures and duplicate grid-rendering loops

Confidence Score: 5/5

  • Safe to merge — pure structural refactoring with no behavioral changes, backed by 1883 passing tests.
  • Every extracted helper maps 1:1 to an inline block from the original; closure-captured variables are correctly threaded as explicit parameters; early-exit patterns (return null in helpers replacing continue in loops) are semantically equivalent; the test suite provides comprehensive behavioral coverage confirming no regressions.
  • No files require special attention. src/domain/analysis/module-map.js has a minor redundant testFilterSQL call inside findHotspots, but this is a trivial inefficiency with no correctness impact.

Important Files Changed

Filename Overview
src/domain/analysis/context.js Extracted buildCallees, buildCallers, buildRelatedTests, getComplexityMetrics, getNodeChildrenSafe, and explainCallees as module-level helpers. The explainCallees helper correctly receives visited as an explicit argument (previously a closure capture). Logic is preserved exactly.
src/domain/analysis/dependencies.js Extracted resolveEndpoints, bfsShortestPath, reconstructPath, and buildTransitiveCallers from pathData and fnDepsData. All behavioral logic is faithfully preserved; early-return patterns replace inner return statements correctly.
src/domain/analysis/impact.js Extracted findGitRoot, runGitDiff, parseGitDiff, findAffectedFunctions, buildFunctionImpactResults, lookupCoChanges, lookupOwnership, and checkBoundaryViolations from diffImpactData. Each helper maps exactly to an inline block from the original.
src/domain/analysis/module-map.js Extracted buildTestFileIds, countNodesByKind, countEdgesByKind, countFilesByLanguage, findHotspots, getEmbeddingsInfo, computeQualityMetrics, countRoles, getComplexitySummary from statsData. Minor: findHotspots internally recomputes testFilter via testFilterSQL rather than receiving it from the caller, adding a small redundant call.
src/features/boundaries.js validateBoundaryConfig (cog 101→2) decomposed into validateModules, validatePreset, validateTargetList, validateRules, validateLayerAssignments. Clean extraction with no behavioral change; each sub-validator mutates the shared errors array via parameter passing.
src/features/cfg.js Extracted initCfgParsers, getTreeAndLang, buildVisitorCfgMap, persistCfg from buildCFGData. The combined if (needsFallback) block in initCfgParsers correctly collapses the original two-step if (needsFallback) / if (parsers) idiom since parsers is only truthy when needsFallback is true.
src/features/communities.js buildCommunityObjects and analyzeDrift extracted from communitiesData. Community object construction and drift analysis are now cleanly separated; the communityDirs Map is passed between them correctly.
src/features/complexity.js Extracted initWasmParsersIfNeeded, getTreeForFile, upsertPrecomputedComplexity, upsertAstComplexity from buildComplexityMetrics. The if/else structure correctly replaces the original if-continue-then-fallback pattern.
src/features/dataflow.js Extracted initDataflowParsers, getDataflowForFile, insertDataflowEdges from buildDataflowEdges. The resolveNode function was changed from a named function declaration to an arrow function assigned to a const; behavior is identical.
src/features/sequence.js Extracted findEntryNode, bfsCallees, annotateDataflow, buildParticipants from sequenceData. The withDataflow local variable was removed; opts.dataflow is used directly in the call site with identical truthy semantics.
src/features/structure.js Extracted getAncestorDirs, cleanupPreviousData, collectAllDirectories, insertContainsEdges, computeImportEdgeMaps, computeFileMetrics, computeDirectoryMetrics from buildStructure. getAncestorDirs is called twice (in cleanup and insertContainsEdges), matching the original two separate IIFE computations.
src/features/triage.js SORT_FNS lifted to module-level constant (created once rather than per call), EMPTY_SUMMARY factory extracted, buildTriageItems and computeTriageSummary extracted. Behaviorally equivalent; signalCoverage division is safe since computeTriageSummary is only reached after a non-empty filtered check.
src/presentation/queries-cli/inspect.js Presentation rendering logic extracted: renderContextResult, renderFileExplain, renderFunctionExplain (renamed from printFunctionExplain). None of the original nested functions captured outer variables, so extraction to module level is safe.
src/presentation/queries-cli/overview.js printCountGrid abstraction eliminates 7 identical grid-rendering loops. All print* helpers are pure side-effect functions over the data object; stats() now reads as a clean sequential pipeline.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph domain["domain/analysis/"]
        CTX["context.js\ncontextData / explainData"]
        DEP["dependencies.js\nfnDepsData / pathData"]
        IMP["impact.js\ndiffImpactData"]
        MOD["module-map.js\nstatsData"]
    end

    subgraph features["features/"]
        BND["boundaries.js\nvalidateBoundaryConfig"]
        CFG["cfg.js\nbuildCFGData"]
        COM["communities.js\ncommunitiesData"]
        CPX["complexity.js\nbuildComplexityMetrics"]
        DFL["dataflow.js\nbuildDataflowEdges"]
        SEQ["sequence.js\nsequenceData"]
        STR["structure.js\nbuildStructure"]
        TRI["triage.js\ntriageData"]
    end

    subgraph presentation["presentation/queries-cli/"]
        INS["inspect.js\ncontext / explain"]
        OVW["overview.js\nstats"]
    end

    CTX --> |"buildCallees\nbuildCallers\nbuildRelatedTests\ngetComplexityMetrics\nexplainCallees"| H1(["helpers ×5"])
    DEP --> |"bfsShortestPath\nreconstructPath\nresolveEndpoints\nbuildTransitiveCallers"| H2(["helpers ×4"])
    IMP --> |"findGitRoot\nrunGitDiff\nparseGitDiff\nfindAffectedFunctions\nbuildFunctionImpactResults\nlookupCoChanges / Ownership / Boundaries"| H3(["helpers ×8"])
    MOD --> |"buildTestFileIds\ncountNodesByKind\ncountEdgesByKind\nfindHotspots\ncomputeQualityMetrics\n+4 more"| H4(["helpers ×9"])
    BND --> |"validateModules\nvalidatePreset\nvalidateTargetList\nvalidateRules\nvalidateLayerAssignments"| H5(["helpers ×5"])
    CFG --> |"initCfgParsers\ngetTreeAndLang\nbuildVisitorCfgMap\npersistCfg"| H6(["helpers ×4"])
    COM --> |"buildCommunityObjects\nanalyzeDrift"| H7(["helpers ×2"])
    CPX --> |"initWasmParsersIfNeeded\ngetTreeForFile\nupsertPrecomputed\nupsertAstComplexity"| H8(["helpers ×4"])
    DFL --> |"initDataflowParsers\ngetDataflowForFile\ninsertDataflowEdges"| H9(["helpers ×3"])
    SEQ --> |"findEntryNode\nbfsCallees\nannotateDataflow\nbuildParticipants"| H10(["helpers ×4"])
    STR --> |"getAncestorDirs\ncleanupPreviousData\ncollectAllDirs\ninsertContainsEdges\ncomputeImportEdgeMaps\n+2 more"| H11(["helpers ×7"])
    TRI --> |"buildTriageItems\ncomputeTriageSummary\nSORT_FNS / EMPTY_SUMMARY"| H12(["helpers ×4"])
    INS --> |"renderContextResult\nrenderFileExplain\nrenderFunctionExplain"| H13(["helpers ×3"])
    OVW --> |"printCountGrid\nprintNodes / Edges / Files\nprintCycles / Hotspots\nprintQuality / Roles / Complexity / Communities"| H14(["helpers ×11"])
Loading

Last reviewed commit: bb2729e

Base automatically changed from refactor/titan-ast-builder to main March 17, 2026 11:57
# Conflicts:
#	src/ast-analysis/engine.js
#	src/db/migrations.js
#	src/domain/analysis/context.js
#	src/domain/analysis/impact.js
#	src/domain/analysis/module-map.js
#	src/domain/graph/builder/stages/build-edges.js
#	src/domain/graph/builder/stages/detect-changes.js
#	src/extractors/csharp.js
#	src/extractors/go.js
#	src/extractors/java.js
#	src/extractors/php.js
#	src/extractors/rust.js
#	src/features/cfg.js
#	src/features/complexity.js
#	src/features/dataflow.js
#	src/features/shared/find-nodes.js

Impact: 26 functions changed, 42 affected
Impact: 27 functions changed, 80 affected
@carlos-alm
Copy link
Contributor Author

Resolved 16 merge conflicts with main — kept PR's refactored helpers for domain analysis and features, main's extractor guards, hasTable safety improvements, and find-nodes validation.

@greptileai

…into refactor/titan-domain-features

# Conflicts:
#	src/features/complexity.js
@carlos-alm carlos-alm merged commit 5e8d825 into main Mar 17, 2026
14 checks passed
@carlos-alm carlos-alm deleted the refactor/titan-domain-features branch March 17, 2026 13:12
@github-actions github-actions bot locked and limited conversation to collaborators Mar 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant