fix: restore complexity/CFG/dataflow analysis in builds by carlos-alm · Pull Request #469 · optave/codegraph

carlos-alm · 2026-03-16T10:48:26Z

Summary

Fixes #468. Two bugs were causing all analysis tables (function_complexity, cfg_blocks, cfg_edges, dataflow) to be empty on every build — full or incremental, native or WASM.

Broken import paths in ast-analysis/engine.js: The buildXxx functions were dynamically imported as ../complexity.js, ../cfg.js, ../ast.js, ../dataflow.js — which resolved to non-existent src/complexity.js etc. instead of src/features/complexity.js. The imports failed silently inside try/catch blocks, so analysis data was never written to the DB. Fixed to ../features/complexity.js etc.
Missing WASM tree pre-parse for native incremental rebuilds: ensureWasmTrees was only triggered when dataflow analysis needed it, but complexity and CFG visitors also require a WASM tree-sitter tree for their visitor walks. On native incremental rebuilds, changed files had their analysis data purged by purgeFilesFromGraph but never re-computed. Broadened the ensureWasmTrees gate to trigger when any of complexity/CFG/dataflow need it.

Test plan

New tests in incremental-parity.test.js verify complexity, CFG, and dataflow rows survive incremental rebuilds
Full test suite passes (662 tests, 42 files — failures are only pre-existing missing WASM grammars for Rust/Java/C#/Ruby/PHP in worktree)
Manual verification: WASM full build, native full build, and native incremental rebuild all produce non-zero analysis rows

Two bugs fixed: 1. Broken import paths in ast-analysis/engine.js — the buildXxx functions (buildComplexityMetrics, buildCFGData, buildDataflowEdges, buildAstNodes) were imported as ../complexity.js etc. which resolved to non-existent src/complexity.js instead of src/features/complexity.js. The dynamic imports failed silently inside try/catch, so all analysis tables (function_complexity, cfg_blocks, dataflow) were always empty on every build — both full and incremental, both engines. 2. Missing WASM tree pre-parse for native engine incremental rebuilds — ensureWasmTrees was only called when dataflow needed it, but complexity and CFG visitors also require a WASM tree. On native incremental rebuilds, changed files had their analysis data purged but never re-computed because no WASM tree was available for the visitor walk. Impact: 1 functions changed, 0 affected

claude · 2026-03-16T10:48:43Z

Claude finished @carlos-alm's task —— View job

PR Review: Fix complexity/CFG/dataflow analysis in builds

Conducting comprehensive review following 5-phase process:

Todo List:

Phase 1: Root cause analysis - understand the real problem
Phase 2: Configuration change detection - check for risky changes
Phase 3: Test integrity verification
Phase 4: Code quality review
Phase 5: Backlog compliance assessment
Final: Provide structured assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Two critical bugs causing all analysis tables to be empty on every build:

Broken import paths in src/ast-analysis/engine.js:340-373 - Dynamic imports were using ../complexity.js instead of ../features/complexity.js, causing silent failures inside try/catch blocks
Missing WASM tree pre-parse for native incremental rebuilds - Only dataflow analysis triggered ensureWasmTrees(), but complexity and CFG visitors also require WASM trees

Why This Approach: This fixes the root cause, not symptoms. The import paths were genuinely wrong after the codebase reorganization (commit 68bd052 moved files from src/ to src/features/), and the WASM tree logic was incomplete.

Risk Assessment: Low risk - surgical fixes to import paths + broadened condition for WASM tree generation.

Phase 2: Configuration Change Detection ✅

No configuration, linting rules, timeouts, or module resolution changes detected. This is pure bug fixing.

Phase 3: Test Integrity ✅

Strong test coverage added:

3 new tests in incremental-parity.test.js:147-164 specifically verify complexity, CFG, and dataflow analysis survive incremental rebuilds
Tests use readAnalysisTables() function to query the actual DB tables: function_complexity, cfg_blocks, dataflow
Tests verify row counts match between full vs incremental builds
Proper assertions: expect(incrAnalysis.complexity.length).toBeGreaterThan(0) ensures data exists

No red flags - tests strengthen verification without relaxing standards.

Phase 4: Code Quality ✅

Import path fixes (lines 340, 351, 362, 373):

- const { buildAstNodes } = await import('../ast.js');
+ const { buildAstNodes } = await import('../features/ast.js');

✅ Correct - the files are actually in src/features/, verified via ls src/features/

WASM tree logic enhancement (lines 81-116):

Broadened condition from if (doDataflow) to if (doComplexity || doCfg || doDataflow)
Added proper per-analysis checking: needsComplexity, needsCfg, needsDataflow
Logic is sound: complexity/CFG visitors need WASM trees for AST walking, not just dataflow

Code quality: Clean, defensive, follows existing patterns. No performance issues introduced.

Phase 5: Backlog Compliance ✅

Root Cause Analysis

Problem Being Solved: Critical infrastructure bug - all analysis tables empty due to broken import paths and incomplete WASM tree logic
Why This Approach: Fixes actual import resolution failures and missing dependencies, not masking symptoms
Risk Assessment: Low risk surgical fixes to core functionality

Backlog Compliance

Zero-dep: ✓ - No new runtime dependencies added
Foundation-aligned: ✓ - Fixes existing functionality, maintains offline-first architecture
Problem-fit: 4/5 - Directly prevents analysis data loss that would confuse AI agents about code complexity/structure
Breaking: No - Pure bug fixes, no API changes
Tier: 1

Critical Concerns

None. This is a well-executed bug fix with:

Precise root cause identification
Minimal, surgical changes
Strong test coverage for regression prevention
No relaxed standards or workarounds

Final Recommendation

Rating: ⭐⭐⭐⭐☆ (4/5)
Action: APPROVE
Reasoning: Excellent bug fix - addresses real infrastructure issues with proper testing. Restores critical analysis functionality that would otherwise leave AI agents with incomplete code structure data. Only missing 5th star due to lack of documentation about why these imports failed originally (though git history provides context).

greptile-apps · 2026-03-16T10:52:23Z

Greptile Summary

This PR correctly fixes two independent root causes that were silently emptying all analysis tables (function_complexity, cfg_blocks, cfg_edges, dataflow) on every build. The changes are surgical and well-reasoned.

Root cause fixes:

Wrong dynamic import paths (../complexity.js → ../features/complexity.js etc.) — all four buildXxx imports were resolving to non-existent paths; because they were inside try/catch blocks the failures were swallowed silently. Fixed across all four import sites.
Too-narrow WASM pre-parse gate — the ensureWasmTrees call was gated only on doDataflow, so native incremental rebuilds would purge complexity and CFG data for changed files and never re-compute it. The gate is now broadened to any of doComplexity || doCfg || doDataflow, with fine-grained per-file guards (needsComplexity, needsCfg, needsDataflow) that match the extension sets and state checks used by the unified walk.

Regression test quality:

All three new analysis-parity tests assert toBeGreaterThan(0) before comparing counts, preventing vacuous passes.
Correct column names (function_node_id, source_id) are used in the SQL helpers.
db.close() in readAnalysisTables is guarded by try/finally.

Minor observation: The existing readGraph helper (not introduced by this PR) still calls db.close() inline without try/finally, inconsistent with the pattern correctly applied in the new readAnalysisTables.

Confidence Score: 5/5

This PR is safe to merge — it fixes two clearly identified silent failures with minimal, targeted changes and adds non-vacuous regression tests for all three affected analysis types.
Both fixes are straightforward corrections (wrong path strings, missing gate condition) with no behavioural changes beyond restoring previously broken functionality. The only outstanding item is a pre-existing try/finally omission in readGraph which is a test-helper resource-leak risk, not a production correctness issue.
No files require special attention; the one minor style note is in the test helper readGraph at line 28 of tests/integration/incremental-parity.test.js.

Important Files Changed

Filename	Overview
src/ast-analysis/engine.js	Fixes two root causes of empty analysis tables: broken `../features/*` import paths and a too-narrow WASM pre-parse gate. The new gate correctly checks all three analysis types (`needsComplexity`, `needsCfg`, `needsDataflow`) with consistent extension and state guards. The `COMPLEXITY_EXTENSIONS` constant and `d.cfg !== null` guard align this module with the patterns already used by the unified walk. No new issues introduced.
tests/integration/incremental-parity.test.js	Adds `readAnalysisTables` helper and three new regression tests for #468. Column names are correct (`function_node_id`, `source_id`), `toBeGreaterThan(0)` guards prevent vacuous passes, and `db.close()` in `readAnalysisTables` is properly guarded by `try/finally`. Minor: the pre-existing `readGraph` function was not updated with the same `try/finally` pattern, leaving a small resource-leak inconsistency.

Sequence Diagram

sequenceDiagram
    participant Builder as builder.js
    participant Engine as engine.js (runAnalyses)
    participant Parser as domain/parser.js
    participant Features as features/*.js

    Builder->>Engine: runAnalyses(db, fileSymbols, rootDir, opts)

    Note over Engine: WASM pre-parse gate (NEW: doComplexity || doCfg || doDataflow)
    loop For each file without _tree
        Engine->>Engine: needsComplexity = doComplexity && COMPLEXITY_EXTENSIONS && !d.complexity
        Engine->>Engine: needsCfg = doCfg && CFG_EXTENSIONS && d.cfg !== null && !blocks
        Engine->>Engine: needsDataflow = doDataflow && !symbols.dataflow && DATAFLOW_EXTENSIONS
        alt Any need is true
            Engine->>Parser: ensureWasmTrees(fileSymbols, rootDir)
            Parser-->>Engine: fileSymbols updated with _tree
            Note over Engine: break — no need to check further files
        end
    end

    Note over Engine: Unified walk (files with _tree)
    loop For each file with _tree
        Engine->>Engine: createComplexityVisitor / createCfgVisitor / createDataflowVisitor
        Engine->>Engine: walkWithVisitors → store results on symbols/defs
    end

    Note over Engine: Delegate to buildXxx (FIXED import paths)
    Engine->>Features: import('../features/ast.js') → buildAstNodes
    Engine->>Features: import('../features/complexity.js') → buildComplexityMetrics
    Engine->>Features: import('../features/cfg.js') → buildCFGData
    Engine->>Features: import('../features/dataflow.js') → buildDataflowEdges
    Features-->>Engine: DB rows written
    Engine-->>Builder: timing object

_{Last reviewed commit: 93ec404}

greptile-apps · 2026-03-16T10:52:27Z

src/ast-analysis/engine.js

+      const needsComplexity =
+        doComplexity &&
+        defs.some((d) => (d.kind === 'function' || d.kind === 'method') && d.line && !d.complexity);
+      const needsCfg =
+        doCfg &&
+        CFG_EXTENSIONS.has(ext) &&
+        defs.some(
+          (d) =>
+            (d.kind === 'function' || d.kind === 'method') &&
+            d.line &&
+            !Array.isArray(d.cfg?.blocks),
+        );


needsCfg gate is looser than needsWasmCfg

The WASM pre-parse gate here uses !Array.isArray(d.cfg?.blocks) which is true when d.cfg === null. However, the unified walk gate on line 197 uses d.cfg !== null && !Array.isArray(d.cfg?.blocks), which skips null entries.

In cfg.js, d.cfg === null means "the native engine explicitly decided CFG can't/needn't be computed for this definition" and is treated as satisfied. The mismatch means this gate may trigger WASM tree loading for files where the visitor will then skip all definitions.

This isn't a correctness bug (CFG results are still correct), but it's a minor inefficiency — and a consistency concern. Consider aligning the check:

Suggested change

const needsComplexity =

doComplexity &&

defs.some((d) => (d.kind === 'function' || d.kind === 'method') && d.line && !d.complexity);

const needsCfg =

doCfg &&

CFG_EXTENSIONS.has(ext) &&

defs.some(

(d) =>

(d.kind === 'function' || d.kind === 'method') &&

d.line &&

!Array.isArray(d.cfg?.blocks),

);

const needsComplexity =

doComplexity &&

defs.some((d) => (d.kind === 'function' || d.kind === 'method') && d.line && !d.complexity);

const needsCfg =

doCfg &&

CFG_EXTENSIONS.has(ext) &&

defs.some(

(d) =>

(d.kind === 'function' || d.kind === 'method') &&

d.line &&

d.cfg !== null &&

!Array.isArray(d.cfg?.blocks),

);

Fixed — added d.cfg !== null to the needsCfg gate, aligning it with the unified walk guard on line 197 and avoiding unnecessary WASM loading for null-CFG definitions.

greptile-apps · 2026-03-16T10:52:28Z

tests/integration/incremental-parity.test.js

+  it('preserves CFG blocks for changed file (#468)', () => {
+    const fullAnalysis = readAnalysisTables(path.join(fullDir, '.codegraph', 'graph.db'));
+    const incrAnalysis = readAnalysisTables(path.join(incrDir, '.codegraph', 'graph.db'));
+    expect(incrAnalysis.cfgBlocks.length).toBe(fullAnalysis.cfgBlocks.length);
+  });


CFG and dataflow tests may pass vacuously

The complexity test on line 150 asserts toBeGreaterThan(0) to ensure rows actually exist before comparing counts. The CFG and dataflow tests only compare counts between full and incremental — if both return 0 (e.g., because the fixture is too simple or a grammar is missing), the test passes without actually verifying analysis data survives incremental rebuilds.

Consider adding expect(incrAnalysis.cfgBlocks.length).toBeGreaterThan(0) and similar for dataflow, matching the complexity test's approach.

Fixed — added toBeGreaterThan(0) assertions for CFG blocks and dataflow statements, matching the complexity test pattern. Also fixed wrong column names in readAnalysisTables (node_id → function_node_id, source_node_id → source_id) that were causing the queries to silently return empty arrays via the try/catch fallback — confirming the tests were indeed passing vacuously.

Add d.cfg !== null check to needsCfg predicate, matching the unified walk guard and avoiding unnecessary WASM loading for null-CFG defs. Fix wrong column names in readAnalysisTables (node_id -> function_node_id, source_node_id -> source_id) that caused CFG/dataflow queries to silently return empty arrays. Add toBeGreaterThan(0) assertions so these tests fail fast if analysis data is missing. Impact: 1 functions changed, 0 affected

carlos-alm · 2026-03-16T10:58:43Z

@greptileai

greptile-apps · 2026-03-16T11:02:45Z

src/ast-analysis/engine.js

+      const needsComplexity =
+        doComplexity &&
+        defs.some((d) => (d.kind === 'function' || d.kind === 'method') && d.line && !d.complexity);


needsComplexity missing language/extension filter

Unlike needsCfg (which guards with CFG_EXTENSIONS.has(ext)) and needsDataflow (which guards with DATAFLOW_EXTENSIONS.has(ext)), needsComplexity has no extension filter. If a file has function definitions but belongs to a language with no COMPLEXITY_RULES entry (e.g. a language the native parser produces definitions for but the WASM complexity visitor doesn't support), this gate can fire and trigger ensureWasmTrees for all files in the batch — even though the unified walk will later skip complexity for that language because cRules is undefined.

The check should be consistent with the other gates:

Suggested change

const needsComplexity =

doComplexity &&

defs.some((d) => (d.kind === 'function' || d.kind === 'method') && d.line && !d.complexity);

const needsComplexity =

doComplexity &&

COMPLEXITY_RULES.has(extToLang.get(ext)) &&

defs.some((d) => (d.kind === 'function' || d.kind === 'method') && d.line && !d.complexity);

This prevents unnecessary WASM loading and brings the three gate conditions in line with each other.

Fixed in 93ec404. Added COMPLEXITY_EXTENSIONS = buildExtensionSet(COMPLEXITY_RULES) at module level (matching CFG_EXTENSIONS and DATAFLOW_EXTENSIONS), and added COMPLEXITY_EXTENSIONS.has(ext) to the needsComplexity check so it now filters by language/extension like the other two guards.

greptile-apps · 2026-03-16T11:02:46Z

tests/integration/incremental-parity.test.js

+  const db = new Database(dbPath, { readonly: true });
+  const result = {};
+  try {
+    result.complexity = db
+      .prepare(
+        `SELECT fc.node_id, fc.cognitive, fc.cyclomatic, n.name, n.file
+         FROM function_complexity fc JOIN nodes n ON fc.node_id = n.id
+         ORDER BY n.name, n.file`,
+      )
+      .all();
+  } catch {
+    result.complexity = [];
+  }
+  try {
+    result.cfgBlocks = db
+      .prepare(
+        `SELECT cb.function_node_id, cb.block_index, cb.block_type, n.name, n.file
+         FROM cfg_blocks cb JOIN nodes n ON cb.function_node_id = n.id
+         ORDER BY n.name, n.file, cb.block_index`,
+      )
+      .all();
+  } catch {
+    result.cfgBlocks = [];
+  }
+  try {
+    result.dataflow = db
+      .prepare(
+        `SELECT d.source_id, d.kind, n.name, n.file
+         FROM dataflow d JOIN nodes n ON d.source_id = n.id
+         ORDER BY n.name, n.file, d.kind`,
+      )
+      .all();
+  } catch {
+    result.dataflow = [];
+  }
+  db.close();
+  return result;
+}


db.close() not guarded by finally

db.close() at line 82 is called after three separate try/catch blocks. In the current implementation this is always reachable — the inner catch blocks swallow all exceptions — but the pattern is fragile: a future catch that rethrows, or a change that adds code between the last catch and db.close(), would silently leak the database handle. A try/finally makes the intent explicit and is idiomatic for resource cleanup:

Suggested change

const db = new Database(dbPath, { readonly: true });

const result = {};

try {

result.complexity = db

.prepare(

`SELECT fc.node_id, fc.cognitive, fc.cyclomatic, n.name, n.file

FROM function_complexity fc JOIN nodes n ON fc.node_id = n.id

ORDER BY n.name, n.file`,

)

.all();

} catch {

result.complexity = [];

}

try {

result.cfgBlocks = db

.prepare(

`SELECT cb.function_node_id, cb.block_index, cb.block_type, n.name, n.file

FROM cfg_blocks cb JOIN nodes n ON cb.function_node_id = n.id

ORDER BY n.name, n.file, cb.block_index`,

)

.all();

} catch {

result.cfgBlocks = [];

}

try {

result.dataflow = db

.prepare(

`SELECT d.source_id, d.kind, n.name, n.file

FROM dataflow d JOIN nodes n ON d.source_id = n.id

ORDER BY n.name, n.file, d.kind`,

)

.all();

} catch {

result.dataflow = [];

}

db.close();

return result;

}

function readAnalysisTables(dbPath) {

const db = new Database(dbPath, { readonly: true });

const result = {};

try {

try {

result.complexity = db

.prepare(

`SELECT fc.node_id, fc.cognitive, fc.cyclomatic, n.name, n.file

FROM function_complexity fc JOIN nodes n ON fc.node_id = n.id

ORDER BY n.name, n.file`,

)

.all();

} catch {

result.complexity = [];

}

try {

result.cfgBlocks = db

.prepare(

`SELECT cb.function_node_id, cb.block_index, cb.block_type, n.name, n.file

FROM cfg_blocks cb JOIN nodes n ON cb.function_node_id = n.id

ORDER BY n.name, n.file, cb.block_index`,

)

.all();

} catch {

result.cfgBlocks = [];

}

try {

result.dataflow = db

.prepare(

`SELECT d.source_id, d.kind, n.name, n.file

FROM dataflow d JOIN nodes n ON d.source_id = n.id

ORDER BY n.name, n.file, d.kind`,

)

.all();

} catch {

result.dataflow = [];

}

} finally {

db.close();

}

return result;

}

Fixed in 93ec404. Wrapped the three inner try/catch blocks in an outer try/finally so db.close() is guaranteed to run even if an unexpected error propagates past the individual catch blocks.

Impact: 1 functions changed, 0 affected

carlos-alm · 2026-03-16T11:13:53Z

@greptileai

greptile-apps bot reviewed Mar 16, 2026

View reviewed changes

carlos-alm added 2 commits March 16, 2026 04:54

Merge branch 'main' into fix/incremental-analysis-468

ce7fb8c

greptile-apps bot reviewed Mar 16, 2026

View reviewed changes

carlos-alm added 2 commits March 16, 2026 05:11

Merge branch 'main' into fix/incremental-analysis-468

71f5b27

fix: add complexity extension filter and guard db.close with finally

93ec404

Impact: 1 functions changed, 0 affected

carlos-alm added 2 commits March 16, 2026 05:18

Merge branch 'main' into fix/incremental-analysis-468

e022f37

Merge branch 'main' into fix/incremental-analysis-468

eb1b155

carlos-alm merged commit 2cebba1 into main Mar 16, 2026
13 checks passed

carlos-alm deleted the fix/incremental-analysis-468 branch March 16, 2026 11:22

github-actions bot locked and limited conversation to collaborators Mar 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: restore complexity/CFG/dataflow analysis in builds#469

fix: restore complexity/CFG/dataflow analysis in builds#469
carlos-alm merged 7 commits intomainfrom
fix/incremental-analysis-468

carlos-alm commented Mar 16, 2026

Uh oh!

claude bot commented Mar 16, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 16, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Mar 16, 2026

Uh oh!

carlos-alm Mar 16, 2026

Uh oh!

greptile-apps bot Mar 16, 2026

Uh oh!

carlos-alm Mar 16, 2026

Uh oh!

carlos-alm commented Mar 16, 2026

Uh oh!

greptile-apps bot Mar 16, 2026

Uh oh!

carlos-alm Mar 16, 2026

Uh oh!

greptile-apps bot Mar 16, 2026

Uh oh!

carlos-alm Mar 16, 2026

Uh oh!

carlos-alm commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented Mar 16, 2026

Summary

Test plan

Uh oh!

claude bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Fix complexity/CFG/dataflow analysis in builds

Todo List:

Phase 1: Root Cause Analysis ✅

Phase 2: Configuration Change Detection ✅

Phase 3: Test Integrity ✅

Phase 4: Code Quality ✅

Phase 5: Backlog Compliance ✅

Root Cause Analysis

Backlog Compliance

Critical Concerns

Final Recommendation

Uh oh!

greptile-apps bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 16, 2026

Uh oh!

greptile-apps bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Mar 16, 2026 •

edited

Loading

greptile-apps bot commented Mar 16, 2026 •

edited

Loading