RUSTFORGE — SAFETY AUDIT REPORT

10-Repo Rust Ecosystem — Complete Findings
Scope: 10 Rust repos, 365 source files Focus: Unwrap/panic safety, unsafe audit, hash-chain integrity Tooling: Custom static scanner (15 tests, 4 bugs found in own tool) Date: February 2026
10
Repos Audited
365
Source Files
1,025+
Tests Verified
140
Prod Unwraps Classified
8
Fixes Delivered
Methodology Sample — This report documents an internal audit of our own Rust ecosystem, published to demonstrate our methodology and analytical rigor. Client audits are performed under NDA.

Executive Summary

We audited 10 Rust repositories spanning AI inference, agent runtimes, authentication, search, government services, ethical learning, and emulation — totaling 365 source files with 1,025+ tests. Every production .unwrap() and .expect() was manually classified: 140 total, all either algorithmically invariant or CLI-startup pattern. 8 actionable issues were found and fixed on the spot: 5 NaN-panic risks via partial_cmp().unwrap(), 1 hash-chain verification crash, 1 unguarded error propagation, and 1 ambiguous unwrap upgraded to explicit expect. A custom static analysis scanner was built and improved during the audit, catching 4 bugs in our own tooling along the way. Final result: 0 P0 risks, 0 actionable unwraps remaining.

Risk Findings

Every finding below was detected, classified, fixed, and verified with passing tests during the audit.

RF-2026-001 High
NaN Panic in f32 Ordering — 5 Sites, 2 Repos
What
partial_cmp().unwrap() on f32 values panics when either operand is NaN. Found in LLM token sampling (argmax, top-k sort, top-p sort) and anomaly detection confidence scoring. Any NaN from upstream tensor ops would crash the inference pipeline.
Where
llm-sampler/src/lib.rs — lines 192, 204, 225 (argmax, top-k, top-p)
llm-models/src/llama.rs — line 880 (softmax argmax)
b3-core/detectors.rs — line 330 (confidence sort)
Impact
NaN values are common in neural network inference (division by zero in attention, overflow in softmax). A single NaN token would crash the entire LLM serving pipeline. In b3-core, corrupted confidence scores would crash the anomaly detector.
Evidence
// Before (panics on NaN):
logits.iter().enumerate()
    .max_by(|(_, a), (_, b)| a.partial_cmp(b).unwrap())

// After (NaN-safe, total ordering):
logits.iter().enumerate()
    .max_by(|(_, a), (_, b)| a.total_cmp(b))
FIXED All 5 sites: partial_cmp().unwrap()total_cmp(). NaN sorts to end deterministically. 88 + 88 tests pass.
RF-2026-002 High
Hash-Chain Verify Crash on Malformed Input
What
The verify() function of an append-only hash-chain log used serde_json::from_str(&line).unwrap() to parse each line. A single malformed line (disk corruption, partial write, manual edit) would panic instead of returning verification failure.
Where
logstore/src/lib.rs — line 158, verify() function
Impact
The logstore provides tamper-evidence for an agent runtime. A crash during verification means a corrupted log cannot be detected — it crashes instead. An attacker who can append garbage to the log file triggers a denial-of-service on the integrity check.
Evidence
// Before (panics on malformed JSON):
let v: serde_json::Value = serde_json::from_str(&line).unwrap();

// After (returns chain-invalid without crash):
let v: serde_json::Value = match serde_json::from_str(&line) {
    Ok(v) => v,
    Err(_) => return Ok(false),
};
FIXED Explicit match returns Ok(false) on parse failure. Verification never panics. 41 tests pass.
RF-2026-003 Medium
Unguarded Error Propagation in Agent Runtime
What
submit(request).unwrap() in a runtime execution path swallowed the error type. If the intervention channel was closed or full, the runtime panicked instead of propagating the error through the normal error chain.
Where
runtime.rs — line 243, submit() call inside execution loop
Impact
A closed channel during graceful shutdown or backpressure would crash the entire runtime instead of triggering the retry/fallback path. The panic would poison any shared state held at that point.
Evidence
// Before (panic on channel error):
submit(request).unwrap();

// After (propagates via error chain):
submit(request)?;
// Added: Intervention(#[from] InterventionError) to RuntimeError
FIXED .unwrap()? with new InterventionError variant in RuntimeError enum. 179 tests pass.
RF-2026-004 Medium
10 Unsafe Pointer Casts Without Bounds Checking
What
LLM quantization and tensor operations used raw pointer casts (as *const f16, std::slice::from_raw_parts) to reinterpret byte buffers. No alignment or bounds validation. Misaligned access is undefined behavior on most architectures.
Where
llm-quant/src/*.rs — 5 pointer casts in quantization kernels
llm-core/src/tensor.rs — 3 pointer casts in tensor view operations
Additional 2 in GGUF model loading
Impact
Undefined behavior from misaligned reads. On x86 this silently works but on ARM (Apple Silicon, mobile) it can segfault. Buffer overread if byte count doesn't align to element size.
Evidence
// Before (raw pointer cast, no validation):
let ptr = bytes.as_ptr() as *const f16;
let slice = unsafe { std::slice::from_raw_parts(ptr, count) };

// After (safe, validated by bytemuck):
let slice: &[f16] = bytemuck::try_cast_slice(bytes)
    .map_err(|e| TensorError::AlignmentError(e))?;
FIXED 10 pointer casts → bytemuck::from_bytes / try_cast_slice. 9 remaining unsafe blocks are Metal FFI, Send/Sync, mmap — all documented with SAFETY comments. 7 boundary tests added.

Scanner Methodology

We built a custom Rust static analysis scanner during the audit. It found 4 bugs in itself — each improving accuracy across all 365 files.

SCANNER Tooling
4 Scanner Bugs Found & Fixed
BUG-1: Test Indirection
#[cfg(test)] mod tests;src/tests.rs was not followed. Result: +27 false positives in one repo. Fix: collect_test_module_paths() resolves indirection before counting.
BUG-2: Unsafe String Match
\bunsafe\b matched inside string literals, comments, and lint attributes. Fix: count_real_unsafe() strips strings/comments first, then matches only unsafe {/fn/impl/trait.
BUG-3: Method Name Collision
Parser::expect(TokenKind, msg)? matched the .expect( regex. +16 false positives in one repo. Fix: require string literal argument — \.expect\s*\(\s*".
BUG-4: Doc Comment False Positives
/// and //! lines containing .unwrap() counted as production code. +18 false positives across 4 repos. Fix: strip_doc_comments() filters before counting.
15 TESTS Scanner v3 with full regression suite. Located at newcool-cognitive-core/crates/extractor-rust/.

Analysis Observations

  1. The ecosystem has strong safety foundations. Zero panic!, todo!, or unimplemented! in production code across all 10 repos. 11 unreachable!() are all guarded by bitwise/enum invariants and documented.
  2. partial_cmp on floats is a systemic blind spot. Found in 3 of 10 repos independently. The pattern looks correct at first glance but panics on NaN — a value that routinely appears in neural network inference. total_cmp should be the default for all f32 ordering in Rust.
  3. Library-level code needs different unwrap discipline than CLI code. 34 of 40 unwraps in the agent-runtime are .expect("msg") in fn main — correct CLI pattern. But 1 unwrap in a library verify() function was a genuine crash risk. The distinction between binary-level and library-level unwrap tolerance is critical.
  4. Static analysis tooling must audit itself. 4 bugs in our own scanner over the course of 10 repos. Each bug was caught because manual classification disagreed with scanner counts. Dogfooding the scanner on known-good code is essential.

Methodology: Unwrap Classification

Scanned: 365 files Classified: 140 prod unwraps Result: 0 actionable remaining

Project Scoreboard

Each repo scored by: prod unwrap count, test coverage, unsafe usage, invariant documentation quality.

apex-learning

10
0 prod unwraps. 0 unsafe. 106 tests. DDD architecture with 7 crates. Clean.

gameboy

9.5
0 prod unwraps. 5 unreachable! (all bitmask-guarded, documented). 28 tests.

rcr

9.0
4 prod unwraps (all match-arm/enum invariant). 330 tests. Best test density in ecosystem.

nccr

9.0
4 prod unwraps (UTF-8 from ASCII, self-ref lookup). 5 unreachable! documented. 68 tests.

estado-transparente

8.5
4 prod unwraps (chrono hardcoded month/day, always valid). 45 tests. Government portal.

llm-runtime

8.0
10 prod unwraps post-fix. 9 unsafe (Metal FFI, all documented). 88 tests. 4 NaN fixes applied.
4 FIXED

agent-core

8.0
14 prod unwraps post-fix (JSON read-back, serde, RFC3339). 41 tests. Verify() hardened.
1 FIXED

auth-rust

7.5
16 prod unwraps (7 chrono invariant + 9 startup expect). 29 tests. All classified safe.

agent-runtime

7.5
40 prod unwraps post-fix (34 CLI expect pattern). 179 tests. 2 fixes applied.
2 FIXED

b3-core

7.0
48 prod unwraps (30 RwLock poison + 18 Regex Lazy). 1 NaN fix. 88 tests. Highest density.
1 FIXED

Cross-Project Patterns

Recurring patterns across the 10-repo ecosystem.

Pattern Repos Count Status
partial_cmp().unwrap() on f32 3 / 10 5 FIXED
CLI .expect("msg") in fn main 3 / 10 44 SAFE
RwLock poison .unwrap() 1 / 10 30 INVARIANT
Regex::new(literal).unwrap() in Lazy 2 / 10 21 INVARIANT
unreachable!() with invariant guard 3 / 10 11 DOCUMENTED
unsafe with SAFETY comment 1 / 10 9 DOCUMENTED
serde_json::to_string infallible 3 / 10 6 INVARIANT
panic! / todo! in production 0 / 10 0 PASS

Want the same depth
on your Rust codebase?

Every unwrap classified. Every unsafe documented. Every fix verified with tests.
Prepaid via Stripe. No actionable findings? Full refund.

Start Your Audit — from $1,800 audit@newcool.io

Prepaid. No actionable findings, full refund.