Auditor Core v2.2.1 | Enterprise Security Analysis Engine

How Auditor Core Works

A deterministic pipeline — from file intake to AI-verified, SPI-scored report, with Chain Analysis in between.

1

FileIntake — Smart Collection

Collects and filters relevant files — skips binaries, media, build artifacts and vendor folders. Processes thousands of files without memory exhaustion.

2

11 Detection Engines — Parallel Scanning

Bandit, Semgrep, Gitleaks, Secret Detector, CICD Analyzer, IaC Scanner, Dependency Scanner, License Scanner, Bridge Detector, SAST Scanner, Slither — each engine contributes weighted findings.

3

Chain Analysis — Attack Path Detection

Correlates findings across detectors. When a hardcoded secret (LOW) and a command injection (MEDIUM) exist in the same scope, they form a CRITICAL chain. Severity escalation ensures correlated risks are never underreported.

Chain ID: CHAIN_0001 — secret_to_command_injection CRITICAL

Step 1: SECRET_HIGH_ENTROPY (LOW) → config.py:42
Step 2: SAST_COMMAND_INJECTION (MEDIUM) → config.py:87

4

WSPM v2.2 — Weighted Scoring

Context weighting (CORE / TEST / DOCS / INFRA), detector trust balancing, reachability scoring, and cross-detector consensus — producing a mathematically reproducible SPI from 0 to 100. Chain-escalated findings contribute with their escalated severity.

5

AI Advisory — Chain-Aware False Positive Elimination

Top findings (including chain context) are sent to Gemini 2.5 Flash for verification. If daily quota is exhausted, Groq (Llama 3.3 70B) takes over automatically. Optional local LLM mode for air‑gapped environments.

Chain+Findings

Gemini 2.5

Groq Fallback

Chain‑Aware Verdict

6

Reports — PDF + HTML + JSON

PDF Executive Summary — 7-page audit-ready document for SOC 2 readiness and cyber insurance underwriting. Includes Attack Path Analysis section (chain visualization), evidence appendix with source-level code context for every CRITICAL/HIGH finding, remediation roadmap, and attestation block.

Interactive HTML Report — Enterprise posture dashboard with SOC 2 / CIS / ISO 27001 control tags, AI analysis badges, collapsible chain cards, and reachability breakdown.

Machine-readable JSON — CI/CD gating with instance_count, chain block per finding (chain_id, chain_risk, original_severity, partner_finding_id), and attack_paths root object aggregating all chains.

11 Detection Engines

Each detector contributes trust-weighted findings to the unified WSPM v2.2 score. Chain Analysis correlates findings across detectors.

Bandit

Python SAST

Python-specific static analysis — injection, insecure crypto, unsafe deserialization, command execution.

Semgrep

Multi-lang SAST

Multi-language pattern matching — OWASP Top 10, custom rules, cross-file dataflow analysis.

Gitleaks

Secrets

Git history secret scanning — API keys, tokens, passwords committed at any point in history.

Secret Detector

Credentials

Entropy-based credential detection — finds secrets that pattern-matching tools miss.

CICD Analyzer

Pipeline

GitHub Actions, GitLab CI, Jenkinsfile — unpinned actions, secret exposure, privilege escalation.

IaC Scanner

Infrastructure

Kubernetes, Terraform, Docker — misconfigurations, privilege escalation, exposed ports.

Dependency Scanner

Supply Chain

CVE lookup for all dependencies — vulnerable packages, unpinned versions, supply chain risks.

License Scanner

Compliance

OSS license compliance — GPL contamination, copyleft risks, license mismatches.

Bridge Detector

Proprietary

Proprietary rule engine — cross-file correlation, contextual risk assessment, custom policies.

SAST Scanner

Deep Analysis

Advanced static analysis — taint tracking, semantic rules, CWE-aligned classification.

Slither

Smart Contracts

Solidity smart contract analysis — reentrancy, overflow, access control, optional module.

Security Posture Index (SPI)

A calibrated score 0–100 computed by WSPM v2.2 — context-weighted, detector-trusted, noise-normalized.
Chain‑escalated findings contribute with their escalated severity, providing a more accurate risk picture.

SPI Range	Grade	Status	Meaning
90 – 100	A	Resilient	Minimal exploitable exposure. Production-ready.
70 – 89	B	Guarded	Manageable risk. Prioritized remediation recommended.
40 – 69	C	Elevated Risk	Significant exposure. Remediation required before production.
0 – 39	D	Critical Exposure	Active risk. Immediate remediation required.

WSPM v2.2 Formula

                    SPI = 100 × e-(Σ WeightedExposure / K)
                

Where context (CORE / TEST / DOCS / INFRA), detector trust, reachability, and cross-detector consensus are all factored in. Chain escalation updates the severity of participating findings before SPI calculation. K scales dynamically with project size to prevent noise amplification in large codebases.

Context Intelligence: Findings in test/, docs/, and examples/ directories are classified as NON_RUNTIME and excluded from SPI calculation by default. Detector fixture files are recognised as SETUP context — eliminating false severity inflation from non-production code.

Gate Override (v2.2.1)

If CRITICAL findings exist in production code (including chain‑escalated CRITICAL), the effective grade is capped at C — regardless of the mathematical SPI score. This resolves the cognitive dissonance of a high SPI alongside a FAIL decision for CISO and underwriter audiences.

Compliance Framework Coverage

Every finding is automatically mapped to industry-standard controls. Reports include a framework_summary block ready for direct submission to SOC 2 auditors or cyber insurance underwriters.

SOC 2 TSC

CC6.1 CC6.6 CC7.1 CC8.1 + more

CIS Controls v8

CIS-16.1 CIS-16.12 CIS-3.11 + more

ISO/IEC 27001:2022

A.8.28 A.8.26 A.5.17 + more

This report does not constitute a formal SOC 2 audit opinion. For SOC 2 Type I/II certification, engage a licensed CPA firm. The PDF report format is designed to align with underwriting pre-assessment requirements from Marsh, Aon, At-Bay, and Coalition.

Hardware-Bound Licensing

Each license is cryptographically tied to your machine's hardware identifier. Non-transferable by design.

1

Get Your Machine ID

Run this on every machine where Auditor Core will be deployed:

                        python3 -c "from auditor.security.guard import AuditorGuard; print(AuditorGuard().get_machine_id())"
                    

2

Send Machine ID to DataWizual

Email your Machine ID to [email protected]. You will receive a License Key unique to that machine.

3

Run start.sh — Automated Setup

The provisioning script handles everything interactively: virtual environment, dependencies, Docker PostgreSQL, and environment configuration.

                        bash start.sh
                    

4

Run Your First Audit

                        ./audit /path/to/project
                    

A License Key issued for one Machine ID will not work on any other machine. Each deployment requires its own key.

Frequently Asked Questions

Does Auditor Core send data anywhere?

No. The engine operates fully offline. Data is stored only in the local PostgreSQL instance. The only outbound connection is to the Gemini or Groq API if AI advisory is enabled and explicitly configured by you. Local LLM mode requires no outbound calls.

What is Chain Analysis?

Chain Analysis correlates findings that together form a complete attack path (e.g., hardcoded secret + command injection). When a chain is detected, all participating findings are escalated to the chain's resulting risk level (typically CRITICAL). This prevents underreporting of correlated risks. Chains are visualized in all report formats and included in the JSON output.

What if Gemini API quota is exhausted?

Groq (Llama 3.3 70B Versatile) automatically takes over for remaining chunks — no manual intervention needed. If both are unavailable, all deterministic findings, chain analysis, SPI scoring, and reports are produced without interruption.

Can I run Auditor Core without Docker?

Docker is required for the PostgreSQL database used for baseline tracking and audit history. The scanning engine itself runs in the Python virtual environment.

How is Auditor Core different from Sentinel Core?

Auditor Core is a deep audit engine — run on demand to produce comprehensive posture reports. Sentinel Core is a real-time enforcement system that intercepts every commit. Sentinel Core uses Auditor Core internally as its scanning engine.

Can I integrate JSON output into my SIEM?

Yes. The JSON report is machine-readable and designed for downstream integration with SIEM platforms, dashboards, and CI/CD quality gates. Each finding includes a chain block where applicable, and the root contains an attack_paths array aggregating all chains.

Can the PDF report be used for SOC 2 readiness?

Yes — as supporting evidence for pre-assessment and gap analysis. The report includes SOC 2 TSC control mappings, an evidence appendix with source-level code context for every CRITICAL/HIGH finding, an attack path analysis section (chains), and a remediation roadmap. It does not constitute a formal audit opinion. For SOC 2 Type I/II certification, engage a licensed CPA firm.

Can the PDF report be submitted to cyber insurance underwriters?

Yes. The report format is designed to align with underwriting pre-assessment requirements from Marsh, Aon, At-Bay, and Coalition. The framework_summary block in JSON aggregates which controls are triggered across all findings — ready for direct submission.

How does local LLM mode work?

Set ai.mode: "local" in audit-config.yml and provide a model path (e.g., /models/your-model.gguf). Auditor Core will run AI validation entirely on your machine — no outbound calls, no data leaving your environment. Ideal for air‑gapped or high‑sensitivity deployments.