Penetration testing today: what it actually covers, what it misses, what needs repeating

One in three companies that receive a pentest report with zero critical findings suffers an incident within twelve months. The problem is rarely the testing vendor. It's the scope definition.

What a standard pentest actually covers

When a client commissions a penetration test, they typically receive one of three classic variants:

Black box: the tester has no credentials or documentation. Simulates an external attacker with no prior knowledge.
Grey box: limited credentials, partial architecture shared. The most common scenario for B2B web application tests.
White box: full access to source code, documentation, and test credentials. The most expensive, the most thorough.

A grey box test on a web application typically covers: OWASP Top 10, authentication and session management, vertical and horizontal authorization, injection (SQL, NoSQL, command), sensitive data exposure in REST APIs, misconfigured HTTP headers.

That's already substantial. But let's stop there and look at what almost never makes it into scope.

Structural gaps that no annual pentest resolves

1. The attack surface keeps changing

A typical B2B SaaS application ships code every one to two weeks. A pentest run in January captures the January state. By March, three new endpoints have been added, a third-party payment provider integration has gone live, and a CSV export feature is in production. None of these have been tested.

The answer isn't running twelve pentests a year — that's not economically viable. The answer is pairing periodic pentests with a continuous DAST (Dynamic Application Security Testing) program on every deploy, integrated into the CI/CD pipeline.

A practical example with OWASP ZAP in GitHub Actions:

- name: DAST scan
  uses: zaproxy/action-full-scan@v0.10.0
  with:
    target: 'https://staging.example.com'
    rules_file_name: '.zap/rules.tsv'
    cmd_options: '-a'

This doesn't replace manual pentesting, but it ensures new endpoints are at least scanned automatically before reaching production.

2. The cloud infrastructure layer is often out of scope

Most application pentests don't touch the underlying cloud infrastructure. IAM policies on AWS, accidentally public S3 buckets, overly permissive security groups, secrets leaking into CloudWatch logs — these are real attack surface, but they're excluded because "this is a web app test".

Covering this layer requires dedicated tooling: Prowler for AWS, ScoutSuite for multi-cloud, or a CSPM solution like AWS Security Hub for continuous findings.

A quick audit with Prowler:

prowler aws --services s3 iam cloudtrail \
  --output-formats json-asff \
  --output-directory ./prowler-reports

An hour of execution on an average AWS account surfaces dozens of findings that no application pentest would have caught.

3. The human factor is not testable in a standard pentest

Phishing campaigns, social engineering, vishing — these vectors are typically contracted separately (red team engagement) or excluded for organizational reasons. Yet 74% of breaches — per the Verizon DBIR 2023 — involve the human element.

If your budget doesn't allow for a full red team, consider at least a quarterly simulated phishing campaign using tools like GoPhish (open source) or managed services. The data you get — click rate, credential submission rate — are concrete operational metrics, not opinions.

4. Third-party dependencies are not tested

Your code may be clean. But if you're using twenty npm packages, three external provider SDKs, and a chat widget in an iframe, the real surface area is far larger than what the pentest covered.

Dependency analysis is a separate process: SCA (Software Composition Analysis) with tools like Snyk, Dependabot, or OWASP Dependency-Check in the pipeline.

What to repeat and how often

There's no universal cadence, but there's a reasonable framework:

| Activity | Recommended cadence | Additional triggers | |---|---|---| | Manual pentest (grey/white box) | Annual | Major feature release, significant architectural change | | Automated DAST | Every deploy (staging) | — | | CSPM scan (cloud posture) | Weekly / continuous | Every IaC change | | SCA (dependencies) | Every build | — | | Simulated phishing | Quarterly | Post-incident, significant team change | | Manual IAM/secrets review | Twice yearly | Senior developer offboarding |

The annual pentest remains essential because human intelligence finds logical vulnerabilities that no scanner will ever catch. But it's an anchor point, not complete coverage.

The report is not the end: it's the beginning

A recurring mistake: receive the report, fix critical and high findings, close the ticket. Medium and low findings sit open for months.

Correct prioritization doesn't rely solely on the CVSS score. It relies on business context: a CVSS 5.5 finding on an endpoint handling payment data is more urgent than a CVSS 7.0 on a public documentation page.

Tracking pentest findings in your issue tracker (Jira, Linear, whatever you use) with defined SLAs by category is the minimum required to turn a document into a process.

Operational takeaways

Define the scope of your next pentest to explicitly include cloud infrastructure, internal APIs, and third-party integrations — not just the main web app.
Add automated DAST to your CI/CD pipeline to cover the gap between one pentest and the next.
Use CSPM for the cloud layer: it's almost always out of scope in standard pentests.
Track findings as tasks with SLAs, not as PDFs to archive.
Schedule pentest repetition whenever the architecture changes significantly, not just on a calendar basis.

Security is a continuous process with periodic checkpoints, not an annual audit with a sign-off at the end.

---

Evviva Group helps partners and system integrators structure continuous security testing programs. If you're revisiting your approach to pentesting, we're happy to think it through together.