katana: next-generation crawling and spidering framework

by ddos · Published May 18, 2025 · Updated May 17, 2025

Katana

A next-generation crawling and spidering framework

Feature

Fast And fully configurable web crawling

Standard and Headless mode support
JavaScript parsing / crawling
Customizable automatic form filling

Scope control – Preconfigured field / Regex
Customizable output – Preconfigured fields
INPUT – STDIN, URL and LIST

OUTPUT – STDOUT, FILE, and JSON

Crawling Mode

Standard Mode

Standard crawling modality uses the standard go http library under the hood to handle HTTP requests/responses. This modality is much faster as it doesn’t have the browser overhead. Still, it analyzes HTTP responses body as is, without any javascript or DOM rendering, potentially missing post-dom-rendered endpoints or asynchronous endpoint calls that might happen in complex web applications depending, for example, on browser-specific events.

Headless Mode

Headless mode hooks internal headless calls to handle HTTP requests/responses directly within the browser context. This offers two advantages:

The HTTP fingerprint (TLS and user agent) fully identify the client as a legitimate browser
Better coverage since the endpoints are discovered analyzing the standard raw response, as in the previous modality, and also the browser-rendered one with javascript enabled.

Headless crawling is optional and can be enabled using -headless option.

Scope Control

Crawling can be endless if not scoped, as such katana comes with multiple support to define the crawl scope.

katana: next-generation crawling and spidering framework

Search

Brilliantly

Content & Links

katana: next-generation crawling and spidering framework

Katana

Feature

Crawling Mode

Standard Mode

Headless Mode

Scope Control

Install & Use

Search

Brilliantly

Content & Links