GuardDog: The Open-Source CLI Tool for Hunting Malicious Packages in npm, PyPI, and More

by Nam Phong · September 29, 2025

GuardDog is a CLI tool that allows to identify malicious PyPI and npm packages, Go modules, GitHub actions, or VSCode extensions. It runs a set of heuristics on the package source code (through Semgrep rules) and on the package metadata.

GuardDog can be used to scan local or remote PyPI and npm packages, Go modules, GitHub actions, or VSCode extensions using any of the available heuristics.

It downloads and scans code from:

NPM: Packages hosted in npmjs.org
PyPI: Source files (tar.gz) packages hosted in PyPI.org
Go: GoLang source files of repositories hosted in GitHub.com
GitHub Actions: Javascript source files of repositories hosted in GitHub.com
VSCode Extensions: Extensions (.vsix) packages hosted in marketplace.visualstudio.com

Heuristics

GuardDog comes with 2 types of heuristics:

Source code heuristics: Semgrep rules running against the package source code.
Package metadata heuristics: Python or Javascript heuristics running against the package metadata on PyPI or npm.

PyPI

Source code heuristics:

Heuristic	Description
shady-links	Identify when a package contains an URL to a domain with a suspicious extension
obfuscation	Identify when a package uses a common obfuscation method often used by malware
clipboard-access	Identify when a package reads or write data from the clipboard
exfiltrate-sensitive-data	Identify when a package reads and exfiltrates sensitive data from the local system
download-executable	Identify when a package downloads and makes executable a remote binary
exec-base64	Identify when a package dynamically executes base64-encoded code
silent-process-execution	Identify when a package silently executes an executable
dll-hijacking	Identifies when a malicious package manipulates a trusted application into loading a malicious DLL
steganography	Identify when a package retrieves hidden data from an image and executes it
code-execution	Identify when an OS command is executed in the setup.py file
cmd-overwrite	Identify when the ‘install’ command is overwritten in setup.py, indicating a piece of code automatically running when the package is installed

Metadata heuristics:

Heuristic	Description
empty_information	Identify packages with an empty description field
release_zero	Identify packages with an release version that’s 0.0 or 0.0.0
typosquatting	Identify packages that are named closely to an highly popular package
potentially_compromised_email_domain	Identify when a package maintainer e-mail domain (and therefore package manager account) might have been compromised
unclaimed_maintainer_email_domain	Identify when a package maintainer e-mail domain (and therefore npm account) is unclaimed and can be registered by an attacker
repository_integrity_mismatch	Identify packages with a linked GitHub repository where the package has extra unexpected files
single_python_file	Identify packages that have only a single Python file
bundled_binary	Identify packages bundling binaries
deceptive_author	This heuristic detects when an author is using a disposable email

npm

Source code heuristics:

Heuristic	Description
npm-serialize-environment	Identify when a package serializes ‘process.env’ to exfiltrate environment variables
npm-obfuscation	Identify when a package uses a common obfuscation method often used by malware
npm-silent-process-execution	Identify when a package silently executes an executable
shady-links	Identify when a package contains an URL to a domain with a suspicious extension
npm-exec-base64	Identify when a package dynamically executes code through ‘eval’
npm-install-script	Identify when a package has a pre or post-install script automatically running commands
npm-steganography	Identify when a package retrieves hidden data from an image and executes it
npm-dll-hijacking	Identifies when a malicious package manipulates a trusted application into loading a malicious DLL
npm-exfiltrate-sensitive-data	Identify when a package reads and exfiltrates sensitive data from the local system

Metadata heuristics:

Heuristic	Description
empty_information	Identify packages with an empty description field
release_zero	Identify packages with an release version that’s 0.0 or 0.0.0
potentially_compromised_email_domain	Identify when a package maintainer e-mail domain (and therefore package manager account) might have been compromised; note that NPM’s API may not provide accurate information regarding the maintainer’s email, so this detector may cause false positives for NPM packages. see https://www.theregister.com/2022/05/10/security_npm_email/
unclaimed_maintainer_email_domain	Identify when a package maintainer e-mail domain (and therefore npm account) is unclaimed and can be registered by an attacker; note that NPM’s API may not provide accurate information regarding the maintainer’s email, so this detector may cause false positives for NPM packages. see https://www.theregister.com/2022/05/10/security_npm_email/
typosquatting	Identify packages that are named closely to an highly popular package
direct_url_dependency	Identify packages with direct URL dependencies. Dependencies fetched this way are not immutable and can be used to inject untrusted code or reduce the likelihood of a reproducible install.
npm_metadata_mismatch	Identify packages which have mismatches between the npm package manifest and the package info for some critical fields
bundled_binary	Identify packages bundling binaries
deceptive_author	This heuristic detects when an author is using a disposable email

go

Source code heuristics:

Heuristic	Description
shady-links	Identify when a package contains an URL to a domain with a suspicious extension
go-exec-base64	Identify Base64-decoded content being passed to execution functions in Go
go-exfiltrate-sensitive-data	This rule identifies when a package reads and exfiltrates sensitive data from the local system.
go-exec-download	This rule downloads and executes a remote binary after setting executable permissions.

Metadata heuristics:

Heuristic	Description
typosquatting	Identify packages that are named closely to an highly popular package

GitHub Action

Source code heuristics:

Heuristic	Description
npm-serialize-environment	Identify when a package serializes ‘process.env’ to exfiltrate environment variables
npm-obfuscation	Identify when a package uses a common obfuscation method often used by malware
npm-silent-process-execution	Identify when a package silently executes an executable
shady-links	Identify when a package contains an URL to a domain with a suspicious extension
npm-exec-base64	Identify when a package dynamically executes code through ‘eval’
npm-install-script	Identify when a package has a pre or post-install script automatically running commands
npm-steganography	Identify when a package retrieves hidden data from an image and executes it
npm-dll-hijacking	Identifies when a malicious package manipulates a trusted application into loading a malicious DLL
npm-exfiltrate-sensitive-data	Identify when a package reads and exfiltrates sensitive data from the local system

Install & Use

Support Our Threat Intelligence

If you find our technology report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal