Directory Traversal Attacks: How They Work and How to Prevent Them
A directory traversal attack (also called path traversal) allows an attacker to access files and directories stored outside the intended web root folder by manipulating file path inputs with sequences like ../. This class of vulnerability is deceptively simple yet consistently appears in real-world breaches — reading /etc/passwd, configuration files with credentials, or application source code.
Check if your application exposes sensitive paths — [scan it free with ZeriFlow](https://zeriflow.com).
How Directory Traversal Works
Web applications often use user-supplied input to construct file paths:
https://example.com/download?file=report.pdf
https://example.com/image?src=avatar.png
https://example.com/template?name=welcomeIf the application appends this input directly to a base path without validation:
# Vulnerable code
import os
def download_file(filename):
base_dir = '/var/www/uploads/'
file_path = base_dir + filename
return open(file_path, 'rb').read()An attacker passes ../../../../etc/passwd as the filename:
GET /download?file=../../../../etc/passwdThe resulting path is /var/www/uploads/../../../../etc/passwd, which resolves to /etc/passwd. The server happily returns the system's user account file.
URL-Encoded Variants
WAFs and basic filters often check for ../ literally. Attackers bypass these with encoding:
| Variant | Encoded Form |
|---|---|
../ | %2e%2e%2f |
../ | ..%2f |
../ | %2e%2e/ |
.. (Windows) | ..\ → ..%5c |
Double-encoding is also common: %252e%252e%252f (where %25 is the encoding of %).
Real-World Path Traversal Examples
CVE-2021-41773 — Apache HTTP Server
One of the most famous recent path traversal vulnerabilities. Apache 2.4.49 introduced a flaw where URL normalization could be bypassed:
GET /cgi-bin/.%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd HTTP/1.1This was exploited in the wild within hours of disclosure. Patched in Apache 2.4.50 — but only after massive exploitation attempts.
Zip Slip
A path traversal variant targeting archive extraction. A malicious ZIP file contains entries with paths like ../../../../etc/cron.d/malicious. When extracted, the file lands outside the intended directory and can achieve remote code execution if placed in an executable location.
Log4Shell Adjacent Exposure
While Log4Shell was primarily an RCE via JNDI injection, many affected systems also had path traversal vulnerabilities that allowed attackers to read application configuration files containing credentials used for lateral movement.
Prevention Strategies
1. Canonicalize and Validate File Paths
Always resolve the canonical (absolute) path of user-supplied input and verify it starts with your intended base directory:
import os
def safe_file_read(filename):
base_dir = '/var/www/uploads'
# Resolve to absolute canonical path
requested_path = os.path.realpath(os.path.join(base_dir, filename))
# Ensure the resolved path is within the base directory
if not requested_path.startswith(os.path.realpath(base_dir) + os.sep):
raise PermissionError('Path traversal attempt detected')
with open(requested_path, 'rb') as f:
return f.read()The key: os.path.realpath() resolves all .. sequences and symlinks before you check the prefix.
Java equivalent:
Path basePath = Paths.get('/var/www/uploads').toRealPath();
Path requestedPath = basePath.resolve(userInput).normalize();
if (!requestedPath.startsWith(basePath)) {
throw new SecurityException('Path traversal detected');
}2. Use Indirect File References
Instead of accepting filenames directly, accept an index or identifier that maps to a file server-side:
FILE_MAP = {
'1': 'report_q1.pdf',
'2': 'report_q2.pdf',
'3': 'user_guide.pdf'
}
def download_file(file_id):
filename = FILE_MAP.get(file_id)
if not filename:
raise ValueError('Invalid file ID')
return open(f'/var/www/uploads/{filename}', 'rb').read()No user input ever touches the file system path. This is the most robust approach.
3. Allowlist File Extensions
If you must accept filenames, validate extensions against a strict allowlist:
ALLOWED_EXTENSIONS = {'.pdf', '.png', '.jpg', '.csv'}
def is_safe_filename(filename):
_, ext = os.path.splitext(filename)
return ext.lower() in ALLOWED_EXTENSIONSThis doesn't prevent traversal alone but combined with path canonicalization it reduces the attack surface significantly.
4. Run with Minimal Permissions
Even if path traversal occurs, damage is limited if the web process can only read files it needs:
- Run the web server as a dedicated user with no shell access.
- Mount the web root with
noexecif possible. - Use Docker volume mounts to limit filesystem visibility.
- On Linux, use
chrootjails or seccomp to restrict syscalls.
Server-Side Indicators and ZeriFlow Detection
ZeriFlow checks for several indicators that may signal path traversal exposure:
- Directory listing enabled: If your server returns a file listing for directories, it dramatically aids path traversal reconnaissance.
- Sensitive file headers: Responses with MIME types inconsistent with the requested resource type.
- Server version disclosure: Revealing exact server versions in headers helps attackers target known CVEs like CVE-2021-41773.
Run a free ZeriFlow security scan to identify these misconfigurations before attackers do.
Windows-Specific Considerations
Path traversal on Windows has additional complexity:
- Backslash variants:
..\..\windows\system32\drivers\etc\hosts - Drive letter injection: On some frameworks, absolute paths like
C:\Windows\...bypass relative path checks. - Case-insensitivity: Windows file systems are case-insensitive; some security checks that work on Linux fail on Windows.
- 8.3 filename aliases:
PROGRA~1forProgram Files,WINDOW~1forWindows— these can bypass filename-based filters.
Always use the platform's canonical path resolution API rather than string manipulation.
Zip Slip Prevention
When processing uploaded archives:
import zipfile, os
def safe_extract(zip_path, extract_to):
with zipfile.ZipFile(zip_path) as zf:
for entry in zf.namelist():
# Resolve the target path
target_path = os.path.realpath(os.path.join(extract_to, entry))
real_extract_to = os.path.realpath(extract_to)
if not target_path.startswith(real_extract_to + os.sep):
raise ValueError(f'Zip Slip detected: {entry}')
zf.extract(entry, extract_to)Libraries like Python's zipfile don't prevent Zip Slip by default — you must validate every entry.
FAQ
Q: What's the difference between directory traversal and LFI?
A: Directory traversal refers to accessing files outside the intended directory via path manipulation. Local File Inclusion (LFI) is a PHP-specific vulnerability where user input is passed to include() or require() — it often leverages path traversal techniques. LFI can be more dangerous as it may execute PHP code embedded in included files. Directory traversal is the broader technique; LFI is a specific implementation of it.
Q: Can a WAF fully prevent path traversal attacks?
A: WAFs provide partial protection by blocking known traversal patterns, but sophisticated encodings and double-encoding attacks frequently bypass them. WAFs are a supplement, not a replacement for secure coding practices. Always fix the vulnerability in code; use WAF rules as a defence-in-depth layer.
Q: Is path traversal possible in cloud storage (S3, GCS)?
A: Yes. Applications that construct object keys from user input can be vulnerable to equivalent attacks. For example, if an app stores users/{user_id}/{filename} and user_id isn't validated, an attacker might access users/../../admin/secrets. Always validate and sanitize cloud storage key components.
Q: How do I test for path traversal vulnerabilities?
A: Manual testing: try appending ../../../etc/passwd (Linux) or ..\..\..\windows\win.ini (Windows) to any file-related parameter. Automated testing: tools like Burp Suite's active scanner, OWASP ZAP, or ZeriFlow can flag potential traversal exposure automatically.
Q: Does containerization prevent path traversal?
A: Containers limit the blast radius — an attacker can only traverse within the container's filesystem. But containers typically contain application secrets, environment variables, and internal service credentials. Path traversal in a container can still lead to credential theft and lateral movement. Fix the vulnerability regardless.
Conclusion
Directory traversal is a consistently exploited vulnerability class — from high-profile Apache CVEs to supply chain attacks via Zip Slip. The fix is always the same: never trust user input when constructing file paths, use canonical path resolution before access checks, and prefer indirect file references over accepting raw filenames.
Start by understanding your exposure. [Scan your site with ZeriFlow](https://zeriflow.com) to check for directory listing, server version disclosure, and other indicators that make path traversal attacks easier. Free, instant, no account required.