Add Password Protection to PDF Files
When adding password protection to PDF files with legacy Python libraries, you hit PdfReadError or NotImplementedError because the old RC4 encryption path is deprecated or absent. The fix is migrating to pypdf 3.x and passing algorithm="AES-256" to writer.encrypt(). This page covers the exact error signature, root cause, corrected implementation, permission flags, validation, and batch patterns.
Root Cause
Legacy PyPDF2 and early pypdf releases default to RC4-40 or RC4-128. Modern PDF readers either reject these ciphers outright or flag the document as insecure. When you call .encrypt() on an outdated version, the interpreter raises:
NotImplementedError: Encryption algorithm not supported
or, when you try to read pages from an already-encrypted file without decrypting first:
pypdf.errors.PdfReadError: Stream has not been decrypted
Three specific triggers:
PyPDF2 < 3.0.0— the.encrypt()method silently falls back to RC4-40, which current Adobe and Chrome PDF engines reject.pypdf < 3.0.0— thealgorithmparameter did not exist; passing it raisesTypeError.- Re-encrypting an already-encrypted file without calling
reader.decrypt()first — pypdf cannot parse the cross-reference table of an locked stream, so any write attempt raisesPdfReadError.
Verify your installation before continuing:
pip show pypdf | grep Version
# Must be 3.0.0 or higher
# If you see PyPDF2 installed: pip uninstall PyPDF2
Minimal Diagnostic
Confirm the failure mode against your exact file before touching production code:
# pip install pypdf
from pathlib import Path
from pypdf import PdfReader
SOURCE = Path("report.pdf")
try:
reader = PdfReader(SOURCE)
print(f"Encrypted: {reader.is_encrypted}")
print(f"Pages : {len(reader.pages)}")
except Exception as exc:
# If PdfReadError fires here the file is encrypted and needs decrypt() first
print(f"Open error: {type(exc).__name__}: {exc}")
If is_encrypted is True and you want to re-encrypt, call reader.decrypt(existing_password) before copying pages to the writer. If is_encrypted is False, proceed directly to the encryption step.
Fix: AES-256 Encryption with pypdf
Replace any PyPDF2 or legacy pypdf writer logic with the following:
# pip install "pypdf>=3.17"
from pathlib import Path
from pypdf import PdfReader, PdfWriter
import os
INPUT_PDF = Path("report.pdf")
SECURED_PDF = Path("report_secured.pdf")
def encrypt_pdf(
source: Path,
output: Path,
user_password: str,
owner_password: str,
algorithm: str = "AES-256",
) -> None:
"""
Encrypt source PDF with AES-256 and write to output.
user_password — required to open/view the document
owner_password — grants full editing rights; overrides permission flags
algorithm — "AES-256" for PDF 2.0 compliance; "AES-128" for older
reader compatibility; never use "RC4-*" for new work
"""
if not source.exists():
raise FileNotFoundError(f"Source PDF not found: {source}")
try:
reader = PdfReader(source)
# Decrypt first if the source file is already protected
if reader.is_encrypted:
result = reader.decrypt(os.environ.get("PDF_EXISTING_PW", ""))
if result == 0:
raise ValueError("Wrong existing password — cannot re-encrypt")
writer = PdfWriter()
for page in reader.pages:
writer.add_page(page)
# algorithm="AES-256" selects PDF 2.0-compliant AES-256 (pypdf 3.x+)
writer.encrypt(
user_password=user_password,
owner_password=owner_password,
algorithm=algorithm,
)
output.parent.mkdir(parents=True, exist_ok=True)
with open(output, "wb") as fh:
writer.write(fh)
print(f"Encrypted ({algorithm}): {output}")
except Exception as exc:
# Re-raise so the caller decides whether to halt or continue a batch
raise RuntimeError(f"Encryption failed for {source.name}: {exc}") from exc
if __name__ == "__main__":
encrypt_pdf(
INPUT_PDF,
SECURED_PDF,
user_password=os.environ["PDF_USER_PW"], # never hardcode credentials
owner_password=os.environ["PDF_OWNER_PW"],
)
Key points on the changed lines:
algorithm="AES-256"— explicit algorithm selection; without it pypdf defaults to AES-128.reader.is_encryptedcheck — preventsPdfReadErrorwhen the source is already locked.reader.decrypt()return-value check — 0 means wrong password, 1 means user-password success, 2 means owner-password success.- Environment variables — never embed passwords in
.pyfiles; they end up in version control.
Variant: Adding Permission Flags
Encryption without permission flags leaves all operations open to anyone with the user password. Restrict printing, copying, and editing with PermissionFlags:
# pip install "pypdf>=3.17"
from pathlib import Path
from pypdf import PdfReader, PdfWriter
from pypdf.generic import PermissionFlags
import os
# Allow viewing and printing; deny content copy and modification
READ_AND_PRINT = (
PermissionFlags.PRINT_PRINTING
| PermissionFlags.PRINT_IN_HIGH_QUALITY
)
def encrypt_with_permissions(
source: Path,
output: Path,
user_pw: str,
owner_pw: str,
permissions: int = READ_AND_PRINT,
) -> None:
reader = PdfReader(source)
writer = PdfWriter()
for page in reader.pages:
writer.add_page(page)
writer.encrypt(
user_password=user_pw,
owner_password=owner_pw,
permissions_flag=permissions, # bitmask controls what user-pw holders can do
algorithm="AES-256",
)
output.parent.mkdir(parents=True, exist_ok=True)
with open(output, "wb") as fh:
writer.write(fh)
The owner_password bypasses all permissions_flag restrictions regardless. Set it to a different, stronger value than user_password — some PDF readers silently disable flag enforcement when both passwords are identical.
Variant: Batch Encryption
# pip install "pypdf>=3.17"
import os
from pathlib import Path
from pypdf import PdfReader, PdfWriter
INPUT_DIR = Path("./raw_pdfs")
OUTPUT_DIR = Path("./secured_pdfs")
def batch_encrypt(
source_dir: Path,
output_dir: Path,
user_pw: str,
owner_pw: str,
) -> None:
output_dir.mkdir(parents=True, exist_ok=True)
pdfs = sorted(source_dir.glob("*.pdf"))
if not pdfs:
print(f"No PDFs found in {source_dir}")
return
ok, failed = 0, 0
for pdf in pdfs:
out = output_dir / f"secure_{pdf.name}"
try:
reader = PdfReader(pdf)
writer = PdfWriter()
for page in reader.pages:
writer.add_page(page)
writer.encrypt(user_password=user_pw, owner_password=owner_pw, algorithm="AES-256")
with open(out, "wb") as fh:
writer.write(fh)
ok += 1
print(f" OK {pdf.name}")
except Exception as exc:
failed += 1
print(f" ERR {pdf.name}: {exc}")
print(f"\nDone: {ok} encrypted, {failed} failed")
if __name__ == "__main__":
batch_encrypt(
INPUT_DIR,
OUTPUT_DIR,
user_pw=os.environ["PDF_USER_PW"],
owner_pw=os.environ["PDF_OWNER_PW"],
)
Verification
Confirm encryption succeeded and the password is correct before routing to downstream systems:
# pip install "pypdf>=3.17"
from pathlib import Path
from pypdf import PdfReader
from pypdf.errors import FileNotDecryptedError
def verify_encryption(file_path: Path, user_password: str) -> bool:
"""Return True if file is encrypted and decrypts cleanly with user_password."""
try:
reader = PdfReader(file_path)
if not reader.is_encrypted:
print(f"FAIL: {file_path.name} is not encrypted")
return False
result = reader.decrypt(user_password)
if result == 0:
print(f"FAIL: wrong user password for {file_path.name}")
return False
page_count = len(reader.pages)
print(f"PASS: {file_path.name} — AES encrypted, {page_count} pages accessible")
return True
except FileNotDecryptedError:
# pypdf raises this if you access .pages before calling decrypt()
print(f"FAIL: FileNotDecryptedError — call reader.decrypt() before reading pages")
return False
except Exception as exc:
print(f"ERROR: {exc}")
return False
An encrypted file that also has watermarks applied should pass this check after the security layer is added as the final pipeline step — never before.
Metadata and Bookmark Preservation
By default, copying pages with add_page() does not transfer the source document's /Info metadata dictionary (author, title, subject, creation date) or the outline tree (bookmarks). If your compliance workflow requires preserving these, copy them explicitly before writing:
# pip install "pypdf>=3.17"
from pathlib import Path
from pypdf import PdfReader, PdfWriter
import os
def encrypt_preserve_metadata(
source: Path,
output: Path,
user_pw: str,
owner_pw: str,
) -> None:
reader = PdfReader(source)
if reader.is_encrypted:
if reader.decrypt(os.environ.get("PDF_EXISTING_PW", "")) == 0:
raise ValueError("Wrong existing password")
writer = PdfWriter()
for page in reader.pages:
writer.add_page(page)
# Preserve /Info metadata if present
if reader.metadata:
writer.add_metadata(dict(reader.metadata))
# Clone the outline (bookmark) tree
writer.clone_document_from_reader(reader) # preserves outline + metadata
writer.encrypt(user_password=user_pw, owner_password=owner_pw, algorithm="AES-256")
output.parent.mkdir(parents=True, exist_ok=True)
with open(output, "wb") as fh:
writer.write(fh)
clone_document_from_reader() copies the full document structure including named destinations and embedded files. If you only need metadata and not the page tree, use add_metadata() alone — it is faster and avoids duplicating pages.
Integrating with the Broader PDF Pipeline
Encryption is always the terminal step. The order matters:
- Generate or assemble content — see Generating PDF Reports Dynamically for ReportLab-based report creation.
- Apply structural edits — merge, split, reorder; see Merging and Splitting PDF Documents.
- Stamp watermarks (optional) — overlay a ReportLab transparency layer.
- Encrypt — call
writer.encrypt()on the final composed writer.
Reversing steps 3 and 4 means the watermark step must decrypt, modify, and re-encrypt, which doubles the I/O and risks losing the encryption settings. Reversing steps 2 and 4 means every individual source file must be decrypted before merging — see Remove a Password from a PDF with Python for that pattern.
Common Mistakes
| Issue | Explanation | Fix |
|---|---|---|
NotImplementedError on .encrypt() | Using PyPDF2 (unmaintained) or pypdf < 3.0 | pip install "pypdf>=3.17" and remove PyPDF2 |
use_128bit=False parameter | Legacy keyword removed in pypdf 3.x | Replace with algorithm="AES-256" |
| User and owner passwords identical | Some readers ignore permission flags when passwords match | Always use distinct, different-strength passwords |
| Overwriting the source file | Writing encrypted output to INPUT_PDF corrupts the stream mid-write | Always define a separate output path |
| Encrypting before merging | Merge operations require unencrypted pages | Apply encryption as the final step after merging |
Frequently Asked Questions
Why does pypdf throw PdfReadError when adding a password?
The source file is already encrypted. Call reader.decrypt(existing_password) before copying pages to the writer. Check the return value — 0 means the password is wrong.
Can I add password protection without changing file size significantly? Yes. AES-256 encryption adds under 1 KB of overhead (a modified trailer and cross-reference table). Size bloat usually indicates uncompressed streams or embedded font duplication unrelated to encryption.
Does encryption preserve bookmarks and metadata?
pypdf preserves the document outline (bookmarks) and /Info metadata by default. If your compliance workflow requires stripping metadata, iterate over writer.add_metadata({}) to clear the info dictionary before calling encrypt().
Related
- Watermarking and Securing PDFs — full guide covering visual overlays, permission flags, and batch pipelines
- Remove a Password from a PDF with Python — decrypt an authorized file before re-encrypting
- Merging and Splitting PDF Documents — complete structural edits before applying encryption
Part of Watermarking and Securing PDFs.