Fix docx2pdf Error on Linux
Running docx2pdf on a Linux machine raises one of two errors immediately:
NotImplementedError: docx2pdf is not implemented for linux as it requires Microsoft Word to be installed
or, if you have imported win32com directly:
ModuleNotFoundError: No module named 'win32com'
Both errors have the same root cause.
Root Cause
docx2pdf converts .docx to PDF by driving Microsoft Word's COM automation layer — on Windows via win32com.client, on macOS via AppleScript. Neither mechanism exists on Linux. The library contains a hard platform check that raises NotImplementedError before attempting any conversion. There is no configuration flag or workaround that makes docx2pdf work on Linux; the dependency on Word is architectural.
If you reached this page by running a script copied from a Windows developer's machine, or by deploying to a Linux server without changing the conversion call, you need to replace docx2pdf with LibreOffice headless for the Linux execution path.
Minimal Diagnostic
The following snippet confirms the error without touching any files:
# pip install docx2pdf
import platform
print(f"Platform: {platform.system()}") # expect 'Linux'
try:
from docx2pdf import convert
convert("dummy.docx")
except NotImplementedError as exc:
print(f"Confirmed root cause: {exc}")
except Exception as exc:
print(f"Other error: {exc}")
Expected output on Linux:
Platform: Linux
Confirmed root cause: docx2pdf is not implemented for linux as it requires Microsoft Word to be installed
Fix: Use LibreOffice Headless
LibreOffice ships a headless conversion mode via the soffice binary. It requires no Python package — call it with subprocess.
Step 1 — Install LibreOffice
# Ubuntu / Debian
sudo apt update && sudo apt install -y libreoffice
# RHEL / CentOS / Rocky
sudo yum install -y libreoffice
# Verify
soffice --version
# LibreOffice 7.x.x ...
Step 2 — Replace the docx2pdf call
# No pip package needed — requires soffice on PATH
import subprocess
from pathlib import Path
def convert_with_libreoffice(docx_path: Path, output_dir: Path) -> Path:
"""Convert a .docx file to PDF using LibreOffice headless."""
docx_path = docx_path.resolve() # soffice requires an absolute path
output_dir = output_dir.resolve()
output_dir.mkdir(parents=True, exist_ok=True)
result = subprocess.run(
[
"soffice",
"--headless",
"--convert-to", "pdf", # target format
"--outdir", str(output_dir), # where the PDF lands
str(docx_path),
],
capture_output=True,
text=True,
timeout=120, # seconds; increase for large documents
)
if result.returncode != 0:
raise RuntimeError(
f"LibreOffice conversion failed:\n{result.stderr.strip()}"
)
pdf_path = output_dir / (docx_path.stem + ".pdf")
if not pdf_path.exists():
raise FileNotFoundError(f"Expected output not found: {pdf_path}")
return pdf_path
# Usage
input_file = Path("documents/report.docx")
output_dir = Path("output_pdfs")
try:
pdf = convert_with_libreoffice(input_file, output_dir)
print(f"Converted: {pdf}")
except FileNotFoundError as exc:
print(f"Input missing: {exc}")
except RuntimeError as exc:
print(f"Conversion error: {exc}")
Cross-Platform Wrapper
If your codebase runs on both Windows/macOS (where docx2pdf is correct) and Linux/CI servers (where LibreOffice is correct), use this wrapper instead of calling either library directly. The full batch workflow using this pattern is in Converting DOCX to PDF with Python.
# pip install docx2pdf (Windows/macOS only; harmless to install on Linux)
# Linux requires: sudo apt install libreoffice
import platform
import subprocess
import tempfile
from pathlib import Path
_SYSTEM = platform.system()
def convert_docx_to_pdf(docx_path: Path, output_dir: Path) -> Path:
"""
Convert a single .docx to PDF.
- Windows / macOS: uses docx2pdf (requires Microsoft Word)
- Linux / other: uses LibreOffice headless (requires soffice on PATH)
"""
docx_path = docx_path.resolve()
output_dir = output_dir.resolve()
output_dir.mkdir(parents=True, exist_ok=True)
pdf_path = output_dir / (docx_path.stem + ".pdf")
if _SYSTEM in ("Windows", "Darwin"):
# Windows/macOS path — drives Microsoft Word via COM / AppleScript
from docx2pdf import convert # noqa: PLC0415
convert(docx_path, pdf_path)
else:
# Linux / server path — drives LibreOffice headless
_soffice_convert(docx_path, output_dir)
if not pdf_path.exists():
raise FileNotFoundError(f"Conversion produced no output at {pdf_path}")
return pdf_path
def _soffice_convert(docx_path: Path, output_dir: Path) -> None:
"""Internal: run soffice with an isolated user profile to allow parallelism."""
with tempfile.TemporaryDirectory() as tmp:
profile = Path(tmp) / "lo_profile"
profile.mkdir()
result = subprocess.run(
[
"soffice",
f"-env:UserInstallation=file://{profile}", # isolate profile
"--headless",
"--convert-to", "pdf",
"--outdir", str(output_dir),
str(docx_path),
],
capture_output=True,
text=True,
timeout=120,
)
if result.returncode != 0:
raise RuntimeError(result.stderr.strip() or "soffice returned non-zero exit code")
# Example usage
if __name__ == "__main__":
try:
out = convert_docx_to_pdf(
docx_path=Path("documents/contract.docx"),
output_dir=Path("output_pdfs"),
)
print(f"Success: {out}")
except Exception as exc:
print(f"Failed: {exc}")
Variant Fix A: soffice Not Found on PATH
If LibreOffice is installed but soffice is not on your PATH, the subprocess call raises FileNotFoundError: [Errno 2] No such file or directory: 'soffice'.
# pip install nothing — stdlib only
import shutil
from pathlib import Path
def find_soffice() -> str:
"""Return the absolute path to soffice, or raise if not found."""
candidate = shutil.which("soffice") or shutil.which("libreoffice")
if candidate:
return candidate
# Common non-PATH install locations on Linux
for fallback in [
"/usr/bin/soffice",
"/usr/lib/libreoffice/program/soffice",
"/opt/libreoffice/program/soffice",
]:
if Path(fallback).exists():
return fallback
raise FileNotFoundError(
"soffice not found. Install LibreOffice:\n"
" Ubuntu/Debian: sudo apt install libreoffice\n"
" RHEL/CentOS: sudo yum install libreoffice"
)
soffice_bin = find_soffice()
print(f"Using: {soffice_bin}")
Then substitute soffice_bin for "soffice" in the subprocess.run call above.
Variant Fix B: LibreOffice Profile Lock / tmp Permission Issues
On headless servers, multiple processes sharing a single LibreOffice user profile directory cause:
[Java framework] Error in function createSettingsDocument (elements.cxx)
user installation could not be completed
Or a conversion appears to succeed (exit code 0) but produces no PDF. The root cause is that LibreOffice tries to write its configuration to ~/.config/libreoffice/ and either the directory is locked by another soffice process or the running user has no home directory (common in Docker containers).
The fix is already shown in the cross-platform wrapper above: pass -env:UserInstallation=file://… pointing to a fresh tempfile.TemporaryDirectory() for each conversion. This makes every soffice invocation fully independent.
For Docker environments where ~ does not exist, also set:
ENV HOME=/tmp
RUN mkdir -p /tmp/.config/libreoffice
Or use the -env:UserInstallation flag with an explicit writable path rather than relying on $HOME.
Variant Fix C: GitHub Actions and CI/CD Pipelines
docx2pdf is especially problematic in CI environments because GitHub-hosted runners use Ubuntu, and Windows runners do not have Microsoft Word installed. The correct approach is to install LibreOffice on the CI runner and use the cross-platform wrapper shown above.
Add to your workflow .yml:
# .github/workflows/convert.yml
name: Convert DOCX to PDF
on:
push:
paths:
- 'documents/**/*.docx'
jobs:
convert:
runs-on: ubuntu-latest # Linux runner; docx2pdf will NOT work here
steps:
- uses: actions/checkout@v4
- name: Install LibreOffice
run: |
sudo apt-get update
sudo apt-get install -y libreoffice ttf-mscorefonts-installer
sudo fc-cache -f -v
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install Python deps
run: pip install pypdf
- name: Convert documents
run: python scripts/batch_convert.py documents/ output_pdfs/
- name: Upload PDFs
uses: actions/upload-artifact@v4
with:
name: converted-pdfs
path: output_pdfs/
Key points:
- Do not install
docx2pdfin the CIpip installstep — it imports cleanly on Linux but raisesNotImplementedErrorat runtime. If other parts of your codebase import it, guard the import withif platform.system() != "Linux". ttf-mscorefonts-installerinstalls Arial, Times New Roman, Courier New, and other common fonts. Without it, LibreOffice substitutes its own fonts and text reflows.- For self-hosted runners on Windows, switch to
docx2pdfand ensure Word is installed on the runner machine.
Variant Fix D: Checking LibreOffice Font Rendering
Even after installing ttf-mscorefonts-installer, some documents use fonts not in that package (Impact, Wingdings, custom brand fonts). LibreOffice silently substitutes them, which shifts text and can cause overflows or empty pages.
Diagnose font substitution by checking LibreOffice's font list:
# List fonts LibreOffice can see
soffice --headless --infilter="impress8" --convert-to txt /dev/null 2>&1 | grep -i "font"
# Or more directly:
fc-list | grep -i "arial"
To add a custom font on a server:
# Copy the .ttf or .otf file
sudo mkdir -p /usr/share/fonts/custom
sudo cp BrandFont-Regular.ttf /usr/share/fonts/custom/
sudo fc-cache -f -v
# Verify it is now visible
fc-list | grep BrandFont
Then re-run the conversion. For production pipelines that must guarantee exact font rendering, consider running conversions on a macOS or Windows CI runner where docx2pdf can use Word's own rendering engine.
Troubleshooting Quick Reference
| Symptom | Likely cause | Fix |
|---|---|---|
NotImplementedError: docx2pdf is not implemented for linux | Hard platform check in docx2pdf | Replace with LibreOffice headless (Step 2 above) |
ModuleNotFoundError: No module named 'win32com' | win32com is a Windows-only package | Same — switch to LibreOffice on Linux |
FileNotFoundError: soffice | LibreOffice not installed or not on PATH | sudo apt install libreoffice; use find_soffice() (Variant A) |
user installation could not be completed | LibreOffice profile dir locked or missing | Use -env:UserInstallation with a temp dir (Variant B) |
| PDF produced but has wrong fonts / text overflow | Required fonts not installed on server | Install ttf-mscorefonts-installer and custom fonts (Variant D) |
| Exit code 0 but no PDF file in outdir | soffice silently failed (corrupt docx, missing font) | Check stderr; validate source .docx with python-docx first |
| GitHub Actions CI fails on ubuntu runner | CI uses Linux, docx2pdf hard-fails | Install LibreOffice on runner; use cross-platform wrapper (Variant C) |
Verification
After applying the fix, run the diagnostic again and check that a real PDF is produced:
# pip install pypdf
from pathlib import Path
from pypdf import PdfReader
from your_module import convert_docx_to_pdf # the wrapper above
docx = Path("documents/test.docx")
out_dir = Path("output_pdfs")
try:
pdf = convert_docx_to_pdf(docx, out_dir)
reader = PdfReader(pdf)
print(f"PDF pages: {len(reader.pages)}") # should be > 0
assert len(reader.pages) > 0, "Empty PDF — conversion failed silently"
print("Conversion verified.")
except AssertionError as exc:
print(f"Verification failed: {exc}")
except Exception as exc:
print(f"Error: {exc}")
A page count greater than zero confirms LibreOffice rendered the document correctly. If you are generating the source .docx files programmatically, see Automating Word Document Creation for the templating patterns that pair with this conversion step. For building PDFs from data without the .docx intermediate, see Generating PDF Reports Dynamically.
Related
- Converting DOCX to PDF with Python — full guide covering both engines, batch conversion, font fidelity, and Docker deployment
- Automating Word Document Creation — create the .docx source files before converting them
- Generating PDF Reports Dynamically — alternative: generate PDFs directly from data using ReportLab or WeasyPrint
Part of Converting DOCX to PDF with Python.