Fix docx2pdf Error on Linux

Running docx2pdf on a Linux machine raises one of two errors immediately:

NotImplementedError: docx2pdf is not implemented for linux as it requires Microsoft Word to be installed

or, if you have imported win32com directly:

ModuleNotFoundError: No module named 'win32com'

Both errors have the same root cause.

Root Cause

docx2pdf converts .docx to PDF by driving Microsoft Word's COM automation layer — on Windows via win32com.client, on macOS via AppleScript. Neither mechanism exists on Linux. The library contains a hard platform check that raises NotImplementedError before attempting any conversion. There is no configuration flag or workaround that makes docx2pdf work on Linux; the dependency on Word is architectural.

If you reached this page by running a script copied from a Windows developer's machine, or by deploying to a Linux server without changing the conversion call, you need to replace docx2pdf with LibreOffice headless for the Linux execution path.

Minimal Diagnostic

The following snippet confirms the error without touching any files:

# pip install docx2pdf
import platform

print(f"Platform: {platform.system()}")   # expect 'Linux'

try:
    from docx2pdf import convert
    convert("dummy.docx")
except NotImplementedError as exc:
    print(f"Confirmed root cause: {exc}")
except Exception as exc:
    print(f"Other error: {exc}")

Expected output on Linux:

Platform: Linux
Confirmed root cause: docx2pdf is not implemented for linux as it requires Microsoft Word to be installed

Fix: Use LibreOffice Headless

LibreOffice ships a headless conversion mode via the soffice binary. It requires no Python package — call it with subprocess.

Step 1 — Install LibreOffice

# Ubuntu / Debian
sudo apt update && sudo apt install -y libreoffice

# RHEL / CentOS / Rocky
sudo yum install -y libreoffice

# Verify
soffice --version
# LibreOffice 7.x.x ...

Step 2 — Replace the docx2pdf call

# No pip package needed — requires soffice on PATH
import subprocess
from pathlib import Path

def convert_with_libreoffice(docx_path: Path, output_dir: Path) -> Path:
    """Convert a .docx file to PDF using LibreOffice headless."""
    docx_path = docx_path.resolve()   # soffice requires an absolute path
    output_dir = output_dir.resolve()
    output_dir.mkdir(parents=True, exist_ok=True)

    result = subprocess.run(
        [
            "soffice",
            "--headless",
            "--convert-to", "pdf",  # target format
            "--outdir", str(output_dir),  # where the PDF lands
            str(docx_path),
        ],
        capture_output=True,
        text=True,
        timeout=120,  # seconds; increase for large documents
    )

    if result.returncode != 0:
        raise RuntimeError(
            f"LibreOffice conversion failed:\n{result.stderr.strip()}"
        )

    pdf_path = output_dir / (docx_path.stem + ".pdf")
    if not pdf_path.exists():
        raise FileNotFoundError(f"Expected output not found: {pdf_path}")

    return pdf_path


# Usage
input_file = Path("documents/report.docx")
output_dir = Path("output_pdfs")

try:
    pdf = convert_with_libreoffice(input_file, output_dir)
    print(f"Converted: {pdf}")
except FileNotFoundError as exc:
    print(f"Input missing: {exc}")
except RuntimeError as exc:
    print(f"Conversion error: {exc}")

Cross-Platform Wrapper

If your codebase runs on both Windows/macOS (where docx2pdf is correct) and Linux/CI servers (where LibreOffice is correct), use this wrapper instead of calling either library directly. The full batch workflow using this pattern is in Converting DOCX to PDF with Python.

# pip install docx2pdf   (Windows/macOS only; harmless to install on Linux)
# Linux requires: sudo apt install libreoffice
import platform
import subprocess
import tempfile
from pathlib import Path

_SYSTEM = platform.system()


def convert_docx_to_pdf(docx_path: Path, output_dir: Path) -> Path:
    """
    Convert a single .docx to PDF.

    - Windows / macOS: uses docx2pdf (requires Microsoft Word)
    - Linux / other:   uses LibreOffice headless (requires soffice on PATH)
    """
    docx_path = docx_path.resolve()
    output_dir = output_dir.resolve()
    output_dir.mkdir(parents=True, exist_ok=True)
    pdf_path = output_dir / (docx_path.stem + ".pdf")

    if _SYSTEM in ("Windows", "Darwin"):
        # Windows/macOS path — drives Microsoft Word via COM / AppleScript
        from docx2pdf import convert  # noqa: PLC0415
        convert(docx_path, pdf_path)
    else:
        # Linux / server path — drives LibreOffice headless
        _soffice_convert(docx_path, output_dir)

    if not pdf_path.exists():
        raise FileNotFoundError(f"Conversion produced no output at {pdf_path}")

    return pdf_path


def _soffice_convert(docx_path: Path, output_dir: Path) -> None:
    """Internal: run soffice with an isolated user profile to allow parallelism."""
    with tempfile.TemporaryDirectory() as tmp:
        profile = Path(tmp) / "lo_profile"
        profile.mkdir()
        result = subprocess.run(
            [
                "soffice",
                f"-env:UserInstallation=file://{profile}",  # isolate profile
                "--headless",
                "--convert-to", "pdf",
                "--outdir", str(output_dir),
                str(docx_path),
            ],
            capture_output=True,
            text=True,
            timeout=120,
        )
        if result.returncode != 0:
            raise RuntimeError(result.stderr.strip() or "soffice returned non-zero exit code")


# Example usage
if __name__ == "__main__":
    try:
        out = convert_docx_to_pdf(
            docx_path=Path("documents/contract.docx"),
            output_dir=Path("output_pdfs"),
        )
        print(f"Success: {out}")
    except Exception as exc:
        print(f"Failed: {exc}")

Variant Fix A: soffice Not Found on PATH

If LibreOffice is installed but soffice is not on your PATH, the subprocess call raises FileNotFoundError: [Errno 2] No such file or directory: 'soffice'.

# pip install nothing — stdlib only
import shutil
from pathlib import Path

def find_soffice() -> str:
    """Return the absolute path to soffice, or raise if not found."""
    candidate = shutil.which("soffice") or shutil.which("libreoffice")
    if candidate:
        return candidate

    # Common non-PATH install locations on Linux
    for fallback in [
        "/usr/bin/soffice",
        "/usr/lib/libreoffice/program/soffice",
        "/opt/libreoffice/program/soffice",
    ]:
        if Path(fallback).exists():
            return fallback

    raise FileNotFoundError(
        "soffice not found. Install LibreOffice:\n"
        "  Ubuntu/Debian: sudo apt install libreoffice\n"
        "  RHEL/CentOS:   sudo yum install libreoffice"
    )

soffice_bin = find_soffice()
print(f"Using: {soffice_bin}")

Then substitute soffice_bin for "soffice" in the subprocess.run call above.

Variant Fix B: LibreOffice Profile Lock / tmp Permission Issues

On headless servers, multiple processes sharing a single LibreOffice user profile directory cause:

[Java framework] Error in function createSettingsDocument (elements.cxx)
user installation could not be completed

Or a conversion appears to succeed (exit code 0) but produces no PDF. The root cause is that LibreOffice tries to write its configuration to ~/.config/libreoffice/ and either the directory is locked by another soffice process or the running user has no home directory (common in Docker containers).

The fix is already shown in the cross-platform wrapper above: pass -env:UserInstallation=file://… pointing to a fresh tempfile.TemporaryDirectory() for each conversion. This makes every soffice invocation fully independent.

For Docker environments where ~ does not exist, also set:

ENV HOME=/tmp
RUN mkdir -p /tmp/.config/libreoffice

Or use the -env:UserInstallation flag with an explicit writable path rather than relying on $HOME.

Variant Fix C: GitHub Actions and CI/CD Pipelines

docx2pdf is especially problematic in CI environments because GitHub-hosted runners use Ubuntu, and Windows runners do not have Microsoft Word installed. The correct approach is to install LibreOffice on the CI runner and use the cross-platform wrapper shown above.

Add to your workflow .yml:

# .github/workflows/convert.yml
name: Convert DOCX to PDF

on:
  push:
    paths:
      - 'documents/**/*.docx'

jobs:
  convert:
    runs-on: ubuntu-latest  # Linux runner; docx2pdf will NOT work here

    steps:
      - uses: actions/checkout@v4

      - name: Install LibreOffice
        run: |
          sudo apt-get update
          sudo apt-get install -y libreoffice ttf-mscorefonts-installer
          sudo fc-cache -f -v

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install Python deps
        run: pip install pypdf

      - name: Convert documents
        run: python scripts/batch_convert.py documents/ output_pdfs/

      - name: Upload PDFs
        uses: actions/upload-artifact@v4
        with:
          name: converted-pdfs
          path: output_pdfs/

Key points:

  • Do not install docx2pdf in the CI pip install step — it imports cleanly on Linux but raises NotImplementedError at runtime. If other parts of your codebase import it, guard the import with if platform.system() != "Linux".
  • ttf-mscorefonts-installer installs Arial, Times New Roman, Courier New, and other common fonts. Without it, LibreOffice substitutes its own fonts and text reflows.
  • For self-hosted runners on Windows, switch to docx2pdf and ensure Word is installed on the runner machine.

Variant Fix D: Checking LibreOffice Font Rendering

Even after installing ttf-mscorefonts-installer, some documents use fonts not in that package (Impact, Wingdings, custom brand fonts). LibreOffice silently substitutes them, which shifts text and can cause overflows or empty pages.

Diagnose font substitution by checking LibreOffice's font list:

# List fonts LibreOffice can see
soffice --headless --infilter="impress8" --convert-to txt /dev/null 2>&1 | grep -i "font"
# Or more directly:
fc-list | grep -i "arial"

To add a custom font on a server:

# Copy the .ttf or .otf file
sudo mkdir -p /usr/share/fonts/custom
sudo cp BrandFont-Regular.ttf /usr/share/fonts/custom/
sudo fc-cache -f -v
# Verify it is now visible
fc-list | grep BrandFont

Then re-run the conversion. For production pipelines that must guarantee exact font rendering, consider running conversions on a macOS or Windows CI runner where docx2pdf can use Word's own rendering engine.

Troubleshooting Quick Reference

SymptomLikely causeFix
NotImplementedError: docx2pdf is not implemented for linuxHard platform check in docx2pdfReplace with LibreOffice headless (Step 2 above)
ModuleNotFoundError: No module named 'win32com'win32com is a Windows-only packageSame — switch to LibreOffice on Linux
FileNotFoundError: sofficeLibreOffice not installed or not on PATHsudo apt install libreoffice; use find_soffice() (Variant A)
user installation could not be completedLibreOffice profile dir locked or missingUse -env:UserInstallation with a temp dir (Variant B)
PDF produced but has wrong fonts / text overflowRequired fonts not installed on serverInstall ttf-mscorefonts-installer and custom fonts (Variant D)
Exit code 0 but no PDF file in outdirsoffice silently failed (corrupt docx, missing font)Check stderr; validate source .docx with python-docx first
GitHub Actions CI fails on ubuntu runnerCI uses Linux, docx2pdf hard-failsInstall LibreOffice on runner; use cross-platform wrapper (Variant C)

Verification

After applying the fix, run the diagnostic again and check that a real PDF is produced:

# pip install pypdf
from pathlib import Path
from pypdf import PdfReader
from your_module import convert_docx_to_pdf  # the wrapper above

docx = Path("documents/test.docx")
out_dir = Path("output_pdfs")

try:
    pdf = convert_docx_to_pdf(docx, out_dir)
    reader = PdfReader(pdf)
    print(f"PDF pages: {len(reader.pages)}")   # should be > 0
    assert len(reader.pages) > 0, "Empty PDF — conversion failed silently"
    print("Conversion verified.")
except AssertionError as exc:
    print(f"Verification failed: {exc}")
except Exception as exc:
    print(f"Error: {exc}")

A page count greater than zero confirms LibreOffice rendered the document correctly. If you are generating the source .docx files programmatically, see Automating Word Document Creation for the templating patterns that pair with this conversion step. For building PDFs from data without the .docx intermediate, see Generating PDF Reports Dynamically.

Part of Converting DOCX to PDF with Python.