Fix Images Too Large in python-docx

Calling doc.add_picture("logo.png") without a width or height argument inserts the image at its native physical size derived from the file's DPI metadata. A 2000 x 1500 px image saved at 72 DPI renders as a 27 x 20 inch shape — it overflows the page margins, pushes all subsequent content down, and frequently causes Word to flag the document as corrupt on open.

Root Cause

python-docx converts pixel dimensions to EMU (English Metric Units) using the image's embedded DPI:

width_emu = pixel_width / dpi * 914400

A 2000 px wide image at 72 DPI produces 2000 / 72 * 914400 = 25,400,000 EMU = 27.8 inches. A standard US Letter page is only 8.5 inches wide, so the image blows past both margins.

The same logic applies in reverse for high-DPI images: a 600 x 450 px image at 300 DPI renders at 2 x 1.5 inches, which may be smaller than expected.

Minimal Reproducible Diagnostic

Run this to confirm your image's native physical size before inserting it:

# pip install Pillow
from PIL import Image
from pathlib import Path

img_path = Path("assets/logo.png")
try:
    with Image.open(img_path) as img:
        w, h = img.size
        dpi = img.info.get("dpi", (72, 72))
        print(f"Pixels: {w} x {h}")
        print(f"DPI: {dpi[0]} x {dpi[1]}")
        print(f"Native size: {w/dpi[0]:.2f} x {h/dpi[1]:.2f} inches")
except FileNotFoundError:
    print(f"File not found: {img_path}")
except Exception as e:
    print(f"Could not read image: {e}")

If "Native size" is larger than your usable page area (typically 6–7 inches for a standard letter/A4 page with 1-inch margins), the image will overflow without an explicit size argument.

Fix: Always Pass an Explicit Width

The primary fix is to always supply width (or height) to add_picture. python-docx preserves the aspect ratio when only one dimension is given.

# pip install python-docx
from pathlib import Path
from docx import Document
from docx.shared import Inches

IMAGE = Path("assets/logo.png")
OUTPUT = Path("output/fixed_image.docx")

doc = Document()
doc.add_heading("Fixed Report", level=1)

try:
    # Pass explicit width — aspect ratio is preserved automatically
    doc.add_picture(str(IMAGE), width=Inches(3.0))  # never omit this argument
    doc.save(str(OUTPUT))
    print(f"Saved: {OUTPUT}")
except FileNotFoundError:
    print(f"Image not found: {IMAGE}")
except Exception as e:
    print(f"Insertion failed: {e}")

Use Cm(n) instead of Inches(n) if your document uses metric measurements. Both are valid arguments; they are just different units for the same EMU value.

Fix: Compute Width to Fit the Usable Page Area

Hard-coding Inches(3.0) works for a single document but breaks when the template uses non-standard margins. Derive the usable width from the section geometry:

# pip install python-docx
from pathlib import Path
from docx import Document

IMAGE = Path("assets/logo.png")
OUTPUT = Path("output/auto_width_image.docx")

doc = Document()
section = doc.sections[0]

# Compute usable width in EMU — works for any margin configuration
usable_width = (
    section.page_width        # total page width in EMU
    - section.left_margin     # subtract left margin
    - section.right_margin    # subtract right margin
)

doc.add_heading("Auto-Fitted Image", level=1)
try:
    doc.add_picture(str(IMAGE), width=usable_width)  # fills the text column exactly
    doc.save(str(OUTPUT))
    print(f"Usable width: {usable_width / 914400:.2f} in — saved {OUTPUT}")
except FileNotFoundError:
    print(f"Image not found: {IMAGE}")
except Exception as e:
    print(f"Error: {e}")

All three values (page_width, left_margin, right_margin) are already in EMU, so arithmetic is exact. Pass the result directly to width= — no unit conversion needed.

Variant: High-DPI Images Come Out Too Small

The opposite problem occurs with 300 DPI assets: 1200 / 300 * 914400 = 3,657,600 EMU = 4 inches. That is a reasonable size but it may be too small for a full-width chart or too large for a logo corner. The fix is identical — always pass an explicit width.

For a helper that reads DPI and returns a safe EMU width:

# pip install python-docx Pillow
from pathlib import Path
from docx.shared import Inches
from PIL import Image

def safe_image_width(img_path: Path, max_inches: float = 4.0) -> int:
    """Return an EMU width that never exceeds max_inches, preserving aspect ratio."""
    try:
        with Image.open(img_path) as img:
            w, h = img.size
            dpi = img.info.get("dpi", (96, 96))
            native_inches = w / dpi[0]
            # Clamp to max_inches
            target_inches = min(native_inches, max_inches)
            return int(target_inches * 914400)  # convert to EMU
    except Exception:
        return int(Inches(max_inches))  # fallback to max

# Usage:
# doc.add_picture(str(img_path), width=safe_image_width(img_path, max_inches=3.0))

Variant: Oversized Image Inside a Table Cell

Inside a table cell, the image must fit within the cell width. Cell width is accessible via the XML, but the simplest reliable approach is to compute a target width based on the number of columns and the usable page width:

# pip install python-docx
from pathlib import Path
from docx import Document
from docx.shared import Inches

LOGO = Path("assets/logo.png")
OUTPUT = Path("output/table_constrained.docx")

doc = Document()
section = doc.sections[0]
usable = section.page_width - section.left_margin - section.right_margin

num_cols = 3
# Leave a small gutter; divide evenly across columns
cell_image_width = int((usable / num_cols) * 0.85)  # 85% of the cell share

table = doc.add_table(rows=1, cols=num_cols)
for i in range(num_cols):
    cell = table.cell(0, i)
    para = cell.paragraphs[0]
    run = para.add_run()
    try:
        run.add_picture(str(LOGO), width=cell_image_width)  # constrained to cell share
    except FileNotFoundError:
        cell.text = "[logo missing]"

try:
    doc.save(str(OUTPUT))
    print(f"Saved: {OUTPUT}")
except Exception as e:
    print(f"Save failed: {e}")

The 0.85 factor gives breathing room for cell padding. Adjust to taste; the important constraint is that cell_image_width never exceeds usable / num_cols.

Variant: Image with No DPI Metadata

Some images — especially those generated programmatically, captured with screen-recording tools, or exported from vector software — carry no DPI tag at all. img.info.get("dpi", ...) returns the fallback tuple. python-docx falls back to 72 DPI internally, which is why screen captures almost always overflow.

You have two options:

Option A — assume a sane DPI and scale accordingly. If you know the image was a screen capture at 96 DPI, pass that assumption explicitly:

# pip install python-docx Pillow
from pathlib import Path
from docx import Document
from docx.shared import Inches
from PIL import Image

ASSUMED_DPI = 96  # override for images with no DPI metadata
MAX_WIDTH_INCHES = 5.0

img_path = Path("assets/screenshot.png")
OUTPUT = Path("output/screenshot_report.docx")

doc = Document()
try:
    with Image.open(img_path) as img:
        w, _ = img.size
        native_inches = w / ASSUMED_DPI
        target_width = min(native_inches, MAX_WIDTH_INCHES)
        target_emu = int(target_width * 914400)

    doc.add_picture(str(img_path), width=target_emu)  # use computed EMU width
    doc.save(str(OUTPUT))
    print(f"Saved: {OUTPUT}")
except FileNotFoundError:
    print(f"Image not found: {img_path}")
except Exception as e:
    print(f"Error: {e}")

Option B — ignore pixel count entirely and always pass a fixed display size. This is the safest approach for batch pipelines where the image source is varied and you want a consistent thumbnail size regardless of file characteristics:

# pip install python-docx
from pathlib import Path
from docx import Document
from docx.shared import Inches

# Fixed display width regardless of source DPI or pixel dimensions
THUMBNAIL_WIDTH = Inches(2.5)

img_path = Path("assets/screenshot.png")
OUTPUT = Path("output/thumbnail_report.docx")

doc = Document()
try:
    doc.add_picture(str(img_path), width=THUMBNAIL_WIDTH)  # fixed size, DPI irrelevant
    doc.save(str(OUTPUT))
    print(f"Saved: {OUTPUT}")
except FileNotFoundError:
    print(f"Image not found: {img_path}")
except Exception as e:
    print(f"Error: {e}")

Option B is idiomatic for logo slots, watermarks, and icon-sized images where the display size is a design constraint rather than a function of the source file.

Understanding the EMU Scale

The diagram below shows the relationship between pixel count, DPI, and the resulting physical size in inches, and how passing an explicit width short-circuits the DPI dependency entirely.

Troubleshooting Table

Symptom	Likely cause	Fix
Image pushes content off page	No `width` argument; native size > page width	Always pass `width=Inches(n)` or computed `usable_width`
Image is a tiny thumbnail	High-DPI source (300 DPI); native size too small	Pass larger explicit `width`; use `safe_image_width()` helper
Image correct in body, huge in header	Header paragraph uses a different context; no width passed to header run	Call `header_run.add_picture(str(img), width=Inches(n))`
Word opens file with "repaired content" warning	Image XML is malformed; often from a BytesIO with wrong seek position	Call `buf.seek(0)` before each `add_picture(buf, ...)` call
Image in table cell overflows the cell border	Cell width not accounted for; passed page-wide `usable_width` to cell	Divide `usable_width` by column count and apply an 85% gutter factor

Verification

Confirm the inserted image is within page bounds by reading back the saved document:

# pip install python-docx
from pathlib import Path
from docx import Document

OUTPUT = Path("output/fixed_image.docx")
MAX_INCHES = 8.5  # letter page width; adjust for A4 (8.27 in)

try:
    doc = Document(str(OUTPUT))
    section = doc.sections[0]
    page_inches = section.page_width / 914400

    for i, shape in enumerate(doc.inline_shapes):
        w_in = shape.width / 914400
        h_in = shape.height / 914400
        status = "OK" if w_in <= page_inches else "OVERFLOW"
        print(f"Shape {i}: {w_in:.2f} x {h_in:.2f} in — {status}")

    print(f"Page width: {page_inches:.2f} in")
except FileNotFoundError:
    print(f"File not found: {OUTPUT}")
except Exception as e:
    print(f"Verification error: {e}")

All shapes should print OK. Any shape where width exceeds the page width will show OVERFLOW — trace it back to the add_picture call and add an explicit width argument.

Inserting Images into Word Documents — full guide covering headers, table cells, BytesIO, and batch insertion
Automating Word Document Creation — document generation pipeline context
Dynamic Mail Merge with Python — per-recipient image insertion via docxtpl

Part of Inserting Images into Word Documents.