Fix Images Too Large in python-docx
Calling doc.add_picture("logo.png") without a width or height argument inserts the image at its native physical size derived from the file's DPI metadata. A 2000 x 1500 px image saved at 72 DPI renders as a 27 x 20 inch shape — it overflows the page margins, pushes all subsequent content down, and frequently causes Word to flag the document as corrupt on open.
Root Cause
python-docx converts pixel dimensions to EMU (English Metric Units) using the image's embedded DPI:
width_emu = pixel_width / dpi * 914400
A 2000 px wide image at 72 DPI produces 2000 / 72 * 914400 = 25,400,000 EMU = 27.8 inches. A standard US Letter page is only 8.5 inches wide, so the image blows past both margins.
The same logic applies in reverse for high-DPI images: a 600 x 450 px image at 300 DPI renders at 2 x 1.5 inches, which may be smaller than expected.
Minimal Reproducible Diagnostic
Run this to confirm your image's native physical size before inserting it:
# pip install Pillow
from PIL import Image
from pathlib import Path
img_path = Path("assets/logo.png")
try:
with Image.open(img_path) as img:
w, h = img.size
dpi = img.info.get("dpi", (72, 72))
print(f"Pixels: {w} x {h}")
print(f"DPI: {dpi[0]} x {dpi[1]}")
print(f"Native size: {w/dpi[0]:.2f} x {h/dpi[1]:.2f} inches")
except FileNotFoundError:
print(f"File not found: {img_path}")
except Exception as e:
print(f"Could not read image: {e}")
If "Native size" is larger than your usable page area (typically 6–7 inches for a standard letter/A4 page with 1-inch margins), the image will overflow without an explicit size argument.
Fix: Always Pass an Explicit Width
The primary fix is to always supply width (or height) to add_picture. python-docx preserves the aspect ratio when only one dimension is given.
# pip install python-docx
from pathlib import Path
from docx import Document
from docx.shared import Inches
IMAGE = Path("assets/logo.png")
OUTPUT = Path("output/fixed_image.docx")
doc = Document()
doc.add_heading("Fixed Report", level=1)
try:
# Pass explicit width — aspect ratio is preserved automatically
doc.add_picture(str(IMAGE), width=Inches(3.0)) # never omit this argument
doc.save(str(OUTPUT))
print(f"Saved: {OUTPUT}")
except FileNotFoundError:
print(f"Image not found: {IMAGE}")
except Exception as e:
print(f"Insertion failed: {e}")
Use Cm(n) instead of Inches(n) if your document uses metric measurements. Both are valid arguments; they are just different units for the same EMU value.
Fix: Compute Width to Fit the Usable Page Area
Hard-coding Inches(3.0) works for a single document but breaks when the template uses non-standard margins. Derive the usable width from the section geometry:
# pip install python-docx
from pathlib import Path
from docx import Document
IMAGE = Path("assets/logo.png")
OUTPUT = Path("output/auto_width_image.docx")
doc = Document()
section = doc.sections[0]
# Compute usable width in EMU — works for any margin configuration
usable_width = (
section.page_width # total page width in EMU
- section.left_margin # subtract left margin
- section.right_margin # subtract right margin
)
doc.add_heading("Auto-Fitted Image", level=1)
try:
doc.add_picture(str(IMAGE), width=usable_width) # fills the text column exactly
doc.save(str(OUTPUT))
print(f"Usable width: {usable_width / 914400:.2f} in — saved {OUTPUT}")
except FileNotFoundError:
print(f"Image not found: {IMAGE}")
except Exception as e:
print(f"Error: {e}")
All three values (page_width, left_margin, right_margin) are already in EMU, so arithmetic is exact. Pass the result directly to width= — no unit conversion needed.
Variant: High-DPI Images Come Out Too Small
The opposite problem occurs with 300 DPI assets: 1200 / 300 * 914400 = 3,657,600 EMU = 4 inches. That is a reasonable size but it may be too small for a full-width chart or too large for a logo corner. The fix is identical — always pass an explicit width.
For a helper that reads DPI and returns a safe EMU width:
# pip install python-docx Pillow
from pathlib import Path
from docx.shared import Inches
from PIL import Image
def safe_image_width(img_path: Path, max_inches: float = 4.0) -> int:
"""Return an EMU width that never exceeds max_inches, preserving aspect ratio."""
try:
with Image.open(img_path) as img:
w, h = img.size
dpi = img.info.get("dpi", (96, 96))
native_inches = w / dpi[0]
# Clamp to max_inches
target_inches = min(native_inches, max_inches)
return int(target_inches * 914400) # convert to EMU
except Exception:
return int(Inches(max_inches)) # fallback to max
# Usage:
# doc.add_picture(str(img_path), width=safe_image_width(img_path, max_inches=3.0))
Variant: Oversized Image Inside a Table Cell
Inside a table cell, the image must fit within the cell width. Cell width is accessible via the XML, but the simplest reliable approach is to compute a target width based on the number of columns and the usable page width:
# pip install python-docx
from pathlib import Path
from docx import Document
from docx.shared import Inches
LOGO = Path("assets/logo.png")
OUTPUT = Path("output/table_constrained.docx")
doc = Document()
section = doc.sections[0]
usable = section.page_width - section.left_margin - section.right_margin
num_cols = 3
# Leave a small gutter; divide evenly across columns
cell_image_width = int((usable / num_cols) * 0.85) # 85% of the cell share
table = doc.add_table(rows=1, cols=num_cols)
for i in range(num_cols):
cell = table.cell(0, i)
para = cell.paragraphs[0]
run = para.add_run()
try:
run.add_picture(str(LOGO), width=cell_image_width) # constrained to cell share
except FileNotFoundError:
cell.text = "[logo missing]"
try:
doc.save(str(OUTPUT))
print(f"Saved: {OUTPUT}")
except Exception as e:
print(f"Save failed: {e}")
The 0.85 factor gives breathing room for cell padding. Adjust to taste; the important constraint is that cell_image_width never exceeds usable / num_cols.
Variant: Image with No DPI Metadata
Some images — especially those generated programmatically, captured with screen-recording tools, or exported from vector software — carry no DPI tag at all. img.info.get("dpi", ...) returns the fallback tuple. python-docx falls back to 72 DPI internally, which is why screen captures almost always overflow.
You have two options:
Option A — assume a sane DPI and scale accordingly. If you know the image was a screen capture at 96 DPI, pass that assumption explicitly:
# pip install python-docx Pillow
from pathlib import Path
from docx import Document
from docx.shared import Inches
from PIL import Image
ASSUMED_DPI = 96 # override for images with no DPI metadata
MAX_WIDTH_INCHES = 5.0
img_path = Path("assets/screenshot.png")
OUTPUT = Path("output/screenshot_report.docx")
doc = Document()
try:
with Image.open(img_path) as img:
w, _ = img.size
native_inches = w / ASSUMED_DPI
target_width = min(native_inches, MAX_WIDTH_INCHES)
target_emu = int(target_width * 914400)
doc.add_picture(str(img_path), width=target_emu) # use computed EMU width
doc.save(str(OUTPUT))
print(f"Saved: {OUTPUT}")
except FileNotFoundError:
print(f"Image not found: {img_path}")
except Exception as e:
print(f"Error: {e}")
Option B — ignore pixel count entirely and always pass a fixed display size. This is the safest approach for batch pipelines where the image source is varied and you want a consistent thumbnail size regardless of file characteristics:
# pip install python-docx
from pathlib import Path
from docx import Document
from docx.shared import Inches
# Fixed display width regardless of source DPI or pixel dimensions
THUMBNAIL_WIDTH = Inches(2.5)
img_path = Path("assets/screenshot.png")
OUTPUT = Path("output/thumbnail_report.docx")
doc = Document()
try:
doc.add_picture(str(img_path), width=THUMBNAIL_WIDTH) # fixed size, DPI irrelevant
doc.save(str(OUTPUT))
print(f"Saved: {OUTPUT}")
except FileNotFoundError:
print(f"Image not found: {img_path}")
except Exception as e:
print(f"Error: {e}")
Option B is idiomatic for logo slots, watermarks, and icon-sized images where the display size is a design constraint rather than a function of the source file.
Understanding the EMU Scale
The diagram below shows the relationship between pixel count, DPI, and the resulting physical size in inches, and how passing an explicit width short-circuits the DPI dependency entirely.
Troubleshooting Table
| Symptom | Likely cause | Fix |
|---|---|---|
| Image pushes content off page | No width argument; native size > page width | Always pass width=Inches(n) or computed usable_width |
| Image is a tiny thumbnail | High-DPI source (300 DPI); native size too small | Pass larger explicit width; use safe_image_width() helper |
| Image correct in body, huge in header | Header paragraph uses a different context; no width passed to header run | Call header_run.add_picture(str(img), width=Inches(n)) |
| Word opens file with "repaired content" warning | Image XML is malformed; often from a BytesIO with wrong seek position | Call buf.seek(0) before each add_picture(buf, ...) call |
| Image in table cell overflows the cell border | Cell width not accounted for; passed page-wide usable_width to cell | Divide usable_width by column count and apply an 85% gutter factor |
Verification
Confirm the inserted image is within page bounds by reading back the saved document:
# pip install python-docx
from pathlib import Path
from docx import Document
OUTPUT = Path("output/fixed_image.docx")
MAX_INCHES = 8.5 # letter page width; adjust for A4 (8.27 in)
try:
doc = Document(str(OUTPUT))
section = doc.sections[0]
page_inches = section.page_width / 914400
for i, shape in enumerate(doc.inline_shapes):
w_in = shape.width / 914400
h_in = shape.height / 914400
status = "OK" if w_in <= page_inches else "OVERFLOW"
print(f"Shape {i}: {w_in:.2f} x {h_in:.2f} in — {status}")
print(f"Page width: {page_inches:.2f} in")
except FileNotFoundError:
print(f"File not found: {OUTPUT}")
except Exception as e:
print(f"Verification error: {e}")
All shapes should print OK. Any shape where width exceeds the page width will show OVERFLOW — trace it back to the add_picture call and add an explicit width argument.
Related
- Inserting Images into Word Documents — full guide covering headers, table cells, BytesIO, and batch insertion
- Automating Word Document Creation — document generation pipeline context
- Dynamic Mail Merge with Python — per-recipient image insertion via docxtpl
Part of Inserting Images into Word Documents.