Fix openpyxl Formulas Showing as Blank

You write ws["B8"].value = "=SUM(B2:B7)" with openpyxl, save the file, then read it back — and ws["B8"].value is None. Or the cell appears blank when you open the .xlsx in another tool that does not recalculate on open. This is the most common openpyxl confusion: openpyxl writes formulas as strings and has no calculation engine.


Root cause

openpyxl is a read/write library, not a spreadsheet engine. When you assign cell.value = "=SUM(B2:B7)", openpyxl stores the formula string in the XML. It does not evaluate =SUM(B2:B7) and does not store a cached result (<v> element in the XML).

Excel's .xlsx format has two separate fields per cell: the formula (<f>) and the last-computed value (<v>). When a file is opened in Excel, Excel fills <v> and saves it. Until that happens, <v> is absent or stale.

openpyxl's data_only=True mode reads <v> only — so it returns None for any cell whose <v> has never been written.

What the XML looks like

Inside xl/worksheets/sheet1.xml (you can inspect it by unzipping the .xlsx), a formula cell written by openpyxl looks like:

<c r="B8" t="str">
  <f>SUM(B2:B7)</f>
  <!-- no <v> element — openpyxl never writes it -->
</c>

After Excel opens and saves the file:

<c r="B8">
  <f>SUM(B2:B7)</f>
  <v>99000</v>   <!-- Excel computed and cached this -->
</c>

data_only=True maps to reading <v>. If <v> is absent, openpyxl returns None.

Choosing the right fix

ScenarioBest fix
CI pipeline, no Excel/LibreOffice availableFix 2: compute in Python, write literals
Need live formulas in the file for end-usersFix 3: LibreOffice headless on Linux, or Fix 4: xlwings on Windows/macOS
Just auditing or rewriting formula textFix 1: data_only=False
Human workflow — file opened manually in ExcelVariant: data_only=True after Excel save

Minimal diagnostic

Confirm the root cause before applying a fix:

# pip install openpyxl
from pathlib import Path
import openpyxl

WORKBOOK = Path("report.xlsx")

try:
    # Read formula string (data_only=False is the default)
    wb_formula = openpyxl.load_workbook(WORKBOOK, data_only=False)
    formula_val = wb_formula["Sheet1"]["B8"].value
    print(f"data_only=False: {formula_val!r}")   # e.g. '=SUM(B2:B7)'

    # Read cached value
    wb_data = openpyxl.load_workbook(WORKBOOK, data_only=True)
    cached_val = wb_data["Sheet1"]["B8"].value
    print(f"data_only=True:  {cached_val!r}")    # None if Excel never opened the file
except FileNotFoundError as exc:
    raise SystemExit(f"File not found: {exc}") from exc

If data_only=False returns the formula string and data_only=True returns None, the file was never opened in Excel (or was created entirely by openpyxl without Excel touching it). That confirms the root cause.


Fix 1 — Read the formula, not the cached value

If you only need to inspect or transform the formula text, use data_only=False (the default). This is not a workaround — it is the correct mode for formula-aware reads.

# pip install openpyxl
from pathlib import Path
import openpyxl

WORKBOOK = Path("report.xlsx")

try:
    wb = openpyxl.load_workbook(WORKBOOK, data_only=False)  # default; explicit for clarity
except FileNotFoundError as exc:
    raise SystemExit(f"File not found: {exc}") from exc

ws = wb["Sheet1"]

for row in ws.iter_rows(min_row=2, max_row=ws.max_row):
    for cell in row:
        if isinstance(cell.value, str) and cell.value.startswith("="):
            print(f"{cell.coordinate}: formula = {cell.value!r}")

Use this when you are auditing formulas, rewriting cell references, or building a report template — anywhere the formula text matters more than the numeric result.


Fix 2 — Compute in Python and write literal values

The most portable fix: skip formulas entirely. Compute the result in Python (using pandas or plain arithmetic) and write the numeric value directly. No Excel round-trip required.

# pip install openpyxl pandas
from pathlib import Path
import openpyxl
import pandas as pd

SOURCE   = Path("sales_data.xlsx")
DEST     = Path("report_computed.xlsx")

try:
    df = pd.read_excel(SOURCE, sheet_name="Sales", engine="openpyxl")
except FileNotFoundError as exc:
    raise SystemExit(f"File not found: {exc}") from exc

# Compute values in pandas — no formulas needed
df["Margin"] = (df["Revenue"] - df["Cost"]) / df["Revenue"]
total_revenue = df["Revenue"].sum()
total_cost    = df["Cost"].sum()
avg_margin    = df["Margin"].mean()

# Write literal values — openpyxl stores them as <v>, readable with data_only=True
try:
    wb = openpyxl.load_workbook(SOURCE)
except FileNotFoundError as exc:
    raise SystemExit(f"Cannot open source: {exc}") from exc

ws = wb["Sales"]
summary_row = ws.max_row + 2
ws[f"A{summary_row}"] = "Total / Avg"
ws[f"B{summary_row}"] = total_revenue   # literal float, not a formula string
ws[f"C{summary_row}"] = total_cost
ws[f"D{summary_row}"] = avg_margin
ws[f"B{summary_row}"].number_format = '"$"#,##0'
ws[f"C{summary_row}"].number_format = '"$"#,##0'
ws[f"D{summary_row}"].number_format = "0.0%"

wb.save(DEST)
print(f"Written with literal values: {DEST}")

This approach integrates naturally with the Automating Excel Report Generation workflow and works in any CI/CD environment where Excel is not installed.


Fix 3 — Force recalculation with LibreOffice headless

If you must have Excel-compatible cached values (for downstream tools that read <v>) without opening Excel manually, use LibreOffice headless to open and re-save the file.

# Requires: LibreOffice installed (apt install libreoffice or brew install --cask libreoffice)
# pip install openpyxl (for verification step)
import subprocess
from pathlib import Path

WORKBOOK = Path("report.xlsx").resolve()
OUTPUT_DIR = WORKBOOK.parent

try:
    result = subprocess.run(
        [
            "libreoffice", "--headless",
            "--calc",
            "--convert-to", "xlsx",
            "--outdir", str(OUTPUT_DIR),
            str(WORKBOOK),
        ],
        capture_output=True,
        text=True,
        timeout=60,
        check=True,   # raises CalledProcessError on non-zero exit
    )
    print(result.stdout)
except FileNotFoundError:
    raise SystemExit("LibreOffice not found — install it or use Fix 2 instead.")
except subprocess.CalledProcessError as exc:
    raise SystemExit(f"LibreOffice failed: {exc.stderr}") from exc
except subprocess.TimeoutExpired:
    raise SystemExit("LibreOffice timed out after 60 s.")

# Verify the cached values are now present
import openpyxl
wb = openpyxl.load_workbook(WORKBOOK, data_only=True)
cached = wb["Sheet1"]["B8"].value
print(f"Cached value after LibreOffice recalc: {cached!r}")   # should be a number now

LibreOffice evaluates formulas on open and writes <v> before converting, so data_only=True returns numeric results. The output file replaces the input (LibreOffice uses the same filename).


Variant fix — data_only=True after Excel has opened the file

If a human has already opened the file in Excel and saved it, the cached values exist. You can then read them reliably with data_only=True.

# pip install openpyxl
from pathlib import Path
import openpyxl

# This only works if the file was previously saved by Excel (cached values present)
WORKBOOK = Path("report_excel_saved.xlsx")

try:
    wb = openpyxl.load_workbook(WORKBOOK, data_only=True)
except FileNotFoundError as exc:
    raise SystemExit(f"File not found: {exc}") from exc

ws = wb["Sheet1"]
cached = ws["B8"].value
print(f"Cached value: {cached!r}")   # numeric result, e.g. 99000

This works for interactive workflows (a human maintains the file) but is not reliable in fully automated pipelines — you cannot guarantee Excel has been run.


Variant fix — xlwings for live recalculation (Windows/macOS only)

On Windows or macOS, xlwings drives Excel via COM/AppleScript, forcing a true recalculation.

# pip install xlwings
# Requires Excel installed on Windows or macOS
from pathlib import Path
import xlwings as xw

WORKBOOK = Path("report.xlsx").resolve()

try:
    app = xw.App(visible=False)
    wb  = app.books.open(str(WORKBOOK))
    wb.app.calculate()         # force recalculation
    wb.save()
    wb.close()
    app.quit()
    print(f"Recalculated and saved: {WORKBOOK}")
except Exception as exc:
    raise SystemExit(f"xlwings error: {exc}") from exc

xlwings is the right tool when you need Excel-precise results and are running on a desktop OS. It does not work in Linux CI environments without Excel.


Additional troubleshooting: formulas that appear blank in Excel itself

Sometimes the formula is stored correctly but displays blank in Excel without triggering a recalculation. This is a separate problem from the data_only=True / Python read-back issue.

Symptom: The cell shows no value in Excel, even after opening.

Cause 1 — calcPr recalculation is set to manual. openpyxl sets calcPr to fullCalcOnLoad="1" by default, but if the workbook was created with a template that sets recalculation to manual, Excel will not auto-calculate on open.

Fix: In the Python script, explicitly set fullCalcOnLoad:

# pip install openpyxl
from pathlib import Path
import openpyxl

WORKBOOK = Path("report.xlsx")

try:
    wb = openpyxl.load_workbook(WORKBOOK)
except FileNotFoundError as exc:
    raise SystemExit(f"File not found: {exc}") from exc

# Force Excel to recalculate all formulas on open
wb.calculation.calcMode = "auto"
wb.calculation.fullCalcOnLoad = True

wb.save(WORKBOOK)
print("calcPr set to fullCalcOnLoad.")

Cause 2 — Array formula not marked as array. Formulas like =SUMPRODUCT(...) work as regular formulas, but =SUM(IF(...)) requires array entry (Ctrl+Shift+Enter in Excel). openpyxl does not support writing array formulas natively. In this case, rewrite the formula to avoid array syntax, or use a helper column.

Cause 3 — Circular reference. If the formula references its own cell (directly or via a chain), Excel may display zero or blank. Inspect the formula logic — openpyxl does not validate formula semantics.


Verification

After applying any fix, confirm the cached value is now readable:

# pip install openpyxl
from pathlib import Path
import openpyxl

WORKBOOK = Path("report.xlsx")

try:
    wb = openpyxl.load_workbook(WORKBOOK, data_only=True)
except FileNotFoundError as exc:
    raise SystemExit(f"File not found: {exc}") from exc

ws = wb["Sheet1"]
value = ws["B8"].value

assert value is not None, (
    "B8 is still None — file was not recalculated by Excel/LibreOffice, "
    "or cached value was not written. Use Fix 2 (literal values) instead."
)
assert isinstance(value, (int, float)), f"Expected numeric, got {type(value)}"
print(f"B8 cached value: {value}   ✓")

If the assertion fails after Fix 3 (LibreOffice), check that LibreOffice did not error silently — inspect result.stderr from the subprocess.run call.


Part of Writing Excel Formulas and Charts with openpyxl.