Skip to content
C Codeloom
Python

Reading and Writing Files in Python

A practical guide to file I/O in Python — open modes, the with statement, reading and writing text and binary, working with paths, and handling common errors.

·7 min read · By Yash Kesharwani
Intermediate 10 min read

What you'll learn

  • How open and the with statement work together
  • The full set of file modes and when to use each
  • Idiomatic ways to read text line by line and all at once
  • How to write and append safely
  • Working with paths using pathlib
  • Reading and writing binary files and JSON

Prerequisites

Almost every useful program reads or writes something — configuration, logs, data files, CSVs, JSON, or images. Python’s file handling is short on ceremony and long on convention. Learn the half-dozen patterns in this post and you will handle the vast majority of file work cleanly.

open and the with statement

open(path, mode) returns a file object. The simplest usage:

f = open("notes.txt")
contents = f.read()
f.close()

This works but is fragile: if read() raises, close() never runs. The idiomatic form is the with statement, which closes the file automatically when the block ends — even on exceptions:

with open("notes.txt") as f:
    contents = f.read()

Always use with for file I/O. There is no good reason not to.

You can also open multiple files in one with:

with open("input.txt") as src, open("output.txt", "w") as dst:
    dst.write(src.read())

File modes

The mode string controls how the file is opened. The main characters:

ModeMeaning
rRead (default). File must exist.
wWrite. Creates or truncates the file.
aAppend. Creates if missing.
xExclusive create. Fails if file exists.
bBinary mode (combine with another).
tText mode (default, combine with another).
+Read and write.

The most common combinations are "r", "w", "a", "rb", and "wb".

Always specify the encoding when working with text. The default is usually UTF-8 but is platform-dependent:

with open("notes.txt", "r", encoding="utf-8") as f:
    ...

Specifying encoding="utf-8" explicitly makes your code portable.

Reading text

A text file object is iterable. The cleanest way to read line by line is to iterate directly:

with open("notes.txt", encoding="utf-8") as f:
    for line in f:
        print(line.rstrip())

Each line includes its trailing newline; rstrip() removes it. Iterating reads one line at a time, so this works for huge files.

To read the whole file at once:

with open("notes.txt", encoding="utf-8") as f:
    text = f.read()

Use read() only when you actually need the whole content in memory. For multi-gigabyte files, iterate line by line instead.

To read all lines into a list:

with open("notes.txt", encoding="utf-8") as f:
    lines = f.readlines()

Equivalent to list(f). Lines still include their trailing newlines.

Writing text

"w" mode truncates the file (or creates it). "a" appends. In both cases, write does not add a newline — you do:

with open("output.txt", "w", encoding="utf-8") as f:
    f.write("first line\n")
    f.write("second line\n")

For many lines, writelines takes any iterable of strings (and still does not add newlines):

lines = ["first\n", "second\n", "third\n"]
with open("output.txt", "w", encoding="utf-8") as f:
    f.writelines(lines)

Or use print with a file= argument, which does add the newline for you:

with open("output.txt", "w", encoding="utf-8") as f:
    for line in ["first", "second", "third"]:
        print(line, file=f)

I find print(..., file=f) the most readable for one-line-per-record output.

Try it yourself. Create a file numbers.txt containing the integers 1 through 20, one per line. Then read it back and print only the even numbers. Use with for both the write and the read.

Paths with pathlib

The pathlib module is the modern, object-oriented way to work with paths. It is part of the standard library and replaces most uses of os.path:

from pathlib import Path

p = Path("data") / "users" / "alice.json"
print(p)              # data/users/alice.json
print(p.parent)       # data/users
print(p.name)         # alice.json
print(p.stem)         # alice
print(p.suffix)       # .json

Path objects support direct I/O:

from pathlib import Path

text = Path("notes.txt").read_text(encoding="utf-8")
Path("output.txt").write_text("Hello!\n", encoding="utf-8")

read_text and write_text are convenient for small files. For anything streaming or line-by-line, fall back to Path.open(...), which is the same as open(path, ...).

A few more handy methods:

p = Path("notes.txt")
p.exists()                      # True or False
p.is_file()                     # True if regular file
p.is_dir()                      # True if directory
p.stat().st_size                # size in bytes
p.unlink(missing_ok=True)       # delete if it exists

Path("logs").mkdir(parents=True, exist_ok=True)
for entry in Path("logs").iterdir():
    print(entry)

for path in Path("src").rglob("*.py"):
    print(path)

pathlib consistently returns Path objects, not strings. Mixing the two works almost everywhere, but stay consistent within a function.

Handling missing files and other errors

open raises FileNotFoundError if the path does not exist in read mode, and PermissionError if the operating system refuses access. Wrap I/O in try/except when the program can do something useful on failure:

from pathlib import Path

def load_optional_config(path: str) -> dict:
    try:
        return parse(Path(path).read_text(encoding="utf-8"))
    except FileNotFoundError:
        return {}

If you cannot recover, do nothing — let the exception propagate. A traceback with a clear FileNotFoundError is far more useful than a silent empty result. See Error Handling for the full philosophy.

Binary files

For non-text content — images, executables, compressed data — open in binary mode. read and write then deal in bytes, not str:

with open("image.png", "rb") as src, open("copy.png", "wb") as dst:
    dst.write(src.read())

For very large binary files, read in chunks:

CHUNK = 64 * 1024
with open("input.bin", "rb") as src, open("output.bin", "wb") as dst:
    while True:
        chunk = src.read(CHUNK)
        if not chunk:
            break
        dst.write(chunk)

Do not mix text and binary modes — the type errors are confusing. Decide upfront which kind of data you are handling.

Working with JSON

JSON is the most common structured-data format. The json module integrates with file I/O directly:

import json
from pathlib import Path

data = {"name": "Alice", "roles": ["admin", "editor"]}

# Write
Path("user.json").write_text(json.dumps(data, indent=2), encoding="utf-8")

# Read
loaded = json.loads(Path("user.json").read_text(encoding="utf-8"))
print(loaded["roles"])    # ['admin', 'editor']

json.dump(obj, file) and json.load(file) take file objects directly if you want to stream:

import json
with open("user.json", "w", encoding="utf-8") as f:
    json.dump(data, f, indent=2)

with open("user.json", "r", encoding="utf-8") as f:
    loaded = json.load(f)

For data that fits in memory, the read_text/json.loads pair is hard to beat for clarity.

Try it yourself. Write a function count_words_in_file(path) that returns a dictionary mapping each word to its count, using the counting pattern from Python Dictionaries. Then write the result to counts.json, formatted with indent=2.

A worked example: a tiny log analyser

A small program that reads a log file, counts log levels, and writes a summary as JSON:

import json
from collections import Counter
from pathlib import Path

LEVELS = {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"}

def analyze(log_path: Path, summary_path: Path) -> None:
    counter: Counter[str] = Counter()
    line_count = 0

    with log_path.open(encoding="utf-8") as f:
        for line in f:
            line_count += 1
            parts = line.split(None, 2)
            if len(parts) >= 2 and parts[1] in LEVELS:
                counter[parts[1]] += 1

    summary = {
        "source": str(log_path),
        "lines": line_count,
        "by_level": dict(counter),
    }
    summary_path.write_text(json.dumps(summary, indent=2), encoding="utf-8")

# Example usage (assuming app.log exists):
# analyze(Path("app.log"), Path("summary.json"))

This combines streaming line-by-line reading, pathlib, a Counter, and JSON output — a complete, realistic shape for a small data-processing script.

Recap

You now know:

  • Always use with open(...) as f: — it closes the file automatically
  • Specify encoding="utf-8" for text files to keep your code portable
  • Iterate the file object for line-by-line reading; read() for whole-file
  • "w" truncates, "a" appends, "x" fails if the file exists
  • pathlib is the modern path API — Path.read_text, Path.write_text, Path.iterdir
  • Binary mode ("rb", "wb") deals in bytes, not str
  • JSON read/write fits naturally with json.dumps/json.loads and Path

Next steps

You now have a complete toolkit for reading and writing data. The final post in this intermediate series shows how to break larger code into modules and import them — the last building block before you start writing genuinely substantial Python.

→ Next: Modules and Imports in Python

Questions or feedback? Email codeloomdevv@gmail.com.