Reading and Writing Files in Python
A practical guide to file I/O in Python — open modes, the with statement, reading and writing text and binary, working with paths, and handling common errors.
What you'll learn
- ✓How open and the with statement work together
- ✓The full set of file modes and when to use each
- ✓Idiomatic ways to read text line by line and all at once
- ✓How to write and append safely
- ✓Working with paths using pathlib
- ✓Reading and writing binary files and JSON
Prerequisites
- •Comfortable with for loops — see For Loops
- •Comfortable with error handling — see Error Handling
Almost every useful program reads or writes something — configuration, logs, data files, CSVs, JSON, or images. Python’s file handling is short on ceremony and long on convention. Learn the half-dozen patterns in this post and you will handle the vast majority of file work cleanly.
open and the with statement
open(path, mode) returns a file object. The simplest usage:
f = open("notes.txt")
contents = f.read()
f.close()
This works but is fragile: if read() raises, close() never runs. The idiomatic form is the with statement, which closes the file automatically when the block ends — even on exceptions:
with open("notes.txt") as f:
contents = f.read()
Always use with for file I/O. There is no good reason not to.
You can also open multiple files in one with:
with open("input.txt") as src, open("output.txt", "w") as dst:
dst.write(src.read())
File modes
The mode string controls how the file is opened. The main characters:
| Mode | Meaning |
|---|---|
r | Read (default). File must exist. |
w | Write. Creates or truncates the file. |
a | Append. Creates if missing. |
x | Exclusive create. Fails if file exists. |
b | Binary mode (combine with another). |
t | Text mode (default, combine with another). |
+ | Read and write. |
The most common combinations are "r", "w", "a", "rb", and "wb".
Always specify the encoding when working with text. The default is usually UTF-8 but is platform-dependent:
with open("notes.txt", "r", encoding="utf-8") as f:
...
Specifying encoding="utf-8" explicitly makes your code portable.
Reading text
A text file object is iterable. The cleanest way to read line by line is to iterate directly:
with open("notes.txt", encoding="utf-8") as f:
for line in f:
print(line.rstrip())
Each line includes its trailing newline; rstrip() removes it. Iterating reads one line at a time, so this works for huge files.
To read the whole file at once:
with open("notes.txt", encoding="utf-8") as f:
text = f.read()
Use read() only when you actually need the whole content in memory. For multi-gigabyte files, iterate line by line instead.
To read all lines into a list:
with open("notes.txt", encoding="utf-8") as f:
lines = f.readlines()
Equivalent to list(f). Lines still include their trailing newlines.
Writing text
"w" mode truncates the file (or creates it). "a" appends. In both cases, write does not add a newline — you do:
with open("output.txt", "w", encoding="utf-8") as f:
f.write("first line\n")
f.write("second line\n")
For many lines, writelines takes any iterable of strings (and still does not add newlines):
lines = ["first\n", "second\n", "third\n"]
with open("output.txt", "w", encoding="utf-8") as f:
f.writelines(lines)
Or use print with a file= argument, which does add the newline for you:
with open("output.txt", "w", encoding="utf-8") as f:
for line in ["first", "second", "third"]:
print(line, file=f)
I find print(..., file=f) the most readable for one-line-per-record output.
Try it yourself. Create a file numbers.txt containing the integers 1 through 20, one per line. Then read it back and print only the even numbers. Use with for both the write and the read.
Paths with pathlib
The pathlib module is the modern, object-oriented way to work with paths. It is part of the standard library and replaces most uses of os.path:
from pathlib import Path
p = Path("data") / "users" / "alice.json"
print(p) # data/users/alice.json
print(p.parent) # data/users
print(p.name) # alice.json
print(p.stem) # alice
print(p.suffix) # .json
Path objects support direct I/O:
from pathlib import Path
text = Path("notes.txt").read_text(encoding="utf-8")
Path("output.txt").write_text("Hello!\n", encoding="utf-8")
read_text and write_text are convenient for small files. For anything streaming or line-by-line, fall back to Path.open(...), which is the same as open(path, ...).
A few more handy methods:
p = Path("notes.txt")
p.exists() # True or False
p.is_file() # True if regular file
p.is_dir() # True if directory
p.stat().st_size # size in bytes
p.unlink(missing_ok=True) # delete if it exists
Path("logs").mkdir(parents=True, exist_ok=True)
for entry in Path("logs").iterdir():
print(entry)
for path in Path("src").rglob("*.py"):
print(path)
pathlib consistently returns Path objects, not strings. Mixing the two works almost everywhere, but stay consistent within a function.
Handling missing files and other errors
open raises FileNotFoundError if the path does not exist in read mode, and PermissionError if the operating system refuses access. Wrap I/O in try/except when the program can do something useful on failure:
from pathlib import Path
def load_optional_config(path: str) -> dict:
try:
return parse(Path(path).read_text(encoding="utf-8"))
except FileNotFoundError:
return {}
If you cannot recover, do nothing — let the exception propagate. A traceback with a clear FileNotFoundError is far more useful than a silent empty result. See Error Handling for the full philosophy.
Binary files
For non-text content — images, executables, compressed data — open in binary mode. read and write then deal in bytes, not str:
with open("image.png", "rb") as src, open("copy.png", "wb") as dst:
dst.write(src.read())
For very large binary files, read in chunks:
CHUNK = 64 * 1024
with open("input.bin", "rb") as src, open("output.bin", "wb") as dst:
while True:
chunk = src.read(CHUNK)
if not chunk:
break
dst.write(chunk)
Do not mix text and binary modes — the type errors are confusing. Decide upfront which kind of data you are handling.
Working with JSON
JSON is the most common structured-data format. The json module integrates with file I/O directly:
import json
from pathlib import Path
data = {"name": "Alice", "roles": ["admin", "editor"]}
# Write
Path("user.json").write_text(json.dumps(data, indent=2), encoding="utf-8")
# Read
loaded = json.loads(Path("user.json").read_text(encoding="utf-8"))
print(loaded["roles"]) # ['admin', 'editor']
json.dump(obj, file) and json.load(file) take file objects directly if you want to stream:
import json
with open("user.json", "w", encoding="utf-8") as f:
json.dump(data, f, indent=2)
with open("user.json", "r", encoding="utf-8") as f:
loaded = json.load(f)
For data that fits in memory, the read_text/json.loads pair is hard to beat for clarity.
Try it yourself. Write a function count_words_in_file(path) that returns a dictionary mapping each word to its count, using the counting pattern from Python Dictionaries. Then write the result to counts.json, formatted with indent=2.
A worked example: a tiny log analyser
A small program that reads a log file, counts log levels, and writes a summary as JSON:
import json
from collections import Counter
from pathlib import Path
LEVELS = {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"}
def analyze(log_path: Path, summary_path: Path) -> None:
counter: Counter[str] = Counter()
line_count = 0
with log_path.open(encoding="utf-8") as f:
for line in f:
line_count += 1
parts = line.split(None, 2)
if len(parts) >= 2 and parts[1] in LEVELS:
counter[parts[1]] += 1
summary = {
"source": str(log_path),
"lines": line_count,
"by_level": dict(counter),
}
summary_path.write_text(json.dumps(summary, indent=2), encoding="utf-8")
# Example usage (assuming app.log exists):
# analyze(Path("app.log"), Path("summary.json"))
This combines streaming line-by-line reading, pathlib, a Counter, and JSON output — a complete, realistic shape for a small data-processing script.
Recap
You now know:
- Always use
with open(...) as f:— it closes the file automatically - Specify
encoding="utf-8"for text files to keep your code portable - Iterate the file object for line-by-line reading;
read()for whole-file "w"truncates,"a"appends,"x"fails if the file existspathlibis the modern path API —Path.read_text,Path.write_text,Path.iterdir- Binary mode (
"rb","wb") deals inbytes, notstr - JSON read/write fits naturally with
json.dumps/json.loadsandPath
Next steps
You now have a complete toolkit for reading and writing data. The final post in this intermediate series shows how to break larger code into modules and import them — the last building block before you start writing genuinely substantial Python.
→ Next: Modules and Imports in Python
Questions or feedback? Email codeloomdevv@gmail.com.