When working with large datasets or infinite sequences, loading everything into memory isn't practical. Generators and iterators solve this problem by producing values on demand, one at a time. They're memory-efficient and form the foundation of how Python handles iteration.
In this lesson, you'll learn the difference between iterables and iterators, how to create generator functions with yield, when to use generators instead of lists, and how to build custom iterable objects. These concepts are essential for efficient Python programming.
What You'll Learn
- Understanding iterators and iterables
- Creating custom iterators
- Generator functions (yield keyword)
- Generator expressions
- When to use generators (memory efficiency)
- Practical examples with large data
Understanding Iterables and Iterators
An iterable is any object you can loop over (lists, strings, dictionaries). An iterator is an object that produces values one at a time:
# Lists are iterables
my_list = [1, 2, 3, 4, 5]
# Getting an iterator from an iterable
my_iterator = iter(my_list)
print(next(my_iterator)) # 1
print(next(my_iterator)) # 2
print(next(my_iterator)) # 3
# The for loop does this automatically
for item in my_list:
print(item)
When you use a for loop, Python automatically calls iter() on the iterable and uses next() to get each value until StopIteration is raised.
Creating Custom Iterators
You can create your own iterator by implementing __iter__() and __next__() methods:
class CountDown:
"""Iterator that counts down from a number."""
def __init__(self, start):
self.current = start
def __iter__(self):
return self
def __next__(self):
if self.current <= 0:
raise StopIteration
self.current -= 1
return self.current + 1
# Using the custom iterator
counter = CountDown(5)
for num in counter:
print(num) # 5, 4, 3, 2, 1
Here's a more practical example:
class Range:
"""Custom range iterator."""
def __init__(self, start, stop, step=1):
self.start = start
self.stop = stop
self.step = step
self.current = start
def __iter__(self):
return self
def __next__(self):
if (self.step > 0 and self.current >= self.stop) or \
(self.step < 0 and self.current <= self.stop):
raise StopIteration
value = self.current
self.current += self.step
return value
# Using the custom range
for i in Range(0, 10, 2):
print(i) # 0, 2, 4, 6, 8
Generator Functions
Generator functions use the yield keyword instead of return. They return a generator object that produces values lazily:
def countdown(n):
"""Generator that counts down from n."""
while n > 0:
yield n
n -= 1
# Create generator
counter = countdown(5)
print(counter) # <generator object countdown at 0x...>
# Get values one at a time
print(next(counter)) # 5
print(next(counter)) # 4
print(next(counter)) # 3
# Or iterate over it
for num in countdown(5):
print(num) # 5, 4, 3, 2, 1
The key difference: return ends the function, but yield pauses it and resumes on the next call.
Why Use Generators?
Generators are memory-efficient because they produce values on demand:
# Creating a list (stores everything in memory)
def squares_list(n):
result = []
for i in range(n):
result.append(i ** 2)
return result
# Using a generator (produces values on demand)
def squares_generator(n):
for i in range(n):
yield i ** 2
# List: stores 1,000,000 numbers in memory
big_list = squares_list(1000000)
# Generator: produces values as needed
big_gen = squares_generator(1000000)
print(next(big_gen)) # 0
print(next(big_gen)) # 1
Practical Generator Examples
Here are real-world generator use cases:
# Example 1: Reading large files line by line
def read_large_file(filename):
"""Generator that reads file line by line."""
with open(filename, 'r') as file:
for line in file:
yield line.strip()
# Process file without loading entire content
for line in read_large_file("large_file.txt"):
process(line) # Process one line at a time
# Example 2: Infinite sequences
def fibonacci():
"""Generate Fibonacci numbers infinitely."""
a, b = 0, 1
while True:
yield a
a, b = b, a + b
# Get first 10 Fibonacci numbers
fib = fibonacci()
for _ in range(10):
print(next(fib)) # 0, 1, 1, 2, 3, 5, 8, 13, 21, 34
# Example 3: Processing data in chunks
def chunks(data, size):
"""Split data into chunks of specified size."""
for i in range(0, len(data), size):
yield data[i:i + size]
numbers = list(range(20))
for chunk in chunks(numbers, 5):
print(chunk) # [0,1,2,3,4], [5,6,7,8,9], ...
# Example 4: Filtering and transforming
def filter_and_transform(data, condition, transform):
"""Filter data and apply transformation."""
for item in data:
if condition(item):
yield transform(item)
numbers = range(10)
result = filter_and_transform(
numbers,
lambda x: x % 2 == 0, # Condition: even numbers
lambda x: x ** 2 # Transform: square
)
print(list(result)) # [0, 4, 16, 36, 64]
Generator Expressions
Generator expressions are like list comprehensions but create generators:
# List comprehension (creates list)
squares_list = [x ** 2 for x in range(10)]
# Generator expression (creates generator)
squares_gen = (x ** 2 for x in range(10))
print(type(squares_list)) # <class 'list'>
print(type(squares_gen)) # <class 'generator'>
# Generator expressions are memory-efficient
large_gen = (x ** 2 for x in range(1000000))
print(sum(large_gen)) # Processes without storing all values
Sending Values to Generators
Generators can receive values using .send():
def number_generator():
"""Generator that can receive values."""
while True:
value = yield
if value is not None:
print(f"Received: {value}")
gen = number_generator()
next(gen) # Start the generator
gen.send(10) # Received: 10
gen.send(20) # Received: 20
Generator Pipelines
You can chain generators to create processing pipelines:
def numbers():
"""Generate numbers."""
for i in range(10):
yield i
def even_numbers(source):
"""Filter even numbers."""
for num in source:
if num % 2 == 0:
yield num
def square(source):
"""Square numbers."""
for num in source:
yield num ** 2
# Create pipeline
pipeline = square(even_numbers(numbers()))
print(list(pipeline)) # [0, 4, 16, 36, 64]
Try It Yourself
Practice with these exercises:
-
Prime Generator: Create a generator that produces prime numbers. Use it to find the first 20 primes.
-
File Processor: Create a generator that reads a CSV file and yields dictionaries (one per row) without loading the entire file.
-
Sliding Window: Create a generator that takes a list and window size, yielding sliding windows of that size.
-
Infinite Sequence: Create a generator for an infinite sequence (like powers of 2) and use it to find values up to a limit.
-
Generator Pipeline: Create a pipeline that filters, transforms, and aggregates data using multiple generator functions.
Summary
Generators and iterators are essential for memory-efficient Python programming. Iterators produce values one at a time, while generators use yield to create iterators easily. Generator expressions provide a concise syntax for creating generators.
Use generators when working with large datasets, infinite sequences, or when you only need to iterate once. They save memory and can make your code more efficient. Understanding generators will help you write better Python code and understand how many Python features work internally.
What's Next?
In the next lesson, we'll begin exploring object-oriented programming. You'll learn how to define classes, create objects, and understand the fundamental concepts of OOP. This is a major paradigm shift that will open up new ways to organize and structure your code.