How lazy loading works across Python, mobile and web
Yesterday, a close friend of mine got asked about virtualization in a Flutter interview. I’ve never heard of virtualization in context of mobile apps before, apparently it’s just lazy loading. Mobile apps aren’t really my strong suit, but I’ve worked with ListView.builder in the past.
On the web, I barely think about it. It’s nuanced, sure, but I consider pandas chunking, python generators, PyTorch dataloaders all to be lazy loaders. That got me curious - how exactly does lazy loading work across these. And do I have to call it virtualization?
Why be lazy about loading data?
Say you have GBs worth of data/ UI. You don’t really need to load everything. You just load what you need. Same logic as pagination. You don’t send all 10000 products; you send back page 1 with 10 products. To me, it’s the same thing, different names:
- ML: not loading entire dataset into RAM
- Backend: not reading entirety of huge files (logs/ databases)
- Frontend: not rendering components that aren’t visible (think blades of grass in a game)
- Mobile: not building widgets outside the viewport
Lazy loading across technologies
Generator is Python being lazy
Python handles this natively.
def get_doubled_numbers():
result = []
for i in range(1_000_000):
result.append(i * 2)
return result
numbers = get_doubled_numbers() # ~8mb memory allocated
# VERSUS
def get_doubled_numbers_lazy():
for i in range(1_000_000):
yield i * 2
numbers = get_doubled_numbers_lazy() # memory savedWhen you call a function with yield, Python returns a generator object (and doesn’t run it) with three methods:
__iter__()- makes it iterable__next__()- gets the next valuesend()- for two-way communication (I don’t use it, maybe look it up if you want to see what it does)
The function’s state is frozen between yields. Generators dont store entire sequence, just the current state. Take up almost no memory. If that makes no sense, think of playing from a checkpoint vs starting from the start.
A normal function’s bytecode looks like:
LOAD_CONST
RETURN_VALUE Generator function’s bytecode looks like:
LOAD_CONST
YIELD_VALUE (save state, return value, pause)Generator’s yield opcode literally saves the current frame state (local vars, instruction pointer, eval stack), returns value, and just suspends the frame (without destroying). So, that pauses execution. When you call next() again, Python restores saved frame, continues from next instructino, and runs until next YIELD_VALUE or end of function.
Basically generators are functions that can pause and resume. Ditto as coroutines (but they can accept args after being called; generators can’t).
Looking at the bytecode confirms this. YIELD_VALUE is basically a checkpoint. When you call next(), it jumps directly to that checkpoint.
import dis
def simple_gen():
yield 1
yield 2
dis.dis(simple_gen)
# output
# 3 0 RETURN_GENERATOR
# 2 POP_TOP
# 4 RESUME 0
# 4 6 LOAD_CONST 1 (1)
# 8 YIELD_VALUE 1 --> this is a checkpoint.
# 10 RESUME 1
# 12 POP_TOP
# 5 14 LOAD_CONST 2 (2)
# 16 YIELD_VALUE 1 --> another checkpoint
# 18 RESUME 1
# 20 POP_TOP
# 22 RETURN_CONST 0 (None)
# >> 24 CALL_INTRINSIC_1 3 (INTRINSIC_STOPITERATION_ERROR)
# 26 RERAISE 1Chunking is lazy loading for data
Pandas has an easy hack for loading huge CSVs - chunksize. Each chunk is just a dataframe.
for chunk in pd.read_csv('huge_file.csv', chunksize=10_000):
# process chunkpd.read_csv(huge_file_path, chunksize=N) returns a TextFileReader object. Basically a generator wrapper around the file. I’m simplifying this, but something like this should work:
class TextFileReader:
def __init__(self, filepath, chunksize):
self.file = open(filepath)
self.chunksize = chunksize
self.parser = CSVParser()
def __iter__(self):
return self
def __next__(self):
# read N lines
lines = []
for _ in range(self.chunksize):
line = self.file.readline()
if not line:
raise StopIteration
lines.append(line)
# parse into df
return self.parser.parse(lines)Memory usage here is just the single chunk + parser overhead.
Batching in PyTorch is the same thing
PyTorch has a Dataset and DataLoader (to load items in batches).
from torch.utils.data import Dataset, DataLoader
class ImageDataset(Dataset):
def __init__(self, image_paths):
self.paths = image_paths
def __len__(self):
return len(self.paths)
def __getitem__(self, idx):
# only loads when requested
img = load_image(self.paths[idx])
return transform(img)
dataset = ImageDataset(paths) # no images loaded
loader = DataLoader(dataset, batch_size=32, num_workers=4)
for batch in loader:
# actually load images..
train(model, batch)This, of course, uses worker processes. When one batch is training, a worker gets another batch so GPU doesnt have to wait for data.
What about virtualization?
In OS, we like to pretend we have more RAM than we do by swapping to disk.
In UI context, virtualization apparently means - pretend you have N items, but only render M items (M << N).
Virtualization” in UI contexts means: pretend you have N items, but only actually render M items (M << N).
I think lazy loading is clearer, and that’s what I’d use to say we’re building more widgets when we have to.
Next.js lazy loading
Next.js is about loading js, not data.
“javascript / this loads immediately mport HeavyComponent from ’./HeavyComponent’
/ this loads lazily onst HeavyComponent = dynamic(() => import(’./HeavyComponent’)) “
next/dynamic:
- creates a separate js bundle for that component at build time
- returns a wrapper component that loads bundle when rendered (at runtime)
Next.js uses Webpack’s code splitting for this.
Is Intersection Observer obsolete?
Not really. next/dynamic handles code splitting (loading js). Intersection Observer handles loading images/content and viewport stuff.
// for lazy loading images
const [isVisible, setIsVisible] = useState(false);
useEffect(() => {
const observer = new IntersectionObserver(([entry]) => {
if (entry.isIntersecting) {
setIsVisible(true);
observer.disconnect();
}
});
observer.observe(elementRef.current);
}, []);
return isVisible ? <img src="large.jpg" /> : <div>loading large image..</div>;loading="lazy" internally uses Intersection Observer so you won’t have to write manually. Intersection Observer was kind of big thing because it let the viewport tracking thing happen on its own thread, allowed batching notifications, and could be browser-optimized. I think most devs know this,
Final Thoughts
I think lazy loading is something we take for granted. It’s great that we don’t have to care, but implementation varies wildly:
- Python generators use function state freezing
- Pandas uses file cursors and iterators
- PyTorch uses multi-process worker pools
- Flutter calculates viewport changes and widget lifecycle
- Next.js uses code splitting and dynamic imports
The idea is defer, on-demand, discard. I don’t know why you’d prefer to call it virtualization- I’d just call it lazy loading widgets/ data/ whatever.
Footnotes:
-
Pandas
chunksizeworks withread_csv,read_sql, andread_jsonbut not withread_parquetbecause Parquet files are column-oriented and streaming row-by-row isnt efficient. -
I’ve heard the Intersection Observer API was added to browsers because people kept writing terrible scroll event listeners for lazy loading. It’s way better because it runs in the compositor thread. I’m certainly happy Next.js takes care of it.