I'm not entirely sure I follow what you are aiming for here yet. Are you trying to aggregate across features per pixel to calculate various statistics? Like, what is the average feature size of all features that intersect this pixel?
Thus if the first pixel has features 1, 2, 3, you want to calculate statistics on an attribute of those features? And if the second pixel has only features 1 and 3, you'd do the same? So your 3rd dimension, and your basis for statistics, is the set of features per pixel?
How many features do you have? As written here, I'd expect the performance to be something like number of features * time to rasterize a single feature PLUS converting the data between formats on each iteration of the loop (i.e., from internal GDAL representation to numpy). I'd recommend timing rasterization of a single feature and see how many features you have - it might be that 3 hours is perfectly normal for that number of features.
In contrast, when you do all features at once, you are doing the data format conversions only once.
1) rasterize feature IDs (index w/in gdf) instead of a floating point like value. This would let rasterize work on an unsigned integer dtype instead of float64 (theoretically resulting in a much smaller memory footprint and less I/O during data format conversion).
This will make it so that you only have to rasterize once, but then can use gdf to create arrays on the fly by looking up feature ID to the attribute you want.
2) rasterize onto the smallest possible array for each feature. The idea is that you figure out your target array dimensions, then calculate a "window" within that covered by the feature's bounds. Then rasterize that smaller window; it will be a bit more efficient because you reduce the size of the arrays that need to be converted. Then you paste the result of rasterization into your target array. You'll have to keep the transforms aligned so that the windowed rasterization can be directly pasted into the target. The performance gains (if any) are likely limited to those cases where your features individually only cover small areas of the target.
3) do some profiling / timing of individual operations - see if you can find those things that are taking the most time. I'd be interested in seeing where your code is spending most of its time.