Performance Optimization: Eliminating Redundant Function Calls
Retrospective from the ArtCraft project - a Warcraft-style RTS built with Love2D
Objective
Eliminate redundant calls to getAllUnits() and getAllBuildings() which were being called 3+ times per frame, creating new tables and iterating through 7 unit types + 10 building types each time.
Analysis
Initial Investigation
Examined gameplay.lua to trace all call sites:
| Line | Function | Context |
|---|---|---|
| 2668-2669 | Both | Fog of war update |
| 2675 | getAllBuildings() |
Immediately after above (redundant) |
| 2971-2972 | Both | Combat updates |
| 298-299 | Both | Inside separateUnits() |
| 3041 | getAllUnits() |
After separateUnits() (redundant) |
Key Finding
Not all calls were truly redundant. Between some calls, the underlying data changes:
- Lines 2668 -> 2971: Units can spawn (peons from town hall, footmen from barracks, etc.) during building updates. Fresh call is necessary.
- Lines 2971 -> 3038:
removeDeadUnits()modifies source arrays.separateUnits()needs fresh data. Necessary. - Line 2675: Called 6 lines after 2669 with no data changes. Redundant.
- Line 3041: Called immediately after
separateUnits()which already computed the same list. Redundant.
Solution Approaches Considered
Approach 1: Frame-Level Caching (Rejected)
Initially proposed adding module-level cache variables:
local cachedBuildings = nil
local cachedUnits = nil
local function invalidateFrameCache()
cachedBuildings = nil
cachedUnits = nil
end
Why rejected: User raised valid concern about polluting module scope with more locals that could cause problems later (stale data bugs, forgotten invalidation). Agreed to pursue simpler approach.
Approach 2: Remove Redundant Calls (Implemented)
Simply restructure code to reuse existing variables where data hasn’t changed.
Implementation
Change 1: Remove redundant getAllBuildings() at line 2675
Before:
local allUnits = getAllUnits()
local allBuildings = getAllBuildings()
map:updateFog(allUnits, allBuildings, playerTeam)
calculatePopulation()
updateRequirementsState()
local buildings = getAllBuildings() -- REDUNDANT
-- Town hall
local peonReady, upgradeComplete, _ = townHall:update(gameDt)
After:
local allUnits = getAllUnits()
local allBuildings = getAllBuildings()
map:updateFog(allUnits, allBuildings, playerTeam)
calculatePopulation()
updateRequirementsState()
-- Town hall
local peonReady, upgradeComplete, _ = townHall:update(gameDt)
Note: The removed buildings variable was never used - allBuildings from line 2669 was already available.
Change 2: Modify separateUnits() to return its computed lists
Before:
local function separateUnits()
local allUnits = getAllUnits()
local buildings = getAllBuildings()
-- ... separation logic ...
end
After:
local function separateUnits()
local allUnits = getAllUnits()
local allBuildings = getAllBuildings()
-- ... separation logic ...
return allUnits, allBuildings
end
Also renamed internal buildings to allBuildings for consistency.
Change 3: Reuse returned values instead of redundant call
Before:
-- Separate overlapping units
separateUnits()
-- Ensure no units are inside buildings (safety check)
local allUnits = getAllUnits() -- REDUNDANT
for _, unit in ipairs(allUnits) do
pushUnitOutOfBuildings(unit)
end
After:
-- Separate overlapping units (returns fresh lists after dead removal)
local allUnits, allBuildings = separateUnits()
-- Ensure no units are inside buildings (safety check)
for _, unit in ipairs(allUnits) do
pushUnitOutOfBuildings(unit)
end
Verification
Benchmark Creation
Created benchmark_getAllUnits.lua to measure the improvement. Added --benchmark flag to main.lua for easy execution.
The benchmark simulates both OLD (8 function calls per frame) and NEW (6 function calls per frame) patterns across different game sizes.
Benchmark Results
Ran love . --benchmark:
| Scenario | Units | Buildings | OLD | NEW | Improvement | Per Frame |
|---|---|---|---|---|---|---|
| Early game | 20 | 10 | 0.028s | 0.021s | 25% | 0.7 us |
| Mid game | 50 | 20 | 0.065s | 0.048s | 27% | 1.75 us |
| Late game | 100 | 30 | 0.116s | 0.086s | 26% | 3.06 us |
| Stress test | 200 | 50 | 0.206s | 0.165s | 20% | 4.05 us |
Real-World Impact
At 60 FPS with 100 units (typical late-game scenario):
- 3 us saved per frame
- 180 us saved per second
- Reduces table allocations by 25% for these functions
- Less GC pressure from fewer temporary tables
Issues Encountered
-
Edit tool string matching: Initial edit failed because I copied grep output that had a typo (
ballistasinstead ofunits). Resolved by re-reading the exact file content. -
No standalone Lua interpreter:
luaandluacweren’t installed. Resolved by integrating benchmark into Love2D via--benchmarkflag. -
Git stash workflow: Used
git stash/git stash popto compare before/after using the same benchmark code.
Conclusion
Reduced getAllUnits()/getAllBuildings() calls from 8 to 6 per frame by eliminating genuinely redundant calls while preserving necessary ones where data changes. The optimization is minimal in code complexity (no new module-level state) while providing measurable improvement (~25% reduction in this specific overhead).