The Reason I Decided to Write This Tutorial and What We'll Be Covering

On the internet, I often see a lot of misinformation about optimization, or people using Unreal Engine tools in the most unoptimized ways possible. This includes, but is not limited to; creating individual materials for every texture, using inefficient file formats for textures, assigning complex collision to meshes where it's not needed, running CPU calculations on thousands of particles just to create puddle splash effects, and focusing way too much on triangle counts (this one especially came up a lot during my teaching sessions), using particles where a material shader could get the job done, and so on.

I’ll try to dive into every major topic; from baking lights to creating efficient materials, reducing draw calls, and identifying what overloads your CPU, in the simplest way possible. That said, you should be aware that every project will have its own needs, and the "sacrifices" you need to make will depend on the genre and how your game is structured. This will be quite an extensive guide, so I’ll break it into different parts. At the end of each post, you’ll find links to the previous and next tutorials. Now that we’ve got that out of the way, let’s begin the tutorial.

The Fundamentals of Video Game Optimization: Concepts and Terms You Need to Be Familiar With


Drawcall

Almost everything in your game contributes to a "draw call."

  • A static mesh in the scene: Contributes a draw call.
  • A different material on the same mesh: Can cause a separate draw call.
  • An actor: Might contribute a draw call if it includes visible components.
  • Lighting: Dynamic lights can increase draw calls; baked/static lighting generally doesn't, since static lighting is calculated offline, it doesn't add draw calls at runtime.

What the hell is a "draw call," then?

A "draw call" is basically when the CPU "calls" the GPU and says, "Hey, we have this mesh here. You need to draw (render) it." And the GPU goes, "Alright, I'll do it." Even though modern GPUs can render LOTS of polygons, when CPUs need to make/organize a lot of "calls" (preparing many draw calls to tell the GPU to render something) per frame, it can saturate the CPU and cause a performance hit. This is the main reason why we need to keep our draw calls at a reasonable level.


Understanding CPU and GPU Overhead; and Common Practices to Prevent It

When it comes to drawing (rendering) calls, the heavy work is handled by the GPU. The CPU’s role is mostly to make the “call” to the GPU. Most of the time, unoptimized scenes will cause overhead on the GPU since it's doing the heavy lifting (rendering). However, it's important to remember that the CPU still plays a crucial role in the process, and its resources are also used. More complex materials, shaders, high-resolution textures, post-processing effects, etc., mean more work for the GPU.

The CPU is more concerned with collisions. It constantly checks for collision overlaps, handling responses such as: "Did it trigger? Did it bounce? Did it block?". More collisions to check = more overhead on the CPU. The more complex the collision gets = the more overhead on the CPU.

If you're wondering which has more impact on performance: draw calls or collisions: the answer is a bit of both. For static objects, high draw calls might matter more. For dynamic or interactive objects, complex collision is usually more damaging. A single merged mesh with complex collision can wreck performance far worse than a dozen simple meshes with more draw calls.

Collision is expensive. Complex collision is even more expensive. Remove collision from objects the player can't interact with. A good example would be small items like books, lamps, keys, or pens placed on a table or sofa. If the player can't interact with them and won't collide with them, remove their collision entirely. For objects the player can collide with, delete the default collision generated by Unreal and add a simple box (or capsule) collision instead. Instead of using complex collision for traversal or objects with intricate creases, bumps, or shapes, use simple convex collision or combine multiple basic shapes. I explain why and how to do this in more detail below.

Your biggest direct enemy isn’t triangle count. It still matters, but here’s the truth: modern hardware can handle far more triangles than you’ll typically need. The real performance bottleneck, especially in modern engines, is lighting complexity and shadows. Think dynamic lights, overlapping lights, reflections, volumetrics; basically anything that’s calculated in real time. All of this puts a heavy load on the GPU. Triangle count affects how many vertices need to be processed, but it doesn’t scale with screen resolution (which is good) the way lighting does. Then there’s your shaders. Complex materials; like subsurface scattering, dynamic global illumination, and especially translucent materials are GPU-intensive. Most modern engines use deferred rendering, which separates lighting from geometry. That allows you to push more triangles, but lighting still dominates the cost. Lighting affects all objects, no matter their poly count. Even low-poly meshes can be expensive if they’re lit by multiple dynamic lights with shadows. Then come post-processing effects, and after that, draw calls and object counts. Triangle count becomes more of a factor if you have tons of unique objects (which increases draw calls and physics complexity if collisions are enabled), or if you're targeting mobile platforms. None of this means you should stop optimizing your assets or ignore LODs; but triangle count usually isn’t the most critical issue to tackle first.

Above, we mentioned that modern hardware (even low-end systems) can handle a lot of draw calls, but it’s super easy to go overboard. (But again; excessive numbers still hurt performance, especially on lower-end systems or mobile platforms.) If you have a bunch of stuff lying on a bed, merge everything into a single mesh using the "Static Mesh Merge Tool" in Unreal. Be aware that when multiple static meshes are merged into one, each of the original objects retains its respective collision. You'll need to remove the collisions from the merged mesh and add a simplified one instead. More collisions to calculate = bigger impact on performance.

If you've been reading about optimization in Unreal Engine 5, you might have come across the terms "per-poly" and "simple convex collision." But what do they actually mean?

  • Per-Poly: Complex Collision
  • Simple Convex Collision: Simple Collision (simplified, convex shapes like boxes, spheres, or capsules)

Now, here's a question for you:


Which one is cheaper: Having multiple simple box collisions in one mesh, or a single complex collision?

In almost all cases, using multiple simple box collisions is significantly cheaper than relying on a single complex (per-poly) collision. Complex collisions place a substantial load on the CPU and are almost never used at runtime. For performance-critical objects; especially if they’re interactable or movable; use multiple simple box or convex collisions instead. Avoid complex collision whenever possible. Instead of placing multiple basic shapes, you can alternatively double-click the asset, go to Collision > Auto Convex Collision, and make the necessary adjustments in the bottom-right corner. This allows you to create a simple yet detailed collision, and it’s still more performance-friendly than using complex collision.


Let’s say we merged a few meshes into one. If I add 3 box collisions in the collision panel, is it still considered one simple collision or 3 simple collisions?

If you manually add three box collisions to a mesh, Unreal Engine will treat them as three separate simple collision shapes; not one unified shape. While it still counts as “Simple Collision,” using multiple collision primitives adds a small amount of processing overhead compared to using a single box. However, it’s still far cheaper than using per-poly (complex) collision. Wherever possible, try to use a single simple collision shape for the entire object. That said, if the shape is complex and requires multiple boxes, using multiple simple boxes collision is still the recommended approach rather than using the complex collision.


What if we have a cave-like system?

The best approach is still to use multiple simple collision shapes; such as boxes, capsules, or convex hulls to build out the shape of the cave interior. Just to underline: complex collision is very CPU-intensive and can cause issues with dynamic physics objects, line traces, or AI navigation. Another solid option is to render only the cave asset while overlaying the walkable areas with custom blocking volumes or collision-only meshes that aren't rendered. Always test with the player controller, AI navigation, and physics objects to ensure that both performance and collision behavior are acceptable.


If complex collision is so expensive, why the hell do we even have that option?

There are other cases where complex collision can be useful, but in game development, it allows developers to “Use Complex Collision as Simple” and quickly move on with prototyping. It lets artists or level designers get assets into the game without needing to create custom collision meshes right away.


Hierarchical Instanced Meshes vs. Static Meshes: When to Use Which?

Hierarchical meshes are great if you have many instances of the same object; like fences, pots, etc. If you instance one mesh 100 times, it still only results in one draw call. Another way to put it: if you have 10 different meshes, each populated 100 times using a Hierarchical Instanced Static Mesh (HISM), you’d end up with 10 draw calls. One important thing to watch out for is collision. Each instance retains its own collision, but it's handled efficiently thanks to instance-based physics optimizations.


Question 01: Which one is better?

Let’s say we have a garden level. Inside the garden, there’s a house structure surrounded by various assets, pots, flowers, wood piles, etc.; scattered around it. Is it better to combine all of these into one static mesh, or use HISM?

Answer: You combined all your fences and pots into a single mesh. The advantage would be having only one draw call, whereas you'd have three draw calls with HISM. The downside is that even if just 1% of one of those meshes is in the camera view, the entire combined mesh will be rendered. Culling efficiency would be practically nonexistent, and LOD management would become much more difficult. You could consider combining multiple fences in different areas to create separate merged meshes, but then you'd increase the draw calls; and add more static meshes to manage, which can become time-consuming, especially when dealing with individual collisions. So the answer would be to use HISM to cover widely scattered objects.


Question 02: Here’s another scenario for you to think about. Let’s say I have 5 unique meshes. Which of the following would be more performant:

  • 5 unique static meshes, each instanced 5 times via HISM (so we have 25 instances total in the level)
  • 5 unique static meshes, each duplicated manually 5 times (copy-pasted), then merged into a single mesh using the Static Mesh Merge tool, with a single simple box collision applied to the combined asset.

Which is more effective in terms of performance?

  • 5 unique meshes, each instanced 5 times: 5 draw calls
  • 25 meshes combined into one: 1 draw call

Answer:Well, the answer is a bit more complicated in this case. Remember, we said HISM is amazing for scattered objects. Here, we’re comparing 5 draw calls (HISM) versus 1 draw call (merged static mesh). Now, the second important factor comes into play: are these assets all non-interactive decorative meshes? (Think of a bunch of books, pens, paper, etc. on a table.) Will they all be rendered at the same time using a shared simple box collision? If we know all objects will always appear together and never need to be culled individually, merging can be slightly more efficient due to the reduced number of draw calls.

When it comes to choosing between merging and using HISM, the verdict boils down to draw calls, collision complexity, and culling efficiency.

If the objects are repetitive and widely spread: use HISM.

  • 🗸 Better/faster culling (since HISM allows Unreal to batch-cull multiple instances).
  • 🗸 Physics is optimized, as UE5 can handle instanced collisions more efficiently.
  • 🗸 If 9 out of 10 instances are off-screen, only the visible one is drawn. Each instance has its own bounding box, and culling happens per instance. This makes HISMs very efficient for scenes with many repeated objects like fences, barrels, trees, etc.
  • ✗ The only downside of HISM is higher CPU overhead when managing large instance counts dynamically.

In most cases, HISM is the better choice; unless the entire merged object is always fully visible.

For tightly packed geometry that will always be together (we can think of books on a bookshelf) never need to be culled separately, with minimal dynamic needs: Merging

  • 🗸 Lowest draw call count
  • 🗸 Culling is handled per-object, which can be beneficial for large meshes.
  • ✗ Heavier culling cost (If any part of the large merged mesh’s bounding box (even if only 5% is visible) is visible to the camera, the entire mesh is rendered. This results in less efficient culling; especially problematic when the mesh is large or the merged objects are spread far apart.)
  • ✗ Less flexibility (You can’t dynamically remove or reposition individual parts of the merged mesh.)
  • ✗ Combined collision (Collision is also merged, which can be less efficient if your original setup relied on per-mesh physics calculations.)

Please note that if the books are spread out across a scene or reused in many places (e.g., on multiple shelves, tables, or floors), using HISM is preferable, as it offers better per-instance culling, efficient batching, and more flexible reuse.



The Final Verdict on HISM vs. Merged Objects

  • Close-together static objects → Merge into a single mesh.
  • Far-apart, repeatable objects → Use HISM.
  • Far-apart, non-repeatable objects → Keep as separate meshes or manually group small clusters.