A comprehensive look at what’s new in Metal from W...
# of-graphics
s
A comprehensive look at what’s new in Metal from WWDC2020. TBDR is coming to Macs with Apple Silicon, harmonizing CPU and GPU architectures across all Apple devices. This comes after they’ve been designing their own CPUs for a decade and their own GPUs for six years. The GPU compute capabilities seem very mature. Have a look at the function pointer feature and the ray tracing APIs. Seems like Apple is ramping up to position this as a serious alternative to the energy-hungry overdraw-heavy classic world of desktop graphics. Will be interesting to see how the industry reacts to a different approach of slicing the market: previously you’d build for either PC/console or mobile, mobile often being second thought ported separately if at all. Soon you can access part of mobile + part of notebook/desktop (+ perhaps a more capable Apple TV which would qualify as a gaming console?) with the same architecture, but of course limited to Apple devices. It was easy to ignore Macs for gaming in the past, but now it basically comes “for free” if you consider shipping for mobile/iOS. http://metalkit.org/2020/07/03/wwdc20-whats-new-in-metal.html
👀 1
🧐 1
w
correction: it is not an alternative to overdraw, it just makes overdraw cheap
g
I absolutely want Apple to be a gaming platform, specifically Mac. But the current GPUs are not remotely fast enough even in the top Macbook Pro for anything other than casual gaming. The 3 yr old NVidia 1060 GTX in my 2017 Razer is often twice as fast as the AMD Radeon Pro 5500M in the top end Macbook Pro 16" If you only care about casual games then an AppleTV probably already has a fast enough GPU but if you want Cyberpunk 2077 or other AAA games on Mac or anywhere else Apple is going to have to step up their game. I don't believe TBDR magically will make them competitive with what AAA games need but I'd be happy if graphics didn't suck on Mac and all the GPU and gaming work switched to Mac so I could get rid of my PC. I don't expect it to happen though.
e
If Apple promotes some unique-to-them technology for graphics, it won't help at all. The big game companies need uniformity of hardware, and any time there is a weak spot in some area, their AAA titles get built around those weak spots. In effect, the least powerful platform sets the expectations and technologies used for a project. Sure you can turn off fancy shadows in some games to help keep the frame rates up, but it is pure torture for game companies when hardware firms come out with some unique-to-them feature set that is only useful in rare conditions. The fact that XBOX and PS are so close in hardware technologies makes their life so much easier. Another example is the instruction sets of the various intel processors. There is such a random assortment of availability of instructions, most programs don't even bother using the fancy new instructions, because it is less code overall just to go with software implementations of those functions, because your going to need to test the software version anyway. Patching code and having alternate versions just makes more work for the software vendor for perhaps a thousandth of a percent overall improvement in the product. It just isn't worth the effort. Apple is a control freak, and would make the raw metal for the cases if they could. They have no great semiconductor physicist staff, so they will remain dependent on their foundry (TSMC), and of course ARM to continue to improve the core designs. Given that their mobile devices outsell the desktop machines by somewhere between 50 and 100:1 in unit volume, using the same chips for desktop is a logical cost saving measure. It is going to cost some article claimed 6% of Intel's total income; a huge blow for Intel. Intel has seen this coming for years, they have been furiously working on the issue of lowering power consumption of their chips to match ARM, and have made great progress. I expect the coming battle between ARM and Intel to be one of the most titanic business struggles in history. Great for consumers, the result of this battle will probably be a 10x improvement in the price * performance *power results.
👍 1
s
correction: it is not an alternative to overdraw, it just makes overdraw cheap
@Wouter Would you mind elaborating? Do you mean Apple’s TBDR approach “makes overdraw cheap”?
i
Note: The Apple Silicon GPU will return false when queried for 
isLowPower
 because its performance is at least as good as the discrete GPU performance in current Macs, while offering a much better performance/watt ratio.
It's probably safe to assume Apple Silicon will feature much more powerful GPUs than iPhone/iPad, but where have they given any indication of how much more powerful?
w
@Edward de Jong / Beads Project this (tile rendering) is not "unique to them", many mobile chips have been using these techniques for ages, and most big rendering engines are going to be well aware of them.. if anything its new to apple laptops. Metal if anything is a bigger stumbling block.
👍 1
@Stefan yes, if you have N triangles overlapping you're still drawing N triangles there, but having that resolved all within a tile makes it cheap, and allows it to run the pixel shader just once.. a desktop chip, if its unlucky, not just has to touch memory N times, but if the engine accidentally renders them back to front, has to run the pixels shader N times also
s
@Wouter My understanding is that the architecture is encouraging you to avoid overdrawing to get the most out of it by using hidden surface removal, draw everything in the right order, and use acceleration structures. Isn’t that different to other architectures where overdrawing is basically the efficient approach and you throw more GPU cores at it? We had another thread here where we were discussing the “software rendering” approach of the new Unreal 5 engine which hinted at a shift in the direction of compute vs. classic pipelines. It looks to me that TBDR and in particular this architecture are build for compute-heavy approaches whereas classic GPU architectures are still closer to the earlier fixed-function pipelines. I’m not an expert on this, so really interested in understanding this better and please correct me if I’m wrong!
d
For me, the importance of Apple Silicon is that they won't port OpenGL to Apple Silicon, so it will be dropped. Although I'd like my project to run on MacOS, I don't have the resources to maintain a separate Metal port. For me, the main choices are between targeting Vulkan (Linux, Windows, Android, and emulated on MacOS using Molten) vs targeting WebGPU (which also runs in a web browser). Either way, the latest features that are exclusive to Metal won't be available to me. Since I'm not writing an AAA game, I will try WebGPU and hope that the limitations will be acceptable.
I don't see evidence that Apple has ever been particularly interested in supporting AAA games on MacOS. Especially at this point, with Metal, I think they primarily want people to develop mobile-class games exclusively for iOS (and secondarily MacOS). Big game engines have the resources to support MacOS and Metal as a secondary platform, but the games developed using those game engines will not be primarily developed, tested or played on the Mac or iOS. IMO.
As a coda, for me personally, the future of coding has never been "Apple only". I'd rather drop Apple as a platform than develop exclusively for it.
w
@Stefan but that hidden surface removal is done by the hardware.. the software that renders to it is still largely the same
take a simple example: render 2 triangles that have some amount of overlap
classical hardware, back to front: each triangle gets rasterized (pixels are determined), shaded, and written to the frame buffer.. this is maximally expensive
classical hardware, front to back: the second triangle will have some pixels for which the depth buffer says there is a closer pixel, so will not run the shader for them.. still, has to touch memory to figure this out
tile based renderer, any order: can track for each pixel in the tile exactly which triangle touches it, so it runs the shader exactly once per pixel.. and tracking these pixel is all in on-chip memory
g
I don't personally think game companies worry so much about APIs. They are all effectively the same. They all have a way to write shaders, they all have a way to provide streamed data to those shaders (typically vertex data), they all have a way to read from sampled arrays (textures) and to write to arrays (textures). The rest is mostly trivial differences in APIs. The major engines (unreal/unity) already generate their shaders per platform so it's relatively trivial to output a different shader language. The rest isn't all that much work. There are perf issues, like for a large number of draw calls OpenGL is much slower than Vulkan/Metal/DirectX12 but most engines already optimized for draw calls because it was important in the older APIs (OpenGL/DirectX9) so when they upgraded to Vulkan/Metal/DirectX12 they didn't see much of an improvement (plenty of GDC presentations on this) Sure, a company making their own engine might complain about the extra work for supporting another API but in the large scheme of a game it's a tiny amount of work. As for tile based renderers they have pluses and minuses. The plus is they help with overdraw for opaque pixels. The minus is they often hit a perf cliff for translucent pixels since they gambled on the tradeoff, slower pixel options by trying to optimize away extra pixel operations. When it becomes impossible to optimize those operations away (like for particle effects, smoke, glows, transparent windows) suddenly that tradeoff fails. There are also plenty of places where they run out of memory for their internal structures and then have to rasterize what they have so far, you then get a big perf hit.
👍 1
d
Ugh all of this still requires low level scheduling of memory, tiles, and such. There's gotta be some kind of hotspot-style implementation for GPU programming that doesn't involve SIMD level coding and low-level optimizations for memory bandwidth vs other things. Apple doesn't seem to be going that way, but instead just optimizing for their transparency+blur with rounded corners style (cough windows)...
I suppose that making the right thing the easy thing is an achievement for them