Edward de Jong / Beads Project
05/21/2019, 6:36 AMyairchu
05/21/2019, 2:26 PMScott Anderson
05/21/2019, 6:23 PMScott Anderson
05/21/2019, 6:30 PMScott Anderson
05/21/2019, 6:35 PMScott Anderson
05/21/2019, 6:36 PMScott Anderson
05/21/2019, 7:04 PMScott Anderson
05/21/2019, 8:43 PMEdward de Jong / Beads Project
05/22/2019, 9:00 AMScott Anderson
05/22/2019, 11:17 PMEdward de Jong / Beads Project
05/23/2019, 8:16 AMKartik Agaram
I did a comparison once where i turned off the compiler flags for instructions past the Pentium... and it didn't affect the performance in either storage or speed more than 1%.Over what programs in what domain? Could other domains have different needs?
Edward de Jong / Beads Project
05/24/2019, 6:58 AMKartik Agaram
Scott Anderson
05/24/2019, 5:43 PMhem. You have to admit that VGF2P8AFFINEINVQB as an opcode is absurd, and how many programmers out of a million will ever use the Galois Field Affine Transformation Inverse instruction?Yes, many of the higher level, sometimes domain specific instructions and wasteful, I agree with that. I think we're talking about different CPU instructions. I'm talking about explicit efficient use of basic SIMD instructions like move, shuffle, add, multiply, etc. that effectively exist on all consumer devices. SSE has existed for 20 years. SSE, Altivec or Neon are on almost all consumer devices you'd would want to ship a game on, including raspberry pi and other cheap SOCs. I assume you're only talking about new AVX instructions because you mentioned fractional market share devices? Turning on compiler flags in many compilers does not (and cannot!) usually take advantage of SIMD, so I'm not surprised you saw no difference. Sometimes auto-vectorization can be a win, but often it isn't, and some compilers don't support it (MSVC) or require you to write special code to make sure it is done in a performant way. Often this stuff is hidden in a math library (https://github.com/Microsoft/DirectXMath, https://glm.g-truc.net/0.9.9/index.html) or some place where it really matters, but I'd argue that even for 3D gameplay code there are advantages. I'm talking about the domain of games specifically because that's what Jon is trying to address in Jai. I'm not anti-Jai btw, there are a ton of things I like about it, but SIMD and multithreading are fundamental to game performance. The reason I'm bringing this up in the context of language is we've seen language features introduced that make writing vectorized, cache friendly and parallel code easier, and I believe there are ways to improve existing patterns that have been demonstrated. If you're not familiar please read up on ISPC (https://ispc.github.io/index.html), and Burst Compiler\HPC# (https://docs.unity3d.com/Packages/com.unity.burst@0.2/manual/index.html, https://lucasmeijer.com/posts/cpp_unity/). It's also completely possible that Jon Blow is thinking about all of this and just hasn't demonstrated it in Jai.
Edward de Jong / Beads Project
06/10/2019, 9:33 AM