Title

#thinking-together

Nick Smith

09/04/2021, 8:57 AM

Shubhadeep Roychowdhury

09/04/2021, 10:18 AM

And relatedly: operations on tensors are typically massively-parallelizable, thus could be a good foundation for a high-performance programming language that compiles to AI hardware.You hooked me there

l

Luke Persola

09/04/2021, 5:59 PM

So we already store tensors in RAM and perform various operations on them. You said software not hardware, so is the difference here that the abstraction between the 1D (flattened) data and its higher dimensional form is provided at a lower level in the software?

Nick Smith

09/04/2021, 10:42 PM

I mean the *assembly language* of the hardware should be phrased in terms of operations on tensors. 🙂 The programmer should not be concerned with whether the tensor is ultimately flattened into a linear array of SRAM or DRAM cells. (In AI hardware, they definitely won’t be.)

10:45 PM

My goal with this post is just to get people thinking a little differently about the memory model upon which a programming language is built. Tensor-based memory models are about to become mainstream (next 5 years) thanks to the AI boom. Could lead to some exciting new paradigms of programming.

10:51 PM

Here’s a challenge for everyone: when you visualise the act of “allocating memory”, what do you see? If you see a big linear chunk that you can address with a pointer, then maybe you’re trapped in 1-dimensional thinking. I certainly was/am.

Denny Vrandečić

09/05/2021, 1:48 AM

How's your memory mostly zeros? If it is, you should use smaller machines. I think the idea that RAM is similar to a sparse vector is often not right. At least not if you use Chrome for browsing.

Nick Smith

09/05/2021, 1:53 AM

I’m referring to virtual memory (i.e. what apps see). I guarantee you that your 64-bits of virtual memory are mostly zeroes! And with a paging system, you can write across vast swathes of the memory whilst only consuming physical resources for the pages you actually touch. That’s what I mean when I describe linear memory as a sparse vector.

2:00 AM

Now imagine what it would be like to have a memory model where your memory is sparse at the *byte*-level, and is multidimensional, and you can have multiple memories and perform massively-parallel operations (such as aggregations) over them. This is what sparse tensors are.

Konrad Hinsen

09/05/2021, 8:04 AM

N-dimensional arrays as a fundamental data representation? That's an idea that has been around since the days of Fortran and APL. The 1960s. Efficient parallelization has been investigated as well, with today's Fortran containing very good support, though it's less automatic / miraculous than people tend to expect.

8:05 AM

BTW, I avoid calling N-dimensional arrays tensors because a tensor for me is a algebraic and geometric object, not a data structure: https://en.wikipedia.org/wiki/Tensor

Nick Smith

09/05/2021, 8:47 AM

Does Fortran handle large and high-dimensional (10000x10000x10000x...) **sparse** tensors, though? That's the main enabler of a lot of interesting applications. Tenstorrent handles sparse tensors completely in hardware; as a programmer you work with them as if they were dense. For context: if you multiply a pair of 99% sparse tensors using a dense multiplication algorithm, you’re doing 10000x more work than you need to (repeatedly multiplying by 0). In general, the asymptotic complexity is different.

8:50 AM

I'm aware of the more "mathematical" definition of tensor. But I believe the difference is just that the array representation is what you get once you've chosen a basis. You can also talk about tensors without reference to any basis.

Konrad Hinsen

09/05/2021, 6:48 PM

Fortran doesn't support sparse arrays as a language feature, but library support has been around for decades, getting better all the time.
As for the tensor, yes, once you pick a basis, you get an array representation. But the whole point of tensor algebra and tensor analysis is that the tensor has a meaning (and properties) *independent* of the choice of a basis.

Alex Chichigin

09/06/2021, 9:07 AM

Have you heard about https://chapel-lang.org/ ? It natively supports not only sparse but **distributed** N-dimentional arrays. Still take a look at the *problems* and the means to make it *efficient*.

9:09 AM

Besides, "tensor operations" are only good for *numerical* computations. I don't know how much numerical computation you develop, but I develop none. Haven't come across a single one in any of Web-dev projects I was involved. 🤷♀️

Nick Smith

09/06/2021, 9:13 AM

Thanks I’ll check out what Charity does. But you’re not correct with the claim that tensor operations are only for numerical computation. I gave a very important example in my original post: contraction of Bool-valued tensors is precisely an equi-join between two database tables. I guarantee you’ve come across databases in your web dev projects 😇

9:14 AM

This isn’t merely a curiosity. Given hardware like Tenstorrent’s, we might now have the opportunity to radically rethink high-performance databases, including in-memory DBs (i.e. the memory model of arbitrary programs).

Alex Chichigin

09/06/2021, 10:15 AM

Yeah and my DBs were full of Strings, DateTimes and Foreign Keys -- good luck putting all of that into "tensors" and performing *parallel* operations on them! 😁

10:17 AM

In reality your parallelism stops as soon as you encounter a fold with *non-associative* operations. And pretty much all operations performing *side effects* are non-associative. That's basically leaves you with numeric (plus boolean) computations.

Nick Smith

09/06/2021, 10:20 AM

Mate, foreign keys aren’t a problem, they’re the very thing you equi-join on. Have a play around with the idea in the 2D case (Boolean matrices); you should be able to figure out how it works. Stuff like strings aren’t a problem either, they’re just a bunch of bytes along one dimension of the tensor.

Alex Chichigin

09/06/2021, 11:15 AM

I know how it works. I know how it *performs*. I know how GPUs and "tensor processors" are implemented and what they are capable of. Do you? 🙂

Nick Smith

09/06/2021, 12:17 PM

I came to share some exciting ideas with the community; I’m not interested in having a pissing contest. Clearly you’ve come to this thread with an ulterior motive. I think we can leave it there.

Alex Chichigin

09/06/2021, 12:24 PM

Vijay Chakravarthy

09/06/2021, 2:56 PM

Very interesting - I’ve been looking at XLA and JAX as a means of abstraction over such hardware. I also think a number of problems can be decomposed into “tensor friendly” representations - lots of interesting work going on in this area.

Konrad Hinsen

09/06/2021, 3:58 PM

Alex Chichigin

09/06/2021, 4:12 PM

Nick Smith

09/06/2021, 10:43 PM

On these new distributed memory tensor processors, you don't usually "point" at things (there is no global memory / address space), you usually *join* things (in the form of tensor contraction). These chips are specifically built to do insanely fast contractions. You won't be running Java code on these chips (the epitome of pointer-chasing), but that's fine, that's not the promise.
Operations on databases are mostly equi-joins and aggregations, **both** of which are basic operations in the Numpy/Pytorch API (which Tenstorrent is essentially using as the initial "instruction set" for their hardware). Fancier kinds of joins (i.e. on predicates) are less-obviously translated to tensor operations. It would be fun to explore how to re-implement them efficiently in terms of lower-level ops (cross-join + filter is always an option, but probably not the smartest one). A good ISA would have the right primitives available.

10:45 PM

Thanks **@Konrad Hinsen**. That article looks really interesting! I'll have a dive in.

Kartik Agaram

09/07/2021, 2:33 PM

This thread is definitely made for "thinking together" ❤️

View count: 1

Join thread in Slack