How To Make a Tensor Class
EthanHorowitz (40)

Warning: This is very, very long, but I wanted to put everything together in one place.

A couple weeks ago, I thought about numpy arrays and wondered how they worked. Numpy arrays are basically tensors, which is what this whole post is about. I couldn't find any online resources explaining how to make a tensor class from scratch, so I tried to make my own in Python.

Understanding Tensors

The first step to making a tensor class is to understand what a tensor is. A tensor is essentially a list of numbers that can be in any dimension.
A one-dimensional tensor is a line of numbers, a two-dimensional tensor is a square of numbers, and a three-dimensional tensor is a cube of numbers. A code-oriented way to think about it is a multi-dimensional array.
Here's a good visual:

Notice how, when a 4D cube can't be drawn, a list of 3D cubes is drawn. This is the same principle I used when making my tensor class: Simplify larger-dimensional tensors into lists of smaller-dimensional ones.

What We Can Do With Tensors

Most operations that can be done on a number can be done on a tensor. The ones that will be covered are negation and scalar addition, subtraction, multiplication, and division. Other tensor-specific operations that will be covered are transposition, matrix multiplication, direct sums, and the tensor product.

Setting Up The Tensor Class

The tensor class will take in an array, like numpy.array.
To start off, we need to process the array. To simplify tensors as much as possible, the tensor will be represented as a list of tensors or 1D arrays. This means that every tensor will always look like a 1-dimensional tensor, which makes writing code easier.
Here's how I wrote the function to process the incoming array:

def process(self, arr):
        newarr = [] # make a new array
        for i in arr: 
            if type(i) == list: # make arrays new tensors, add those new tensors to the array
                newarr.append(Tensor(i))
            else:
                newarr.append(i) # add numbers to the 1D tensors
        return newarr

This turns all arrays into tensors of their own, which makes every tensor just a list, or vector, of either other tensors or numbers.

An important aspect of a tensor is its dimensions. For a numpy array, this would be arr.shape.
For a tensor, we can assume that all tensors in our array of tensors have the same shape. This makes getting the dimensions fairly easy. Just loop through the first elements of each contained tensor and find the length of that tensor’s array.

def getDims(self):
        dims = [] # array that will contain dimension numbers ([outer, inner])
        dims.append(len(self.__arr)) # get the length of the processed array
        if type(self.__arr[0]) == Tensor:
            dims += self.__arr[0].getDims() # add on the dimension of the contained tensor
        return dims

With those two functions defined, we can finally define the init function of our Tensor class:

class Tensor:
    def __init__(self, arr=[]):
        self.__arr = self.process(arr)
        self.dimensions = self.getDims()

Accessing The Tensor

It's important to be able to access the numbers in a tensor, and it can be useful to turn a tensor back into an array.
Accessing parts of the tensor is easy: reference parts of the already-defined representation of the tensor.

def __getitem__(self, ind):
        return self.__arr[ind]

To turn the tensor into an array, we need to do the opposite of the process function: turn all tensors back into arrays. To do this, we loop through all elements in the tensor and add a list if that element is a tensor or the number if the element is a number.

def asArray(self):
        arr = [] # the full array
        isVector = len(self.dimensions) == 1 # see if the tensor is contains numbers or tensors
        for i in self.__arr: 
            if isVector: # append the number if it's a number
                arr.append(i) 
            else: # append the array made by the tensor (i)
                arr.append(i.asArray()) # add the array that that tensor represents
        return arr

If however, we want to make it a vector,, we can just loop through all the numbers and add them to a list.

def asVector(self):
        vector = []
        
         isVector = len(self.dimensions) == 1
        for i in self.__arr:
            if isVector : vector.append(i)
            else: vector += i.asVector()
        
        return vector

Creating Default Tensors

Instead of defining your own array, it's nice to have the option to create a blank tensor. For convenience, this will be split into three parts: one function that computes the product of all numbers in an array, one part that packages a 1D array into a tensor with all the dimensions we want, and one part that creates a 1D array of numbers.

The product function is the easiest. It just multiplies all the elements in a list.

def product(arr):
    i = 1
    for x in arr: i*= x
    return i

Next comes the packaging function.

def package(arr, dims):
        if len(arr) == 1:  # if just a single number is contained, return the number
            return arr[0]

        newarr = []
 
        index = 0
        p = product(dims[1:]) # get the size of the tensors contained in the tensor
        for i in range(dims[0]): # dims[0] = the length of the tensor
            newarr.append(Tensor.package(arr[index:index+p], dims[1:])) # package all the way down
            index += p # increment by the group size
            
        return Tensor(newarr)

This function isn't as self explanatory.
The function takes in a 1D array and splits it into parts of a calculated size.
First, the 1D array is split into groups. If the dimensions will be [2, 3, 4], then the array passed in will be split into 2 arrays of length 3*4. These two arrays will be the two tensors in self.__arr. Recursion is then used to package the two new arrays. When a 0-dimensional tensor (one number) is made, the number is returned since it can't be split any further.

Lastly, the blank tensor. In this case, it'll be full of zeros. It creates a 1D array of zeros and packages it.

def zeros(*dims):
        arr = [0] * product(dims)
        if len(dims) == 1: return Tensor(arr)
        return Tensor.package(arr, dims)

Transposing A Tensor

Transposing a tensor is the same as changing indices. Down becomes right and right becomes down. You might know this from matrices, where the transpose of a matrix means that the width and height (m and n) are swapped.

When thinking abstractly, though, things get a bit tougher. When there are more than just two dimensions, you can order things in more than two ways. The default transposition is just reversing dimensions. A tensor with dimensions [2, 3, 4] will have dimensions [4, 3, 2]. But, in case you don't want to reverse it, you can also choose the new ordering of dimensions. This is the same as numpy.transpose(arr, order).

def transpose(self, order=[]):
        # GROUP 1
        if len(self.dimensions) == 1: return Tensor(self.__arr)

        # GROUP 2
        if not order: 
            order = list(reversed(range(len(self.dimensions))))

        newDims = [self.dimensions[i] for i in order]

        # make an array (to be a tensor) with the dimensions of the new tensor, all spots are defaulted to 0
        arr = Tensor.zeros(*newDims).asArray()

        # GROUP 3
        # go through all possible paths in the tensor
        paths = [0]*len(self.dimensions)
        while paths[0] < self.dimensions[0]:
         
            # get references to the path, put the number in the tensor to its corresponding spot in the new tensor
            ref = self
            place = arr
            for i in range(len(paths) - 1):
                ref = ref[paths[i]]
                place = place[paths[order[i]]]
            
            place[paths[order[-1]]] = ref[paths[-1]]

            # GROUP 4
            # go to the next path (sequentially)
            paths[-1] += 1
            for i in range(len(paths)-1, 0, -1):
                if paths[i] >= self.dimensions[i]:
                    paths[i] = 0
                    paths[i-1] += 1
                else: 
                    break

        return Tensor(arr)

This is quite a long function, so I split it into four groups to better explain what's going on.
Group 1 means that, if the tensor is a vector, return it. This is because a vector can't be transposed. It only has one dimension. As an example, [2, 3, 4] can only map to [2, 3, 4].

Group 2 produces the blank tensor. The order of the dimensions is passed into the function, so we need to get the actual dimensions and make a tensor with it. The first statement, starting with if not order makes an order (reversed dimensions) if one wasn't specified. After that, an array is created with the actual dimensions in the order specified by the order array. Lastly, a blank tensor is created using the zeros function we defined earlier.

Group 3 starts by defining the path of the element we're trying to access. That way, we can access the element in the current tensor and reorder its path to place it in its corresponding spot in the new tensor. Because we can't assume how many dimensions the tensor has, we need to create a reference (the variables ref and place) and loop down to each number with a while loop. The path is then looped through to update the reference to the original tensor (ref) and update the reference to the new tensor (arr). For referencing the new tensor, the order array defines the spot in the path that needs to be referenced, which is why to reference the new array place = place[paths[order[i]]] is done.

Group 4 updates the path. It increments the last number by 1, and then checks down the line to make sure there are no overflows.

The while loop runs until all possible combinations have been referenced, which is indicated by the first number in the path being equal to the number of elements the tensor should contain, so paths[0] is compared to self.dimensions[0]

Addition, Subtraction, Division, Multiplication

These four operations are done element-wise. This means that, when adding another tensor, the first number of the first tensor will be added with the first number of the second tensor, the second number of the first tensor will be added with the second number of the second tensor, and so on. A scalar, or number, can also be added to a tensor, with the only difference being that the scalar will be added to all numbers in the tensor.
Here's how the __add__ function is written:

def __add__(self, other):
        newarr = [] # create a new tensor

        if type(other) != Tensor: # it the thing being added is a scalar, add it to all elements
            for i in self.__arr:
                newarr.append(i+other)
        
        else: # if a tensor is being added, add the corresponding elements of the two tensors
            for i in range(len(self.__arr)): 
                newarr.append(self.__arr[i] + other[i])
        
        return Tensor(newarr)

By overloading the + operator, an added scalar will be added to all numbers of a given tensor.
Subtraction, multiplication, and division are all done in the same way, but with the operator changed from + to - (__sub__), /(__truediv__), or *(__mul__).
However, subtracting a number is the same as adding a negative number. So, we can shorten the __sub__ method to just:

def __sub__(self, other):
        return self + (-other)

Negating A Tensor

In the subtraction function I just showed, I assumed that a tensor could be negated. To negate everything, we can just multiply all values by -1.

def __neg__(self):
        return self * -1

Direct Sum

This is pretty simple. The In a direct sum, one tensor is just stacked onto another. So, [1, 2, 3] + [4, 5, 6] = [1, 2, 3, 4, 5, 6].

def directsum(self, other):
if other.dimensions[1:] == self.dimensions:
return Tensor(self.asArray() + other.asArray())

This just extends the second tensor to the first. If the dimensions don’t match, the function will return None by default.

Tensor Product

The tensor product is the product of everything in one tensor by everything in another. [1, 2]*[3, 4] = [1*3, 1*4, 2*3, 2*4].

def tensorProduct(self, other):
newarr = []
	for i in range(len(self)):
		for j in range(len(other)):
			if len(self.dimensions) == 1 or len(other.dimensions) == 1:
				newarr.append(self[i] * self[j])
			else:
				newarr.append(self[i].product(self[j]))
		
	return Tensor.package(newarr, [len(self), len(other)])

This will multiply all parts of one tensor by all parts of the other.

Matrix Multiplication

Last one! It also happens to be one of the most complex functions in this.
Matrix multiplication works by multiplying the rows of one 2d tensor by the columns of another.

Multiplying two vectors will yield a single number, a matrix times a vector is a vector, and a matrix times a matrix is a matrix. For tensors, though, a matrix can have more than two dimensions. So, I followed the rules that numpy had.

If both arguments are 2-D they are multiplied like conventional matrices.
If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.
If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.

Numpy uses broadcasting to do multiply tensors with more than two dimensions -- I won’t get into that but just know that those tensors are treated as a stack of matrices.

def matmul(self, other):
 
        def firstVector(v, m):
            v = Tensor([v])
            return Tensor.matmul(v, m)[0]
 
        def secondVector(m, v):
            v = Tensor([[v[i]] for i in range(len(v))])
            r = Tensor.matmul(m, v)
            return Tensor([r[i][0] for i in range(len(r))])
 
        def matrices(m1, m2):
 
            newarr = []
 
            for _ in range(m1.dimensions[0]):
                newarr.append([0]*m2.dimensions[1])
                      
            for i in range(m2.dimensions[1]): # for all columns in the second matrix
                for j in range(m1.dimensions[0]): # for all rows in the first matrix
                    newarr[j][i] = sum([m2[k][i] * m1[j][k] for k in range(m1.dimensions[1])]) # for all numbers in the row of the first matrix
          
            return Tensor(newarr)
      
        def nd(nm1, nm2): # treat as a stack of matrices
            newarr = []
            if len(nm1.dimensions) > 2:
                for i in range(nm1.dimensions[0]):
                    newarr.append(nm1[i].matmul(nm2))
                return Tensor(newarr)
 
            elif len(nm2.dimensions) > 2:
                for i in range(nm2.dimensions[0]):
                    newarr.append(nm1.matmul(nm2[i]))
                return Tensor(newarr)
          
          
        if len(self.dimensions) == 1 and len(other.dimensions) == 2: return firstVector(self, other)
        elif len(self.dimensions) == 2 and len(other.dimensions) == 1: return secondVector(self, other)
        elif len(self.dimensions) == 2 and len(other.dimensions) == 2: return matrices(self, other)
        else: return nd(self, other)

Four rules requires four functions. While nested functions are NOT the most efficient way to go, it made it simple to package the function into one thing.
The first two functions are pretty straightforward. The main operation is matrix multiplication, so whenever one tensor is a vector, turn it into a matrix. If the first tensor is a vector, make it a row vector, and if the second tensor is a vector, make it a column vector.
The matrices() function does standard matrix multiplication, as referenced above. Go through each row in the first matrix and each column in the second and sum the products of corresponding points in each row and column to get each point.
The last function (nd, stands for n-dimensional) takes in two tensors where at least one has an order greater than 2. If the first tensor has more than 2 dimensions, multiply each tensor in that by the second tensor. If the second tensor has more than two dimensions, multiply the first tensor by each tensor inside the second tensor. This means that each tensor (with more than 2 dimensions) will be treated as a stack of matrices, being multiplied by a vector, tensor, or another stack of matrices.

And there we go! For everyone who read all that, you should earn a medal. That was a lot of words.

TLDR

We made a tensor class using recursion. The tensor class supports addition, subtracition, multiplication, division, negation, the direct sum, the tensor product, and matrix multiplication. The transpose of the tensor can be taken, and a tensor can be converted into a multidimensional array.

You are viewing a single comment. View All
EthanHorowitz (40)

@CodeABC123 The 3D thing I mentioned in the beginning was just to show that a tensors are lists of other tensors. So, a 4D tensor is a list of 3D tensors, 3D tensors are lists of 2D tensors, and so on. This means that tensors can always be stored as a single list of other, smaller, tensors. I hope that clarifies it a little bit