# Neural Nets from Scratch - 1/4

# The Goal⌗

The media is buzzing with discussions about Chat GPT, Stable Diffusion, and other artificial intelligences (AIs) that seemingly threaten to replace all our jobs. However, are these stories presenting a complete and accurate picture? To gain a better understanding of these emerging technologies and their implications, it is crucial to delve deeper. In my humble opinion, the most effective approach to unraveling these questions is to deconstruct and reconstruct some of these technologies ourselves. Thankfully, brilliant minds such as Grant Sanderson (3Blue1Brown) and Andrej Karpathy (Stanford University) have already paved the way for us. In this blog post, we will leverage the frameworks they have provided to rebuild these models from scratch. I highly recommend watching the videos in the upcoming section, as they will form the foundation of our journey today.

This series will consist of the following parts:

- Create a
*Value*object that can store floats and has the basic arithmetic operators implemented. - Implement a graphing solution to visualize our arithmetic operations
- Modify our
*Value*object with some bonus features (Gradient Descent, Topological Sorting, ReLU) - Train and solve the MNIST handwriting database

Hopefully by the end of this series of posts, we will have a deep understanding of how these Neural Networks are designed, implemented, trained, and tested.

# Prerequisite Learning⌗

# The Initial Library⌗

Let’s start by creating a library of Values that have some special properties. Those properties are:

- Can store (wrap) a
*float*value - Can execute simple arithmetic operators (+, -, *, /, **)
- Can store the operands of the resulting
*value*

Our requirements will evolve overtime as we continue on your journey.

## Storing Values and Arithmetic⌗

Firstly we need a *value* object that can store some floats. This will then be used later to do some simple arithmetic.

```
class Value:
pass
class Value:
def __init__(self, value = 0) -> Value:
self.value = value
```

**Note:** The first time we define the *Value* class we just write *pass*. This is so that our type hinting in the second class definition will not error out when we output a *Value*. This is the lazy solution.

Now lets see what our *Value* object looks like:

```
from value import Value
a = Value(2.0)
print(a)
```

When we run this we get:

```
<value.Value object at 0x7f516c103550>
```

Which is the memory address and not a useful representation of our object.

This is where Python’s magic methods come into play. Python’s magic methods are simply a way to define behavior for hidden expressions. Simply put when we type:

`print(a)`

or `b = a + 3.0`

what the python interpreter unpacks is:

`print(a.__repr__())`

and `b = a.__add__(3.0)`

Knowing this we must now define the methods for add, multiply, and representation.

```
class Value:
pass
class Value:
def __init__(self, value = 0) -> Value:
self.value = value
def __add__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
return Value(self.value + other.value)
def __mul__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
return Value(self.value * other.value)
def __repr__(self) -> str:
return "Value({})".format(self.value)
```

Which would now make a test like this work:

```
from value import Value
a = Value(2.0)
b = 3.0
c = a + b
d = a * b
print(c)
print(d)
```

Resulting in:

```
Value(5.0)
Value(6.0)
```

But what if we did the operations the other way?

```
a = Value(2.0)
b = 3.0
c = b + a
d = b * a
print(c)
print(d)
```

Results in:

```
Traceback (most recent call last):
File "/home/otis/github/llms/tests.py", line 6, in <module>
c = b + a
TypeError: unsupported operand type(s) for +: 'float' and 'Value'
```

Luckily we have another set of magic methods prefaced with the character *r*.

```
def __radd__(self, other):
return self + other
def __rmul__(self, other):
return self * other
```

When `__add__()`

is called on the *Float* 3.0 it can’t resolve its logic for the parameter of type *Value*. Python will then call the `___radd__()`

method on the *Value* operand and pass in the *Float* as the new parameter. What that looks like is:

```
a = Value(1.0)
b = 2.0
c = b + a
c = b.__add__(a) # Fails and replaces expression with below
c = a.__radd___(b)
```

Using this knowledge we can now define some more simple arithmetic for our *Value* class including division, powers (floats and ints only), and subtraction.

```
class Value:
def __init__(self, value = 0) -> Value:
self.value = value
def __add__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
return Value(self.value + other.value)
def __mul__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
return Value(self.value * other.value)
def __pow__(self, other) -> Value:
assert isinstance(other, (float, int))
return Value(self.value ** other)
def __truediv__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
return Value(self.value / other.value)
def __sub__(self, other) -> Value:
return self + (-other)
def __neg__(self) -> Value:
return self * -1
def __repr__(self) -> str:
return "Value({})".format(self.value)
def __radd__(self, other) -> Value:
return self + other
def __rmul__(self, other) -> Value:
return self * other
def __rsub__(self, other) -> Value:
return other + (-self)
def __rtruediv__(self, other) -> Value:
return Value(other) / self
```

## Storing the Children⌗

Now that we have a value object that can utilize common operators like addition, subtraction, multiplication, and divison, we want to make sure that we can track the children of these operations. This will come in handy when we start graphing and would like to see the genealogy of the operation.

Lets start by creating a place to store our operands. We store a *tuple* as our children since each operation will have at most 2 operands. Storing the *tuple* as a *set* for speed optimization.

```
class Value:
def __init__(self, value = 0, op = "", children = ()) -> Value:
self.value = value
self.op = op
self.children = set(children)
...
```

Lets see how to implement this with the addition operation. Modify the `__add__()`

method to include the following:

```
def __add__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
out = Value(self.value + other.value, "+", (self,other))
return out
```

We can now modify the rest of the methods in a similar way. Resulting in this semi-final* class:

```
class Value:
pass
class Value:
def __init__(self, value = 0, op = "", children = ()) -> Value:
self.value = value
self.op = op
self.children = set(children)
def __add__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
out = Value(self.value + other.value, "+", (self,other))
return out
def __mul__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
out = Value(self.value * other.value, "*", (self,other))
return out
def __pow__(self, other) -> Value:
assert isinstance(other, (float, int))
out = Value(self.value ** other, "exp {}".format(other), (self, ))
return out
def __truediv__(self, other) -> Value:
other = other if isinstance(other, Value) else Value(other)
out = Value(self.value / other.value, "/", (self, other))
return out
def __sub__(self, other) -> Value:
return self + (-other)
def __neg__(self) -> Value:
return self * -1
def __repr__(self) -> str:
return "Value({})".format(self.value)
def __radd__(self, other) -> Value:
return self + other
def __rmul__(self, other) -> Value:
return self * other
def __rsub__(self, other) -> Value:
return other + (-self)
def __rtruediv__(self, other) -> Value:
return Value(other) / self
```

## Conclusion⌗

So far we have created a *Value* object that can store numbers and conduct some arithmetic operations. It can also store its children, which are the operands of each operation. In the next post we will be implementing a graphing solution so we can better visualize these children-parent relationships.

We will also be modifying our *Value* object with gradient descent and ReLU in the next section. Transforming this from a simple *numbers* library, into something a neural network might be able to use.