̴ı̴̴̡̡̡ ̡͌l̡̡̡ ̡͌l̡*̡̡ ̴̡ı̴̴̡ ̡̡͡| ̲▫̲͡ ̲̲͡▫̲̲͡͡ ̲|̡̡̡ ̡ ̴̡ı̴̡̡ ̡͌l̡̡̡̡._.

Thoughts on AI, life, and everything else in between

What's Next?

At the start of the year, there was a few things that I wanted to do. First, complete my course - that I’ve done. Even if I still need to do another semester, I’ll treat it as effectively done, and that any future involvement will be way lower touch. Second, I wanted a change in job scope. That, I’ve also done. Oh, did I mention, I am now working in secops in my company? Security operations. I’ve managed to successfully apply for a job switch. I am grateful that it came through. It’s only been one week in my job, but so far, I am really liking it. Cybersecurity. This field, for now, is deep enough, and the level of craft here is really endless. It’s fascinating and also powerful. Fortunately, despite having only a lay person’s knowledge of this field, it’s one of those roles that allows you to pick up on the job and to learn by doing. For that, I count myself truly lucky. ...

April 18, 2026 · 3 min · Lei

A Crazy Few Months

The last few months has been crazy. Driven by the need to get an A for both of my courses in order to graduate, my life assumed a certain relentless discipline. Work, home, school, studying in the evenings, studying over the weekends, hanging out with Shaune, exercising once in a while. It was just so long. A full 13 weeks of this schedule. Towards the end, I did feel extremely tired. I miss hanging out with my friends. I miss the feeling of just switching off your mind, turning off the need to constantly ask, “what’s next”. I miss going on purposeless meanders, to dwell, and to take luxurious indulgent breaks. To breathe slowly, and to soak up each moment of the day. ...

April 18, 2026 · 5 min · Lei

Sanity Checking the Results of My Cnn Stock Price Predictor

For my assignment, I was training a 1D CNN model to predict stock prices. Whilst most of the models gave 0.5% accuracy, there was one version of the model trained with a learning rate of 1e-3 that was giving me 80% accuracy. I am a little skeptical. It feels a little too good to be true. A model that gave 0.8% accuracy would mean that this model would be extremely profitable. Sounds a litle too good to be true? ...

February 24, 2026 · 3 min · Lei

Observations Training 3 Models on MNISTFashion

I wanted to see what are the effects of training some simple models on this dataset, namely, a Shallow 1-layer neural network, a deep 3-layer neural network, a variant of the deep neural network with wider dimensions, as well as a simple 1-layer CNN. The results of the models are as follows: Model Dimensions Accuracy Shallow NN 16 78.4% Deep 3-layer NN 8-8-8 79.2% Deep 3-layer NN 16-12-8 80.6 CNN 8 82.6% These corresond to models #1-4 on this page: https://models.minimumloss.xyz/ ...

February 22, 2026 · 3 min · Lei

Regularization

Regularization is the process of ‘smoothening out’ your loss function, such that it is less responsive to the peculiarities of a specific training dataset and therefore does better at interpolating to unseen data. It attempts to resolve the problem of fitting well on training data but doing badly on test data. This process involves adding a new term to the loss function. In machine learning, the term regularization is also generally used to refer to any strategy that helps improve generalization. ...

February 9, 2026 · 5 min · Lei

The Constituents of Errors

When we’re training our models, we’re minimizing loss. Yet, not all loss are created equal. Broadly speaking, there are 3 sources of errors: Noise Noise is the inherent randomness of the test data. Assuming that you have a model that perfectly fits the true underlying function, the test data you draw will still lie within the SD of the true data. This is error that cannot be gotten rid of. ...

February 8, 2026 · 8 min · Lei

Parameter Initialization

Typically when initializing parameters ($\beta$, $\Omega$) for a network, we choose values from a standard normal distribution (with mean 0 and variance $\sigma^2$). Now, the most important factor in preserving the stability of the network as we move past different layers (or functional transformations) is the variance of the initialization. This affects the magnitude of your preactivation f, activations h in the forward pass, as well as your gradients in your backward pass. This almost solely determines whether you’ll suffer from vanishing or exploding gradient problems. ...

February 8, 2026 · 4 min · Lei

On the Math of Back Propagation

It’s incredible how entire models can be represented as a series of equations, and the entire process of back propogation (the spirit matter of deep learning) can also be represented as a series of equations. I’ll type this out fully when I have the time, but here’s what most neural networks are, abstracted: The math of back propagation On the top left (1), you see a typical 3 layer neural network represented by a series of preactivations f and hidden units h functions. That’s all there is to it! It’s a very compact representation of a complex series of functional compositions. ...

February 8, 2026 · 4 min · Lei

On Stochastic Gradient Descent, Momentum, Adam

This is an overdue recap of the lecture I had 2 weeks ago, where we went through Stochastic gradient descent, momentum, and Adam. I’ve learned the benefits of SGD before, so I won’t go that deeply into it. But I think what’s good about the lecture is how it anchors it in the mathematics. Stochastic Gradient descent To avoid being trapped in a local minimum, at each iteration, the SGD algorithm choose a random subset of training data and computes the gradients from these examples alone. This is known as a minibatch. The parameter update will then consider only that batch alone. ...

February 6, 2026 · 7 min · Lei

Loss Functions

The second class of FT5011 took us through loss functions. To be honest, my knowledge of loss functions is a little wonky. Like, I know the general idea, but beyond the simplistic model of “loss as sum of squared errors”, I really don’t know that much. After being completely lost in the class, and subsequently reviewing the chapter on loss functions, I am happy to say that my mental model of loss functions has been sucessfully updated. ...

January 25, 2026 · 14 min · Lei