Deep Learning

Sanity Checking the Results of My Cnn Stock Price Predictor

For my assignment, I was training a 1D CNN model to predict stock prices. Whilst most of the models gave 0.5% accuracy, there was one version of the model trained with a learning rate of 1e-3 that was giving me 80% accuracy. I am a little skeptical. It feels a little too good to be true. A model that gave 0.8% accuracy would mean that this model would be extremely profitable. Sounds a litle too good to be true? ...

Observations Training 3 Models on MNISTFashion

I wanted to see what are the effects of training some simple models on this dataset, namely, a Shallow 1-layer neural network, a deep 3-layer neural network, a variant of the deep neural network with wider dimensions, as well as a simple 1-layer CNN. The results of the models are as follows: Model Dimensions Accuracy Shallow NN 16 78.4% Deep 3-layer NN 8-8-8 79.2% Deep 3-layer NN 16-12-8 80.6 CNN 8 82.6% These corresond to models #1-4 on this page: https://models.minimumloss.xyz/ ...

Regularization

Regularization is the process of ‘smoothening out’ your loss function, such that it is less responsive to the peculiarities of a specific training dataset and therefore does better at interpolating to unseen data. It attempts to resolve the problem of fitting well on training data but doing badly on test data. This process involves adding a new term to the loss function. In machine learning, the term regularization is also generally used to refer to any strategy that helps improve generalization. ...

Core Takeaways From the First FT5011 Lecture

I’ve just begun my second deep learning course with Prof Stanley Kok at NUS - for FT5011. Based on the first lecture alone, I have a feeling that this is going to be a great course. Stanley seems to be a very good explainer of concepts, and despite having learnt deep learning and neural networks before, there were still nuances from the first lecture that I want to write about. ...

Understanding the Transformer Model P1

For my assignment, I have to partially implement the T5 transformer model. However, the only thing I have for implementing it is intuition. I don’t understand many parts of it deeply enough. And the full code for the implementation is sprawling - and also, because of its sprawling nature, makes breaking down the components difficult. There is quite a good guide for it here, where the authors take us through each aspect of its implementation with notes. I think this is a good starting point. ...