Momentum, RMSprop and Adam Optimizers

Question

Home

Momentum, RMSprop and Adam Optimizers

0

So, I’m having difficulty getting RMSprop and Adam to work.

I’ve correctly implemented Momentum as an optimizing algorithm, meaning that, comparing Gradient Descent with Momentum, the Cost goes down much faster using Momentum. The model Accuracy, for the same number of Epochs, is also higher for the test set if using Momentum.

Here is the code:

# only momentum elif name == 'momentum':                  # calculate momentum for every layer     for i in range(self.number_of_layers - 1):         self.v[f'dW{i}'] = beta1 * self.v[f'dW{i}'] + (1 - beta1) * self.gradients[f'dW{i}']         self.v[f'db{i}'] = beta1 * self.v[f'db{i}'] + (1 - beta1) * self.gradients[f'db{i}']                      # update parameters     for i in range(self.number_of_layers - 1):         self.weights[i] = self.weights[i] - self.learning_rate * self.v[f'dW{i}']         self.biases[i] = self.biases[i] - self.learning_rate * self.v[f'db{i}']

I’ve tried everything I could come up with to try to implement both RMSprop and Adam, both to no success. Below the code. Any help on why it is not working would be much appreciated!

# only rms elif name == 'rms':                  # calculate rmsprop for every layer     for i in range(self.number_of_layers - 1):         self.s[f'dW{i}'] = beta2 * self.s[f'dW{i}'] + (1 - beta2) * self.gradients[f'dW{i}']**2          self.s[f'db{i}'] = beta2 * self.s[f'db{i}'] + (1 - beta2) * self.gradients[f'db{i}']**2                     # update parameters     for i in range(self.number_of_layers - 1):         self.weights[i] = self.weights[i] - self.learning_rate * self.gradients[f'dW{i}'] / (np.sqrt(self.s[f'dW{i}']) + epsilon)         self.biases[i] = self.biases[i] - self.learning_rate * self.gradients[f'db{i}'] / (np.sqrt(self.s[f'db{i}']) + epsilon)

# adam optimizer elif name == 'adam':                  # counter     # this resets every time an epoch finishes     self.t += 1                # loop through layers     for i in range(self.number_of_layers - 1):                          # calculate v and s         self.v[f'dW{i}'] = beta1 * self.v[f'dW{i}'] + (1 - beta1) * self.gradients[f'dW{i}']         self.v[f'db{i}'] = beta1 * self.v[f'db{i}'] + (1 - beta1) * self.gradients[f'db{i}']         self.s[f'dW{i}'] = beta2 * self.s[f'dW{i}'] + (1 - beta2) * np.square(self.gradients[f'dW{i}'])         self.s[f'db{i}'] = beta2 * self.s[f'db{i}'] + (1 - beta2) * np.square(self.gradients[f'db{i}'])                          # bias correction         self.v1[f'dW{i}'] = self.v[f'dW{i}'] / (1 - beta1**self.t)         self.v1[f'db{i}'] = self.v[f'db{i}'] / (1 - beta1**self.t)         self.s1[f'dW{i}'] = self.s[f'dW{i}'] / (1 - beta2**self.t)         self.s1[f'db{i}'] = self.s[f'db{i}'] / (1 - beta2**self.t)                      # update parameters     for i in range(self.number_of_layers - 1):         self.weights[i] = self.weights[i] - self.learning_rate * np.divide(self.v1[f'dW{i}'], (np.sqrt(self.s1[f'dW{i}']) + epsilon))         self.biases[i] = self.biases[i] - self.learning_rate * np.divide(self.v1[f'db{i}'], (np.sqrt(self.s1[f'db{i}']) + epsilon))

# additional information # epsilon = 1e-8 # beta1 = 0.9 # beta2 = 0.999

Orlandohughmarjorie Asked on July 16, 2020 in Python.

Share
Comment(0)

Add Comment

0 Answer(s)

Votes
Oldest

Your Answer

Answer 1

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 2

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 3

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 4

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 5

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 6

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 7

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 8

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

Answer 9

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 10

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 11

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 12

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 13

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 14

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 15

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 16

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

LATEST ANSWERS

Momentum, RMSprop and Adam Optimizers

Your Answer

TOP USERS

HOT QUESTIONS

LATEST ANSWERS

Momentum, RMSprop and Adam Optimizers

Your Answer

Tags Widget

TOP USERS

HOT QUESTIONS