Clear all

sinabs.reset_states() vs buffer detach

4 Posts
3 Users
Topic starter

The SINABS documentation makes it clear that during training we need to reset the states of the spiking neural network layers since they are stateful.

In the SINABS docs ( this is done via the sinabs.reset_states() method.

In the sinabs-dynapcnn documentation ( the call to sinabs.reset_states() is missing, and instead a manual detach() is run on the buffer of the stateful layers.

Are these two methods equivalent?

2 Answers

Hi, Hovren. These two methods are not equivalent; 'reset' will set vmem to 0, but 'detach' won't.

Avatar hovren Topic starter 20/10/2023 7:52 pm

@clarence OK, can you please expand on why and how they are different? I understand why I need to reset the neurons vmem between trainig samples, but what is the use for detach? Do I need both? When should I use one or the other? Thanks!

Avatar hovren Topic starter 06/11/2023 3:28 pm

Hi @clarence could you please expand on the difference between reset_states() and buffer.detach() and more importantly: if both are always required. As it is now the official documentation / tutorials is not consistent. Thanks!


Hi @hovren,

There are two scenarios that are relevant during training:

1. Your gradients need to be reset for every batch after back propagation. You will be using detach() method to accomplish this for your state buffers that are not taken care of by optimiser's zero_grad() method.

2. Because you have internal states in some of the layers, you have a choice of deciding whether your initial state is always zero for each sample or if you would like it to be a random initial condition. Often, the most sensible random initial condition is to retain the previous sample's state as your new initial condition. Under this condition, you will want to use detach(). If instead, you want to just have your initial condition as a state with normal distribution or even simply zero state, then you should use reset_states() method.

I hope this gives you a better understanding of the difference between these two methods.

Best regards


Back To