上QQ阅读APP看书，第一时间看更新

Capturing patterns heirarchically

We previously saw how a specific model configuration with two neurons, each equipped with a sigmoid activation function, manages to capture two different curvatures in our feature space, which is then combined to plot our decision boundary, represented by the aforementioned output. However, this is just one possible configuration, leading to one possible decision boundary.

The following diagram shows a model with two hidden layers with the sigmoid activation function, trained for 1,000 epochs:

The following diagram shows a model with one hidden layer, composed of two neurons, with a rectified linear unit activation function, trained for 1,000 epochs, on the same dataset:

The following diagram shows a model with one hidden layer, composed of three neurons, with a rectified linear unit activation function, again on the same dataset:

Note that by using different activation functions, and manipulating the number of hidden layers and their neurons, we can achieve very different decision boundaries. It is up to us to asses which of them is ideally predictive, and is suitable for our use case. Mostly, this is done through experimentation, although domain knowledge about the data you are modelling may go a long way.