@tylerklement:disqus have you managed to clarify your initial question? I am also stuck here.

However, as I understand, the target outputs are going to be a matrix of height = sample_count and width = 3 (for the 3 possible classifications of the iris dataset).

The result from the feed-forward algorithm, though, is a matrix of height = sample_count and width = 5, because as it is described in step 5, we form the output matrix to return by concatenating a vertical bias vector horizontally to the input matrix of width 4 (sepal length/width, etc).

Am I misunderstanding what you mean by “target output”?

Thanks in advance for your advice.

