

Sequential ( conv_block ( input_channels, 3 ), conv_block ( 3, 3 ), conv_block ( 3, 3 ), conv_block ( 3, 3 ), conv_block ( 3, 3 ), torch. Module ): def _init_ ( self, input_channels ): super (). Linear ( 16, 1 ) ) def forward ( self, x ): out = self. Module ): def _init_ ( self, input_size ): super (). Conv2d ( input_channels, output_channels, ( 3, 3 ), padding = 1 ), torch. ReLU () ) def conv_block ( input_channels, output_channels ): return torch. Linear ( input_size, output_size ), torch. Nice job!ĭef linear_block ( input_size, output_size ): return torch. We reduced that last layer to 1,251 parameters. Is there a way we can reduce this somehow? Glad you asked! See you in the next section. Oh man! 20,000 parameters in that last layer, geez. Layer (type:depth-idx) Output Shape Param # We preserve the useful structure of our data. We have less parameters than a fully connected network. Instead of fully-connected hidden nodes, we have 2D filters that we “convolve” over our input data. We just made a convolutional neural network (CNN). The filter “convolves” over each group of pixels, multiplies corresponding elements and sums them up to give the values in the output nodes:Īs we’ll see, we can add as many of these “filters” as we like to make more complex models that can identify more useful things: We’ll display some arbitrary values for our pixels Let’s summarise the weights into a weight “filter”: But we don’t need the weights to be different for each group, we’re looking for structure, we don’t care if my face is in the top left or the bottom right, we’re just looking for a face! We’re seeing that structure is important here, so then why should I need to flatten my image at all? Let’s be crazy and not flatten the image, but instead, make our hidden layer a 2D matrix:Īs it stands, each group of 2 x 2 pixels has 4 unique weights associated with it (one for each pixel), which are being summed up into a single value in the hidden layer. We have far fewer parameters now because we’re acknowledging that pixels that are far apart are probably not all that related and so don’t need to be connected. So maybe, we should have each hidden node only look at a small area of the image, like this: The point here is that the structure of our data (the pixels) is important. You probably use the shading (colour) to infer things about the image too but we’ll talk more about that later. You notice how different structures are positioned and related (the face is on top of the shoulders, etc.) You notice the structure in the image (there’s a face, shoulders, a smile, etc.) The order of our features doesn’t matter.Ĭonsider the simple image and fully connected network below:Įvery input node is connected to every node in the next layer - is that really necessary? When you look at this image, how do you know that it’s me? Up until now we’ve been dealing with “fully connected neural networks” meaning that every neuron in a given layer is connected to every neuron in the next layer.
