# Naturally Occurring Equivariance in Neural Networks

> Source: <https://distill.pub/2020/circuits/equivariance>
> Published: 2020-12-08 20:00:00+00:00

Convolutional neural networks contain a hidden world of symmetries within themselves. This symmetry is a powerful tool in understanding the features and circuits inside neural networks. It also suggests that efforts to design neural networks with additional symmetries baked in (eg.
To see these symmetries, we need to look at the individual neurons inside convolutional neural networks and the circuits that connect them.
It turns out that many neurons are slightly transformed versions of the same basic feature.
This includes rotated copies of the same feature, scaled copies, flipped copies, features detecting different colors, and much more.
We sometimes call this phenomenon “equivariance,” since it means that switching the neurons is equivalent to transforming the input.
Before we talk about the examples introduced in this article, let’s talk about how this definition maps to the classic example of equivariance in neural networks: translation and convolutional neural network nets. In a conv net, translating the input image is equivalent to translating the neurons in the hidden layers (ignoring pooling, striding, etc). Formally, and maps images to hidden layer activations. Then acts on the input image by translating spatially, and acts on the activations by also spatially translating them.
Now let’s consider the case of curve detectors (the first example in the Equivariant Features section), which have ten rotated copies. In this case, and maps a position at an image to a ten dimensional vector describing how much each curve detector fires. Then acts on the input image by rotating it around that position and acts on the hidden layers by reorganizing the neurons so that their orientations correspond to the appropriate rotations. This satisfies, at least approximately, the original definition of equivariance.
This transformed neuron form of equivariance is a special case of equivariance. There are many ways a neural network could be equivariant without having transformed versions of neurons. Conversely, we’ll also see a number of examples of equivariance that don’t map exactly to the group theory definition of equivariance: some have “holes” where a transformed neuron is missing, while others consist of a set of transformations that have a weaker structure than a group or don’t correspond to a simple action on the image. But this general structure remains.
Equivariance can be seen as a kind of ”circuit
motif,” an abstract recurring pattern across circuits analogous to motifs in systems biology
In this article, we’ll focus on examples of equivariance in InceptionV1
Rotational Equivariance: One example of equivariance is rotated versions of the same feature. These are especially common in early vision, for example curve detectors, high-low frequency detectors, and line detectors.
One can test that these are genuinely rotated versions of the same feature by taking examples that cause one to fire, rotating them, and checking that the others fire as expected. The article on curve detectors tests their equivariance through several experiments, including rotating stimuli that activate one neuron and seeing how the others respond.
Scale Equivariance: Rotated versions aren’t the only kind of variation we see. It’s also quite common to see the same feature at different scales, although usually the scaled features occur at different layers. For example, we see circle detectors across a huge variety of scales, with the small ones in early layers and the large ones in later layers.
Hue Equivariance: For color-detecting features, we often see variants detecting the same thing in different hues. For example, color center-surround units will detect one hue in the center, and the opposing hue on around it. Units can be found doing this up until the seventh or even eighth layer of InceptionV1.
Hue-Rotation Equivariance: In early vision, we very often see color contrast units. These units detect one hue on one side, and the opposite hue on the other. As a result, they have variation in both hue and rotation. These variations are particularly interesting, because there’s an interaction between hue and rotation. But cycling hue by 180 degrees flips which hue is on which side, and is so is equivalent to rotating by 180 degrees.
In the following diagram, we show orientation rotating the whole 360 degrees, but hue only rotating 180. At the bottom of the chart, it wraps around to the top but shifts by 180 degrees.
Reflection Equivariance: As we move into the mid layers of the network, rotated variations become less prominent, but horizontally flipped pairs become quite prevalent.
Miscellaneous Equivariance: Finally, we see variations of features transformed in other miscellaneous ways. For example, short vs long-snouted versions of the same dog head features, or human vs dog versions of the same feature. We even see units which are equivariant to camera perspective (found in a Places365
The equivariant behavior we observe in neurons is really a reflection of a deeper symmetry that exists in the weights of neural networks and the circuits they form.
We’ll start by focusing on rotationally equivariant features that are formed from rotationally invariant features. This “invariant→equivariant” case is probably the simplest form of equivariant circuit. Next, we’ll look at “equivariant→invariant” circuits, and then finally the more complex “equivariant→equivariant” circuits.
High-Low Circuit: In the following example, we see high-low frequency detectors get built from a high-frequency factor and a low-frequency factor (both factors correspond to a combination of neurons in the previous layer). Each high-low frequency detector responds to a transition in frequency in a given direction, detecting high-frequency patterns on one side, and low frequency patterns on the other. Notice how the same weight pattern rotates, making rotated versions of the feature.
Contrast→Center Circuit: This same pattern can be used in reverse to turn rotationally equivariant features back into rotationally invariant features (an “equivariant→invariant” circuit). In the following example, we see several green-purple color contrast detectors get combined to create green-purple and purple-green center-surround detectors. Compare the weights in this circuit to the ones in the previous one. It’s literally the same weight pattern transposed.
Sometimes we see one of these immediately follow the other: equivariance be created, and then immediately partially used to create invariant units.
BW-Color Circuit: In the following example, a generic color factor and a black and white factor are used to create black and white vs color features. Later, the black and white vs color features are combined to create units which detect black and white at the center, but color around, or vice versa.
Line→Circle/Divergence Circuit: Another example of equivariant features being combined to create invariant features is very early line-like complex Gabor detectors being combined to create a small circle unit and diverging lines unit.
Curve→Circle/Evolute Circuit: For a more complex example of rotational equivariance being combined to create invariant units, we can look at curve detectors being combined to create circle and evolute detectors. This circuit is also an example of scale equivariance. The same general pattern which turns small curve detectors into a small circle detector turns large curve detectors into a large circle detector. The same pattern which turns medium curve detectors into a medium evolute detector turns large curves into a large evolute detector.
Human-Animal Circuit: So far, all of the examples we’ve seen of circuits have involved rotation. These human-animal and animal-human detectors are an example of horizontal flip equivariance instead:
Invariant Dog Head Circuit: Conversely, this example (part of the broader oriented dog head circuit) shows left and right oriented dog heads get combined into a pose invariant dog head detector. Notice how the weights flip.
The circuits we’ve looked at so far were either “invariant→equivariant” or “equivariant→invariant.” Either they had invariant input units, or invariant output units. Circuits of this form are quite simple: the weights rotate, or flip, or otherwise transform, but only in response to the transformation of a single feature. When we look at “equivariant→equivariant” circuits, things become a bit more complex. Both the input and output features transform, and we need to consider the relative relationship between the two units.
Hue→Hue Circuit: Let’s start with a circuit connecting two sets of hue-equivariant center-surround detectors. Each unit in the second layer is excited by the unit selecting for a similar hue in the previous layer.