2 minute

Video Preview

ArXiv

Preprint

ECCV 2020

Oral

Source Code

Github

Demo Colab

Notebook

Deep network training is a blind optimization procedure where
programmers define objectives but not the solutions that emerge.
In this paper we ask if deep networks can be created in a different way.
**Can the rules in a network be directly rewritten**?

Original Model |
|||
---|---|---|---|

There are two reasons to want to rewrite a deep network directly:

- To gain insight about how a deep network
*organizes its knowledge*. - To enable creative users to quickly make novel models
for which there is
*no existing data set*.

Model rewriting envisions a way to construct deep networks according
to a user's intentions. Rather than limiting networks to imitating data
that we already have, rewriting allows deep networks to model a world
that follows new rules that a user *wishes to have*.

Model rewriting lets a person edit the internal rules of a deep network directly instead of training against a big data set.

The difference between *training* and *rewriting* is akin
to the difference between natural selection and genetic engineering.
While training allows efficient optimization of a global objective,
it does not allow direct specification of internal mechanisms.
In contrast, rewriting allows a person to directly choose the internal
rules they wish to include, even if these choices do not happen to
match an existing data set or optimize a global objective.

Rewriting a model is challenging, because doing it effectively requires the user to develop a causally correct understanding of the structure, behavior, and purpose of the internal parameters of the network. Up to now it has been unknown whether direct model rewriting is a reasonable proposition.

Our paper shows that model rewriting is feasible.

This 10 minute talk demonstrates the user interface and gives an overview of the mathematical ideas.

The mp4 files of the videos can be downloaded directly:

Our method is based on the hypothesis that the weights of a generator act as Optimal Linear Associative Memory (OLAM).

OLAM is a classic single-layer neural data structure for memorizing associations that was described by Teuvo Kohonen and James A Anderson (independently) in the 1970s. In our case, we hypothesize that within a large modern multilayer convolutional network, the each individual layer plays the role of an OLAM that stores a set of rules that associates keys, which denote meaningful context, with values, which determine output.

The implication of the optimiality hypothesis is that an individual memorized rule is stored within the weights in a slot corresponding to a particular subspace of rank-one updates determined by the fixed global key statistics and the context (key) of the rule being changed — but not the value.

In other words, one rule corresponds to a line of memory that can be rewritten freely.

Therefore, to alter a rule, we first identify the subspace direction
*d* corresponding to one specific rule's memory slot, then we
make a change in that direction to achieve the desired change. Since the
update direction minimizes interference with other rules, one rule can be
changed dramatically to fit any goal the user wishes to specify,
while leaving the rest of the model's behavior essentially unchanged.

The reasoning is discussed further in the talk, and full details are described in the paper.

@inproceedings{bau2020rewriting, title={Rewriting a Deep Generative Model}, author={Bau, David and Liu, Steven, and Wang, Tongzhou, and Zhu, Jun-Yan and Torralba, Antonio}, booktitle={Proceedings of the European Conference on Computer Vision (ECCV)}, year={2020} }