Memorization to generalization: The emergence of diffusion models from associative memory
Publication year
2024In
NeurIPS 2024: Workshop on Scientific Methods for Understanding Deep Learning, pp. 1-13Related links
Annotation
NeurIPS 2024: Workshop on Scientific Methods for Understanding Deep Learning (Vancouver, Canada, 16 December 2024)
Publication type
Article in monograph or in proceedings
Display more detailsDisplay less details
Organization
SW OZ DCC AI
Languages used
English (eng)
Book title
NeurIPS 2024: Workshop on Scientific Methods for Understanding Deep Learning
Page start
p. 1
Page end
p. 13
Subject
Cognitive artificial intelligenceAbstract
Hopfield networks are associative memory systems, designed for storing and retrieving specific patterns as local minima of an energy landscape. In the classical Hopfield model, an interesting phenomenon occurs when the model's memorization capacity reaches its critical memory load - spurious states, or unintended stable points, emerge at the end of the retrieval dynamics. These particular states often appear as mixtures of the stored patterns, leading to incorrect recall. In this work, we propose that these spurious states are not necessarily a negative feature of retrieval dynamics, but rather that they serve as the onset of generalization. We employ diffusion models, commonly used in generative modelling, to demonstrate that their generalization stems from a phase transition which occurs as the number of training samples is increased. In the low data regime the model exhibits a strong memorization phase, where the network creates a distinct basin of attraction for each sample in the training set, akin to the Hopfield model below the critical memory load. In the large data regime a different phase appears where an increase in the training set size fosters the creation of new attractor states that correspond to manifolds of the generated samples. Spurious states appear at the boundary of this transition and correspond to emergent attractor states, which are absent in the training set, but, at the same time, still have a distinct basin of attraction around them. From the perspective of Hopfield description these spurious states correspond to mixtures of "fundamental memories" which facilitate generalization through the superposition of underlying features, resulting in the creation of novel samples. Our findings provide a novel perspective on the memorization-generalization phenomenon in diffusion models via the lens of Hopfield networks, which illuminate the previously underappreciated view of diffusion models as Hopfield networks above the critical memory load.
This item appears in the following Collection(s)
- Academic publications [246860]
- Electronic publications [134292]
- Faculty of Social Sciences [30549]
- Open Access publications [107812]
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.