Deep Learning and Artistic Style – Can art be quantified?

We analyze the latest advance in Deep learning which teaches computers to paint in the style of different famous painters, from Van Gogh to Picasso. Is it really Art?



The work "A Neural Algorithm of Artistic Style" discussed here presents yet another interesting application of deep learning and a variation on the theme of computer vision. A convolutional neural network is employed to separate the style and content of different images and recombine these disparate sources into one cohesive piece.

Separating Style and Content

neural-art-style-content

The novelty of this algorithm, as previously mentioned, is taking any image and reproducing it in the style of a given artist. First the content of a photograph is extracted, then the style of a given painting and finally, these heterogeneous aspects are combined via a parameterised ratio to create a new image.

neural-art-content-reconstruction Fig. 1 from the paper: Convolutional Neural Network (CNN). A given input image is represented as a set of filtered images at each processing stage in the CNN.

A convolutional neural network (CNN) is the tool used to extract both style and content. CNNs are a type of neural network where filters of shared weights are passed across overlapping image patches in order to learn distinguishing characteristics, known as feature maps or convolutions. Here, the publicly available and trained VGG network is utilised, recognised as state of the art on the ImageNet Large Scale Visual Recognition Challenge.

In order to capture image content, a central tenet of deep learning is employed: when progressively more - in this case convolutional or 'hidden' - layers are added to a network, a more abstract representation of the image is learned and the model is less susceptible to local perturbations. This means the concept or content depicted in the image is captured without concern for minor details.

neural-art

The authors' method for determining the style of a painting is based on previous work which examined texture extraction. To find this texture representation or style, the correlations between feature-maps within a layer are analysed. These correlations are found via a Gram matrix calculated by taking the inner-product of vectorised feature maps for each layer.

Painting the Picture

The reconstruction begins as a randomly generated white-noise image. Gradient descent is then performed and through minimising the loss function outlined in the article - a weighted sum of the style and content mean squared error loss functions respectively - a new image is generated.

Input into the content loss function are the activations found in the upper layer of the CNN for both the input and randomly generated image, passed through the same filters. The artist’s style is integrated through minimising the distance between the gram matrix of the painting and the gram matrix of the image to be generated.

Is this Art?

Subjective, but it probably is not. It is however, an interesting question. We could view this progress as a step on the road towards artificial intelligence creating art, or a piece of software to expand the digital artists’ palette. Usually, a first foray into art involves copying a favourite painter's style or a favourite musician’s sound. This work could be seen as analogue to this and I for one am excited to see where this analogue takes us.