Studying artificial intelligence (analytical note No. 1)

Task: analyze the capabilities of artificial intelligence focused on creating visual solutions.

We live in an amazing time of paradigm shifts!

Modern artificial intelligence is gradually learning to perform functional tasks that used to be the prerogative of man. Yesterday there was confidence that the last "bastion", artistic activity, will not be taken for a long time, and our generation, and possibly the next, will have time to work in the context of the "classical" paradigm. But, apparently, it was necessary to predict the development of artificial intelligence not by arithmetic progression, but by exponential.

Today, many different neural networks are able to generate images by verbal description: Photosonic, Dall-e 2, Stable Diffusion, NightCafe, Jasper Art, etc. The process is simple: a human user records a set of keywords (tags, promts), and a neural network generates an image based on the specified semantics. More accurate description, the better set of keywords and the resulting visual solution more correlate.

A lot has been written about such neural networks, but previously the result of generation was either clearly different from the visual solutions created by man (the neural network, as it were, immediately gave out its artificial nature), or incorrect from a visual point of view (there were distortions that a human would hardly have allowed). Accordingly, the artist could feel relatively calm — knowing that the machine had not yet mastered the visual literacy. However, everything changed during the last few months of 2022, after the release of the 4th version of the neural network MidJourney.

Today, the MidJourney neural network is able to create (generate) very interesting visual solutions. In this essay, such concepts as "qualitative" instead of "interesting" and "work" instead of "visual solution" are intentionally not used, because such terms require more detailed research, including the content of visual solutions.

For the formulation of the problem and the primary, visual study of the state of affairs, a small study was conducted. 10 generated visual solutions created by various users were selected. The images are taken from the community on the VKontakte social network ( ).

The images were selected according to the following technical, visual and thematic criteria:
 • without visual distortions — various meaningless and unsystematic transformations of the main (dominant) or additional (subdominant) objects of the visual solution;
 • without stylization — for example, creating a graphic or pictorial visual solution for photography;
 • without cyberpunk, steampunk, surrealism, etc. visual solutions, so that there is no illusion that MidJourney can only generate content of this kind. The bias in the selection was made towards the traditional (classical) and understandable (from a pictorial point of view);
 • more variety of painting, graphic and decorative techniques.

After the selection, a search for visual solutions was carried out in Yandex-pictures, to exclude the possibility of direct borrowing of artists' works.

The main conclusion: the MidJourney neural network is a phenomenon that is capable of making a paradigm shift in the field of artistic creativity. The creation of MidJourney, and other similar neural networks, resembles the invention of photography, after which various "isms" gradually began to form, which made it possible to distinguish the artistic genius of a person from the task of realistic fixation of the surrounding reality.

Yes, 10 images are not enough to represent the phenomenon as widely as possible, but the purpose of the mini-study was solely to pose the problem. Yes, for more specific conclusions, more detailed research will be required, including in the direction of communicative design. However, despite these clarifications, the creators of the neural network have cast lots and the Rubicon has been crossed...