Driven by
the rapid development of Artificial Intelligence, new possibilities of image production are emerging
at an enormous speed.
Reactions are equally strong, ranging from creators being afraid of
being replaced by machines to fascination about the democratization of design processes and the
emergence of new genre and self-designations
("AI Art / AI Artist"). The wide range of responses demonstrates the enormous power of new
technologies
, such as text-to-image models, that will fundamentally change the way creatives work.
In order to understand and actively shape this change
, the Prompt Engineering Project, as an interdisciplinary project, aims to gather different positions
and foster exchange between them.
We want to connect researchers and practitioners from different disciplines, from tech to art and
beyond, in order to find answers and solutions.
In computer science prompts are commonly known as the task you give a program to execute something. Established in natural language processing, prompt engineering means designing your input in a way that the machine learning model produces an output according to the user’s expectations. This also applies to text-to-image models, where prompt engineering is already discussed as a new profession.
The enormously fast technical progress makes it possible to generate better and better images. But what does better mean here? An accurate reproduction of the training data? An image that is as photorealistic as possible? A precise or literal translation of text prompts into images? From amateurs who can suddenly create works they would have had to spend years learning techniques to produce, to professional creators who want to incorporate the algorithms into their workflow, to developing tech companies: What ideas of a "good picture" do the target groups have?
A program that can generate an infinite number of new images - that's what text-to-image models promise. But what does it mean to create something new? By training with billions of already existing images, established styles and content are reproduced. How can we escape this aesthetic echo chamber? Can something innovative actually be created, for example, through an unexpected combination and bringing together of concepts never before associated? Is this even a necessary criterion for such technologies?
The
computer painted a picture
or the machine produced a painting - such statements give the impression that programs can
autonomously produce a work of art. However, the prompt, i.e. the instruction to the program, still
has to be conceptualized, entered, evaluated and adapted by a human being. So what might meaningful
collaboration between humans and computers look like? How much control do we need over the output
and how many decisions do we leave to the program? What might an interface look like and what
brushes or tools do we need, if any, besides the text prompt? Is our language powerful enough to
give us enough control?
If we show the same picture to a hundred different people and ask them to describe it, we are likely to get a hundred different answers. Both image and language have numerous, socio-cultural dimensions and, as two distinct modalities, are different in structure. Can this complexity of our image and language world be modeled algorithmically at all? Most models use the "alt tags" of images published on websites, resulting in a simplified, literal description. What other image-text relations would be conceivable, helpful, and implementable?
Escalation of our image canon: Whose images are reproduced, who is represented?
Escalation of the speed of production: Are we drowning in a flood of
images of the same aesthetics
or are we generating new creative potential through the quick and easy production of variant
images?
Escalation of the question of originality: What role does an image have, if now seemingly everyone
can produce a painting in the style of van Gogh or credibly represent alternative facts in formerly
reliable media such as photo or video?
We are looking for people from various fields who engage with multimodal AI models to build a database of practitioners, artists, designers, researchers and developers and map out your answers and thoughts. Want to participate? Write us an E-Mail or contact via Social Media!
This project is the follow-up project to the final thesis Algorithms designing our
future – Critical engagement with text-to-image algorithms and call for interdisciplinary
research by Katharina Mumme at the Department of Design at Hamburg University of Applied
Sciences. Feel free to check out the research on this website. Or
get in contact to read the thesis.
This project is part of the research project aiXdesign.space at the Department of Design at Hamburg University of Applied Sciences.
The underlined words are used as prompts in Stable Diffusion for the images.