University of Chicago Glaze Overview
Last updated
Last updated
Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models https://arxiv.org/pdf/2302.04222.pdf
Since the rise of powerful AI image generators that swept the internet last year, artists have been talking about the threat to their livelihood, especially those who use their works for model training without prior consent. Recently, their dissatisfaction has even escalated into a class action lawsuit. In order to solve this problem, the academic research project Glaze at the University of Chicago has launched a free (non-commercial) application for artists, hoping to combat AI's "artistic intellectual property" theft with advanced "camouflage" technology. The team also published a research paper explaining how this beta version of the app works - it can add nearly imperceptible "disturbances" to each piece of art, thereby interfering with the AI model's ability to read style data, making it more difficult to imitate the target work and the creator's artistic style. Moreover, these disturbing elements can also induce the system to output other common styles far from the original art, but they will hardly affect the look of the painting or picture.
It is clear that AI perceives the world in a fundamentally different way from humans, a difference that has been recognized for some time. This difference is not easily eliminated or avoided, hence the attempt to counter machine learning with 'adversarial examples'. Over the past decade, efforts have been made to solve this adversarial problem, but to little effect. From the current situation, there seems to be an inevitable difference between human visual observation of the real world and AI model's mathematical observation of the real world. So in purely technical terms, the adopted disturbance method is actually an attack rather than a defense, but it does serve a defensive function.
In addition, there is a mismatch between the individual needs of human creators (i.e., artists) and AI models. These creators often make a living from artistic creation, while the commercial participants behind generative AI models are mature institutions that have attracted a large amount of venture capital and personal data, with the goal of giving machines the ability to automate (more bluntly, replace) human creativity. In terms of generative AI art models, this technology indeed threatens artists' livelihoods by automatically mimicking artistic styles. Users of generative AI art tools like StableDiffusion and Midjourney don't need any drawing skills to generate images that look quite reasonable (at least enough to fool laymen). Just enter a few words to describe the content needed, and these software can quickly generate images, even imitating a specific artist's creative style based on user prompts. The entire output process is extremely fast, and the visual effects are quite unique, highlighting the maturity and power of this technology. However, the developers of generative AI models generally do not obtain permission to search for data on the public internet for model training. On the other end of the ecosystem, artists are still displaying their works on open platforms, hoping to improve their skills and expand their personal industry influence. But it is this common creative service operation that allows a large number of works to be used as training data by AI models. While generative AI is continuously strengthening itself through this, no one seems to be asking the affected artists whether they are willing to do so.
In some cases, artists have even found that their names can be used directly as prompts, instructing AI models to generate images similar to their style - again without any prior permission (and no corresponding income compensation). This can be said to be outright creative theft (although such demands are expected to be supported by the judicial department soon).
It was developed by a team of computer scientists at the University of Chicago, who are associated with the school's research group, SANDLab. The inspiration for this idea came from Fawkes, an algorithm built by SANDLab in 2020 that can "hide" personal photos so they can't be used as data for facial recognition models. This latest project was prompted by artists contacting the lab begging for help, after which the lab surveyed over 1,000 artists to assess the severity of the problem. Therefore, Glaze borrows this algorithm to identify which features need to be changed to change the style of the work, and then only "disturbs" these features to confuse the minimal amount required by the generative model.
As the head of this project, Ben, a professor of computer science at the University of Chicago, explained how this tool works in an interview. The software works by trying to understand how AI models perceive artistic styles in their own way. Then, based on this dimension, it works to distort the model's identification results for specific styles. So rather than hiding or preventing certain feature information, it's more about learning the language of machine learning models and using this language to distort the model's view of artistic images, while at the same time trying not to affect normal human viewing. It turns out that AI models view paintings from a completely different angle than humans do, and significant distortions can be achieved from a machine learning perspective without causing easily detectable changes from a human perspective.
The chosen target style T should be sufficiently different from the style of the protected artwork V in the model feature space to maximize the disruption of style imitation opportunities. For a new user, Glaze uses the following algorithm to randomly select a style T from a set of candidate styles that are quite different from the artist's style:
First, check a public dataset for an artist, each artist has a specific style (e.g. Monet, Van Gogh, Picasso). For each candidate target artist/style, it selects some images in that style and calculates their feature space centroid using the image feature extractor Ξ¦.
Calculate the feature space centroid of the protected artwork V. Then find the candidate style set, the centroid of which is between 50 and 75 percentiles of all candidate styles from the centroid of V.
Finally, it randomly selects T from the candidate set.
Glaze uses a pre-trained style transfer model Ξ© to generate artwork in the target style. Given each artwork xβXV and the target style T, it styles x to the target style T to produce the style transferred image Ξ©(x,T).
Then, Glaze calculates the cloak disturbance Ξ΄x to be added to the original artwork feature extractor to disrupt style imitation. To minimize the visual impact of the disturbance on the artwork, the paper follows the formula for optimization:
Where Ξ¦ is a common image feature extractor used in text-to-image generation tasks, Dist(.) calculates the distance between two feature representations, |Ξ΄x| measures the perceived disturbance caused by the cloak, and p is the perceived disturbance budget.
The paper's implementation uses LPIPS (Learned Perceptual Image Patch Similarity) to bound the disturbance. Unlike the Lp distance used in previous work, LPIPS is becoming increasingly popular as a measure of user-perceived image distortion. Using this measure to bound cloak generation ensures that the cloaked version is visually similar to the original image. The paper applies a penalty method to solve the optimization in the above formula, the specific content is as follows:
Where Ξ± controls the impact of the input disturbance. The L2 distance is used to calculate the feature space distance.
Glaze has the ability to fight back through a series of generative AI models, which largely aligns with the concept of "transferability" in machine learning. Although it currently cannot fully implement all AI art models, it has shown sufficient portability in these types. It can transfer the disturbance effect to other unseen models. Of course, the final result is not ideal because the transferability is not perfect. But for now, such perfection is not needed, just this alone is enough to change the slump in the entire game industry. There is no need to set clear boundaries, it should be viewed as a highly continuous space. Therefore, as long as a version with a style camouflage effect is established, in most cases, it can be ported to other unoptimized models, ensuring similar effects on other models.
Specific art styles may greatly affect the effectiveness of Glaze, some styles are indeed hard to protect. Most importantly, there is not much room for the style-disturbing trick to play, for example, he thinks that simple, fresh, and pure colors do not have as good a disturbance effect as those works with rich visual elements. Some kinds of artworks, due to their own characteristics, are indeed difficult to protect. For example, in an architectural sketch, the lines are clear and accurate, combined with a large area of white background, this style is hard to destroy because the indoor activity space is so small. This type of picture is either blank or solid, with few transitional places. Therefore, it is more difficult to preserve such artwork and the effect will not be too good. However, if it is a painting with many textures, many colors, and many backgrounds, it is much easier. This method can not only improve the quality of the picture but also greatly enhance the quality of the picture.
From the effect point of view, one can feel that Glaze has done noise reduction processing on the picture, which is very different from the original picture, some pictures are very blurry. However, the survey group believes that the current difference is already very small, and it is difficult for ordinary people to perceive this. If it is an artist with insight, they will definitely notice this, but some sacrifices made to preserve their own style are still worthwhile, at least they can let the audience continue to appreciate their works without worrying about their works falling into the hands of artificial intelligence. Glaze is committed to solving practical problems for artists and hopes that they can freely publish and promote their works online. Especially for some independent artists, under the deterrence of the AI generative mode, they no longer dare to take this form of commission, and their lives have been greatly affected. Now, they obviously feel safer, and it's hard for AI to mimic their painting style, which means they have kept their promise. For most artists, this tool allows them to better promote their works without being easily replicated by AI. Although it's not a complete cure, it has at least greatly alleviated their situation.
(1) Without using Glaze, the style can be easily stolen.
When the imitator gains access to the victim's original (unaltered) artwork, the attack is highly successful. Examples of simulated artworks are shown in the leftmost two columns of the figure. The first column displays the original works of the victim artist, while the third column depicts the victim's original works trained by a specific style model without using Glaze. In the user study of the paper, over 95% of the respondents considered this attack to be successful.
οΌ2οΌGlaze effectively prevents style theft.
Glaze makes imitation attacks significantly less successful, as shown in the above figure. The 5th and 6th columns (from the left) display the imitated artworks when training a specific style model. The 4th column shows the style-transferred artworks (the originals from the 1st column are the source) used to optimize "Glaze".
Artificial intelligence models have absorbed a large number of works and art styles, infringing the rights of their creators. However, this team remains optimistic, believing that the majority of artists continue their creative work and constantly promote their new works. Of course, AI models themselves have not stagnated, but continue to be trained for a long time. In their view, as more and more disguised works are exposed, AI will change its cognition of a certain style, which will affect the previous style.
Once artists start using tools similar to Glaze, their influence will last for a long time. In addition, these tools also bring another benefit, that is, art style is actually a continuation, so there is no "to get the desired result, the integrity of most pictures must be guaranteed". Even if only a small part of the images have been processed by Glaze, when these images are added to the training samples, they will still have a certain impact on the results of the AI model. Therefore, it can be determined that the more content the AI model contains, the closer the generated content will be to the original content. Even if only a small part of people receive treatment, the effect still exists, although it is relatively weak. In general, Glaze technology is not a binary tool, it has an efficient linear gain protection mechanism.
Currently, AI models to some extent determine certain specific artists, such as Picasso. As training materials about Picasso's painting style are continuously poured in, AI's pattern of Picasso's painting style is advanced to a new level. Once you absorb more materials, you will embark on a path to another road. When you completely deviate from your understanding of the original painting, you will not be able to create a work similar to Picasso. Perhaps, this is AI's understanding of art. Let's take a look at how Glaze selects an inappropriate style for AI, or rather, how it selects a special style to resist AI's imitation of art works. Of course, this is also a moral consideration, if this method really works, there will be more methods in the future to guide the development of AI.
He emphasized that "Glaze" will strive to make its output targets completely different from the originals, thereby effectively protecting the rights of creators. That is to say, a realist artist's painting, after some disguise, is likely to show something very abstract, meaning that his painting has nothing to do with the original. It is worth noting that there is a consensus among the interviewed artists: the worse the quality of the painting, the better the protective quality of Glaze. Moreover, we do not want this model to deviate greatly from the style of a certain artist, which means we do not need to convert an existing style into another style. On the contrary, he needs a work that can be distinguished from the original. So, he wants to find a work that can be accepted by the public, but different from other artists. Anyway, what he wants to do is to distinguish between two people. When the software is running, it analyzes the works submitted by the artist, roughly determines the artist's current creative state, then assigns a clearly different creative state as the deviation, and tries to guide the AI model.
Firstly, Glaze's protection method involves hiding its own work within the training data set of the simulation model. This poses a significant challenge for those who are well-developed and stable in the arts, with many works available from gallery collections.
Secondly, systems like Glaze that offer protection for artists face challenges for the future. The technology currently used to hide artwork could also be used against it in the future, potentially damaging previously protected works.
The research results of our project team have explored possible countermeasures that AI model developers might adopt. These include using image transformations to avoid interference (enhancing the image before training to eliminate interference) and robust training (incorporating a certain number of camouflage images during the training process to improve the robustness of the model). In both cases, researchers found that these countermeasures were unable to reduce the success index of the "Artist Recognition Protection (ARP)" achieved by Glaze (though the paper also mentions that robust training techniques do indeed weaken the effect of camouflage).
When exploring the risks posed by countermeasures, the research team also recognized that this confrontation will trigger an arms race between the illusion of protection and AI model designers eliminating protective actions to obtain valuable information. However, they believe that Glaze can provide positive protection, at least temporarily allowing artists to take the initiative and use this opportunity to enact laws to combat greedy AI models. For example, these tools can be used to increase the cost of decoding protected information and to discourage AI developers from using it for free. In the world of machine learning, offense is always easier than defense. In some cases, interference is like poisoning the model. Of course, some people will come up with more powerful countermeasures to offset the protective effect provided by Glaze. In the past, it took more than a year for any research field to formulate corresponding countermeasures. Given that Glaze itself is indeed an effective offensive method, there is still more room for "anti-Glaze" strategies.
(1) Increase Countermeasure Costs
Many people see this as an endless game of cat and mouse. While this view is correct to some extent, it seeks to extend the time of each round of countermeasure cycles. In addition, to prevent Glaze from being widely used, all countermeasures against it must be expensive. For most artists, if the cost of breaking this system can be raised, it wouldn't be worth it to analyze their works, create a clearer picture, and mimic their artistic styles. The goal is to raise these standards as high as possible so that imitators will be deterred and turn to other more profitable industries.
(2) Strengthen Promotion and Improve Staff Literacy
Even if the price is raised, rich AI giants would still pay a high price to copy the works of popular artists. After all, these companies have abundant resources and it's easy for them to tap into their potential. However, it can at least prevent imitation at the level of small workshops, because the group believes that "they cannot afford the massive computational power required to bypass Glaze." Not to mention, for a wealthy AI giant, even if it's not a big deal, reputation still matters. If they really studied these technologies, their so-called "stealing of public data" would seem somewhat weak.
The research group emphasizes: "In this event, most people will make a clear judgment on the event, either morally or ethically." If AI targets individuals and deliberately mimics others' works without permission or authorization, it will cause serious moral flaws and erroneous decisions. For them, this is a good thing. In other words, all countermeasures are naturally unethical. In this way, big companies can be deterred from acting recklessly, and they might even be encouraged to protect their own works."
With Glaze, more researchers will be inspired to work for human creativity. We hope this goal can continue, such as creating a result better than "Glaze", which can play a powerful role in future confrontations. This is also a major purpose of the program, which is to arouse people's attention to this issue, cherish these technologies and research results, and provide as much help as possible to participants. At the same time, we should also notice that some people, because their voices are too small, can't express their opinions with technological means. Therefore, we hope that in this diverse art community, more attention can be gained. Only in this way can our "clear-cut" struggle be victorious.
War of the Painter and AI: https://zhuanlan.zhihu.com/p/611136956
Release That Artist! Glaze Protects Art from AI Intrusion: https://zhuanlan.zhihu.com/p/616819820