OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling’s Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.
Its a bit pedantic, but I’m not really sure I support this kind of extremist view of copyright and the scale of whats being interpreted as ‘possessed’ under the idea of copyright. Once an idea is communicated, it becomes a part of the collective consciousness. Different people interpret and build upon that idea in various ways, making it a dynamic entity that evolves beyond the original creator’s intention. Its like issues with sampling beats or records in the early days of hiphop. Its like the very principal of an idea goes against this vision, more that, once you put something out into the commons, its irretrievable. Its not really yours any more once its been communicated. I think if you want to keep an idea truly yours, then you should keep it to yourself. Otherwise you are participating in a shared vision of the idea. You don’t control how the idea is interpreted so its not really yours any more.
If thats ChatGPT or Public Enemy is neither here nor there to me. The idea that a work like Peter Pan is still possessed is such a very real but very silly obvious malady of this weirdly accepted but very extreme view of the ability to possess an idea.
Ai isn’t interpreting anything. This isn’t the sci-fi style of ai that people think of, that’s general ai. This is narrow AI, which is really just an advanced algorithm. It can’t create new things with intent and design, it can only regurgitate a mix of pre-existing stuff based on narrow guidelines programmed into it to try and keep it coherent, with no actual thought or interpretation involved in the result. The issue isn’t that it’s derivative, the issue is that it can only ever be inherently derivative without any intentional interpretation or creativity, and nothing else.
Even collage art has to qualify as fair use to avoid copyright infringement if it’s being done for profit, and fair use requires it to provide commentary, criticism, or parody of the original work used (which requires intent). Even if it’s transformative enough to make the original unrecognizable, if the majority of the work is not your own art, then you need to get permission to use it otherwise you aren’t automatically safe from getting in trouble over copyright. Even using images for photoshop involves creative commons and commercial use licenses. Fanart and fanfic is also considered a grey area and the only reason more of a stink isn’t kicked up over it regarding copyright is because it’s generally beneficial to the original creators, and credit is naturally provided by the nature of fan works so long as someone doesn’t try to claim the characters or IP as their own. So most creators turn a blind eye to the copyright aspect of the genre, but if any ever did want to kick up a stink, they could, and have in the past like with Anne Rice. And as a result most fanfiction sites do not allow writers to profit off of fanfics, or advertise fanfic commissions. And those are cases with actual humans being the ones to produce the works based on something that inspired them or that they are interpreting. So even human made derivative works have rules and laws applied to them as well. Ai isn’t a creative force with thoughts and ideas and intent, it’s just a pattern recognition and replication tool, and it doesn’t benefit creators when it’s used to replace them entirely, like Hollywood is attempting to do (among other corporate entities). Viewing AI at least as critically as actual human beings is the very least we can do, as well as establishing protection for human creators so that they can’t be taken advantage of because of AI.
I’m not inherently against AI as a concept and as a tool for creators to use, but I am against AI works with no human input being used to replace creators entirely, and I am against using works to train it without the permission of the original creators. Even in the artist/writer/etc communities it’s considered to be a common courtesy to credit other people/works that you based a work on or took inspiration from, even if what you made would be safe under copyright law regardless. Sure, humans get some leeway in this because we are imperfect meat creatures with imperfect memories and may not be aware of all our influences, but a coded algorithm doesn’t have that excuse. If the current AIs in circulation can’t function without being fed stolen works without credit or permission, then they’re simply not ready for commercial use yet as far as I’m concerned. If it’s never going to be possible, which I just simply don’t believe, then it should never be used commercially period. And it should be used by creators to assist in their work, not used to replace them entirely. If it takes longer to develop, fine. If it takes more effort and manpower, fine. That’s the price I’m willing to pay for it to be ethical. If it can’t be done ethically, then imo it shouldn’t be done at all.
Your broader point would be stronger if it weren’t framed around what seems like a misunderstanding of modern AI. To be clear, you don’t need to believe that AI is “just” a “coded algorithm” to believe it’s wrong for humans to exploit other humans with it. But to say that modern AI is “just an advanced algorithm” is technically correct in exactly the same way that a blender is “just a deterministic shuffling algorithm.” We understand that the blender chops up food by spinning a blade, and we understand that it turns solid food into liquid. The precise way in which it rearranges the matter of the food is both incomprehensible and irrelevant. In the same way, we understand the basic algorithms of model training and evaluation, and we understand the basic domain task that a model performs. The “rules” governing this behavior at a fine level are incomprehensible and irrelevant-- and certainly not dictated by humans. They are an emergent property of a simple algorithm applied to billions-to-trillions of numerical parameters, in which all the interesting behavior is encoded in some incomprehensible way.
Bro I don’t think you have any idea what you’re talking about. These AIs aren’t blenders, they are designed to recognize and replicate specific aspects of art and writing and whatever else, in a way that is coherent and recognizable. Unless there’s a blender that can sculpt Michelangelo’s David out of apple peels, AI isn’t like a blender in any way.
But even if they were comparable, a blender is meant to produce chaos. It is meant to, you know, blend the food we put into it. So yes, the outcome is dictated by humans. We want the individual pieces to be indistinguishable, and deliberate design decisions get made by the humans making them to try and produce a blender that blends things sufficiently, and makes the right amount of chaos with as many ingredients as possible.
And here’s the thing, if we wanted to determine what foods were put into a blender, even assuming we had blindfolds on while tossing random shit in, we could test the resulting mixture to determine what the ingredients were before they got mashed together. We also use blenders for our own personal use the majority of the time, not for profit, and we use our own fruits and vegetables rather than stuff we stole from a neighbor’s yard, which would be, you know, trespassing and theft. And even people who use blenders to make something that they sell or offer publicly almost always list the ingredients, like restaurants.
So even if AI was like a blender, that wouldn’t be an excuse, nor would it contradict anything I’ve said.
Super interesting response, you managed to miss every possible point.
I disagree with your interpretation of how an AI works, but I think the way that AI works is pretty much irrelevant to the discussion in the first place. I think your argument stands completely the same regardless. Even if AI worked much like a human mind and was very intelligent and creative, I would still say that usage of an idea by AI without the consent of the original artist is fundamentally exploitative.
You can easily train an AI (with next to no human labor) to launder an artist’s works, by using the artist’s own works as reference. There’s no human input or hard work involved, which is a factor in what dictates whether a work is transformative. I’d argue that if you can put a work into a machine, type in a prompt, and get a new work out, then you still haven’t really transformed it. No matter how creative or novel the work is, the reality is that no human really put any effort into it, and it was built off the backs of unpaid and uncredited artists.
You could probably make an argument for being able to sell works made by an AI trained only on the public domain, but it still should not be copyrightable IMO, cause it’s not a human creation.
TL;DR - No matter how creative an AI is, its works should not be considered transformative in a copyright sense, as no human did the transformation.
I thought this way too, but after playing with ChatGPT and Mid Journey near daily, I have seen many moments of creativity way beyond the source it was trained on. I think a good example that I saw was on a YouTube video (sorry I cannot recall which to link) where thr prompt was animals made of sushi and wow, was it ever good and creative on how it made them and it was photo realistic. This is just not something you an find anywhere on the Internet. I just did a search and found some hand drawn Japanese style sushi with eyes and such, but nothing like what I saw in that video.
I have also experienced it suggested ways to handle coding on my VR Theme Park app that is very unconventional and not something anyone has posted about as near as I can tell. It seems to be able to put 2 and 2 together and get 8. Likely as it sees so much of everything at once that it can connect the dots on ways we would struggle too. It is more than regurgitated data and it surprises me near daily.
Just because you think it seems creative due to your lack of experience with human creativity, that doesn’t mean it is uniquely creative. It’s not, it can’t be by its very nature, it can only regurgitate an amalgamation of stuff fed into it. What you think you see is the equivalent of paradoilia.
Why you making personal jabs to make a point? How do you know my creative experience?
I’m going to need a source for that. Fair use is a flexible and context-specific, It depends on the situation and four things: why, what, how much, and how it affects the work. No one thing is more important than the others, and it is possible to have a fair use defense even if you do not meet all the criteria of fair use.
I’m a bit confused about what point you’re trying to make. There is not a single paragraph or example in the link you provided that doesn’t support what I’ve said, and none of the examples provided in that link are something that qualified as fair use despite not meeting any criteria. In fact one was the opposite, as something that met all the criteria but still didn’t qualify as fair use.
The key aspect of how they define transformative is here:
These (narrow) AIs cannot add new expression or meaning, because they do not have intent. They are just replicating and rearranging learned patterns mindlessly.
These AIs can’t provide new information because they can’t create something new, they can only reconfigure previously provided info. They can’t provide new aesthetics for the same reason, they can only recreate pre-existing aesthetics from the works fed to them, and they definitely can’t provide new insights or understandings because again, there is no intent or interpretation going on, just regurgitation.
The fact that it’s so strict that even stuff that meets all the criteria might still not qualify as fair use only supports what I said about how even derivative works made by humans are subject to a lot of laws and regulations, and if human works are under that much scrutiny then there’s no reason why AI works shouldn’t also be under at least as much scrutiny or more. The fact that so much of fair use defense is dependent on having intent, and providing new meaning, insights, and information, is just another reason why AI can’t hide behind fair use or be given a pass automatically because “humans make derivative works too”. Even derivative human works are subject to scrutiny, criticism, and regulation, and so should AI works.
You said "…fair use requires it to provide commentary, criticism, or parody of the original work used. " This isn’t true, if you look at the summaries of fair use cases I provided you can see there are plenty of cases where there was no purpose stated.
You’re anthropomorphizing a machine here, the intent is that of the person using the tool, not the tool itself. These are tools made by humans for humans to use. It’s up to the artist to make all the content choices when it comes to the input and output and everything in between.
I’m going to need a source on this too. This statement isn’t backed up with anything.
AI works are human works. AI can’t be authors or hold copyright.
Isn’t your last sentence making his point?
Neural networks are based on the same principles as the human brain, they are literally learning in the exact same way humans are. Copyrighting the training of neural nets is the essentially the same thing as copyrighting interpretation and learning by humans.
These AIs are not neural networks based on the human brain. They’re literally just algorithms designed to perform a single task.
Well, I’d consider agreeing if the LLMs were considered as a generic knowledge database. However I had the impression that the whole response from OpenAI & cie. to this copyright issue is “they build original content”, both for LLMs and stable diffusion models. Now that they started this line of defence I think that they are stuck with proving that their “original content” is not derivated from copyrighted content 🤷
Yeah I suppose that’s on them.
Copyright definitely needs to be stripped back severely. Artists need time to use their own work, but after a certain time everything needs to enter the public space for the sake of creativity.
deleted by creator
I think you completely and thoroughly do not understand what I’m saying or why I’m saying it. No where did I suggest that I do not understand modern copyright. I’m saying I’m questioning my belief in this extreme interpretation of copyright which is represented by exactly what you just parroted. That this interpretation is both functionally and materially unworkable, but also antithetical to a reasonable understanding of how ideas and communication work.
deleted by creator
Yeah, this is definitely leaning a little too “People shouldn’t pump their own gas because gas attendants need to eat, feed their kids, pay rent” for me.
A sample is a fundamental part of a song’s output, not just its input. If LLMs are changing the input’s work to a high enough degree is it not protected as a transformative work?
deleted by creator
To add to that, Harry Potter is the worst example to use here. There is no extra billion that JK Rowling needs to allow her to spend time writing more books.
Copyright was meant to encourage authors to invest in their work in the same way that patents do. If you were going to argue about the issue of lifting content from books, you should be using books that need the protection of copyright, not ones that don’t.
I just don’t know that I agree that this line of reasoning is useful. Who cares what it was meant for? What is it now, currently and functionally, doing?
I’m a huge proponent of expanding individual copyright to extreme amounts (an individual is entitled to own the rights and usage rights to anything they create and can revoke those rights from anyone), but not in favor of the same thing for corporations.
I hold the exact opposite view as you. As long as it’s a truly creative work (a writing, music, artwork, etc) then you own that specific implementation of the idea. Someone can make something else based on it, but you still own the original content.
LLMs and companies using them need to pay for the content in some way. This is already done through licensing in other parallels, and will likely come to AI quickly.
To be clear, I’m still working through my thinking in this but it’s been something cooking for quite a while. I may not have all the words to express my meaning. For example, I think there are two routes to take in making my argument, one moral, the other technical. I’m not building on the morality of copyright, but focusing on the technical aspects of the limits of ideas.
I suppose I would ask you then about your views in authoritarianism. Because it seems to be that with out an extremely authoritarian state, it would be basically impossible to enforce your view of copyright. Are you okay with that kind of pervasiveness?
Also, from a technical perspective, how do you propose this view of copyright be applied? This is kind of towards the broader point I’m thinking I believe in. It’s not just that copyright laws are epifaci ridiculous, they are also technically almost unenforceable in their modern extremist interpretation with out an extremely pervasive form of surveillance.
Easy. The same way we already do it. We enforce music licensing, video licensing and other ip licensing. It’s been done. All this would do is extend those rights to the individual and remove them from corporations. Work product can be owned by companies, but not indefinitely. Individuals should always be in control of their creations.
Restrictions would more or less be strictly commercial, to where hobbyists wouldn’t be impacted, but as soon as it’s used to make money the original creators are owed as part of it.
It wouldn’t be any harder than it is now, as long as copyright is proved. (Hey look, this is the first time I’ve found an actual use of NFTs). In general anything being done for momentary gain is already monitored and surveilled, so this wouldn’t be a change there either.
Edit: Also most of us already live in authoritarian states. This won’t really change anything. Big corps already enforce their copyright when it makes monetary sense and are actively trolling for unauthorized uses.
Personally, I think you are describing a dystopian, authoritarian landscape which will be devoid of any real creativity or interesting ideas. I’m a believer that all ideas are free to be stolen, copied, improved upon; that imitation of ideas is a fundamental human right, and fundamental to what it means to be human. Likewise, I think our social and media landscape would be much poorer without this right. I don’t think any one has the inherent right to profit off of an idea.
I feel the exact opposite. There’s no reason for me to create anything if someone else can come along and steal it. Eliminating copyright will bring your dystopian landscape where nobody shares any sort of art or creative work because someone else will steal it.
What motivation is there for creatives if you’re just telling them their work has no implicit value and anyone else can come along and reappropriate it for whatever they’d like?
This is great because I think you are totally correct in your sentiment that we believe oppositely. I see art created only for the purpose of profit as drivel; true art is an expression of the self. If the only reason you make art is for profit, you aren’t an artist, you are an employee.
That’s a great theory and all, but it’s not even money. I make no money from my photos, but I also refrain from posting any of them because I’d rather they not be used for AI training. Same with any music I create and I’m getting there with my code.
The nobility of art has always been in question, and it’s consistently been proven that artists who aren’t compensated for their work also tend to create less.
This is also not explicitly about profit. If I write a song and then it’s used at a hate rally, I currently have no recourse. They’re not making money from that application (directly), but they are using my creation to promote something I don’t agree with.
I’m curious to know if you’re an artist yourself, as it’s very contrary to the opinions from other creatives I know.
I assume you’re against the communal and collective culture that modders for games enjoy?
I assume you also believe no technological innovations are produced in America anymore since China would simply steal it.
Nowhere did I say derivative works are not ok. If a game maker explicitly forbids using modded versions of their game, I think that should be up to them. Games that have vibrant modding communities are almost always at least partially supported by the developer anyways.
My points are individual copyright anyways, not corporate. With increasing individual protections I also propose decreasing corporate copyright protection.
I believe that China makes 90% of the same product for 80% of the price after ripping off their American counterparts. But that’s also entirely off topic and really has nothing to do with this. Art/Creative Works are entirely different than physical goods.
How is what AI produces not derivative? Like humans, AI takes in a bunch of inputs (think about all the art you’ve seen, read, and watched, and how it affects the art you create), and outputs something that’s derivative from the input.
I hold whatever view makes George Lucas stop digitally remastering the original trilogy.