Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Furthermore, if weights are copyrightable, wouldn't this make the issue of training data licenses even more urgent?

IANAL, but if weights are IP, wouldn't they constitute a "derived work" of the training data?



That's also my understanding, either the weights are copyrightable and then all the models need explicit agreements for any work they include in it because models become derivatives or they are not copyrightable being just machine data (the most likely scenario in my opinion), they can't have it both ways.


There is also a (IMO less likely, but still conceivable) scenario where weights ARE copyrightable, but represent fair use of the training data on grounds of being "sufficiently transformative".


I consider that one super likely, but then using the model to make competing works with one the artists in their own style is a non-fair use derivative work


Style explicitly isn’t copyrightable. It’ll need to be for some other reason.


Your case wouldn't be about style, it would be about specific elements that you posit were memorized and regurgitated by the model. The fact that you're creating art in the same style/medium as the author is what negates the "sufficiently transformative" fair use defense.

Basically, that world ignores the AI model completely. If your resulting work wouldn't be fair use if you directly were working with something from the training set, it wouldn't be fair use if you fed it through an AI model first.


Also known as "having your cake and eating it".


Sadly this seems to be the most likely considering how the US is ran


I think there could be an argument that it's copyrightable but not a derivative work.

If I read a few books about a subject as research, and then I write an article about the subject, it's my own copyright. The fact that I did research doesn't make it derivative of those books (correct me if I'm wrong, IANAL).

Perhaps a model created from copyrighted material be treated in the same way?


> If I read a few books about a subject as research, and then I write an article about the subject, it's my own copyright.

Yes, because in that case you'd be the "author" doing "creative work".

> Perhaps a model created from copyrighted material be treated in the same way?

Who would be the author doing creative work in this case? The people who decided what training material to use? Perhaps, but it seems a stretch for the people who selected the training material to be authors but not the people who created the training material.


The difference is you are person and have many more rights than a machine.


That's because you are human and have rights that a computer program doesn't


A fertile subject for sf stories.


"Transformative use".

The inputs could be copyrighted and the weights could be copyrighted if creating the weights from the inputs is (legally) regarded as a transformative use. And I think it could reasonably be considered to be transformative - the weights don't look anything like the input data.

Disclaimer: IANAL. So far as I know, no court has ruled on whether this qualifies as a transformative use. I take no position on how the courts will actually rule. I merely say that they could regard this as transformative use. (But see jerf's "creativity" argument for another hurdle that weights must pass to be copyrightable.)


Transformative use doesn't necessarily mean copyrightable.

Google's thumbnails are a purely mathematical transformation on images (no copyright themselves), and yet are considered a transformative use.

I believe that trained models are similarly a purely mathematical transformation of {data}, but is transformative in what that can be used for going forward.

"Can" bearing a lot of weight in that sentence.

It's how the human, with agency, uses the model that may be a derivative or copyright infringing use - not the model itself nor necessarily the output.

The output of a generative AI may be similar enough to an existing work that it is derivative of that work. It is possible to construct a prompt that infringes on an existing work even if that work wasn't part of the training data.

For that case, consider you drew a picture. That picture that you just drew isn't part of any training data. I could presumably look at it and describe it with sufficient detail that something similar enough would be generated... and that may be considered a derivative work. The same test could be applied to me describing it to someone on Fiverr with the same outcome.

If I were to publish that work by the generative AI or Fiverr - who would be infringing on copyright? me? or the black box that may be AI or Fiverr that created a picture based on my prompts?


Another way to look at it is if a thing reproduced a data subjectively resembling originals and then you used it anyhow, then its non-transformative use, and methods used is just extra details.


No, weights are not just data fed but also the training process itself. I think the whole argument hinges on how much human thought and action is needed in training the model.

On the other end of the spectrum, AI generated content couldn't be copyrighted if there is no human involvement. If someone asks GPT to write 1000 poems, it couldn't be copyrighted.


In a sane legal system a new copyright law would be passed to clarify all of this. In ours, the poor copyright office needs to make things up on the fly.

Their recent decision that implies that anything that AI is used to produce is non-copyrightable is silly, sad, and not sustainable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: