Apparently, stealing other people’s work to create product for money is now “fair use” as according to OpenAI because they are “innovating” (stealing). Yeah. Move fast and break things, huh?

“Because copyright today covers virtually every sort of human expression—including blogposts, photographs, forum posts, scraps of software code, and government documents—it would be impossible to train today’s leading AI models without using copyrighted materials,” wrote OpenAI in the House of Lords submission.

OpenAI claimed that the authors in that lawsuit “misconceive[d] the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.”

    • sculd@beehaw.orgOP
      link
      fedilink
      arrow-up
      1
      ·
      8 months ago

      Yup, I saw that too. There is also another thread on this board that is discussing this issue.

      One interesting thing I noticed is how the AI apologists in this thread seems to be quiet on the other.

  • sculd@beehaw.orgOP
    link
    fedilink
    arrow-up
    1
    ·
    8 months ago

    Some relevant comments from Ars:

    leighno5

    The absolute hubris required for OpenAI here to come right out and say, ‘Yeah, we have no choice but to build our product off the exploitation of the work others have already performed’ is stunning. It’s about as perfect a representation of the tech bro mindset that there can ever be. They didn’t even try to approach content creators in order to do this, they just took what they needed because they wanted to. I really don’t think it’s hyperbolic to compare this to modern day colonization, or worker exploitation. ‘You’ve been working pretty hard for a very long time to create and host content, pay for the development of that content, and build your business off of that, but we need it to make money for this thing we’re building, so we’re just going to fucking take it and do what we need to do.’

    The entitlement is just…it’s incredible.

    4qu4rius

    20 years ago, high school kids were sued for millions & years in jail for downloading a single Metalica album (if I remember correctly minimum damage in the US was something like 500k$ per song).

    All of a sudden, just because they are the dominant ones doing the infringment, they should be allowed to scrap the entire (digital) human knowledge ? Funny (or not) how the law always benefits the rich.

  • noorbeast@lemmy.zip
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    8 months ago

    I will repeat what I have proffered before:

    If OpenAI stated that it is impossible to train leading AI models without using copyrighted material, then, unpopular as it may be, the preemptive pragmatic solution should be pretty obvious, enter into commercial arrangements for access to said copyrighted material.

    Claiming a failure to do so in circumstances where the subsequent commercial product directly competes in a market seems disingenuous at best, given what I assume is the purpose of copyrighted material, that being to set the terms under which public facing material can be used. Particularly if regurgitation of copyrighted material seems to exist in products inadequately developed to prevent such a simple and foreseeable situation.

    Yes I am aware of the USA concept of fair use, but the test of that should be manifestly reciprocal, for example would Meta allow what it did to MySpace, hack and allow easy user transfer, or Google with scraping Youtube.

    To me it seems Big Tech wants its cake and to eat it, where investor $$$ are used to corrupt open markets and undermine both fundamental democratic State social institutions, manipulate legal processes, and undermine basic consumer rights.

    • vexikron@lemmy.zip
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      8 months ago

      Yep, completely agree.

      Case in point: Steam has recently clarified their policies of using such Ai generated material that draws on essentially billions of both copyrighted and non copyrighted text and images.

      To publish a game on Steam that uses AI gen content, you now have to verify that you as a developer are legally authorized to use all training material for the AI model for commercial purposes.

      This also applies to code and code snippets generated by AI tools that function similarly, such as CoPilot.

      So yeah, sorry, either gotta use MIT liscensed open source code or write your own, and you gotta do your own art.

      I imagine this would also prevent you from using AI generated voice lines where you trained the model on basically anyone who did not explicitly consent to this as well, but voice gen software that doesnt use the ‘train the model on human speakers’ approach would probably be fine assuming you have the relevant legal rights to use such software commercially.

      Not 100% sure this is Steam’s policy on voice gen stuff, they focused mainly on art dialogue and code in their latest policy update, but the logic seems to work out to this conclusion.

    • sculd@beehaw.orgOP
      link
      fedilink
      arrow-up
      0
      ·
      8 months ago

      Agreed.

      There is nothing “fair” about the way Open AI steals other people’s work. ChatGPT is being monetized all over the world and the large number of people whose work has not been compensated will never see a cent of that money.

      At the same time the LLM will be used to replace (at least some of ) the people who created those works in the first place.

      Tech bros are disgusting.

  • Nacktmull@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    8 months ago

    The problem is not the use of copyrighted material. The problem is doing so without permission and without paying for it.

  • unrelatedkeg@lemmy.sdf.org
    link
    fedilink
    arrow-up
    1
    ·
    8 months ago

    OpenAI says it’s impossible to create useful AI models without copyrighted material

    Good riddance, then just don’t.

  • sub_o@beehaw.org
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    8 months ago

    https://petapixel.com/2024/01/03/court-docs-reveal-midjourney-wanted-to-copy-the-style-of-these-photographers/

    What’s stopping AI companies from paying royalties to artists they ripped off?

    Also, lol at accounts created within few hours just to reply in this thread.

    The moment their works are the one that got stolen by big companies and driven out of business, watch their tune change.

    Edit: I remember when Reddit did that shitshow, and all the sudden a lot of sock / bot accounts appeared. I wasn’t expecting it to happen here, but I guess election cycle is near.

    • sanzky@beehaw.org
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      8 months ago

      What’s stopping AI companies from paying royalties to artists they ripped off?

      profit. AI is not even a profitable business now. They exist because of the huge amount of investment being poured into it. If they have to pay their fair share they would not exist as a business.

      what OpenAI says is actually true. The issue IMHO is the idea that we should give them a pass to do it.

      • sub_o@beehaw.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        8 months ago

        Uber wasn’t making profit anyway, despite all the VCs money behind it.

        I guess they have reasons not to pay drivers properly. Give Uber a free pass for it too

    • flatbield@beehaw.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      8 months ago

      Money is not always the issue. FOSS software for example. Who wants their FOSS software gobbled up by a commercial AI regardless. So there are a variety of issues.

      • intensely_human@lemm.ee
        link
        fedilink
        arrow-up
        0
        ·
        8 months ago

        I don’t care if any of my FOSS software is gobbled up by a commercial AI. Someone reading my code isn’t a problem to me. If it were, I wouldn’t publish it openly.

        • sub_o@beehaw.org
          link
          fedilink
          English
          arrow-up
          0
          ·
          8 months ago

          I do, especially when someone’s profiting from it, while my license is strictly for non commercial.

          • The Doctor@beehaw.org
            link
            fedilink
            English
            arrow-up
            0
            ·
            8 months ago

            Same. I didn’t write it for them. I wrote it for folks who don’t necessarily have a lot of money but want something useful.

            • intensely_human@lemm.ee
              link
              fedilink
              arrow-up
              0
              ·
              8 months ago

              Well, for $20/mo I get a super-educated virtual assistant/tutor. It’s pretty awesome.

              I’d say that’s some good value for people without much money. All of my open source libs are published under the MIT license if I recall correctly. I’ve made so much money using open source software, I don’t mind giving back, even to people who are going to make money with my code.

              It makes me feel good to think my code could be involved in money changing hands. It’s evidence to me that I built something valuable.

              • ParsnipWitch@feddit.de
                link
                fedilink
                arrow-up
                0
                ·
                8 months ago

                $20/mo

                good value for people without much money

                The absolute majority of people can not afford that. This is especially true for huge part of the art that was used to train various models on.

                AI currently is a tool for rich people by rich people which uses the work of poor people who themselves won’t be able to benefit from it.

                • intensely_human@lemm.ee
                  link
                  fedilink
                  arrow-up
                  0
                  ·
                  8 months ago

                  And yet it is orders of magnitude less than it cost a year ago to hire someone to do research, write reports, and tutor me in any subject I want.

                  If an artist can’t afford $20/mo they need a job to support that hobby.

    • Mnglw@beehaw.org
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      8 months ago

      I’m not so much in favor of IP law as I am in favor of informed consent in every aspect of the word.

      when posting photos, art and text content years ago, I was not able to imagine it might be trained off by an AI. As such I was not able to make a decision based on informed consent if I agreed to that or not.

      Even though quotes such as “once you post it, its on the internet forever” were around, I was not aware the extend to which this reached and that had my art been vacuumed by a generative AI model (it hasnt luckily) people could create art that pretends to be created by me. Thus I could not consent

      I think this goes for a lot of artists actually, especially those who exist far more publicly than I do, who are in those databases and who are a keyword to be used in prompts. There is no possible way they could have given informed consent to that at the time they posted art/at the time they started that social media profile/youtube channel etc.

      To me, this is the real problem. I could care less about corporations.

    • interdimensionalmeme@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      8 months ago

      I still think IP needs to eat shit and die. Always has, always will.

      I recently found out we could have had 3d printing 20 years earlier but patents stopped that. Cocks !

    • JokeDeity@lemm.ee
      link
      fedilink
      arrow-up
      1
      ·
      8 months ago

      I’m the detractor here, I couldn’t give less of a shit about anything to do with intellectual property and think all copyright is bad.

  • Pratai@lemmy.ca
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    8 months ago

    I stand by my opinion that AI will be the worst thing humans ever created, and that means it ranks just a bit above religion.

    • Allero@lemmy.today
      link
      fedilink
      arrow-up
      1
      ·
      8 months ago

      I’d argue the issue is not the AI but capitalism.

      AI is good, AI companies are evil.

  • Haus@kbin.social
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    Try to train a human comedian to make jokes without ever allowing him to hear another comedian’s jokes, never watching a movie, never reading a book or magazine, never watching a TV show. I expect the jokes would be pretty weak.

    • Phanatik@kbin.social
      link
      fedilink
      arrow-up
      2
      ·
      8 months ago

      A comedian isn’t forming a sentence based on what the most probable word is going to appear after the previous one. This is such a bullshit argument that reduces human competency to “monkey see thing to draw thing” and completely overlooks the craft and intent behind creative works. Do you know why ChatGPT uses certain words over others? Probability. It decided as a result of its training that one word would appear after the previous in certain contexts. It absolutely doesn’t take into account things like “maybe this word would be better here because the sound and syllables maintains the flow of the sentence”.

      Baffling takes from people who don’t know what they’re talking about.

      • Pup Biru@aussie.zone
        link
        fedilink
        arrow-up
        0
        ·
        8 months ago

        you know how the neurons in our brain work, right?

        because if not, well, it’s pretty similar… unless you say there’s a soul (in which case we can’t really have a conversation based on fact alone), we’re just big ol’ probability machines with tuned weights based on past experiences too

        • ParsnipWitch@feddit.de
          link
          fedilink
          arrow-up
          0
          ·
          8 months ago

          “Soul” is the word we use for something we don’t scientifically understand yet. Unless you did discover how human brains work, in that case I congratulate you on your Nobel prize.

          You can abstract a complex concept so much it becomes wrong. And abstracting how the brain works to “it’s a probability machine” definitely is a wrong description. Especially when you want to use it as an argument of similarity to other probability machines.

          • Pup Biru@aussie.zone
            link
            fedilink
            arrow-up
            0
            ·
            edit-2
            8 months ago

            “Soul” is the word we use for something we don’t scientifically understand yet

            that’s far from definitive. another definition is

            A part of humans regarded as immaterial, immortal, separable from the body at death

            but since we aren’t arguing semantics, it doesn’t really matter exactly, other than the fact that it’s important to remember that just because you have an experience, belief, or view doesn’t make it the only truth

            of course i didn’t discover categorically how the human brain works in its entirety, however most scientists i’m sure would agree that the method by which the brain performs its functions is by neurons firing. if you disagree with that statement, the burden of proof is on you. the part we don’t understand is how it all connects up - the emergent behaviour. we understand the basics; that’s not in question, and you seem to be questioning it

            You can abstract a complex concept so much it becomes wrong

            it’s not abstracted; it’s simplified… if what you’re saying were true, then simplifying complex organisms down to a petri dish for research would be “abstracted” so much it “becomes wrong”, which is categorically untrue… it’s an incomplete picture, but that doesn’t make it either wrong or abstract

            *edit: sorry, it was another comment where i specifically said belief; the comment you replied to didn’t state that, however most of this still applies regardless

            i laid out an a leads to b leads to c and stated that it’s simply a belief, however it’s a belief that’s based in logic and simplified concepts. if you want to disagree that’s fine but don’t act like you have some “evidence” or “proof” to back up your claims… all we’re talking about here is belief, because we simply don’t know - neither you nor i

            and given that all of this is based on belief rather than proof, the only thing that matters is what we as individuals believe about the input and output data (because the bit in the middle has no definitive proof either way)

            if a human consumes media and writes something and it looks different, that’s not a violation

            if a machine consumes media and writes something and it looks different, you’re arguing that is a violation

            the only difference here is your belief that a human brain somehow has something “more” than a probabilistic model going on… but again, that’s far from certain

  • ky56@aussie.zone
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    All the AI race has done is surface the long standing issue of how broken copyright is for the online internet era. Artists should be compensated but trying to do that using the traditional model which was originally designed with physical, non infinitely copyable goods in mind is just asinine.

    One such model could be to make the copyright owner automatically assigned by first upload on any platform that supports the API. An API provided and enforced by the US copyright office. A percentage of the end use case can be paid back as royalties. I haven’t really thought out this model much further than this.

    Machine learning is here to say and is a useful tool that can be used for good and evil things alike.

    • Kichae@lemmy.ca
      link
      fedilink
      arrow-up
      1
      ·
      8 months ago

      Nah. Copyright is broken, but it’s broken because it lasts too long, and it can be held by constructs. People should still reserve the right to not have the things they’ve made incorporated into projects or products they don’t want to be associated with.

      The right to refusal is important. Consent is important. The default permission should not be shifted to “yes” in anybody’s mind.

      The fact that a not insignificant number of people seem to think the only issue here is money points to some pretty fucking entitled views among the would-be-billionaires.

  • Pete Hahnloser@beehaw.org
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    Any reasonable person can reach the conclusion that something is wrong here.

    What I’m not seeing a lot of acknowledgement of is who really gets hurt by copyright infringement under the current U.S. scheme. (The quote is obviously directed toward the UK, but I’m reasonably certain a similar situation exists there.)

    Hint: It’s rarely the creators, who usually get paid once while their work continues to make money for others.

    Let’s say the New York Times wins its lawsuit. Do you really think the reporters who wrote the infringed-upon material will be getting royalty checks to be made whole?

    This is not OpenAI vs creatives. OK, on a basic level it is, but expecting no one to scrape blogs and forum posts rather goes against the idea of the open internet in the first place. We’ve all learned by now that what goes on the internet stays there, with attribution totally optional unless you have a legal department. What’s novel here is the scale of scraping, but I see some merit to the “transformational” fair-use defense given that the ingested content is not being reposted verbatim.

    This is corporations vs corporations. Framing it as millions of people missing out on what they’d have otherwise rightfully gotten is disingenuous.

    • lemmyvore@feddit.nl
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      8 months ago

      This isn’t about scraping the internet. The internet is full of crap and the LLMs will add even more crap to it. It will shortly become exponentially harder to find the meaningful content on the internet.

      No, this is about dipping into high quality, curated content. OpenAI wants to be able to use all existing human artwork without paying anything for it, and then flood the world with cheap knockoff copies. It’s that simple.

      • towerful@programming.dev
        link
        fedilink
        arrow-up
        0
        ·
        8 months ago

        Shortly? It’s happening already. I notice it when using Google and Duckduckgo. There are always a few hits that are AI written blog spam word soup

        • lemmyvore@feddit.nl
          link
          fedilink
          English
          arrow-up
          0
          ·
          8 months ago

          Unfortunately you haven’t seen the full impact of LLMs yet. What you’re seeing now is stuff that’s already been going on for a decade. SEO content generators have been a thing for many years and used by everybody from small business owners to site chains pinching ad pennies.

          When the LLM crap will kick in you won’t see anything except their links. I wouldn’t be surprised if we’ll have to go back to 90s tech and use human-curated webrings and directories.

          • emptiestplace@lemmy.ml
            link
            fedilink
            arrow-up
            1
            ·
            8 months ago

            It’s especially amusing when you consider that it’s not even fully autonomous yet; we’re actively doing this to ourselves.