Showing posts with label content creators. Show all posts
Showing posts with label content creators. Show all posts

Tuesday, July 23, 2024

The Data That Powers A.I. Is Disappearing Fast; The New York Times, July 19, 2024

 Kevin Roose , The New York Times; The Data That Powers A.I. Is Disappearing Fast

"For years, the people building powerful artificial intelligence systems have used enormous troves of text, images and videos pulled from the internet to train their models.

Now, that data is drying up.

Over the past year, many of the most important web sources used for training A.I. models have restricted the use of their data, according to a study published this week by the Data Provenance Initiative, an M.I.T.-led research group.

The study, which looked at 14,000 web domains that are included in three commonly used A.I. training data sets, discovered an “emerging crisis in consent,” as publishers and online platforms have taken steps to prevent their data from being harvested.

The researchers estimate that in the three data sets — called C4, RefinedWeb and Dolma — 5 percent of all data, and 25 percent of data from the highest-quality sources, has been restricted. Those restrictions are set up through the Robots Exclusion Protocol, a decades-old method for website owners to prevent automated bots from crawling their pages using a file called robots.txt."

Wednesday, July 17, 2024

How Creators Are Facing Hateful Comments Head-On; The New York Times, July 11, 2024

Melina Delkic, The New York Times ; How Creators Are Facing Hateful Comments Head-On

"Experts in online behavior also say that the best approach is usually to ignore nasty comments, as hard as that may be.

“I think it’s helpful for people to keep in mind that hateful comments they see are typically posted by people who are the most extreme users,” said William Brady, an assistant professor at Northwestern University, whose research team studied online outrage by looking at 13 million tweets. He added that the instinct to “punish” someone can backfire.

“Giving a toxic user any engagement (view, like, share, comment) ironically can make their content more visible,” he wrote in an email. “For example, when people retweet toxic content in order to comment on it, they are actually increasing the visibility of the content they intend to criticize. But if it is ignored, algorithms are unlikely to pick them up and artificially spread them further.”"

Sunday, June 30, 2024

Tech companies battle content creators over use of copyrighted material to train AI models; The Canadian Press via CBC, June 30, 2024

 Anja Karadeglija , The Canadian Press via CBC; Tech companies battle content creators over use of copyrighted material to train AI models

"Canadian creators and publishers want the government to do something about the unauthorized and usually unreported use of their content to train generative artificial intelligence systems.

But AI companies maintain that using the material to train their systems doesn't violate copyright, and say limiting its use would stymie the development of AI in Canada.

The two sides are making their cases in recently published submissions to a consultation on copyright and AI being undertaken by the federal government as it considers how Canada's copyright laws should address the emergence of generative AI systems like OpenAI's ChatGPT."

Saturday, June 11, 2016

New York Times Says Fair Use Of 300 Words Will Run You About $1800; New York Times, 6/10/16

Tim Cushing, TechDirt; New York Times Says Fair Use Of 300 Words Will Run You About $1800:
"Fair use is apparently the last refuge of a scofflaw. Following on the heels of a Sony rep's assertion that people could avail themselves of fair use for the right price, here comes the New York Times implying fair use not only does not exist, but that it runs more than $6/word.
Obtaining formal permission to use three quotations from New York Times articles in a book ultimately cost two professors $1,884. They’re outraged, and have taken to Kickstarter — in part to recoup the charges, but primarily, they say, to “protest the Times’ and publishers’ lack of respect for Fair Use.
These professors used quotes from other sources in their book about press coverage of health issues, but only the Gray Lady stood there with her hand out, expecting nearly $2,000 in exchange for three quotes totalling less than 300 words.
The professors paid, but the New York Times "policy" just ensures it will be avoided by others looking to source quotes for their publications. The high rate it charges (which it claims is a "20% discount") for fair use of its work will be viewed by others as proxy censorship. And when censorship of this sort rears its head, most people just route around it. Other sources will be sought and the New York Times won't be padding its bottom line with ridiculous fees for de minimis use of its articles.
The authors' Kickstarter isn't so much to pay off the Times, but more to raise awareness of the publication's unwillingness to respect fair use."