Showing posts with label digital preservation. Show all posts
Showing posts with label digital preservation. Show all posts

Monday, June 9, 2025

Newsmaker: Brewster Kahle; American Libraries, June 4, 2025

Anne Ford  , American Libraries; Newsmaker: Brewster Kahle

"How has the work of the Internet Archive been affected since Trump took office?

Well, the biggest effect has been getting a lot of attention for what we do. We spend a lot of time on Democracy’s Library, which is a name for collecting all the born-digital and digitized publications of government at the federal, state, and municipal levels. There’s been so much attention about all of the [digital] takedowns that we’ve received lots and lots of volunteer help toward collecting not only web assets but also databases that are being removed from government websites. It’s all hands on deck.

And you just launched a new YouTube channel.

Yes, we unveiled our next-generation microfiche scanning as part of our Democracy’s Library project, because a lot of .gov sites are on microfiche, and people don’t want to use microfiche anymore. Fortunately, the US government in its early era was pro–access to information and made government documents public domain. So we put out a YouTube livestream of the microfiche being digitized.

What would you like to see libraries and librarians do during this challenging time?

We need libraries to have at least as good rights in the digital world as we have in the physical world. There’s an upcoming website [from the Internet Archive and others] called the Four Digital Rights of Libraries, and that is something libraries can sign onto as institutions. [The website will launch during the Association of European Research Libraries’ LIBER 2025 Conference in Lausanne, Switzerland, July 2-4.]

People generally don’t know that libraries, in this digital era, are prevented from buying any ebooks or MP3s. They are not allowed by the publishers to have them. They spend and spend and spend, but they don’t end up owning anything. They’re not building collections. So the publishers can change or delete anything at any time, and they do. In their dream case, libraries will never own anything ever again. This is a structural attack on libraries. You don’t need to be a deep historian to know what happens to libraries. They’re actively destroyed by the powerful.

So let’s spend [our collection budgets] buying ebooks, buying music, buying material from small publishers or anybody [else] that will actually sell to us. Make it so we are building our own collections, not this licensing thing where these books disappear.

That’s a big ask. But the great thing about that will be that our libraries start buying things from small publishers, where most of the money goes back to the authors, not stopping with the big multinational publishers. Let’s build a system that works for more players than just big corporations that make a habit of suing libraries."

Tuesday, March 18, 2025

Pentagon restores webpage for Black Medal of Honor winner but defends DEI purge; The Guardian, March 17, 2025

, The Guardian; Pentagon restores webpage for Black Medal of Honor winner but defends DEI purge


[Kip Currier: In my comments on a prior Guardian story yesterday, I noted that although it's good that the webpage for Maj. Gen. Rogers has been restored and the pejorative label "DEImedal" has been removed from the website address, we are left with many troubling concerns and questions. Chief among them: how many other websites have been temporarily or permanently removed and/or altered that relate to marginalized persons?

Now, we have a clearer picture, from this Guardian and Associated Press reporting:

In all, thousands of pages honoring contributions by women and minority groups have been taken down in efforts to delete material promoting diversity, equity and inclusion – an action that Parnell defended at a briefing.

https://www.theguardian.com/us-news/2025/mar/17/defense-department-black-medal-honor-webpage-restored 

These purges more than ever underscore the vital work of people and institutions who are collecting, archiving, and preserving digital records; people like archivists and library and museum staffs. Remember this when someone shortsightedly or misguidedly asks whether libraries, archives, and museums are still needed in the Internet and AI ages.

Other non-profit organizations, too, such as the Internet Archive and its digital preservation-missioned Wayback Machine, are crucial for preserving as much information and as many webpages as possible.

Digital preservation of the information and webpages removed by entities like the current Trump administration could eventually enable that information to be restored. It is imperative that everyone have access to the full breadth of human experience and history, rather than the fragmented shards of history and lived experiences that a particular political administration deems acceptable.

Access to information is a core principle of healthy, well-functioning, responsive democracies. As the late Pulitzer Prize winner Toni Morrison sagely asserted, "Access to knowledge is the superb, the supreme act of truly great civilizations."]


[Excerpt]

A screenshot posted on Bluesky by the writer Brandon Friedman noted that a Google preview continued to show the defense department’s profile page – noting of Rogers that, “as a Black man, he worked for gender and race equality while in the service”. Friedman added that the page no longer worked and the URL had been “changed to include ‘DEI medal’”.

By Monday, however, the site was operational once more – and the URL had returned to its original formulation, with the letters DEI no longer present.

In a statement on Monday that did not elaborate, a defense department spokesperson told the Guardian: “The department has restored the Medal of Honor story about army Maj Gen Charles Calvin Rogers … The story was removed during auto removal process.”

While the defense department also claimed publicly on Monday that internet pages honoring Rogers, as well as Japanese American service members, had been taken down mistakenly, spokesperson Sean Parnell also staunchly defended its overall campaign to strip out content singling out the contributions by women and minority groups, which the Trump administration considers “DEI”.

“I think the president and the secretary have been very clear on this – that anybody that says in the Department of Defense that diversity is our strength is, is frankly, incorrect,” Parnell said.

In all, thousands of pages honoring contributions by women and minority groups have been taken down in efforts to delete material promoting diversity, equity and inclusion – an action that Parnell defended at a briefing.

The defense secretary, Pete Hegseth, and Donald Trump have already removed the only female four-star officer on the joint chiefs of staff, navy Adm Lisa Franchetti, and removed its Black chairperson, Gen CQ Brown Jr.

Wednesday, January 8, 2025

The Internet Archive is in danger; WBUR, January 7, 2025

 

The Internet Archive is in danger


"More than 900 billion webpages are preserved on The Wayback Machine, a history of humanity online. Now, copyright lawsuits could wipe it out.

Guests

Brewster Kahle, founder and director of the Internet Archive. Digital librarian and computer engineer.

James Grimmelmann, professor of digital and information law at Cornell Tech and Cornell Law School. Studies how laws regulating software affect freedom, wealth, and power."

Wednesday, June 26, 2024

The MTV News website is gone; The Verge, June 25, 2024

Andrew Liszewski, The Verge ; The MTV News website is gone

"The archives of the MTV News website, which had remained accessible online after the unit was shut down last year by parent company Paramount Global, have now been completely taken offline. As Varietyreported yesterday, both mtvnews.com and mtv.com/news now redirect visitors to the MTV website’s front page...

Although the MTV News website was no longer publishing new stories, its extensive archive, dating back over two decades to its launch in 1996, remained online. But as former staffers discovered yesterday, that archive is no longer accessible."

Thursday, May 23, 2024

When Online Content Disappears; Pew Research Center, May 17, 2024

 Athena Chapekis, Samuel Bestvater, Emma Remy and Gonzalo Rivero, Pew Research Center; When Online Content Disappears

"38% of webpages that existed in 2013 are no longer accessible a decade later...

How we did this

1

PEW RESEARCH CENTER

Pew Research Center conducted the analysis to examine how often online content that once existed becomes inaccessible. One part of the study looks at a representative sample of webpages that existed over the past decade to see how many are still accessible today. For this analysis, we collected a sample of pages from the Common Crawl web repository for each year from 2013 to 2023. We then tried to access those pages to see how many still exist.

A second part of the study looks at the links on existing webpages to see how many of those links are still functional. We did this by collecting a large sample of pages from government websites, news websites and the online encyclopedia Wikipedia.

We identified relevant news domains using data from the audience metrics company comScore and relevant government domains (at multiple levels of government) using data from get.gov, the official administrator for the .gov domain. We collected the news and government pages via Common Crawl and the Wikipedia pages from an archive maintained by the Wikimedia Foundation. For each collection, we identified the links on those pages and followed them to their destination to see what share of those links point to sites that are no longer accessible.

A third part of the study looks at how often individual posts on social media sites are deleted or otherwise removed from public view. We did this by collecting a large sample of public tweets on the social media platform X (then known as Twitter) in real time using the Twitter Streaming API. We then tracked the status of those tweets for a period of three months using the Twitter Search API to monitor how many were still publicly available.

Refer to the report methodology for more details."