Mar 17, 2023

Jeff MacKie-Mason: “The Internet Archive Is a Library”

By Dave Hansen, Deborah Jakubs, Chris Bourg, Thomas Leonard, Jeff MacKie-Mason, Joseph A. Salem Jr., MacKenzie Smith, and Winston Tabb

The Internet Archive, a nonprofit library in San Francisco, has grown into one of the most important cultural institutions of the modern age. What began in 1996 as an audacious attempt to archive and preserve the World Wide Web has grown into a vast library of books, musical recordings and television shows, all digitized and available online, with a mission to provide “universal access to all knowledge.”

Right now, we are at a pivotal stage in a copyright infringement lawsuit against the Internet Archive, still pending, brought by four of the biggest for-profit publishers in the world, who have been trying to shut down core programs of the archive since the start of the pandemic. For the sake of libraries and library users everywhere, let’s hope they don’t succeed.

You’ve probably heard of Internet Archive’s Wayback Machine, which archives billions of webpages from across the globe. Fewer are familiar with its other extraordinary collections, which include 41 million digitized books and texts, with more than three million books available to borrow. To make this possible, Internet Archive uses a practice known as “controlled digital lending,” “whereby a library owns a book, digitizes it, and loans either the physical book or the digital copy to one user at a time.”

The Internet Archive fulfills the mission of a library in ways we could only dream of a few decades ago.

Despite its incredible library collections, which serve the needs of millions of people, Hachette Book Group, HarperCollins Publishers, John Wiley & Sons Inc., and Penguin Random House assert that the Internet Archive is not a real library.

In their lawsuit against the Internet Archive, which could extract millions of dollars from the nonprofit organization, the publishers claim that the Internet Archive “badly misleads the public and boldly misappropriates the goodwill that libraries enjoy and have legitimately earned.” In their view, the archive’s “efforts to brand itself as a library” are part of a scheme to “fraudulently mislead” people, circumvent copyright law and limit how much profit publishers can extract from the ebook market. They describe the Internet Archive as a “pirate site” and its business model as “parasitic and illegal” and characterize controlled digital lending as “an invented paradigm that is well outside copyright law.”

The Internet Archive, in turn, argues that the practice of controlled digital lending constitutes fair use under copyright law, and asserts that “libraries have been practicing CDL in one form or another for more than a decade, and hundreds of libraries use it to lend books digitally today.”

Why is it so important to the publishers that the Internet Archive not be identified as a library? Primarily because Congress has long recognized the valuable role that libraries play in our copyright system and has created special allowances in the law for their work. In this suit, the publishers seek to redefine the Internet Archive on their own terms and, in so doing, deny it the ability to leverage the same legal tools that thousands of other libraries use to lend and disseminate materials to our users.

The argument that the Internet Archive isn’t a library is wrong. If this argument is accepted, the results would jeopardize the future development of digital libraries nationwide. The Internet Archive is the most significant specialized library to emerge in decades. It is one of the only major memory institutions to be created from the emergence of the internet. It is, and continues to be, a modern-day cultural institution built intentionally in response to the technological revolution through which we’ve lived.

Libraries are defined by collections, services and values. In The Librarian’s Book of Lists (ALA, 2010), George M. Eberhart offers this definition: “A library is a collection of resources in a variety of formats that is (1) organized by information professionals or other experts who (2) provide convenient physical, digital, bibliographic, or intellectual access and (3) offer targeted services and programs (4) with the mission of educating, informing, or entertaining a variety of audiences (5) and the goal of stimulating individual learning and advancing society as a whole.”

The Internet Archive has all these characteristics. It is a one-of-a-kind independent research library, with its holdings fully available in digital form. Its substantial physical and digital collections are unique. It employs librarians and other information professionals. It is open to all interested readers. It cooperates with peer libraries in support of archiving the information and contemporary discourse as manifested in the World Wide Web. It has an active community of researchers who depend on its collections. And it is an engaged, responsive, resource-sharing partner to hundreds of peer libraries. It is also now an integral part of the interlibrary loan system, sharing its holdings with other libraries worldwide. It shares the keystone values of all libraries: preservation, access, privacy, intellectual freedom, diversity, lifelong learning and the public good. And it does all this without commercial motive as a mission-driven not-for-profit organization.

Those of us who have worked with the Internet Archive or drawn on its many offerings have long seen the organization as a peer. The Internet Archive fulfills the mission of a library in ways we could only dream of a few decades ago.

We cannot defend against the publishers’ lawsuit. We can, however, stand with Internet Archive as it fights for the right to buy, preserve and lend books, which is what libraries do.

Originally published as The Internet Archive Is a Library; by Inside Higher Ed on March 17, 2023. Reprinted with permission.

Dave Hansen is executive director of Authors Alliance.

Deborah Jakubs is university librarian emerita at Duke University.

Chris Bourg is director of libraries at Massachusetts Institute of Technology.

Thomas Leonard is university librarian emeritus and professor of journalism emeritus at University of California, Berkeley.

Jeff MacKie-Mason is university librarian, chief digital scholarship officer and professor in the School of Information and the Department of Economics at UC Berkeley.

Joseph A. Salem Jr. is the Rita DiGiallonardo Holloway University Librarian and vice provost for library affairs at Duke.

MacKenzie Smith is university librarian and vice provost of digital scholarship at University of California, Davis.

Winston Tabb is the Sheridan Dean of University Libraries, Archives and Museums Emeritus at Johns Hopkins University. 

Last updated:

March 21, 2023