Friday, April 06, 2012

Science and the Sticky Issue of Intellectual Property

I'm not a huge fan of intellectual property (IP). Certainly not the current status it enjoys in the US and which the US is pressuring the rest of the world to follow suit on. At the same time I'm not convinced by the abolish IP crowd even if they make a strong case, although the fashion industry is a great example of a thriving artistic business which has virtually no IP.

Now we come to the openness in the arena of scientific papers. I recently came across the JSTOR controversy:
On July 19, 2011, internet activist Aaron Swartz was charged with data theft in relation to an alleged theft of academic journal articles from JSTOR. According to the indictment against him, Swartz surreptitiously attached a laptop to MIT's computer network, which allowed him to "rapidly download an extraordinary volume of articles from JSTOR". Prosecutors in the case say Swartz acted with the intention of making the papers available on P2P file-sharing sites. Swartz surrendered to authorities, pleaded not guilty to all counts and was released on $100,000 bail. Prosecution of the case is ongoing.

Two days later, on July 21, Greg Maxwell published a torrent file of a 32GB archive of 18,592 academic papers from JSTOR's Royal Society collection, via The Pirate Bay, in protest against Swartz' prosecution.
On September 7, JSTOR announced that they are releasing the public domain content of their archives (about 6% of the total) to the public. According to JSTOR, they have been working on making those archives public for some time, and the recent controversy made them "press ahead" with this initiative.
Now I think ten years in prison and $100 000 bail is pretty damn draconion. But that's really the US authorities and legal machine going full retard. Anyway, there is a 32.5 GB torrent over on Pirate Bay (that I'm not going to link to) put up by Greg Maxwell. Greg has a long, long diatribe (or manifesto) about why he put it up, a lot of which I find rather naive or just downright idiotic, although I admire him for publicly identifying himself on the torrent. Here are some excerpts:
Limited access to the documents here is typically sold for $19 USD per article, though some of the older ones are available as cheaply as $8. Purchasing access to this collection one article at a time would cost hundreds of thousands of dollars. Also included is the basic factual metadata allowing you to locate works by title, author, or publication date, and a checksum file to allow you to check for corruption.

I've had these files for a long time, but I've been afraid that if I published them I would be subject to unjust legal harassment by those who profit from controlling access to these works. I now feel that I've been making the wrong decision.
[...]
Copyright is a legal fiction representing a narrow compromise: we give up some of our natural right to exchange information in exchange for creating an economic incentive to author, so that we may all enjoy more works. When publishers abuse the system to prop up their existence, when they misrepresent the extent of copyright coverage, when they use threats of frivolous litigation to suppress the dissemination of publicly owned works, they are stealing from everyone else. 
Okay, I more or less agree with this. But I don't think publishers abuse the system to prop up their existence, I think publishers created the system, using crony capitalism, to prop up their existence. Although I suspect it was the music industry that had the biggest hand in greasing the right palms (thanks Bill Clinton and a unanimous boot-licking Senate).

Still, it didn't seem to be JSTOR's intent to charge for older papers that were in the public domain. This is from JSTOR's press release:
On a final note, I realize that some people may speculate that making the Early Journal Content free to the public today is a direct response to widely-publicized events over the summer involving an individual who was indicted for downloading a substantial portion of content from JSTOR, allegedly for the purpose of posting it to file sharing sites. While we had been working on releasing the pre-1923/pre-1870 content before the incident took place, it would be inaccurate to say that these events have had no impact on our planning. We considered whether to delay or accelerate this action, largely out of concern that people might draw incorrect conclusions about our motivations. In the end, we decided to press ahead with our plans to make the Early Journal Content available, which we believe is in the best interest of our library and publisher partners, and students, scholars, and researchers everywhere.
The thing is, it actually does cost time and money to digitize old papers and make them publicly available even if they are in the public domain. Project Gutenberg is a great project that takes advantage of hundreds of thousands of hours of volunteer work, but how many people are interested in spending their free time digitizing old scientific papers?

How much do strict IP laws affect the spread of information? Probably more than you'd think. Take a look at this rather shocking graph by Paul Heald from the blog Offsetting Behaviour:


What happened? Well, 1922 is the cutoff for public domain books in the US.

How much would more open access to scientific papers affect the general health of science these days? It's very difficult to say. There are certainly much fewer people clamoring to read dense scientific papers than the latest Stephen King novel. On the other hand, there's a lot more dissemination of information these days by non-mainstream channels such as this blog.

In my opinion, post-WWII science has suffered from the peer-review system, an entrenched academia and most of all, from the underlying source of post-WWII scientific funding--the government. This doesn't mean I'm not a big fan of the open science movement (here and here) but I don't think it gets to the root of the problem.

These are sticky problems, what is wrong with science these days and what is wrong with IP these days (perhaps nothing). I don't think there are any hard and fast answers, but I think they are very much related.

No comments:

Post a Comment