Dean's World

Defending the liberal tradition in history, science, and philosophy.

Is It Copyright Infringement If You're Just An Indexer?

Ron Coleman has some thoughts on the matter.

I must admit to being completely torn. I am of the "information wants to be free" mentality, although when I say that I mean "free" like free speech and not like free beer. So I hate when people just hijack other people's work to make money for themselves and don't compensate the original creator. But I also hate it when information is locked up and hard to get--out of print books that are hard to find, stuff locked away in estates no one can get to, and scientific journals which charge hundreds or thousands of dollars just to look at what they have. (The latter is particularly irksome when it's reports on studies paid for primarily by government grants.)

Entirely coincidentally, in another thread here on Dean's World recently, the topic of one of my favorite minor historical characters came up: Vilhjalmur Stefansson (See Wikipedia entry here). He was an early 20th century scientist in the old romantic tradition, who made major contributions to anthropology as well as medicine by travelling and living among previously-unstudied people in the arctic. He had studies published in prominent anthropological journals as well as the Journal of the American Medical Association. Yet today, all his books are out of print, his papers are in editions of the scientific journals which are not currently online, and anything else is locked up in a storage room somewhere in the Dartmouth college library.

The man died 43 years ago. You would think that at this point most or all his works should be free to the public for open distribution on the internet, but good luck trying to get that done. The only person who can grant permission is his widow, who long ago remarried and no longer answers mail or telephone inquiries regarding her first husband's work. (At least, the last I checked. By now she may have passed away.)

The practical financial value of his papers is very low. Yet he's a fascinating minor historical figure who was once very nearly as famous as Charles Lindberg or Marie Curie. There would be value in getting his papers online for free access, but good luck trying to read any of it unless you haunt rare book establishments, or make an appointment with the Dartmouth library staff. Public redistribution? No chance. That to me is crazy. It should be out there. (I've read some of it and it's all deeply fascinating.)

So when I read what Google is doing with "indexing" content, part of me cheers, and part of me cringes, and I'm not sure what exactly I think.

Posted by Dean | Permalink | Technorati Trackbacks
Dan the Highway guy (mail) (www):
This points up the issue that 'copyright' is one of the most pressing issues of the digital age, and there's a fairly awful transition coming up on us. There is going to be some sort of end to the current 'content-owner vs user' debate, and it's not going to be pretty for someone. My opinion, based on the interests I see involved, is that it's going to end up with a government forced system that ends up bad for the users, but that's not at all certain.

I think the reason that the majority of work is not available is totally based on the desire of the content owner to not want to bother with it. It's a lot easier to say 'no' than to work out deals, or have a free policy that may present liability exposure. But is there a solution? Not without changing things, and that's going to step on toes. There's no way to limit the current version of copyright inheritance without hurting some narrow interests, and that's the kind of thing that is a losing battle.

I don't know if anyone's read Spider Robinson's Melancholy Elephants, but it deals with the issue of having very strong copyright protection, and how that would be a terrible thing for humans. The short story is online here at Baen books.
10.31.2005 1:12pm
Martin L. Shoemaker (www):
Your Ron Coleman link goes to SplogReporter, so I'm not sure what you're commenting on here. That makes it harder for me to respond; but when has that ever slowed me down?

(I assume you meant to link to this piece: http://www.likelihoodofconfusion.com/?p=290.)

I am, I suspect, a much stronger advocate of strict copyrights than many people on the Web. I see a lot of problems with a lot of positions held by the "knowledge must be free" crowd.

But somehow, this case is different. I just don't see the big deal with Google's indexing. It seems to me like a variation on Fair Use. If Fair Use is intended to encourage commentary and open debate as a way to generate and improve ideas, then a good searchable index is a good way to let people know what ideas are out there and may need to be debated. As long as an index search does not yield up the entire work in readable format all in one chunk, then an index seems to me to be very much like excerpts for commentary under Fair Use.

I respect Mr. Coleman a lot, and I usually turn to his blog for this sort of information. He knows a lot more about this field than I do. But somehow in this case, I can't figure out how he sees it the way he does.
10.31.2005 1:24pm
Bryan Costin (mail) (www):
Looks like a misdirected link. Is this the entry you mean?

I think Google is one of the few companies that live by the motto that it's easier to ask forgiveness than to get permission. They're demonstrating what would be technically possible but for the fact that the copyright laws are so badly broken: "Sure, we can make a universal library that's instantly searchable, machine translated into dozens of languages, and constantly updated. Wouldn't that be great? But those guys over there won't let us. Maybe you should ask them why."

I'm sure it's not entirely altruistic. They're building a huge treasure chest of data against the day when either copyright law changes or when publishers and authors come to their senses. I suspect other companies (Microsoft?) are doing the same thing, though much more quietly.

Your example is perfect. The author died decades ago and the heirs are not interested. Whatever cost was incurred writing those reports was paid back or written off long ago. The data isn't making anyone any money where it is now, and the physical medium on which the data is stored is perishable and easily lost.

Once upon a time that information would've automatically passed into the public domain, where people who care could do something to preserve it. Maybe it already has, in fact, but the laws are so tangled that most entities would apparently prefer see such works lost forever than to risk the possibility of a lawsuit.
10.31.2005 1:29pm
Robert Speirs (mail) (www):
Stefansson's Eskimo book is great. I remember being impressed that he came back and ate nothing but twice-boiled mutton for a year to prove the low-carb theory. Then when I ran into the Atkins diet, I remembered it again. Also I was impressed that Eskimos actually do share their wives with visitors. Now that's hospitality!
10.31.2005 3:13pm
Dean Esmay:
Link fixed.

I do a lot of my writing just before bed and put it on a timer for release during the day. I need to work harder on double-checking before saving.
10.31.2005 3:13pm
Dean Esmay:
Eskimos actually do share their wives with visitors. Now that's hospitality!

That points to an unpleasant anthropological fact: in hunter/gatherer societies, women are rarely hunters. It's not unknown, but it's not at all common, pregnancy and childrearing tending to militate against it. In such societies, women mostly gather, and men mostly hunt. The percentage of food brought by gathering vs. the amount hunted tends to be proportionate to the status of women in such societies. Among most Inuit tribes, the status of women was virtually zero because 9 months of the year all the food was hunted and for about 3 months only small amounts could be gathered. Women were relegated to little more than slaves: cooks, childrearers, and sex toys.

In hunter/gatherer societies where weather is generally good year-round and ample food can be gathered, and hunted meat winds up only a pleasant addition to the diet, the status of men tends to be lower than women's.
10.31.2005 3:34pm
MaryJ:
Dean, I was wondering when you slept! Husband, father, work, school and blog. That note above was good to know.
10.31.2005 4:33pm
MaryJ:
Oops, and study!
10.31.2005 4:35pm
Dean Esmay:
I sleep during the day.
10.31.2005 7:49pm
Ronald Coleman (mail) (www):
But somehow, this case is different. I just don't see the big deal with Google's indexing. It seems to me like a variation on Fair Use. If Fair Use is intended to encourage commentary and open debate as a way to generate and improve ideas, then a good searchable index is a good way to let people know what ideas are out there and may need to be debated. As long as an index search does not yield up the entire work in readable format all in one chunk, then an index seems to me to be very much like excerpts for commentary under Fair Use.

I guess, Martin, my thinking is that the problem is that here you've gone and uploaded a whole work to your database. That's copying. You promise me no one's going to download it, but it's essentially on a server that's accessible to the whole universe. If downloading a work to a cache file on a PC can be construed as unauthorized copying -- and in many cases it can be -- this is even more worrisome.

Meanwhile, you don't have to read my book anymore to find the parts you want. Don't mention libraries -- most books and articles don't make into a library. The one thing I have going for me as an author is that you have to buy my book to get the advantage of its contents.

Another point, now that I think about it. Doesn't context matter? The saving grace of what Google's doing is supposed to be that it isn't giving the whole text. But that guarantees that you have no idea whether I'm really saying what you think I'm saying. That's always possible even if you buy my book, but if you have the whole thing in front of you, at least you have no one to blame but yourself.

I respect Mr. Coleman a lot, and I usually turn to his blog for this sort of information.

Wow! I didn't even know my dad had a blog!
11.1.2005 3:11pm