Tuesday, August 28, 2007

Is this Fair Use?

This morning I received an email from someone who was asking me for more information about a person they had found on my site at Ancestry.com but the first thing that popped into my feeble brain was that I don't have a website on Ancestry and I knew that what they were referring to was not on my freepages at RootsWeb!

So I went to Ancestry and did a search for "Phend" which brought up the following screen and didn't see anything out of the ordinary (click on any of the images to make them easier to read):



So I elected to view all 229 results:



The "Internet Biographical Collection" jumped out at me. Notice the padlock? I clicked on that link, but this is a "for pay" subscription database, and since I wasn't logged in I couldn't see the detail any more than the listing of pages, all of which, except for the last one, are from my website and they are definitely NOT part of Ancestry.com!!!



After logging in and clicking on "View Record" on one of the listings, what you see is shown below. No indication of where this came from, only a small link to "View Cached Web Page", Okay, so it says it is a cached page. . .



Click on "View Cached Web Page" (click on these images to make them bigger) you'll see a small link at the top of the page to "View Live web page" and it will then take you to the page, maybe.



For this particular page the link works because my site is still live. But when I was investigating all this I had gone to some obituary links. The site where the obituary was retrieved from is even more "hidden" for lack of a better word - many newspapers only keep obituaries online for a short time so the page is no longer live. I wonder if Ancestry.com is paying those sites to "store" their obituaries and make them available to Ancestry subscribers?

Is this legal or moral? How is it right for Ancestry.com to take my website pages, which I've made freely available, and CHARGE people to use them? And if they can legally or morally do this, how can they in turn say that it is illegal for their users (me and you) to use their images (census records, draft cards, etc.) on our websites or in our books or other publications?

The more I think about this, the angrier I am getting. At first I thought, okay, they say it is a cached web page, but it's not overly obvious. But they are charging people for access to my stuff!!! I really don't think it would bother me so much if this wasn't hidden behind a padlock. The more people that can find my data and possibly connect to me or someone else, the better - but they shouldn't have to pay to see it! Now, Ancestry is probably going to say they are simply providing a service for all of us poor webmasters and making it so that more people will see our stuff - but does that make it right? They are profiting from my work, and not just my work but the work of anyone with a genealogy related website. Will my blog pages show up next?

This is different than Google or Yahoo or any other search engine storing cached pages or providing links to websites. This is a company using other peoples work for their own gain - Ancestry is charging for these 'searches'. That is just not right, and not just because this is my work showing up - if you have genealogy pages out there anywhere they will probably show up as part of this new Ancestry database.

*** Update 4:00 PM Tuesday ***
I spent a while this morning and afternoon putting this post together, and while I was doing so, it appears that "all hel* was breaking loose" on this issue, see these posts with some very good commentary on the subject:

*** Update 4:44 PM Tuesday ***

Ancestry.com has now made the "Internet Biographical Collection" a "free" resource. You have to register to view these free records, which is not the same as signing up for a free trial, but why should you even have to register to view the "Internet Biographical Collection"? Registration is not required to view the Ancestry World Tree entries. To my way of thinking, this step by Ancestry does not entirely resolve the issue.

*** Update 11:30 PM Tuesday ***

Dick Eastman's post yesterday on The Generations Network Receives Patent for Correlating Genealogy Records has a lot of comments dealing with the Internet Biographical Collection, which really had nothing to do with his original topic, so you could say the comments thread got hijacked. As can be expected there is a wide range of opinions on the matter. Some make sense, others don't. Some valid, some not. And Dick is really good at playing the devil's advocate!

9 comments:

  1. Becky:

    Great job Becky! This is a real hornet's nest.

    I have a problem with registration for the free records. Ancestry does get something of value from registration. My email address.

    I will not register with them.

    fM

    ReplyDelete
  2. Rock on, Becky. Excellent deconstruction. One thing I'd love to know, though, are the specific URLs that go with the screenshots you've provided.

    (I'm looking at this from a technical angle. I want to know if there's any way in web server logs to identify ancestry.com's scraper bot, so access can be denied to their bot.)

    Also, I did a little scrap-- er screenshotting myself, and came up with a parody screenshot of Ancestry.com's home page. Please feel free to take it and post it on your own site as well.

    Cheers,

    Susan

    ReplyDelete
  3. Found more tech information on how to exclude Ancestry's bots; info is posted at the conclusion of post linked above.

    Tho I hit on a solution, I have NO idea how to implement it for blogger-hosted sites. :(

    ReplyDelete
  4. Well they certainly get a benefit from the content for registering new users, but on top of that, I would think that "caching" your page is basically the same as copying it and copying is plagiarism if permission has not come from the author. Google also does cache pages, but Google is Google so not many entities complain about what Google does as long as Google sends traffic their way. I would press Ancestry.com to either (1) Remove the content (2) Pay original authors per each user registered because of User's content, or (3) Provide a short summary text of your work, with a link to your site. They will probably just remove your content if you insist on this, but making this issue public may force a policy change.

    ReplyDelete
  5. Becky,
    Your site has a creative share license that seems to support your position that this is not fair use. I think you could have them take it off their site.
    I agree with the previous anonymous post that it is plagiarism. If you put the text in turnitin.com I'm sure that it will show that it is 100% from your site. If a student in one if my classes turned in a paper with the text on their site, they would receive an "F".

    Your cousin, and University Professor and Dean, Bill Conrad

    ReplyDelete
  6. Becky,

    Great article... I've added a link to it from mine about the same topic.

    I agree with the anonymous poster about what Ancestry.com should do now (next).

    Thank you for all the great screen shots too. I took a few myself.

    Janice

    ReplyDelete
  7. Thanks everyone for your comments.

    Bill - the Creative Commons license was added to the blog this morning. I don't want to prohibit people from using what I post, I'd just like to have proper attribution and acknowledgment of my contribution.

    ReplyDelete
  8. Becky,

    I already had a Creative Commons license on my blog (it was posted the first day it went up), and it clearly shows on the cached versions at Ancestry.com

    That did not stop them.

    Janice

    ReplyDelete
  9. Did you ever have an Ancestry.com account? If you did then in their user agreement you gave them permission to use your stuff.

    ReplyDelete

You have to be a "Registered User" to leave a comment. This means you must have a user ID with one of the following: Google, Live Journal, Word Press, Type Pad, AIM,or Open Id. If you don't have one of those IDs you can always send me an email (link in upper right corner of the blog). I apologize for the inconvenience but the amount of Spam Comments being left was overwhelming. Comment moderation is turned on for posts more than 3 days old.