The Happy Technologist Interesting Geekdom


Heritage Health Prize: On Algorithms, Rights, Patents and Patients

So my goal for the weekend was to submit an entry to the Heritage Health Prize. It took me until Monday night (have to work off-hours, this isn't a work-sponsored event), but our team (the Data Monkeys) (with Jeremi and Chris at this point) are now entered and somewhat amazingly NOT in last place! Yay!

But I'm ahead of myself... the Heritage Health Prize is a data competition run through Kaggle, who runs these sorts of things. It's a $3-Million prize competition for a method of predicting what hospital patients will spend time in a hospital given their prior years' medical history.

I've been wanting to enter something like this for a while. I don't house any real hopes of winning (I have some fake hopes, of course); this sort of money attracts teams with far more depth of experience in data mining algorithms than I have -- our team leans more towards data management, but not analytics. Still, this is an opportunity to head in that direction, so I'm going to take it.

Naturally, there's already some controversy. The terms of the competition indicate that you have to sign over the rights to the winning Algorithm to the Heritage people. For $3 million I don't really have much of a problem with this, but there are some people who disagree. The more interesting question this raises for me is, is this even possible? U.S. Patent law seems to exclude the patenting of algorithms, although Software can currently be patented -- there's much debate about this, as the two things are arguably identical for all practical purposes. Copyright may be an equally invalid approach, although that seems to be less clear... Software can clearly protected by copyright so an algorithm executed in software would inherently be covered, but it's unclear how much would have to be rewritten to avoid the copyright.

In the microchip industry, as an example, Intel and AMD have had a weird relationship, many expensive lawsuits to avoid just this sort of issue. Cyrix, on the other hand, performed some weird reverse engineering practices to ensure a high degree of interoperability while (in theory) avoiding such legal issues around the x86 instruction set that PC folk have come to love and adore. Litigation with Intel had a lot to do with Cyrix's eventual downfall, although the core issue of patent and copyright infringement were never fully resolved legally.

The upshot of this is that I sincerely doubt it will be possible for Heritage to enforce any algorithm rights they assert. The contest agreement is effectively a non-disclosure, and Heritage will have to treat the entry as a copyrighted work and the underlying work as a trade secret or something similar. Even then, the process by which the algorithm gets developed will almost certainly be open for discussion, and if something novel is developed then similar, but uninfringing, algorithms will almost certainly pop up without any need to cry foul on the prize agreements. The competition itself will show just how high the bar can be set and, hopefully, similar to the Netflix Prize (a similar data competition), collaboration will improve results both inside and outside the Heritage environment.

Hopefully I'm not being idealistic there; I certainly don't have any legal background, and I realize that if Heritage wants to bully people about they probably can -- for $3 Million, again, I can't see how anyone can complain... In fact, for me, Heritage is probably in a position to do much more good with such an algorithm than I could on my own, even trying to evangelize it's use for free, so it may just be a win-win for everyone. Still, you can always just avoid joining in in the first place if this concerns you.

Wish us luck!

Filed under: Data Leave a comment
Comments (0) Trackbacks (0)

No comments yet.

Leave a comment

No trackbacks yet.