Sunday, April 03, 2011

Privacy and Transparency at the University

The Mackinac think-tank has filed a broad Freedom of Information Act request for the emails of professors of labor studies at three universities run by the state of Michigan. This appears to be nothing more than political harassment, similar to the harassment of climate researcher Michael Mann by Virginia's Attorney General, Ken Cuccinelli, following the "Climategate" brouhaha. It is specifically to avoid this type of political pressure that we have the notion of "academic freedom", and that universities jealously guard their independence from the state.

The heavy use of the Internet by university professors has clearly opened them up to new forms of encroachment by political actors. Universities should dedicate some serious thought to how they manage their data, so as to keep private communications private, and properly document and release any information that should be made public.

I propose that, by default, all internal communications of university staff and students should be considered private, and should be handled in a way that maintains confidentiality. To accomplish this, university IT departments should develop encryption standards for individual email accounts and encourage their universal adoption. This is not a terribly difficult technical issue, as strong encryption systems have already been developed and deployed, such as Pretty Good Privacy. One of the big hurdles to adopting PGP encryption is to establish a network of users with trusted encryption keys; universities are in a perfect position to accomplish this.

This idea of secure communication within universities probably scares a number of people -- I've repeatedly heard mumblings about universities being sinister, oppressive forces in society (for example, view the first comment on this blog post at Bleeding Heart Libertarians). You don't have to be an anti-intellectual conspiracy theorist to insist that universities develop a high level of transparency. Universities provide some important public services where the quality of the final product (e.g. research results, student certification) cannot be easily evaluated without knowledge of the process by which it was produced. To maintain public trust, universities should develop a process that provides relevant information to anyone with a legitimate interest.

Documents relating to student evaluation should be available, both to administrators and to each student or their representative. If encrypted emails are among these documents, the student should keep a copy, and perhaps the university could keep a copy of the student's encrypted emails (at least, any from a professor), which could be recovered if the student provides his decryption key. One nice side-effect of widespread encryption would be widespread signing of electronic documents (using the same key), so that if a document is deemed relevant to an accusation, its authenticity can be easily validated.

Finally, we have the raw data that goes into research publications. The issues here are complicated, and many extend beyond individual universities. For instance, data accessibility has been a major source of contention in climate research, but much of the raw data is treated as a commercial asset by non-academic institutions, so there is little that universities can do. Additionally, scientists will always hesitate to release data until they have had a chance to analyze it themselves. Each field of research probably has to develop its own process for making raw data accessible. For instance, biologists have developed a massive database of DNA sequences (Genbank), and all major journals require that any sequence discussed in a publication be submitted to the database. There is currently a push to mandate the publication of the source code for any program used in an analysis. There has even been some frivolous dispute over accessibility to the raw data from traditional microbiological techniques (which is rarely digitized).*

The complete archiving of research data is an unreachable ideal, though it may become more common with the increasing automation of data collection. Every innovation in data collection and data storage will require researchers to develop new systems for archiving data, possibly leading to the loss of older data archived with obsolete systems. University IT departments (and perhaps librarian/archivists) may be able to provide resources that enable researchers to record their data in an accessible form, but ultimately the focus and extent of archiving and distribution will be determined by the value of the data to other researchers, not curious laymen.

*Addition: There are also experiments relating to transparency in the peer-review process.

No comments: