Recent Posts

The Reverse Quine: Making Web services transparent

How much can Webserver-hosted code ever be trusted? How can we help people identify trustworthy Web services?

Where it seems like everyone running an Internet-based business is going to great lengths to try to make the workings of its software as obscure and secretive as possible, there are a few of us trying to do the opposite.
Luckily for those of us who recognize that without openness and honesty there cannot be trust, it is becoming increasingly obvious to everyone that the unintended consequences of DRM are unacceptable, especially given that it cannot reasonably work anyway.
As exceedingly difficult as it is--perhaps impossible, even in theory--to achieve 100-percent effective secrecy of the workings of a system that people need to be able to actually use on their own computers, one might expect things to get easier as the attempt to keep things secret is abandoned.
It actually does get easier, too, at least for a while. There is a point of diminishing returns, however, as one starts trying to enforce greater transparency.
Things are fairly good in this realm when it comes to locally installed software. It is easy enough to provide someone with the source code to an application and let him compile, install, and run it, knowing that the source code you gave him and allowed him to examine with his own eyeballs is the very source used to produce the binary executable file that is actually used whenever he runs the program.
Many software vendors try to gain the benefit of trust that can be had this way without actually being fully open and trustworthy, of course, as in the case of many corporations' policies of making source code available to clients.
They give the client a copy of "the source code" for the software, minus the bits related to technical license enforcement of course--because they still believe obscurity is security. Next, they hand the client a binary executable (with nothing missing) and leave the client with the impression that this binary executable is made from the source code that was just provided.
There is a problem, though: the client has no way to verify that the source code that was just examined bears any relation to what was used to compile that software. At least open source projects offer the source code directly for anyone to compile on his or her own.
A problem I have been contemplating for a long time now is how to achieve maximum transparency for a provided service. There are times, increasingly so as we move into an ever-more networked future, when it is more reasonable for the software someone uses to be hosted somewhere on someone else's server. This is especially the case with many Web technologies, such as social networking sites like Facebook, Web search engines like Google and social news sites like reddit.
How does one get the benefits of any of these three types of Web application with an application installed on the ThinkPad sitting on one's desk? It is, generally speaking, simply better to offer their functionality as a Web application. Sometimes, even software that can be installed locally or on one’s own Web server and still do its job should be used as a service offered by some third-party provider, as in the case of a Weblog application for use by someone who has neither the technical skills to maintain it himself nor the money to hire someone else to do it, because the goal of achieving the same performance and effectiveness as the professionals can prove worse than elusive.
If you wish to provide some kind of Web-based service to others, the question of trust may come up. How can the user verify the trustworthiness of your code? There really is no way to do it in a manner equivalent to the way the standard open source projects do with their offer of source code the user can compile and use at home--or, at least, not without risking serious security issues on the server.
If you let just anyone install your software on your server in a way that lets them verify that they are getting exactly what they expect, you must provide them direct, substantial control over the server. Unless the service you are providing is virtual server accounts or dedicated server hosting, this is probably not a good thing. It seems that the closest you can reasonably get is to try the "source available" approach, showing everyone "the source" and hoping they trust it is the exact source used to set up the service.
Perhaps some combination of intentionally failing to protect Webserver directories from unwanted browsing and actually using a distributed version control system from which people can arbitrarily check out your source code for the working system can help.
Of course, this only works if all the code you use is interpreted rather than precompiled to a binary executable. Worse, you then have to contend with the problem that users of the service may not trust your directory browsing and version control system to offer access to anything but a clever ruse.
I have taken to calling this problem the Reverse Quine. A quine is a program named after Willard Van Orman Quine, who coined Quine's Paradox:
"Yields falsehood when preceded by its quotation" yields falsehood when preceded by its quotation.
A quine program, as opposed to Quine's Paradox, is software that takes no input and produces a copy of its own source code as its sole output. Reading in the contents of the program file itself to output those contents is generally considered cheating, so that the programmer must ensure the program produces its output programmatically, entirely from within the executing program in memory.
What I would like to achieve, with verifiably trustworthy Web services, is in some respects the opposite of the problem of developing a quine. Rather than having to figure out how to make a program output its complete source code and only its complete source code without touching the program's file, I need to figure out how to let people view and execute the contents of a file without allowing them to touch the operation of the program itself, or to affect the context in which it operates.
Without dragging this out any more than necessary, the whole problem boils down to this:
How can I possibly make the code running on a Webserver, to provide a Web-based service of some kind, verifiably trustworthy--without compromising the security of the system itself by giving visitors to the site a way to actually alter the running software's operation?