PLDI 2016 is continuing the novel experiment that began at PLDI 2014: giving authors the opportunity to submit for evaluation any artifacts that accompany their papers. Similar experiments ran successfully for ESEC/FSE 2011 and 2013, ECOOP 2013 and 2014, OOPSLA 2013 and 2014, and POPL 2015.

Background

A paper consists of a constellation of artifacts that extend beyond the document itself: software, proofs, models, test suites, benchmarks, and so on. In some cases, the quality of these artifacts is as important as that of the document itself, yet most of our conferences offer no formal means to submit and evaluate anything but the paper. PLDI 2016 has created an Artifact Evaluation Committee (AEC) to remedy this situation.

Goals

The AEC’s goal is twofold: to reward and probe. The primary goal is to reward authors who take the trouble to create useful artifacts beyond the paper. Sometimes the software tools that accompany the paper take years to build; in many such cases, authors who go to this trouble should be rewarded for setting high standards and creating systems that others in the community can build on. Conversely, authors sometimes take liberties in describing the status of their artifacts—claims they would temper if they knew the artifacts are going to be scrutinized. This leads to more accurate reporting.

The organizers of the AEC hope that eventually, the assessment of a paper’s accompanying artifacts will guide the decision-making about papers: that is, the AEC will inform and advise the Program Committee (PC). This would, however, represent a radical shift in our conference evaluation processes; the organizers of the PLDI 2016 AEC would rather proceed gradually. Thus, at PLDI 2016, artifact evaluation is optional, and authors choose to undergo evaluation only after their paper has been accepted.

Criteria

The evaluation criteria are simple. A paper sets up certain expectations of its artifacts based on its content. The AEC will read the paper and then judge how well the artifact matches these criteria. Thus the AEC’s decision will be that the artifact does or does not “conform to the expectations set by the paper.” Ultimately, the AEC expects artifacts to be:

  • consistent with the paper,
  • as complete as possible,
  • documented well, and
  • easy to reuse, facilitating further research.

Benefits

The dissemination of artifacts benefits our science and engineering as a whole. Their availability improves reproducibility, and enables authors to build on top of each others’ work. It can also help to more unambiguously resolve questions about cases not considered by the original authors.

Beyond helping the community as a whole, the evaluation and dissemination of artifacts confers several direct and indirect benefits to the authors themselves. The most direct benefit is, of course, the recognition that the authors accrue. But the very act of creating a bundle that can be used by the AEC also confers important benefits:

  • The same bundle can be distributed to third parties.
  • A reproducible bundle can be used subsequently for later experiments (e.g., on new parameters).
  • The bundle simplifies subsequent re-executions of the system when, say, having to respond to a journal reviewer’s questions.
  • The bundle is more likely to survive being put in storage between the departure of one student and the arrival of the next.

However, creating a bundle that meets all these properties can be onerous. Therefore, the AEC process described below does not require an artifact to have all these properties. It offers a route to evaluation that confers fewer benefits for vastly less effort.

Process

To maintain a wall of separation between paper review and the artifacts, authors will only be asked to upload their artifacts after their papers have been accepted. Of course, they can (and should!) prepare their artifacts well in advance, and can provide the artifacts to the PC through unofficial URLs contained in their papers, as many authors already do.

The authors of all accepted papers will be asked whether they intend to have their artifact evaluated and, if so, to upload the artifact. They are welcome to indicate that they do not. Since we anticipate small glitches with installation and use, the AEC reserves the right to send a one-time message to the authors requesting clarification. Authors can submit a one-time response, focusing solely on the questions of the AEC; we do not impose a word-limit (since, e.g., a code attachment may be needed), but strongly suggest that the prose be no longer than 1000 words. Based on these inputs, the AEC will complete its evaluation and notify authors of the outcome. Authors are welcome to ignore the feedback or to include it in their paper as they deem fit (as a footnote, a section, etc.).

The conference proceedings will include a discussion of the artifact evaluation process. Papers with artifacts that “meet expectations” may indicate that they do with the following badge (courtesy Matthias Hauswirth):

PLDI 2015 Artifact Evaluated Badge

Artifact Details

To avoid excluding some papers, the AEC will accept any artifact that authors wish to submit. These can be software, mechanized proofs, test suites, data sets, and so on. Obviously, the better the artifact is packaged, the more likely the AEC can actually work with it.

In all cases, the AEC will accept a video of the artifact in use. These may include screencasts of the software being run on the examples in the paper, traversals of models using modeling tools, stepping through a proof script, etc. The video is, of course, not a substitute for the artifact itself, but this provides an evolutionary path that imposes minimal burden on authors.

Submission of an artifact does not contain tacit permission to make its content public. AEC members will be instructed that they may not publicize any part of your artifact during or after completing evaluation, nor retain any part of it after evaluation. Thus, you are free to include models, data files, proprietary binaries, and similar items in your artifact. The AEC organizers strongly encourage you to anonymize any data files that you submit.

Membership

The AEC consists of about two dozen members. Other than the chairs, all other members are senior graduate students, postdocs, or recent graduates, identified with the help of current, active researchers.

Qualified graduate students are often in a much better position than many researchers to handle the diversity of systems expectations that the AEC will encounter. In addition, graduate students represent the future of the community, so involving them in the AEC process early will help push this process forward.

The AEC chairs devote considerable attention to both mentoring and monitoring, helping to educate the students on their responsibilities and privileges.

HOW-TO

This HOWTO document, written by a group that has both reviewed and won Distinguished Artifact Awards, provides useful advice on how best to prepare your artifact for review. We highly recommend that you read it.

[Last modified Sat Jul 18 2015.]

This page provides guidelines on how to package artifacts for submission. The AEC chairs are open to possibilities that are not considered below. However, in cases where artifacts ought to fit these options, authors are strongly encouraged to follow the guidelines.

TL;DR

Create a submission with the same title as the paper. In the submission form fields, be sure to include:

  • the paper’s abstract (to help with bidding)
  • a URL pointing to the artifact web site – include this in the abstract section
  • there is no “paper” to submit via the submission form

If the artifact web site is access-protected, then in the appropriate form field, enter the credentials needed to access the URL.

At the artifact web site, give the AEC access to:

  • the artifact – preferably packaged as a virtual machine image
  • the (conditionally) accepted version of the paper
  • instructions

See below for additional details.

Artifact Submission

Irrespective of the nature of the artifact, authors should create a single Web page (whether on the authors’ site or a third-party file upload service) that contains the artifact, the paper, and all necessary instructions.

The artifact itself should be complete and packaged in a way that makes it as easy as possible for the Artifact Evaluation Committee (AEC) members to inspect, understand, and—if applicable—execute the artifact.

The artifact must be accompanied by the (conditionally) accepted version of the paper.

The artifact must also be accompanied by suitable instructions and documentation, to save committee members the burden of reverse-engineering what the authors intended. (For example, a tool without a quick tutorial is generally very difficult to use. Similarly, a dataset is useless without some explanation on how to browse the data.) If it would be helpful, please feel free to include a video that demonstrates the artifact running or explaining how it should be run.

Please make concrete what claims you are making of the artifact, if these differ from the expectations set up by the paper. This is a place where you can tell the AEC about difficulties we might encounter in using the artifact, or its maturity relative to the content of the paper. The AEC will still evaluate the artifact relative to the paper, but this helps set expectations up front, especially in cases that might frustrate the reviewers without prior notice.

Some artifacts may attempt to perform malicious operations by design. These cases should be boldly and explicitly flagged in detail in the instructions so that AEC members can take appropriate precautions before installing and running these artifacts.

Packaging Artifacts

Authors should strongly consider one of the following methods to package the software components of their artifacts (though the AEC is open to other reasonable formats as well).

A virtual machine image containing the software application, already set up in the intended run-time environment

For example:

  • For a compiled application, the VM would contain the program executable(s) and all of the necessary shared libraries.
  • For a mobile phone application, the VM would have a phone emulator installed.
  • For mechanized proofs, the VM would have the actual version of the theorem prover used.
  • For raw data, the VM would contain the data and the scripts used to analyze it.

We recommend using VirtualBox because it is freely available on several platforms.

Virtual machine images are the preferred option for packaging software artifacts. A VM image avoids installation and compatibility problems, and it provides the best guarantees for reproducibility of the results by the committee. Authors using Linux might find the CDE tool useful for automatically creating a VM image from existing software without needing to manually re-install all dependencies within a VM.

A binary installable package

We invite the authors to use CDE (Linux) or MSI Installer (Windows) to package the binary application.

A live instance running on the web

See, for example, the demo at the Herbie web site (from PLDI 2015).

A detailed screencast of the tool along with the results

Consider this option if one of the following special cases applies:

  • the application needs proprietary/commercial software that is not easily available or cannot be distributed to the committee;
  • the application requires significant computation resources (e.g., more than 24 hours of execution time to produce the results).

An installation or update repository for the tool

For example, an Eclipse update site or a Debian APT repository). We urge the authors to test their repository on standard environments to avoid installation problems by the committee members.

Multiple Formats

If you wish, you may provide your artifacts in more than one form. For example, as an accompaniment to your ready-to-run VM image, you might choose to include a copy of your application’s source code and build scripts. We strongly discourage source-only artifacts because of the extra work they cause for the Artifact Evaluation Committee and the extra risk that they pose to the evaluation process. However, source code is an excellent companion to a ready-to-run artifact, because it can help to answer questions that might arise.

Non-code artifacts should preferably be in open document formats. For documents and reports, we invite the authors to use PDF, ODF, or RTF. We invite the authors to submit experimental data and results in CSV (preferred option), JSON, or XML file formats. In the special case that authors need to submit non-standard data formats, they should also provide suitable readers.

Confidentiality

In all cases, authors should make a genuine effort to not learn the identity of the reviewers. This may mean turning off “call home” features or analytics, or only using systems with high enough usage that AEC accesses will not stand out.

We ask that, during the evaluation period, you not embed any analytics or other tracking in the web site for the artifact or, if you cannot control this, that you not access this data. This is important for maintaining the confidentiality of reviewers. If for some reason you cannot comply with this, please notify the chairs immediately.

Conflicts

If you have a conflict of interest with anyone on the committee (including the AEC chairs), please indicate this in both the artifact submission system and on your artifact’s web site.

If one of the authors of your paper is an AEC chair, then you may not submit an artifact. You can, however, indicate in your paper that you were unable to submit an artifact due to the conflict of interest.

Final Advice

Regardless of the packaging method, it is strongly recommended for authors to beta test their instructions by getting a friend to follow them before submitting them to the AEC. This is a simple way to find the kinds of inconsistencies and omissions that are inevitably present in any set of instructions that has never been followed.