Each year, the ACH and ALLC send forms out to the reviewers of the abstracts
that are submitted to the annual conference. These forms ask the reviewers
to identify themselves and the abstract, to evaluate relevant expertise, and
certainty of judgement, then ask for numeric rankings of different aspects
of the abstract (significance, technical knowledge, rigour, clarity,
references, etc.). Finally, there is space for comments to the author and to
the program committee. Until this year, the forms were sent out and returned
via email, but the information on it was not rigidly structured. Program
committees and local organizers "processed" them by hand.
Since the process was already electronic, this year we decided to automate
part of the tabulation process. To this end, we decided to implement a
database that was accessible through the WWW, so the program committee could
view and sort reviews according to different criteria, as well as see all
the comments. As our conference does not traditionally have a program
committee meeting to decide on papers, this would allow the whole program
committee to see all the reviews. The program committee could collectively
decide which papers were important to discuss and re-read, and would also be
able to contextualize their discussion with respect to all rankings. This
approach dovetailed extremely well with the decision of the local organizers
to put all paper abstracts online. Now, all reviewers could look at all
papers, and papers and reviews could be linked for the program committee.
Our solution
The program committee felt that email was still the best way to communicate
with reviewers, and did not want to require WWW forms capabilities.
Therefore, we decided to continue with distributing and collecting review
forms by email. We created a structured review form, which contained
substantially the same information as those from previous years, but used an
SGML like "pointy bracket" syntax to identify the different fields. We asked
reviewers to be careful not to delete any of the tags as they filled in
their reviews. As we shall see later, the creativity of the ACH/ALLC
reviewers proved this to be our biggest mistake.
The specifications for the database were that it should create a record for
each review, and be able to display reviews by paper number and by Summary
score (a score assigned by the reviewer). It also had to be able to keep
track of reviewers, so that it was possible to learn any reviewer's average
review scores, but to keep reviewers' names hidden from database users. As
implemented, an overview displayed the paper number, followed by the average
Summary score, as well as each Summary score that the abstract had received.
This provided both ranking and an analytical view, at a glance. From this
view, it was possible to read the abstract itself, see each complete review,
or just see all the comments from all the reviewers for each abstract. We
also wanted a reviewer module that would show the averege and analytical
score for each (anonymous) reviewer. This would let the program committee
decide that a score of 6 from a reviewer who assigned only 3s elsewhere was
stronger support than an 8 from a reviewer who assigned all 8s.
We used Sybase as our database and SyPerl as the CGI scripting language. We
also had to write verification and cleaning routines to make the email
review forms that we received. The database worked as described, except for
the reviewer overview which we did not have time to implement.
Results
When it was finally in place, the whole system worked well, and we had
generally positive feedback from the program committee. Most of their
complaints had to do with the speed of the connection, especially from
Europe (they asked for a mirror site), and with inconsistencies in the data,
which are discussed below. They felt that it was useful to be able to scan
through all the reviews and to see the paper rankings. We were also able to
easily discuss and re-review that crucial number of papers that are in the
middle, neither clear acceptances, nor clear rejections. In the past, these
were usually singled out by the program chair or the local organizers, and
sent out for discussion to the program commmittee. We all felt that we
wanted more time to work with the database of reviews and to make our
decisions (the last reviews came in just a few days before the notification
deadline).
Because it compiles all the comments that reviewers write for each paper,
this system also makes it easier to mail out authors' comments to them when
the reviewing process is over.
Problems
Almost all the serious problems we encountered as we loaded and tried to use
this reviewing aid were due to inconsistencies and mistakes in the data. It
became abundantly clear, from the very first reviews that came back, that
email forms were not a good idea. The most prominent mistakes were that
reviewers filled in incorrect or invalid information for the name and number
of the paper and the rankings; they forgot or refused to fill in some
fields; and they deleted tags by mistake as they typed. We were able to
write checking routines to catch some of these mistakes, but we kept finding
new problems as we loaded the database and viewed the information on the
web. The only comfort was that reviewers tended to make the same mistake
consistently, so once a problem was found it was possible to eradicate it
from all the reviews it appeared in.
The next version
The next version, currently being worked on, uses WWW forms for submitting
reviews. The paper name and number and the reviewer's email (used for
indentification) can therefore be entered automatically, and data can be
validated on submission. It also obviates the need for loading the database
by hand, so that the program committee can start to think about papers as
soon as the reviews start to arrive, and not wait while it is loaded in
batches.
Having done a certain amount of work, we felt that it would be satisfying to
have this system used for more than one conference. We are therefore
generalizing the database definition and Perl scripts, so that someone who
is moderately expert in large databases can customize it for their own
purposes. We are also moving from Sybase, which is a large commercial
package, to miniSQL, which is free to universities. The reviewing database
will be available from the STG website (http://www.stg.brown.edu/projects) later this summer.
Conclusions
This review database is well tailored to the way that ACH/ALLC does paper
reviewing. We rely a great deal on numerical information, only later looking
at reviewers' comments. This kind of data is easy to tally and sort. It
makes the process of weeding out the definite acceptances and the definite
rejections much faster. Additionally, it allows a far-flung program
committee to work together much more closely, and to really put together the
conference in a collaborative manner.
Samples
The poster will provide samples of the reviewing form and of the various
screens in a WWW browser, as used by the program committee. We will also
show screens from the new version.