By the NIPS 2006 Program Committee
With input from Andrew Ng, Peter Dayan, Daphne Koller, Sebastian Thrun, Bruno Olshausen, Yair Weiss, and Bernhard Schölkopf
In this informal essay, we describe some of the criteria that will be used to evaluate NIPS submissions. This document should not be construed as an official NIPS policy statement; but through it, we also hope to give advice for writing a good NIPS paper.
We'll take for granted that your paper will be clearly written, be technically sound and correct, and reference previous work. Thus, we will not further dwell on the issues of clarity and soundness, despite their importance. We will instead focus on how one might shape a paper's content so as to maximize its chance of being published and influencing others.
A
few notes:
A
significant fraction of NIPS papers either describe or study artificial
systems. This includes the majority of papers published with the
Bioinformatics; Clustering; Control and reinforcement learning; Dimensionality
reduction and manifolds; Feature selection; Gaussian processes; Graphical
models; Kernels; Learning theory; Machine vision; Margins and boosting; Monte
Carlo methods; Neural networks; Other algorithms; Semi-supervised learning;
Speech and signal processing; Text and language.
Examples
of such papers may include: a paper proposing a new learning algorithm; one
that describes a solution to a difficult application; or one that proves bounds
on the error of some learning method.
Papers
submitted with the keywords listed above are expected to make a significant (i) algorithmic, (ii) application, or (iii) theoretical
contribution. NIPS seeks to publish papers that will
have a high impact in the world---both within our research community, and
beyond. Whenever appropriate, papers will therefore be evaluated on the basis
of the following five criteria:
Not
all papers are expected to address all of these criteria, and a paper that is
extremely strong on only one of them may well be acceptable for publication.
For example, a learning theory paper that studies an existing algorithm may be
reasonably expected to address only the last of these criteria.
However,
in some cases where the research can be reasonably expected to address more
than one of the criteria above, a paper may have a better chance of acceptance
if it does indeed address them. For example, a paper that gives an elegant
mathematical derivation of a new algorithm (Criterion #1) may fare better if it
is also demonstrated through rigorous empirical evaluation to do well (Criterion
#4), or demonstrated on a real/non-trivial application (Criterion #3). This is
because such experiments can help build a significantly stronger case for the
algorithm's actual utility. Similarly, a paper describing an impressive
application of machine learning (Criterion #2 or #3) may fare better if beyond
reporting success, it further elucidates the structure of the problem or
algorithm that made the application work, and thereby conveys insight
(Criterion #5).
For
empirical studies, a good result can lie along many different axes, all of
which compare to the best state-of-the-art algorithm. These axes may include:
better accuracy, better ROC performance, faster, less memory, more generally
applicable, easier out-of-the-box usage, much simpler to code. If an algorithm
does not excel along any of these axes, a reviewer may wonder why it is worth
publishing at NIPS.
Although
NIPS strongly encourages interdisciplinary work that spans multiple keywords,
we now also describe some evaluation criteria that are more specialized and may
apply only to individual keywords.
Algorithmic
papers (e.g., Clustering, Dimensionality reduction and manifolds, Feature
selection, Gaussian processes, Graphical models, Kernels, Margins and boosting,
Monte Carlo methods, Neural networks, Other algorithms, Semi-supervised
learning) Authors of papers that
propose new algorithms for well-established, existing problems are encouraged
to provide evidence for the practical applicability of their methods, such as
through rigorous empirical evaluation of their methods on real data or on real
problems. For example, a paper about a new mathematical trick (or about a
beautiful new mathematical derivation) would be stronger if it is supported by
empirical evidence that the resulting algorithm really helps on a problem. We
also encourage submission of papers that describe algorithmic or implementation
principles that may have a large impact on applications or on practitioners of
machine learning.
Control
and Reinforcement learning: Authors
of papers that propose new algorithms for existing problems (such as solving MDPs) are encouraged to provide rigorous empirical
evaluation of their methods on real problems, and show its relevance to
real/difficult decision making or control tasks. For example, rather than
demonstrating your idea only on a grid-world or on mountain-car, also show if
it works on a more challenging task. The other comments for AA papers also
apply here.
Learning
Theory, which may have appropriate algorithmic keywords also: Any Learning Theory paper should have a theorem
about learning and a proof. Leaving out the proof is not an option in a double
blind setting! Several styles of papers exist:
Technically
difficulty or novelty is not the goal. Impact on the process and practice of
learning is the goal. Experimental results are nice but not necessary in
general.
Applications,
such as text, bioinformatics, or others:
Application papers should describe your work on a real as opposed to
hypothetical application; specifically, it should describe work that has
direct relevance to, and addresses the full complexity of, solving a
non-trivial problem. Authors are also encouraged to convey insight about the
problem, algorithms, and/or application. For example, one might describe the
more general lessons learned, or elucidate (through an ablative analysis/lesion
analysis, which removes one component of an algorithm at a time) which were the
key components of the system needed to get the application to work. A NIPS
application paper should be comparable in quality to paper in the corresponding
application domain conference: for example, a text paper should be acceptable
to SIGIR, EMNLP, or other appropriate conference
Application papers should not
only present concrete application results, but also contain at least one of the
below elements:
Machine
vision: Authors of vision papers are
encouraged to provide rigorous empirical evaluation of their methods to
demonstrate value added not just for a few selected images, but more broadly.
Ideally, a NIPS paper proposes a machine learning algorithm or system that can
be used by a computer vision researcher to help solve a difficult computer
vision problem. NIPS papers in this area should be comparable in quality to
those accepted in the major computer vision conferences, such as ICCV or CVPR.
Speech
and signal processing: Similar to
computer vision, a NIPS paper should solve a difficult audio, speech, or other
signal processing problem via machine learning; and be useful for a signal
processing practitioner. The quality bar for NIPS is higher than those of a
typical signal processing conference (such as ICASSP or ICIP): the NIPS papers
are 30% longer, the reviews are more detailed, and the acceptance rate is about
half. Therefore, a NIPS signal processing paper should be more significant than
the average ICASSP paper.
Hardware
technology: In addition to describing
a successful implementation, a NIPS hardware paper should also convey insight
into the underlying principles behind your implementation that serve as useful
lessons learned to non-hardware researchers, such as computer scientists or
neurobiologists.
A
significant fraction of NIPS papers, comprising mainly ones from the
Neuroscience, Biological vision, or Cognitive Science keywords, either describe
or study natural systems. Examples include a paper proposing a new model of
human decision making, a paper describing evidence for a neural code, and so
on.
Papers
submitted to the keywords listed above should make significant contributions to
the computational, psychological and/or neural understanding of an important
biological and/or behavioral system or function. Such papers will be evaluated
on the basis of some or all of the following seven criteria:
Neuroscience: A good neuroscience model should make testable predictions
- and they should be interesting, too. An interesting prediction is something
you may not have thought about otherwise: a prediction that is non-obvious, or
does not derive directly from the limitation assumptions made in the model. A
neuroscience model should give you a new way of looking at the system which
inspires new experiments. NIPS neuroscience papers should either be neuroscientifically or computationally well-grounded,
ideally both. The paper should make a serious attempt at connecting to
state-of-the-art neurobiology, and/or provide a rigorous mathematical treament or comparison to state-of-the-art engineering
method.
Brain
imaging and brain computer interfaces:
Papers with this keyword tend to fall between the natural and artificial systems.
A good brain imaging paper may lead to neurobiological insight, or it may
propose an experimental method for obtaining new kinds of measurements. A good
brain computer interface would either be useful as a computer interface, or
also lead to neurobiological insight.
These criteria were selected with the goals of encouraging good research, and of maximizing NIPS' long term impact. Note that this is not as simple as accepting papers with high expected impact. For example, a paper that makes ambitious but poorly substantiated claims may have high expected impact---largely on the off-chance that the claims turn out to be correct---but is still likely to be rejected. Some of these evaluation criteria exactly address this issue of providing evidence for the utility of one's work.
Quick links:
Home
Call for Papers
Author and submission instructions
Style files
Paper submission site
Paper evaluation criteria
Reviewer instructions
FAQ
Questions?
Comments? Please send email to nips07 AT cs.stanford.edu (non-standard form used
to prevent spam).