\def\twelveb{\bf}
\def\lapp{\buildrel < \over \sim}
\def\gapp{\buildrel > \over \sim}
\def\endpage{\par\vfil\eject}
\magnification=\magstep1
\baselineskip 14pt
\pageno=311
\parskip=0pt
\footline={\raise.08truein\line{\tenbf\hfil\folio}}
\voffset=.67truein
\global\topskip .5truein
\global\hsize 4.93truein
\global\vsize 7.71truein
\vbox{\vskip .62truein}
\noindent
{\twelveb Appendices}
\vskip 1.44truein plus.06truein
\noindent
{\bf A.1 Information in Astronomy}
\vskip \baselineskip
\noindent
In this appendix I give a brief and rough discussion of the information
content, and useful information content, of some astronomical data.
The purpose is only to explain why certain kinds of
astronomical data have been and are likely to be useful, and others
less so; a whiff of information theory is mixed with astronomical
experience and cynicism.
If we perform a measurement of a quantity with a signal to noise ratio
$S/N$ (or, equivalently, that ratio of measured quantity to its
uncertainty) we may express the result as a number in binary notation
with $n=\log_{2}(S/N)$ significant digits. In other language, the
result contains $n$ bits of information. $n$ cannot be very large
(Fermi is reputed to have said that all logarithms are equal to 10).
If the measured value is zero (or numerically small), as is frequently
the case, this definition of $n$ underestimates the significance of
the result, and $S$ should be some estimate of the characteristic
magnitude the result of our null experiment could have had. For
example, if we are measuring the charge of the neutron $S$ might be
taken to be the electron charge, even though the experiment has always
produced a null result. Clearly we have introduced a subjective
element into $n$. To make it objective and more quantitative would
require a careful consideration of the various hypotheses we might
consider (and their $a$ $priori$ likelihoods). Such exercises in
probability theory are rarely justified in experimental science.
We frequently have a series of $N$ similar measurements, representing
a time series, its Fourier transform, a spectrogram, an image, or some
other array. This array then contains $Nn$ bits of information.
Note that a minimum of $2^{2n}N$ photons must have been detected.
Values of $N$ are found over an
enormous range, from one or a few (classical astronomical photometry)
to $10^{3}-10^{5}$ (optical spectrometry, typical astronomical time
series), to $10^{9}$ (good photographic images).
These estimates of information content may offer valuable hints to the
opportunities for observing interesting phenomena. For example, most
of the interesting (in my narrow opinion!) phenomena in X-ray
astronomy have been found in studies of time series and in the
spectroscopy of the optical counterparts of X-ray sources; both these
kinds of study produce data with fairly large values of $N$.
Astronomical X-ray spectroscopy, for which usually $N\lapp 100$, has
been much less fruitful.
It is necessary, however, to distinguish between information in the
sense of the communications engineer, which was the basis of these
estimates, and scientifically useful information. If we measure a
physical constant $N$ times we should obtain the same value each time,
and clearly do not have $Nn$ useful bits of information. Rather, we
have $n$ useful bits if the errors are systematic; if they are random
and well-behaved statistically we can average our measurements to
produce a mean with greater accuracy (up to $n+{1\over 2}\log_{2}N$
bits) than that of an individual measurement. The extra bits are
not lost; rather, we already assumed we knew them when we assumed
that we were measuring the same physical constant $N$ times, so that
finding them again in the constancy of the data does not tell us
anything new.
\headline={\ifodd\pageno\rightheadline \else\leftheadline\fi}
\def\leftheadline{\vbox{\vskip 0.125truein\line{\tenbf\folio\qquad
Appendices \hfil}}}
\def\rightheadline{\vbox{\vskip 0.125truein\line{\tenbf\hfil
Information in Astronomy \qquad\folio}}}
\footline={\hfil}
Images have very large values of N, but have been particularly
disappointing in astronomy. One reason is that many astronomical
objects (most familiarly, stars) are unresolved in images. Only one
independent picture element of the image collects data from the
object of interest, and the rest are irrelevant. The second reason
is less trivial. It is very hard to formulate quantitative hypotheses
to describe usefully the morphology of extended objects, and hence it
is hard to utilize the high information content of their images.
Spectrograms have related problems. Wavelengths at which no detectable
spectral features exist produce no useful information. Most of the
spectral lines observed come from a relatively few species and
ionization states, and their intensities are related by inflexible
rules of atomic physics. Once we assume the laws of atomic physics
to hold everywhere, the number of independent parameters which can be
measured is much smaller than $N$. In X-ray astronomy most
spectrograms determine at best a density, a temperature, and a few
abundances (in principle the spatial distribution of these quantities
is described by an infinity of parameters, but usually the data are
sufficient to determine only a very few). The greater spectral
resolution and information content of visible spectra permits the
resolution of line profiles, and hence the study of additional physics:
velocity fields and line-broadening mechanisms.
Time series in astronomy have been very useful when they have led to
the discovery of regular periodicity. Time series without such
periodicities have been nearly useless, even though in principle
they contain a great deal of statistical information, because it
has not been possible to formulate scientifically interesting and
sensible hypotheses to predict their statistics. For example, a time
series of $N$ elements which is observed to be constant (within the
statistical errors) contains much fewer than $Nn$ bits of useful
information. Instead, it contains no more than $\log_{2}M$ bits,
where $M$ is the number of scientifically sensible (not $a$ $priori$
rejected) hypotheses. Since these hypotheses will generally be
expressible as specific functional forms for the signal, with a small
number of free parameters, $M$ will be approximately the number of
distinct functional forms of hypotheses times $2^{pm}$ where $p$ is
the number of free parameters in each and $m\lapp n+{1\over 2}\log_{2}N$
is the number of significant binary bits to which each may be determined.
In realistic scientific theories $p$ is a small number and the final
result is usually $\sim 100$ bits, and often less. For a physical
constant there is one hypothesis and $p=1$.
The most familiar and important use of time series is the study of
regular periodicities, which are predicted by many theories (as a
consequence of orbital, rotational, or vibrational motion), and
which are widely observed. A time series of $N$ evenly spaced
elements permits a frequency resolving power of $N/2$. It is therefore
possible to measure directly a period or frequency stability
(quality factor) $Q \equiv 1 / \vert \dot P \vert$
(where $P$ is the period) of
approximately $N/2$. This $Q$ may be defined very generally as the
width of a Fourier transform, and does not require an assumption of
sinusoidal variation. Such a measurement of $P$ or $Q$ contains
$\log_{2}N$ bits; this is our previous expression with one hypothesis,
$p=1$, and $m=\log_{2}N$ (or $M=N$). It has been tacitly assumed that
$n\gapp{1\over 2}\log_{2}N$; complications ensue if this is not the case.
If exact sinusoidal behavior is assumed the accuracy to which its period
may be determined increases to 1 part in $2^{n}N^{1/2}$, corresponding
to $\log_{2}M=n+{1\over 2}\log_{2}N$ bits (note that this accurate
determination of a sinusoidal period does not imply that the actual
$Q$ of the oscillator is as large as $2^{n}N^{1/2}$; a much longer
time series would be required to establish this). In practice,
systematic errors rarely permit this full extra accuracy.
The use of unevenly spaced time series (the only ones available over
extended periods when observations are interrupted by daylight, seasonal
effects, and other practical problems) permits very much higher accuracy
in period determination for a given number of observations than is
possible with evenly spaced time series. This accuracy is obtained at
the price of failing to distinguish strictly periodic behavior
from a large class of aperiodic behavior (periods
modulated in regular but complex ways). These aperiodic hypotheses
are rejected on sound scientific grounds, but they are fully consistent
with the data; these independent scientific arguments add a significant
number of essential bits of information.
It is possible to determine a frequency stability $Q$ and to measure
the period of oscillation (assumed sinusoidal) to an accuracy of $1$
part in $Q$ with $\sim \ln Q$
observations of moderate quality ($n\gapp 5$). This method is widely
used to measure values of $Q$ as high as $10^{15}$ in pulsars and
other very stable oscillators. A small number $j$ of observations
over a period $T$ determine the frequency to an accuracy of about
$\pm 1/T$. An additional $j$ observations at intervals of order
$T/2$ later extend the time baseline to of order $jT/2$, and the
frequency accuracy to about $\pm 2/(jT)$. After $k$ repetitions of this
process the time baseline is of order $(j/2)^{k}T$, the frequency
accuracy is of order $2^{k}/(j^{k}T)$, and the required number of
observations $j(k+1) \approx j\ln Q/\ln(j/2) \gapp 2e\ln Q$ varies only
logarithmically with $Q$. Accurate data permit better period
determinations in each iteration and longer gaps between subsequent
observations; this is required in practice by the timing of
opportunities for observation, which are usually brief and widely
separated. This reduces the required number of observations by a modest
factor.
Another problem in which a qualitative application of these ideas is
illuminating is that of deconvolution. Frequently we have data (in the
form of a series or array of numbers) which have been smoothed by the
limited resolution of an instrument or some other process. Familiar
examples are visible images blurred by atmospheric ``seeing'' and
spectrograms smoothed by the finite spectral resolution of the
instrument. If the properties of the smoothing function (the
atmospheric or instrumental response to a point-like signal) are
known then one might hope to remove its influence and to recover the
original signals. (Usually the smoothing function is only
approximately known, but this is generally not critical.)
A variety of linear and nonlinear algorithms exist to accomplish this
process of ``deconvolution,'' but their results are frequently
disappointing. The reason is easy to see. Suppose the original
observed data consist of $U$ numbers, each known with $n$ significant
binary bits, and it is desired to increase the resolution (in one
dimension) by a factor $q$ in order to obtain a deconvolved series of $
N=qU$ independent numbers. Only $nU$ bits of information are available,
so each of the new numbers can only have (on average) $n/q$ significant
bits. Because $n$ is rarely more than $10$, it is clear that attempts
to significantly increase the resolution rapidly destroy the ratio of
signal to noise! The most successful algorithms are those which best
use the available information to answer the scientifically interesting
questions by fitting to an intelligently chosen model, rather
than blindly attempting to increase the resolution; those known
as ``maximum entropy methods'' provide a systematic (and in some
sense optimal) way of imposing constraints.
It is my opinion that formal statistical methods and tests are
frequently misleading and of little real use in experimental science,
and that the best way to extract useful scientific information from
imperfect data is to compare them to the predictions of sensibly chosen
models, using the human eye to assess the results.
\goodbreak
\vskip 2\baselineskip plus2\baselineskip
\goodbreak
\noindent
{\bf A.2 Fermi at Alamogordo}
\def\rightheadline{\vbox{\vskip 0.125truein\line{\tenbf\hfil Fermi at
Alamogordo \qquad\folio}}}
\vskip \baselineskip
\noindent
The story is told (I do not vouch for its historical accuracy) that
when Fermi witnessed the first nuclear explosion he wanted a quick
estimate of the energy $Y$ it released. His observing point was far
enough from the explosion (at a distance $R \gg (Y/P_{0})^{1/3}$) that
its shock wave had become very small in amplitude, with a pressure jump
$\Delta P \ll P_{0}$, where $P_{0}$ is the pressure of the ambient air.
This is called
regime III in {\bf 3.4}. Shortly before the expected arrival of the
shock he is said to have dropped a few small scraps of paper into the
air. A number of seconds later the shock passed by. Because the scraps
of paper were very light they moved with the air, while more massive
objects remained fixed. By measuring the displacement $\Delta R$ of the
scraps of paper he was able to estimate the energy of the hydrodynamic
motion produced by the explosion.
If an explosion is close to the surface of the ground (in comparison
to the observer's distance from it) its shock will be nearly
hemispherical. The mechanical work done by the shock against the
pressure of the atmosphere is
$$W = 2 \pi R^{2} \Delta R P_{0} . \eqno(A.2.1)$$
Air is well described (under ordinary conditions) by a perfect gas
equation of state with a ratio of specific heats $\gamma =
c_{P}/c_{V} = 1.40$\break ({\bf 1.9.1}), and $P_{0} = 1.0 \times 10^{6}$
dyne/cm$^{2}$. A fraction $(c_{P} - c_{V})/c_{P}$ of the energy
supplied to the atmosphere does mechanical work against its pressure,
while its internal energy is increased by the remaining fraction $c_{V}/
c_{P}$. If the atmosphere were not permitted to move the two specific
heats would be the same, and all of the energy injected by the explosion
would appear as internal energy of the air. Equating $W$ to the
portion of the explosive energy which does mechanical work gives
$$\eqalign{Y&= \left({c_{P} \over c_{P} - c_{V}} \right) 2 \pi R^{2}
\Delta R P_{0} \cr &= \left({\gamma \over \gamma -1}\right) 2 \pi R^{2}
\Delta R P_{0}.\cr} \eqno(A.2.2)$$
For nuclear explosions this is a conveniently measured displacement.
If $Y = 4.18 \times 10^{20}$ erg (equivalent to 10 kilotons of standard
conventional explosive) and
$R = 10$ km, $\Delta R = 19$ cm. However, $\Delta R \propto R
(\Delta P / P_{0})$, where $\Delta P$ is the overpressure at the point
of observations. For small explosions, where observations are conducted
at small R, the displacements may be too small to measure easily.
A number of approximations have been made. We have neglected the
energy coupled into the ground, which is very small because the large
jump in density and sound speed at the surface leads to a large acoustic
mismatch. The kinetic energy of the moving air is also small. Its
characteristic velocity $v \sim \Delta R / t \sim c_{s}(\Delta R / R)$,
where $t$ is the time required for the shock to arrive after the
explosion and $c_{s}$ is the sound speed of air. From this it is easy
to see that the fraction of $W$ which appears as kinetic energy is
${\cal O}(Y/P_{0}R^{3}) \ll 1$. There are more significant errors.
In the inner parts of the shocked air the temperature is very high and
$\gamma$ differs from (is generally less than) its value in cool air.
The calculation also ignores the energy radiated by the hot fireball.
For nuclear explosions in air this leads to an underestimate of $Y$ by
a factor $\sim 2$, and for shocks in the interstellar medium, which
pass through a long snowplow stage (regime IIb in {\bf 3.4})
in which they radiate strongly, this
error may be even larger. Finally, if the result is to be applied to
a shock which expands spherically, as is the case in most astronomical
problems, the factor $2 \pi$ must be replaced by $4 \pi$.
In principle, this method could be applied to interstellar shocks, if
the displacement of suitable light objects could be measured. Because
interstellar shocks move so slowly (on astronomical distance scales) it
would probably be necessary to measure the displacement of a continuous
emitting gas filament where the shock had passed through it, in
comparison to a place where a dense cloud shielded it from the shock.
It is unlikely to be feasible to observe the displacement of the
filament as it occurs.
\endpage
\end
\bye