Free Idea: Enhancing Astronaut Photography of Earth

Beta version. May contain bad ideas.

By a free idea I mean something that I think is probably fun and probably possible but that I don’t have the combination of time, skill, energy, patience, etc. to do myself. I hope someone does this. I hope someone reads this and does just the specific part that they’re interested in. I’m trying to get the idea out there without giving the impression that it’s my project. It’s just an idea.

To do the whole thing as laid out here I think you’d need at least an intermediate understanding of convolutional neural networks for image processing, access to a GPU, some sense of geography and astronomy (to gut-check your intermediate results), and a reasonable internet connection to download the images.

The idea

Train a neural network to improve (clarify, denoise, remove coma and chromatic abberation from, and spatially and temporally interpolate) photos taken by astronauts on ISS, and in particular their time-lapse series.

Some background on astronaut photography

Astronauts on the space station use a range of off-the-shelf pro-grade SLR gear. They take photos of things that interest them (attractive landscapes, volcanic eruptions, etc.) and also time-lapse series where the camera is set in place and takes a frame on the order of once a second. Their photos are downlinked and distributed at https://eol.jsc.nasa.gov as JPEGs but also, via a slightly annoying but functional API, as NEFs (Nikon raws).

The photos are generally technically sound – well exposed, with good focus and little camera shake. However, they’re shooting through glass, there is often sensor or internal lens dust, the cameras themselves are a little old, the wider lenses show quite a bit of coma and chromatic abberation, and the images are often noisy due to high ISO and (I believe) the high incidence of cosmic rays.

The trick

The fact that makes me think there might be room for surprisingly good results here, beyond what you could do by applying an ordinary interpolater/denoiser, is that there’s an unusual source of ground truth that can be constructed to correspond to these images: the starfield.

As a broad assumption, star surveys mean we know what an unimpeded view of the night sky at a given angle is supposed to look like. This is an assumption, not a fact: star surveys are not done in the color space that ordinary cameras use, they don’t all handle nebulosity, and they don’t account for transients like planets, comets, the moon, supernovas, etc. In other words, going from a stellar database to an accurate image of the night sky from above the atmosphere is not actually as simple as projecting it.

However, there are reasonable approximations. Using astrometry (like the beloved http://astrometry.net system), it’s possible to lock down the orientation and field of view of an imperfect image that contains stars. The moon and planets’ positions can be accurately predicted.

Some important transients and tricky effects – aurora, atmospheric refraction, in-station glare and reflections – will remain, but possibly under reasonable control.

As a sketch of a training protocol then:

Divide photos into sensor/lens pair buckets (body serial number X, lens serial number Y is one bucket)
Choose photos that contain starfields
Mask out everything that doesn’t appear to be starfield (Earth, auroras, station solar panels, etc.)
Use the astrometry.net toolkit to find the orientation of the photo
Use your stellar/planetary database to paint in an ideal skyview in this orientation
Use the exposure time to motion-blur the ideal skyview; this is the training y (target)
Add ancillary bands or inputs encoding the f-stop, the ISO, the focal distance (if not infinity), other metadata that might be useful, and the angular distance of each pixel from the image centerline (to help identify coma and chromatic abberation)

Now train (randomly drawn unmasked image chip) → (corresponding ideal skyview chip).

Apply on full images.

Gotchas

The surface of Earth, and “good” features like auroras, plus interesting transients like glinting satellites, do not look like artificially generated skyviews. In other words, training a network to do lens/sensor correction for stars may not give you a network that is good at doing the same thing for photos of clouds over the coast of Ireland. I suspect this is a pretty hard problem, and I would want to start tracking it early in the training process.

Relatedly, starfields are very speckly. Stars are bright virtual point sources. A “correctly” represented starfield would actually probably look quite weird compared to what we perceive through the human visual system at the bottom of the atmosphere. Which is to say: a “correct” representation of a star database that contains 10,000 objects in the field of view is going to look like 10,000 isolated single pixels, very much like a bad sensor with a lot of hot pixels. This is arguably not what we actually want to see. So you might, for example, convolve the target image (the ideal skyview) with a small Gaussian to give slight halos to stars that will better represent their colors, relative brightness, and so on.

(A really fun feature of this whole problem space is that lens imperfections are what smear stars over multiple wells in the Bayer array and let us sense their color in the first place. In theory, with a perfect lens, it would generally be impossible to tell star colors from a single frame and a Bayer sensor, because the images of their discs would virtually never be split between wells.)

You would want to be careful about overfitting by memorizing star positions.

Also

A fun project would be to paint ideal skyviews and then run them through the inverse of this correction network to get what they should look like with a given lens/sensor combination. Then you could subtract that from the observed skyviews to spot (1) noise, but much more interestingly (b) objects that aren’t in your database, like glinting satellites and aurora.