First of all: don’t smoke, kids. I’m genuinely trying to quit. I chose this photo as a good testbed for today’s mission: to compare the perceivable quality, or more concretely the loss in image data, between common photo storage file types.
It’s a contentious issue that’s flared up every few years since the dawn of digital photography, mostly through the undying Raw vs JPEG debate, and most recently with Adobe’s introduction of the Lossy Digital Negative format in 2012.
Here, I’ve compared the most common storage formats: the camera’s own Raw file, 16-bit TIFF, Lossless and Lossy Digital Negative (DNG), JPEG at quality levels 100 and 60 – and for the hell of it, 256-color GIF (dithered).
If you’ll pardon a spoiler, it turned out pretty interesting.
A major stumbling block of the Storage Format Debates, such as they are, is the absence of a concrete value of quality. Vinyl vs CD’s anyone?
The starting point for this comparison is the Raw file output by the camera – in this case a Fujifilm X100T. This camera’s 16 megapixel sensor has an unusual color filter array, which should tax the archival wherewithal of the candidate file formats quite nicely.
The first job was to select a base photo to work from. Since we’re interested in perceivable loss in image data, I selected an image without a lot of chaotic detail (such as a landscape photo with tiny blades of grass), but which still had smooth gradients and rich skin tones.
The procedure was simple. Load up the original Raw file in Photoshop, then lay one of the other formats on top of it and set the top layer’s blending mode to Difference. This creates a new image, where each pixel’s value is determined by the difference between the Raw file and the copy.
If they’re exactly the same, then the whole image will be black. If a pixel on the copy has less green, then that would show up as a slight magenta tint in the blended image — less amber, and the resulting pixel would be more blue.
So what do we get if we perform this test for each of the canditate file types and lay the results side by side?
Looks all black, doesn’t it? And since black means that the original and the copy are identical, that means no loss in quality has occurred. From the look of things, this article’s over before it started – nothing’s changed!
So we need to look closer. For one thing, the original image is a healthy 16 megapixels, but on our screen here, it’s shrunk to a fraction of that size. The shrinking process averages neighboring pixels, so it’s completely possible that one pixel on the copy was much lighter than the original and its neighbor much darker; one greener and its neighbor magenta… -er. The shrinking would average those differences out to zero, despite huge changes to the original image data.
Furthermore, even minuscule changes can affect the perceivable image quality. Even tiny variations in color and luminosity on a smooth gradient can cause banding.
Photoshop, again, to the rescue.
Since we’re looking for very small variations, we need very meticulous math. The photos were loaded into a 16-bit photoshop file, which records color values with roughly 16 million times more precision than a normal screen can display. So even though we might not be able to see differences, they’ll be recorded – so we can amplify them.
And how do we do that? The humble Levels tool. By setting the White level to its absolute lowest value of 2 (out of 256), we can amplify the darkest 1% of the spectrum to fill the capacity of our screen.
So what does that reveal?
Digital Negative | Lossless
There may be those among you who take offense that I start the list with the Digital Negative Format instead of TIFF – I beg your patience.
Adobe advertised the Digital Negative format as the ultimate storage safeguard. Imagining an apocalyptic future where your Raw camera files can no longer be read by any software running on your brain-computers, saving your photos as DNGs would ensure that they could always be read. Even if your camera’s manufacturer fell to the hordes of mutant chinchillas that roam the future wastelands, any brain-software that can read DNGs can read your file.
Furthermore, the DNG format allows standardized metadata to be recorded, as well as preview images and to top it all off, some data compression is applied that shrinks the file down without touching any of your image data.
And, looking at the Difference image – which is black – and even the 128x amplified version – also black – one would agree… Except if you push the amplification even further, 2048 times the actual value, tiny artefacts appear.
A variety of explanations spring to mind, mostly concerning mathematical rounding errors in the bit depth conversion or the amplification.
I have an inkling that this might be specific to the Fujifilm X100T’s X-Trans sensor, though. When I tried the same difference-layer procedure on a Raw file from the Canon EOS M and a DNG copy, there was no variation, no matter how much I amplified the image. Zero is zero.
So, at least with the X-Trans sensor, DNG counts as a lossy format. The histogram shows us the enormity of this betrayal, though: a mean variation of 0.07 – and that’s when the image is amplified times 128, putting the real value of the difference at 0.00002734375.
I can live with that.
TIFF | 16-bit
Next up, the old classic. The Tagged Image Format was created in the ‘80s to standardize image formats among scanner manufacturers, growing from a simple binary image file to the robust, archival format we know today.
I had set out to include Adobe’s PhotoShop Document (PSD) format as well, but the results were identical to TIFF. Adobe, it turns out, holds the copyright on the TIFF specification so maybe that shouldn’t come as a surprise.
What does come as a surprise, though, is that TIFF isn’t lossless either. While the DNG required massive additional amplification, even the 128x enhancement that was applied to all the image shows that there are definitely differences between the original and the TIFF, and it seems to be due to demosaicing.
Digital cameras (with the exception of Sigma’s Foveon sensors) record only one color value per pixel. While my beloved X100T boasts 16 megapixels, that actually means that there are 8 million green pixels and 4 million each of red and blue pixels. To display or print this image, software looks at each blue pixel’s green and red neighbors, and guesses how much green and red would have been recorded on the blue pixel if it were sensitive to those colors. This guesswork is called de-mosaic-ing.
Considering the very fine grain of the differences between the original and the TIFF, it stands to reason that the guesswork causes the small-scale differences between the files. The histogram shows a mean per-pixel difference of 58.71, which is quite a lot more than we saw on the DNG.
Still, no reason to panic. Demosaicing is required to translate Raw image data to a screen or a print and with 16 bits of color depth (which is, again, about 16 million times more color detail than a screen can reproduce) there’s plenty of lattitude for rigorous editing.
Digital Negative | Lossy
Introduced in 2012, the Lossy DNG format became the platforms for Lightroom 5’s Smart Previews feature, and the syncing of Raw photos to Lightroom Mobile. As the name suggests this format applies lossy compression, intelligently ditching image data that’s considered non-essential.
However, a Lossy DNG still allows you to make roughly the same adjustments to white balance and exposure that you can with a Raw or Lossless DNG – unlike bitmap formats such as JPEG, GIF and PNG.
And the files are small. The test photo was shrunk from 33.5 MB to 4 MB, a mere 12% of the original’s size.
While hard disks are cheap these days, they’re not free, and more and more of us are eschewing the traditional desktop in favor of high-performance laptops with solid-state drives, which tend to be much smaller and more expensive than their ancestors.
So how’s the quality?
Not bad! Sure, it looks psychedelic and weird, but remember: the darker the pixels in the Difference image, the closer they resemble their originals.
The preponderance of purple and blue in the image suggests that most of the loss in image data occurs in the red-orange-yellow part of the spectrum. This is quite clever and sensible: the human eye is much more sensitive to green light, so we’d notice artefacts in the green tones before anywhere else. It makes sense to preserve the image data there and smooth out the earth- and skin-tones.
Admittedly, there’s no overt green in the base photo, but the color theory that drives digital photography involves mixing all the color channels. Even the skin should contain a fair amount of green data, and it looks like that’s being preserved.
The histogram and the close-up show that the loss in image data is well-handled, smooth and measured. There’s some evidence of grainy banding in the smoke, but the skintones are smooth and the loss of warm-tone detail is consistent between the highlights on the hair and the shadows on the shirt.
Not only do Lossy DNGs save space, they’re also quicker to load and edit in Lightroom since the demosaicing has already been done. Those are worthy benefits for a well-handled loss of image data – even if the histogram now shows a hefty 109.65 mean value in image difference.
JPEG | Quality 100
Ah, good old JPEG. The cornerstone of electronic imagery; the files are tiny and the compression good enough for display on a webpage or a phone. Heck, I even went through a phase recently where I primarily shot JPEG, even on paid assignments, because my Fuji cameras produce such beautiful files that I rarely had a need for Raw.
But JPEG is a notoriously lossy format. Here’s the comparison for the highest-quality JPEG Lightroom can produce:
Here we’re starting to run into image data loss that’s well beyond the theoretical. Light pixels in the Difference image are bad news, signifying substantial difference beween the original and the copy, and this file is a lot lighter than the others we’ve seen so far.
The green patches on the chest show that a lot of red color detail is missing from the purple shirt’s shadows. The top left of the original image is very dark, and the light-colored nebula on the Difference image suggests that all that data has been dropped in favor of smooth, clean black.
The grainy pattern in the Difference image suggests that what little noise there was in the source image has been largely smoothed out, though there isn’t much in the way of blocky artefacting. It’s a nice, smooth image on the screen, but there’s nothing below the surface.
The Difference image shows a green sheen on the skin, and on the highlights, suggesting that a lot of red detail is being lost.
So while JPEG is quite fine if you nail your shot in-camera, there isn’t much room for major edits.
JPEG | Quality 60
Let’s make things worse. We’ve seen the best that JPEG can offer, so let’s turn down the quality from 100 to 60 and – yowzers.
The chaotic patterns of brightness and color suggest massive differences between the medium-quality JPEG and its raw source. What’s worse, the colors that are being affected aren’t consistent within each eara: in the top left smoke we see sharp, jagged bands where blue and green have been lost, and on the right side of the shirt we see a band of green right next to a band of purple, showing a loss of red and yellow right next to each other.
But moreover, the glasses, eyes and hair are lit up like a Christmas tree, showing huge differences that can only be ascribed to significant loss of detail.
That loss is hard to see in the JPEG copy, though. Lossy though it may be, the JPEG compression algorithms are mighty clever, playing on the eye’s weaknesses to remove detail without appearing to lose much richness and lustre.
And it can always get worse…
GIF | 256 Colors
Before the web was grown up, the Graphic Interchange Format was created by Compuserve and brought us such innovations as really grainy pictures of the starship Enterprise and little cartoon builders hammering on blinking Under Construction text on home-made websites.
A pure bitmap format, GIF doesn’t save file space by cleverly patching over details or looking for patterns. No, good old GIF just does away with all those millions upon billions of color tones the our cameras can record and crushes them down to a crayon-palette of, at most, 256 colors.
That the Difference image is a mess is no surprise; the GIF format hardly deserves to be included in a list of archival photo formats.
But just look at the full GIF!
Sure, at 5 MB is may be bigger than the much more flexible Lossy DNG, but the image itself doesn’t look nearly as horrible as the Difference image would suggest. There’s still the appearance of smooth skin tones and detail in the shirt, and while the banding in the smoke can now be seen with the naked eye, it’s not terribly offensive.
Credit has to go to Photoshop here. Careful analysis of the original image led to selection of 256 very carefully selected colors. A massive reduction of the color palette would normally make a photo look like it was drawn in crayon, but by applying ‘dithering, the pointillistic and random patterns of colored pixels give the impression of smooth gradients – as long as you’re viewing it from far enough away.
Shrunk down, it actaully doesn’t look terrible. But histograms don’t lie: mean variation of 192.31 (out of 256), with a standard deviation of 55. This GIF only gives the impression of resembling the original.
Putting the images side by side, it’s time to take our pick. Lossless DNG is the hands-down winner, that shouldn’t be a surprise to anyone, but whether you prefer TIFF or Lossy DNG is a matter of personal preference.
TIFF clearly preserves more detail but Lossy DNG is much more consistent than TIFF in its color variations. Plus, you can store 24 times as many Lossy DNGs as TIFFs on a given hard drive.
JPEG’s no slouch. Quality 75 is generally considered a great balance between file size and image quality, but even 60 doesn’t look terrible to my eyes. Even looking closely it’s hard to find any of the blocky artefacts that JPEG is so renowned for.
And GIF… Well, bless it for still trying. And an impressive showing, considering how little image data it can actually store in its huge files.
Which format you choose for archiving still depends on your needs. If you have clients in the print industry, TIFFs are still the norm, while a wedding and event shooter like myself would benefit from the flexibility and low footprint of lossy DNG.
And if you absolutely can’t bear to give up any flexibility in latitude, but still want to save a little disk space and prepare yourself for the mutant chinchillas of the future, there’s always DNG.