Image file formats
Image file formats are standardized means of organizing and storing digital images. Image files are composed of digital data in one of these formats that can be rasterized for use on a computer display or printer. An image file format may store data in uncompressed, compressed, or vector formats. Once rasterized, an image becomes a grid of pixels, each of which has a number of bits to designate its color equal to the color depth of the device displaying it.
Image file sizes
The size of raster image files is positively correlated with the resolution and images size (number of pixels) and the color depth (bits per pixel). Images can be compressed in various ways, however. A compression algorithm stores either an exact representation or an approximation of the original image in a smaller number of bytes that can be expanded back to its uncompressed form with a corresponding decompression algorithm. Images with the same number of pixels and color depth can have very different compressed file size. Considering exactly the same compression, number of pixels, and color depth for two images, different graphical complexity of the original images may also result in very different file sizes after compression due to the nature of compression algorithms. With some compression formats, images that are less complex may result in smaller compressed file sizes. This characteristic sometimes results in a smaller file size for some lossless formats than lossy formats. For example, graphically simple images (i.e. images with large continuous regions like line art or animation sequences) may be losslessly compressed into a GIF or PNG format and result in a smaller file size than a lossy JPEG format.
Vector images, unlike raster images, can be any dimension independent of file size. File size increases only with the addition of more vectors.
For example, a 640 * 480 pixel image with 24-bit color would occupy almost a megabyte of space:
640 * 480 * 24 = 7,372,800 bits = 921,600 bytes = 900 KiB
Image file compression
Lossless compression algorithms reduce file size while preserving a perfect copy of the original uncompressed image. Lossless compression generally, but not always, results in larger files than lossy compression. Lossless compression should be used to avoid accumulating stages of re-compression when editing images.
Lossy compression algorithms preserve a representation of the original uncompressed image that may appear to be a perfect copy, but it is not a perfect copy. Often lossy compression is able to achieve smaller file sizes than lossless compression. Most lossy compression algorithms allow for variable compression that trades image quality for file size.
Major graphic file formats
Including proprietary types, there are hundreds of image file types. The PNG, JPEG, and GIF formats are most often used to display images on the Internet. These graphic formats are listed and briefly described below, separated into the two main families of graphics: raster and vector.
In addition to straight image formats, Metafile formats are portable formats which can include both raster and vector information. Examples are application-independent formats such as WMF and EMF. The metafile format is an intermediate format. Most applications open metafiles and then save them in their own native format. Page description language refers to formats used to describe the layout of a printed page containing text, objects and images. Examples are PostScript, PDF and PCL.
JPEG (Joint Photographic Experts Group) is a lossy compression method; JPEG-compressed images are usually stored in the JFIF (JPEG File Interchange Format) file format. The JPEG/JFIF filename extension is JPG or JPEG. Nearly every digital camera can save images in the JPEG/JFIF format, which supports eight-bit grayscale images and 24-bit color images (eight bits each for red, green, and blue). JPEG applies lossy compression to images, which can result in a significant reduction of the file size. Applications can determine the degree of compression to apply, and the amount of compression affects the visual quality of the result. When not too great, the compression does not noticeably affect or detract from the image's quality, but JPEG files suffer generational degradation when repeatedly edited and saved. (JPEG also provides lossless image storage, but the lossless version is not widely supported.)
JPEG 2000 is a compression standard enabling both lossless and lossy storage. The compression methods used are different from the ones in standard JFIF/JPEG; they improve quality and compression ratios, but also require more computational power to process. JPEG 2000 also adds features that are missing in JPEG. It is not nearly as common as JPEG, but it is used currently in professional movie editing and distribution (some digital cinemas, for example, use JPEG 2000 for individual movie frames).
The Exif (Exchangeable image file format) format is a file standard similar to the JFIF format with TIFF extensions; it is incorporated in the JPEG-writing software used in most cameras. Its purpose is to record and to standardize the exchange of images with image metadata between digital cameras and editing and viewing software. The metadata are recorded for individual images and include such things as camera settings, time and date, shutter speed, exposure, image size, compression, name of camera, color information. When images are viewed or edited by image editing software, all of this image information can be displayed.
The actual Exif metadata as such may be carried within different host formats, e.g. TIFF, JFIF (JPEG) or PNG. IFF-META is another example.
The TIFF (Tagged Image File Format) format is a flexible format that normally saves eight bits or sixteen bits per color (red, green, blue) for 24-bit and 48-bit totals, respectively, usually using either the TIFF or TIF filename extension. The tagged structure was designed to be easily extendible, and many vendors have introduced proprietary special-purpose tags – with the result that no one reader handles every flavor of TIFF file. TIFFs can be lossy or lossless, depending on the technique chosen for storing the pixel data. Some offer relatively good lossless compression for bi-level (black&white) images. Some digital cameras can save images in TIFF format, using the LZW compression algorithm for lossless storage. TIFF image format is not widely supported by web browsers. TIFF remains widely accepted as a photograph file standard in the printing business. TIFF can handle device-specific color spaces, such as the CMYK defined by a particular set of printing press inks. OCR (Optical Character Recognition) software packages commonly generate some form of TIFF image (often monochromatic) for scanned text pages.
GIF (Graphics Interchange Format) is in normal use limited to an 8-bit palette, or 256 colors (while 24-bit color depth is technically possible). GIF is most suitable for storing graphics with few colors, such as simple diagrams, shapes, logos, and cartoon style images, as it uses LZW lossless compression, which is more effective when large areas have a single color, and less effective for photographic or dithered images. Due to GIF's simplicity and age, it achieved almost universal software support. Due to its animation capabilities, it is still widely used to provide image animation effects, despite its low compression ratio compared to modern video formats.
The BMP file format (Windows bitmap) handles graphic files within the Microsoft Windows OS. Typically, BMP files are uncompressed, and therefore large and lossless; their advantage is their simple structure and wide acceptance in Windows programs.
The PNG (Portable Network Graphics) file format was created as a free, open-source alternative to GIF. The PNG file format supports eight-bit paletted images (with optional transparency for all palette colors) and 24-bit truecolor (16 million colors) or 48-bit truecolor with and without alpha channel - while GIF supports only 256 colors and a single transparent color.
Compared to JPEG, PNG excels when the image has large, uniformly colored areas. Even for photographs – where JPEG is often the choice for final distribution since its compression technique typically yields smaller file sizes – PNG is still well-suited to storing images during the editing process because of its lossless compression.
PNG provides a patent-free replacement for GIF (though GIF is itself now patent-free), and can also replace many common uses of TIFF. Indexed-color, grayscale, and truecolor images are supported, plus an optional alpha channel. The Adam7 interlacing allows an early preview, even when only a small percentage of the image data has been transmitted. PNG can store gamma and chromaticity data for improved color matching on heterogeneous platforms.
PNG is designed to work well in online viewing applications like web browsers and can be fully streamed with a progressive display option. PNG is robust, providing both full file integrity checking and simple detection of common transmission errors.
PPM, PGM, PBM, and PNM
Netpbm format is a family including the portable pixmap file format (PPM), the portable graymap file format (PGM) and the portable bitmap file format (PBM). These are either pure ASCII files or raw binary files with an ASCII header that provide very basic functionality and serve as a lowest common denominator for converting pixmap, graymap, or bitmap files between different platforms. Several applications refer to them collectively as PNM (Portable aNy Map).
WebP is a new open image format that uses both lossless and lossy compression. It was designed by Google to reduce image file size to speed up web page loading: its principal purpose is to supersede JPEG as the primary format for photographs on the web. WebP is based on VP8's intra-frame coding and uses a container based on RIFF.
HDR raster formats
Most typical raster formats cannot store HDR data (32 bit floating point values per pixel component), which is why some relatively old or complex formats are still predominant here, and worth mentioning separately. Newer alternatives are showing up, though. RGBE is the format for HDR images originating from Radiance and also supported by Adobe Photoshop.
The High Efficiency Image File Format (HEIF) is an image container format that was standardized by MPEG on the basis of the ISO base media file format. While HEIF can be used with any image compression format, the HEIF standard specifies the storage of HEVC intra-coded images and HEVC-coded image sequences taking advantage of inter-picture prediction.
BAT was released into the public domain by C-Cube Microsystems. The "official" file format for JPEG files is SPIFF (Still Picture Interchange File Format), but by the time it was released, BAT had already achieved wide acceptance. SPIFF, which has the ISO designation 10918-3, offers more versatile compression, color management, and metadata capacity than JPEG/BAT, but it has little support. It may be superseded by JPEG 2000/DIG 2000: ISO SC29/WG1, JPEG - Information Links. Digital Imaging Group, "JPEG 2000 and the DIG: The Picture of Compatibility."
BPG (Better Portable Graphics) is a new image format. Its purpose is to replace the JPEG image format when quality or file size is an issue. Its main advantages are:
- High compression ratio. Files are much smaller than JPEG for similar quality.
- Based on a subset of the HEVC open video compression standard.
- Supports the same chroma formats as JPEG (grayscale, YCbCr 4:2:0, 4:2:2, 4:4:4) to reduce the losses during the conversion. An alpha channel is supported. The RGB, YCgCo and CMYK color spaces are also supported.
- Native support of 8 to 14 bits per channel for a higher dynamic range.
- Lossless compression is supported.
- Various meta data (such as EXIF) can be included.
Other raster formats
- CD5 (Chasys Draw Image)
- DEEP (IFF-style format used by TVPaint)
- ECW (Enhanced Compression Wavelet)
- FITS (Flexible Image Transport System)
- FLIF (Free Lossless Image Format) - a work-in-progress lossless image format which claims to outperform PNG, lossless WebP, lossless BPG and lossless JPEG2000 in terms of compression ratio. It uses the MANIAC (Meta-Adaptive Near-zero Integer Arithmetic Coding) entropy encoding algorithm, a variant of the CABAC (context-adaptive binary arithmetic coding) entropy encoding algogithm.
- ICO, container for one or more icons (subsets of BMP and/or PNG)
- ILBM (IFF-style format for up to 32 bit in planar representation, plus optional 64 bit extensions)
- IMG (ERDAS IMAGINE Image)
- IMG (Graphics Environment Manager (GEM) image file; planar, run-length encoded)
- JPEG XR (New JPEG standard based on Microsoft HD Photo)
- Layered Image File Format for microscope image processing
- Nrrd (Nearly raw raster data)
- PAM (Portable Arbitrary Map) is a late addition to the Netpbm family
- PCX (Personal Computer eXchange), obsolete
- PGF (Progressive Graphics File)
- PLBM - Planar Bitmap, proprietary Amiga format
- SID (multiresolution seamless image database, MrSID)
- Sun Raster is an obsolete format
- TGA (TARGA), obsolete
- VICAR file format (NASA/JPL image transport format)
Container formats of raster graphics editors
These image formats contain various images, layers and objects, out of which the final image is to be composed
- CPT (Corel Photo Paint)
- PSD (Adobe PhotoShop Document)
- PSP (Corel Paint Shop Pro)
- XCF (eXperimental Computing Facility format, native GIMP format)
As opposed to the raster image formats above (where the data describes the characteristics of each individual pixel), vector image formats contain a geometric description which can be rendered smoothly at any desired display size.
At some point, all vector graphics must be rasterized in order to be displayed on digital monitors. Vector images may also be displayed with analog CRT technology such as that used in some electronic test equipment, medical monitors, radar displays, laser shows and early video games. Plotters are printers that use vector data rather than pixel data to draw graphics.
CGM (Computer Graphics Metafile) is a file format for 2D vector graphics, raster graphics, and text, and is defined by ISO/IEC 8632. All graphical elements can be specified in a textual source file that can be compiled into a binary file or one of two text representations. CGM provides a means of graphics data interchange for computer representation of 2D graphical information independent from any particular application, system, platform, or device. It has been adopted to some extent in the areas of technical illustration and professional design, but has largely been superseded by formats such as SVG and DXF.
Gerber format (RS-274X)
The Gerber format (aka Extended Gerber, RS-274X) was developed by Gerber Systems Corp., now Ucamco, and is a 2D bi-level image description format. It is the de facto standard format used by printed circuit board or PCB software. It is also widely used in other industries requiring high-precision 2D bi-level images.
SVG (Scalable Vector Graphics) is an open standard created and developed by the World Wide Web Consortium to address the need (and attempts of several corporations) for a versatile, scriptable and all-purpose vector format for the web and otherwise. The SVG format does not have a compression scheme of its own, but due to the textual nature of XML, an SVG graphic can be compressed using a program such as gzip. Because of its scripting potential, SVG is a key component in web applications: interactive web pages that look and act like applications.
Other 2D vector formats
- AI (Adobe Illustrator Artwork)
- CDR (CorelDRAW)
- GEM metafiles (interpreted and written by the Graphics Environment Manager VDI subsystem)
- Graphics Layout Engine
- HPGL, introduced on Hewlett-Packard plotters, but generalized into a printer language
- HVIF (Haiku Vector Icon Format)
- NAPLPS (North American Presentation Layer Protocol Syntax)
- ODG (OpenDocument Graphics)
- !DRAW, a native vector graphic format (in several backward compatible versions) for the RISC-OS computer system begun by Acorn in the mid-1980s and still present on that platform today
- POV-Ray markup language
- PPT (Microsoft PowerPoint)
- Precision Graphics Markup Language, a W3C submission that was not adopted as a recommendation.
- PSTricks and PGF/TikZ are languages for creating graphics in TeX documents.
- ReGIS, used by DEC computer terminals
- Remote imaging protocol
- VML (Vector Markup Language)
- WMF / EMF (Windows Metafile / Enhanced Metafile)
- Xar format used in vector applications from Xara
- XPS (XML Paper Specification)
3D vector formats
- AMF - Additive Manufacturing File Format
- Asymptote - A language that lifts TeX to 3D.
- .blend - Blender
- .flt - OpenFlight
- IMML - Immersive Media Markup Language
- .MA (Maya ASCII format)
- .MB (Maya Binary format)
- .OBJ (Alias|Wavefront file format)
- OpenGEX - Open Game Engine Exchange
- STL - A stereolithography format
- U3D - Universal 3D file format
- VRML - Virtual Reality Modeling Language
- .3ds - Autodesk 3D Studio
- X3D - Vector format used in 3D applications from Xara
Compound formats (see also Metafile)
These are formats containing both pixel and vector data, possible other data, e.g. the interactive features of PDF.
- EPS (Encapsulated PostScript)
- PDF (Portable Document Format)
- PostScript, a page description language with strong graphics capabilities
- PICT (Classic Macintosh QuickDraw file)
- SWF (Shockwave Flash)
- XAML User interface language using vector graphics for images.
- MPO The Multi Picture Object (.mpo) format consists of multiple JPEG images (Camera & Imaging Products Association) (CIPA).
- PNS The PNG Stereo (.pns) format consists of a side-by-side image based on PNG (Portable Network Graphics).
- JPS The JPEG Stereo (.jps) format consists of a side-by-side image format based on JPEG.
- Andreas Kleinert (2007). "GIF 24 Bit (truecolor) extensions". Retrieved 23 March 2012.
- Philip Howard. "True-Color GIF Example". Retrieved 23 March 2012.
- "Gerber File Format Specification". Ucamco.