Image Optimization: An Engineer's Guide

From HTML responsive images, to lazy loading, to layout shifts, to image CDNs... this guide covers it all.

Images are the most ubiquitous asset and an essential part of the web. However, they’re often the largest contributor to overall page size. Large, unoptimized images dramatically slow down your website.

While there are many tools to make image optimization easy, it’s important to understand the underlying concepts and how they work. They can generally be broken down into the following aspects:

Let’s take a look at each of them in more detail.

Properly Size Images

The <img> element traditionally only supports a src attribute that lets you point the browser to a single source file:

<img src="https://example.com/image.jpg" alt="A cat" />

With responsive images, we can provide multiple sources of the same image for different devices, and the browser will choose the most appropriate one to load. They can be useful for: fixed size image with different pixel densities; different image sizes and pixel densities; and different sizes, densities, and art directions.

Fixed Size Image with Different Pixel Densities

Pixel density indicates the number of pixels that fit into a square inch of a screen. The higher the pixel density, the more pixels fit into the same space. The most common ones are 1x, 2x, and 3x. Retina displays have a pixel density of 2x. It means for the same physical size, a retina display requires 2 times the amount of CSS pixels than standard displays.

So in practice, if we want images to look sharp on high pixel density screens, we just need to provide images with higher intrinsic dimensions. For example:

Device Pixel RatioIndicates that:On this device, an <img> tag with a CSS width of 250px will look best when the source image is…
11 device pixel = 1 CSS pixel250 pixels wide
22 device pixels = 1 CSS pixel500 pixels wide
33 device pixels = 1 CSS pixel750 pixels wide

An example for a fixed size image with different pixel densities is a logo, which is usually a fixed size. Say you have a logo that’s 100x100px. If you only serve it at 1x, it will look blurry on retina displays. If you serve it at 2x, it will waste bandwidth on non-retina displays. To solve this problem, we can use the srcset attribute with the x (density) descriptor in the <img> element:

<img
  srcset="
  logo-1x.jpg 1x,
  logo-2x.jpg 2x,
  logo-3x.jpg 3x
"
  src="logo-1x.jpg"
  alt="Company logo"
  width="100"
  height="100"
/>

In this example, we’re telling the browser that it can choose from three image sources: logo-1x.jpg, logo-2x.jpg, logo-3x.jpg. We provide a density descriptor (1x, 2x, 3x) following each image source to tell the browser what pixel density the image source is intended for, e.g., logo-1x.jpg is intended for browsers with a 1x pixel density, logo-2x.jpg for browsers with a 2x pixel density. We also specify a default image source (logo-1x.jpg) in the src attribute. Browsers that don’t support srcset will load images from this source.

Several things to note here:

Different Image Sizes and Pixel Densities

Images with different sizes are commonly used on responsive websites. For example, you have a landing page with a hero image:

<article class="hero">
  <img src="hero-image.jpg" alt="A cat" />
  ...
</article>

You also have a set of CSS rules to display the hero section responsively and to prevent the image from overflowing its container or squashing/stretching:

.hero {
  width: min(90%, 1000px);
  margin-inline: auto;
}

.hero img {
  max-inline-size: 100%; // prevent overflow
  block-size: auto; // preserve aspect ratio
  aspect-ratio: 2/1;
  object-fit: cover; // crop image if necessary
}

For windows wider than 1111px (1000px / 90%), the hero image will be 1000px wide. So you point the <img> to a src with a 1000px width. However, for smaller screens, the image will be 90% of the viewport width. It’s unnecessary for these screens to load such a large image.

In this case, you’d want to use the srcset attribute with the w (width) descriptor to provide multiple image sources with different sizes:

<img
  srcset="
    cat-600w.jpg 600w,
    cat-1000w.jpg 1000w,
    cat-1500w.jpg 1500w,
    cat-2000w.jpg 2000w
  "
  src="cat-600w.jpg"
  sizes="(min-width: 1111px) 1000px, 90vw"
  alt="A cat"
/>

Here we’re telling the browser that it can choose from 4 image sources: image-600w.jpg, image-1000w.jpg, image-1500w.jpg, image-2000w.jpg.

In the pixel density example in the previous section, the x descriptor following image sources in srcset is used to tell browsers what pixel density the image source is intended for. In this example, however, the w descriptor is used to tell browsers what width the image asset itself is.

The sizes attribute tells the browser what size you expect the image to be displayed under different conditions. Those conditions are specified in a media query. The browser will pick the first matching condition. Here we’re saying that if the window is equal to or wider than 1111px, the image should be 1000px wide. Otherwise, it should be 90% of the viewport width.

So finally, via srcset, the browser knows the resources available and their widths. Via sizes, it knows the width of the <img> for a given window width. It can now pick the best resource to download.

Create sizes

In our previous example, the sizes are pretty straightforward, but they can get complicated in practice. For example, let’s take a look at this template I found on Figma community:

Responsive design for mobile and desktop

My interpretation of the design is:

To put them together in the sizes attribute, we have:

<img
  ...
  sizes="
    (min-width: 1250px) calc(1170px / 2),
    (min-width: 769px) calc((100vw - 40px * 2) / 2),
    (min-width: 431px) calc(100vw - 40px * 2),
    100vw
  "
/>

Create srcset

Picking which resources to create and include in srcset is a balancing act. You want to serve more image sizes for better page load speed, but it will take up more space on your servers and require writing a bit more HTML. I typically like to go through the breakpoints one by one. In our Figma example, we have:

So roughly speaking, we need 2 image assets for 1x tablet and desktop: 585px and 344px. For 2x devices, we need: 1170px, 688px, and 862px.

<img
  ...
  srcset="
    image-585w.jpg 585w,
    image-1170w.jpg 1170w,
    image-344w.jpg 344w,
    image-688w.jpg 688w,
    image-862w.jpg 862w
  "
/>

To put the srcset and sizes together for the Figma example, we have:

<img
  srcset="
    image-585w.jpg 585w,
    image-1170w.jpg 1170w,
    image-344w.jpg 344w,
    image-688w.jpg 688w,
    image-862w.jpg 862w
  "
  src="image-344w.jpg"
  sizes="
    (min-width: 1250px) calc(1170px / 2),
    (min-width: 769px) calc((100vw - 40px * 2) / 2),
    (min-width: 431px) calc(100vw - 40px * 2),
    100vw
  "
  alt="..."
/>

Different Sizes, Densities, and Art Directions

Art direction change means showing entirely different images on different display sizes. A common use case is to zoom in on the area of interest when the image is displayed at a smaller size. This technique is not as widely used as the previous two, and the benefits of art direction are purely aesthetic. However, the <picture> tag used to enable it is necessary for other use cases, such as different image formats. So it’s worth understanding the syntax.

Art direction uses <picture>, <source>, and <img> tags. The <picture> element is a wrapper for zero or more <source> tags and one <img> tag. The <source> tags are used to specify different image sources for different conditions. The <img> tag is used to specify a fallback image source if none of the conditions provided by <source> are met and for browsers that don’t support <picture>. A very simple example:

<picture>
  <source
    media="(min-width: 769px)"
    srcset="zoom-out-rectangle.jpg"
  />
  <source
    media="(min-width: 431px)"
    srcset="zoom-in-rectangle.jpg"
  />
  <img src="zoom-in-square.jgp" alt="A cat" />
</picture>

Like <img>, <source> can take a srcset attribute with multiple images referenced, as well as a sizes attribute. So, you could offer multiple images via a <picture> element, but then also offer multiple resolutions of each one. Once the <source> or <img> is picked by the browser, the srcset and sizes attributes work as in previous examples.

For example, we wrote our responsive image code for the Figma example using an <img> tag with srcset and the w descriptor. With <picture>, we can refactor it and use the <source> tag’s media to specify the breakpoints and srcset to include image sources for 1x and 2x densities:

<picture>
  <source
    media="(max-width: 430px)"
    srcset="
      image-430-1x.jpg 1x,
      image-430-2x.jpg 2x
    "
  />
  <source
    media="(max-width: 768px)"
    srcset="
      image-344-1x.jpg 1x,
      image-344-2x.jpg 2x
    "
  />
  <img
    src="image-585-1x.jpg"
    srcset="
      image-585-1x.jpg 1x,
      image-585-2x.jpg 2x
    "
    alt="..."
  />
</picture>

Note: While similar, the 2 code snippets don’t work the same. In the first one, the browser first checks the sizes attribute to determine what size the image should be displayed at and picks the best resource among all 5 sources listed in srcset. In the latter, the browser first evaluates all source and img tags and picks the first one that matches the condition specified in the media query. It then only goes through the srcset attribute of that one source or img tag to pick the best resource to load.

Serve Next-Gen Formats

An important use case of <picture> is to serve different image formats. Next-gen formats can reduce the file size of an image by 30% to 50%, which decreases page size and improves performance. The WebP image format is designed to supersede JPEG, PNG and GIF. AVIF and JPEG XL are designed to supersede WebP. They now have relatively broad browser support, with WebP at 97% and AVIF at 76%.

However, not all browsers support all formats. To load the most efficient format for a given browser, we can use the <picture> element to supply MIME types inside the type attribute so the browser can ignore resources with formats it doesn’t support:

<picture>
  <source type="image/avif" srcset="cat.avif" />
  <source type="image/webp" srcset="cat.webp" />
  <img src="cat.jpg" alt="A cat" />
</picture>

Very similar to the art direction use case, the <picture> tag wraps around zero or more <source> tags and one <img> tag. The <source> tag specifies a media resource with the type attribute indicating the MIME type of the resource. The browser evaluates the list of <source> tags in order and ignores any <source> tags with MIME types it doesn’t support. If none of the <source> tags are supported, the browser will load the image in the <img> tag.

Like before, we can offer multiple image types via a <picture> element, but then also offer multiple resolutions of each type with the srcset and sizes attributes in <source> and <img>. To build on our previous Figma example, but with a bit of simplification for clarity, we get the following code:

<picture>
  <source
    type="image/avif"
    srcset="
      image-585w.avif 585w,
      ...
      image-862w.avif 862w
    "
    sizes="
      (min-width: 1250px) calc(1170px / 2),
      ...
      100vw
    "
  />
  <source
    type="image/webp"
    srcset="
      image-585w.webp 585w,
      ...
      image-862w.webp 862w
    "
    sizes="
      (min-width: 1250px) calc(1170px / 2),
      ...
      100vw
    "
  />
  <img
    srcset="
      image-585w.jpg 585w,
      ...
      image-862w.jpg 862w
    "
    sizes="
      (min-width: 1250px) calc(1170px / 2),
      ...
      100vw
    "
    src="image-344w.jpg"
    alt="..."
  />
</picture>

When the browser encounters this code, it will iterate through the list of <source> and <img> tags until it finds the MIME type it supports. It will then look at the sizes and srcset attributes to determine the best image to load. If nothing works, it will just load the default src in the <img> tag.

Generate Images In Multiple Sizes and Formats

Now we know what image sizes and formats we need for our design, let’s create them so we can test how it’s working. The most popular tool for one-off image processing is the ImageMagick command line. We’ll use it to test the final code snippet in the above section.

In that example, we have the following srcset for the image widths:

<img
  srcset="
    image-585w.jpg 585w,
    image-1170w.jpg 1170w,
    image-344w.jpg 344w,
    image-688w.jpg 688w,
    image-862w.jpg 862w
  "
  ...
/>

We’re also serving WebP and AVIF formats, so we need to generate 15 images in total. We can use the following command to get them:

convert image.jpg -resize 585 image-585w.jpg
convert image.jpg -resize 1170 image-1170w.jpg
...
convert image-585w.jpg -quality 50 -define webp:lossless=true image-585w.webp
convert image-1170w.jpg -quality 50 -define webp:lossless=true image-1170w.webp
...
convert image-585w.jpg -quality 50 -define avif:lossless=true image-585w.avif
convert image-1170w.jpg -quality 50 -define avif:lossless=true image-1170w.avif
...

After running the command, you should have all 15 images in the same directory as the original image. Put them in your site’s public directory and reference them somewhere on your site with the responsive image HTML syntax from the previous section. Now we’re ready to test it in action.

Have your developer tools, networks tab open and load the page where you referenced your images. You should see the browser loading the image in the appropriate size and format. Disable cache in the network tab and resize your window a few times to verify that the browser is changing which resource to load depending on viewport size. For example, my browser only loaded the 1170w image in AVIF because I’m on a wide viewport using the newest Chrome:

Responsive design for mobile and desktop

This workflow is pretty manual. It’s certainly not sustainable for a production website with lots of images to process. Luckily, there are many tools and services to automate the process. We went through it the manual way because it’s important to understand what’s happening under the hood.

If you’d like to integrate the image generation process into your Node-based build pipeline, Sharp is a popular choice for it. It’s a Node library used by various SSR and SSG frameworks under the hood, including Next.js, Gatsby, Eleventy, Astro.js.

For example, if you use @astrojs/image, after you build your site, you’ll see your image assets in different sizes and formats in the dist directory although you originally only had one image file. These assets are generated by Sharp automatically during the build process depending on the widths and sizes you specified in the <Image /> or <Picture> component.

There are also many cloud-based services that have a specific focus on image management like Cloudinary and Imgix. They generally offer rich features (like background removal and face detection) for image transformation on the fly, and allow you to upload images to their servers or integrate with your existing storage (like AWS S3 or Google Cloud Storage). We’ll look at them in a bit more detail in the Host Images on a CDN section.

Lazy Load Offscreen Images

We’ve verified that the browser is loading the right image size for the viewport and the most efficient format supported by the browser. This will help us save bandwidth during page load. But there are still a few more optimizations we can do.

Because images are usually large in size, we should lazy load those that are offscreen. This can be done by adding the loading attribute of lazy to <img> elements. This tells the browser to load the image immediately if it is in the viewport, and to fetch other images when the user scrolls near them. It is a great way to defer loading images that are not critical to the initial page load:

// In img
<img
  srcset="..."
  sizes="..."
  src="image-344w.jpg"
  alt="..."
  loading="lazy"
/>

// In picture
<picture>
  <source type="image/avif" srcset="..." sizes="..." />
  ...
  <img
    srcset="..."
    sizes="..."
    src="image-344w.jpg"
    alt="..."
    loading="lazy"
  />
</picture>

This approach lets the browser decide when to load the image depending on the user’s network speed and distance to the image. Its support is currently at 92%. For older browsers, use the loading-attribute-polyfill.

Now let’s test if lazy loading is working on our image. Add loading=lazy to the previous code snippet that you used to test responsive image. If this image element is not below the fold, make sure to add an empty element with a height of 200vh above it to push it down. You can add some text element below the image to have a more clear view. Reload the page and scroll down to the image. You should see the image loading only when you scroll to it. If the loading is too fast, you can adjust your network speed to slow 3G in your network tab or just observe the waterfall.

Hold Image Position to Prevent Layout Shifts

If you’ve done the previous step to test out lazy loading, you may have observed that the text element first occupied the space where the image was supposed to be, and then the image was loaded and pushed the text down. This is called a layout shift. If you didn’t see it, set your network to slow 3G, disable cache, and try it again. Layout shifts will likely be more visible with a slower network.

This is a bad user experience because it makes the page jump around and can cause users to lose their place on the page. Lighthouse has a Cumulative Layout Shift metric to measure the layout shifts on a page. You may be penalized if your page has a high CLS score.

To prevent layout shifts caused by images, the browser needs to reserve the correct space for an image in the layout before it loads in. And we can let the browser know how much space to reserve by specifying it in our markup. This can be achieved with several approaches.

We can set the width and height attributes explicitly on the <img> and <source> tags that use srcset:

<picture>
  <source
    type="image/avif"
    srcset="..."
    sizes="..."
    width="585"
    height="329"
  />
  ...
  <img
    src="image-585w.jpg"
    srcset="..."
    sizes="..."
    width="585"
    height="329"
    alt="..."
    loading="lazy"
  />
</picture>

Some general patterns to cover most use cases:

Another approach is to use the aspect-ratio attribute. This attribute accepts a ratio in the form of width:height, for example 16:9. The browser will use it to calculate the height of the image based on the width. This may result in a different height being used than the original image, so you’ll likely want to use the CSS object-fit property to control how the image is scaled to fit the space:

img {
  width: 100%;
  aspect-ratio: 16/9;
  object-fit: cover;
}

Specifying the width / height or width / aspect-ratio will not only decrease the chances of layout shifts, but also help the browser make better decisions for lazy loading. Without these attributes, images are 0x0px at first. Imagine if you have a gallery of such images, the browser may conclude that all of them fit into the first viewport since they take no space and none of them is pushed offscreen. This will result in all images being loaded at once, although we’d probably want to defer some of them in reality.

Now apply your preferred approach to the code snippet you used to test lazy loading while remain at a slow 3G network. When you scroll to the image, you should first see an empty space reserved for the image and the text element below the space. When the image is loaded, there’s no content shift.

Host Images on a CDN

When you have tons of image assets, you’d want to serve them from a CDN. CDNs are optimized for serving static assets because they can cache them and serve them from the cache nearest to the user. This will reduce requests to the origin server and improve speed.

As mentioned before, there are many options when it comes to image CDN services and they usually come with rich features for image transformation on the fly. Here’s two common use cases:

Cropping

In the previous section, we mentioned that explicit width and height or the aspect-ratio of an image is needed to prevent layout shifts. But what if the intrinsic dimensions of the image are unknown? You can use CSS properties like object-fit and object-position to scale the image and position the focus, but the transformations you can make with these properties are limited. This is when the rich features of the cloud services come in handy.

For example, with Cloudinary, you can use the g_auto transformation to automatically detect the focus of the image, then use c_fill_pad to first attempt to use the fill mode, but add some padding if the algorithm determines that more of the original image needs to be included in the final image. There’s also c_imagga_crop and c_imagga_scale that use the Imagga algorithm to crop and scale images. Finally, you can use the f_auto transformation to automatically detect the format of the image and convert it to the most suitable format for the browser. If you’re interested in learning more about the image transformation options, take a look at Cloudinary’s API reference.

Low Quality Image Placeholders

When we observed the behavior of image lazy loading with slow network, we saw an empty space being reserved for the size of the image before it loaded in. You might also observed that the unstyled alt text appeared for a bit in the image space, which might not be aesthetically pleasing. Ideally, we’d want to show a placeholder image to indicate to the user that the image is being loaded while providing some visual information about the actual image.

Image CDN services usually offer many different options for low quality image placeholders, such as a blurred or grayscale version of the image or a solid, dominant color of the image. These low quality placeholders are usually significantly smaller in file size than the actual image and can be loaded much quicker. If you look at Pinterest’s image loading behavior, you’ll see that they use a dominant color placeholder for the images in the feed. Again, with Cloudinary, you can use transformations like e_grayscale, e_blur:200 to generate low quality image placeholders.

Here’s a list of some other popular image CDN services:

Finally

In this article, we’ve covered some of the fundamental concepts of image optimization and went through some common techniques manually. I’d say it’s definitely not a trivial task to get image optimization right and I’d recommend automating the process as much as possible!

Nowadays almost all SSR and SSG frameworks have some kind of optimized <Image> component built-in. Hopefully this article has provided enough context for understanding how these components work and how we can use them properly to build a better, faster web.