How does Google's Page Speed lossless image compression work?

How does Google's Page Speed lossless image compression work? - compression

When you run Google's PageSpeed plugin for Firebug/Firefox on a website it will suggest cases where an image can be losslessly compressed, and provide a link to download this smaller image.
For example:
Losslessly compressing http://farm3.static.flickr.com/2667/4096993475_80359a672b_s.jpg could save 33.5KiB (85% reduction).
Losslessly compressing http://farm2.static.flickr.com/1149/5137875594_28d0e287fb_s.jpg could save 18.5KiB (77% reduction).
Losslessly compressing http://cdn.uservoice.com/images/widgets/en/feedback_tab_white.png could save 262B (11% reduction).
Losslessly compressing http://ajax.googleapis.com/ajax/libs/jqueryui/1.8.9/themes/base/images/ui-bg_flat_75_ffffff_40x100.png could save 91B (51% reduction).
Losslessly compressing http://www.gravatar.com/avatar/0b1bccebcd4c3c38cb5be805df5e4d42?s=45&d=mm could save 61B (5% reduction).
This applies across both JPG and PNG filetypes (I haven't tested GIF or others.)
Note too the Flickr thumbnails (all those images are 75x75 pixels.) They're some pretty big savings. If this is really so great, why aren't Yahoo applying this server-side to their entire library and reducing their storage and bandwidth loads?
Even Stackoverflow.com stands for some very minor savings:
Losslessly compressing http://sstatic.net/stackoverflow/img/sprites.png?v=3 could save 1.7KiB (10% reduction).
Losslessly compressing http://sstatic.net/stackoverflow/img/tag-chrome.png could save 11B (1% reduction).
I've seen PageSpeed suggest pretty decent savings on PNG files that I created using Photoshop's 'Save for Web' feature.
So my question is, what changes are they making to the images to reduce them by so much? I'm guessing there are different answers for different filetypes. Is this really lossless for JPGs? And how can they beat Photoshop? Should I be a little suspicious of this?

If you're really interested in the technical details, check out the source code:
png_optimizer.cc
jpeg_optimizer.cc
webp_optimizer.cc
For PNG files, they use OptiPNG with some trial-and-error approach
// we use these four combinations because different images seem to benefit from
// different parameters and this combination of 4 seems to work best for a large
// set of PNGs from the web.
const PngCompressParams kPngCompressionParams[] = {
PngCompressParams(PNG_ALL_FILTERS, Z_DEFAULT_STRATEGY),
PngCompressParams(PNG_ALL_FILTERS, Z_FILTERED),
PngCompressParams(PNG_FILTER_NONE, Z_DEFAULT_STRATEGY),
PngCompressParams(PNG_FILTER_NONE, Z_FILTERED)
};
When all four combinations are applied, the smallest result is kept. Simple as that.
(N.B.: The optipng command line tool does that too if you provide -o 2 through -o 7)
For JPEG files, they use jpeglib with the following options:
JpegCompressionOptions()
: progressive(false), retain_color_profile(false),
retain_exif_data(false), lossy(false) {}
Similarly, WEBP is compressed using libwebp with these options:
WebpConfiguration()
: lossless(true), quality(100), method(3), target_size(0),
alpha_compression(0), alpha_filtering(1), alpha_quality(100) {}
There is also image_converter.cc which is used to losslessly convert to the smallest format.

I use jpegoptim to optimize JPG files and optipng to optimize PNG files.
If you're on bash, the command to losslessly optimize all JPGs in a directory (recursively) is:
find /path/to/jpgs/ -type f -name "*.jpg" -exec jpegoptim --strip-all {} \;
You can add -m[%] to jpegoptim to lossy compress JPG images, for example:
find /path/to/jpgs/ -type f -name "*.jpg" -exec jpegoptim -m70 --strip-all {} \;
To optimize all PNGs in a directory:
find /path/to/pngs/ -type f -name "*.png" -exec optipng -o2 {} \;
-o2 is the default optimization level, you can change this from o2 to o7. Notice that higher optimization level means longer processing time.

Take a look at http://code.google.com/speed/page-speed/docs/payload.html#CompressImages which describes some of the techniques/tools.

It's a matter of trading encoder's CPU time for compression efficiency. Compression is a search for shorter representations, and if you search harder, you'll find shorter ones.
There is also a matter of using image format capabilities to the fullest, e.g. PNG8+a instead of PNG24+a, optimized Huffman tables in JPEG, etc.
Photoshop doesn't really try hard to do that when saving images for the web, so it's not surprising that any tool beats it.
See
ImageOptim (lossless) and
ImageAlpha (lossy) for smaller PNG files (high-level description how it works) and
JPEGmini/MozJPEG (lossy) for better JPEG compressor.

To Replicate PageSpeed's JPG Compression Results in Windows:
I was able to get exactly the same compression results as PageSpeed using the Windows version of jpegtran which you can get at www.jpegclub.org/jpegtran. I ran the executable using the DOS prompt (use Start > CMD). To get exactly the same file size (down to the byte) as PageSpeed compression, I specified Huffman optimization as follows:
jpegtran -optimize source_filename.jpg output_filename.jpg
For more help on compression options, at the command prompt, just type: jpegtran
Or to Use the Auto-generated Images from the PageSpeed tab in Firebug:
I was able to follow Pumbaa80's advice to get access to PageSpeed's optimized files. Hopefully the screenshot here provides further clarity for the FireFox environment. (But I was not able to get access to a local version of these optimized files in Chrome.)
And to Clean up the Messy PageSpeed Filenames using Adobe Bridge & Regular Expressions:
Although PageSpeed in FireFox was able to generate optimized image files for me, it also changed their names turning simple names like:
nice_picture.jpg
into
nice_picture_fff5e6456e6338ee09457ead96ccb696.jpg
I discovered that this seems to be a common complaint. Since I didn't want to rename all my pictures by hand, I used Adobe Bridge's Rename tool along with a Regular Expression. You could use other rename commands/tools that accept Regular Expressions, but I suspect that Adobe Bridge is readily available for most of us working with PageSpeed issues!
Start Adobe Bridge
Select all files (using Control A)
Select Tools > Batch Rename (or Control Shift R)
In the Preset field select "String Substitution". The New Filenames fields should now display “String Substitution”, followed by "Original Filename"
Enable the checkbox called “Use Regular Expression”
In the “Find” field, enter the Regular Expression (which will select all characters starting at the rightmost underscore separator):
_(?!.*_)(.*)\.jpg$
In the “Replace with” field, enter:
.jpg
Optionally, click the Preview button to see the proposed batch renaming results, then close
Click the Rename button
Note that after processing, Bridge deselects files that were not affected. If you want to clean all your .png files, you need reselect all the images and modify the configuration above (for "png" instead of "jpg"). You can also save the configuration above as a preset such as "Clean PageSpeed jpg Images" so that you can clean filenames quickly in future.
Configuration Screenshot / Troubleshooting
If you have troubles, it's possible that some browsers might not show the RegEx expression above properly (blame my escape characters) so for a screenshot of the configuration (along with these instructions), see:
How to Use Adobe Bridge's Batch Rename tool to Clean up Optimized PageSpeed Images that have Messy Filenames

In my opinion the best option out there that effectively handles most image formats in a go is trimage.
It effectively utilizes optipng, pngcrush, advpng and jpegoptim based on the image format and delivers near perfect lossless compression.
The implementation is pretty easy if using a command line.
sudo apt-get install trimage
trimage -d images/*
and voila! :-)
Additionally you will find a pretty simple interface to do it manually as well.

There's a very handy batch script that recursively optimizes images beneath a folder using OptiPNG (from this blog):
FOR /F "tokens=*" %G IN ('dir /s /b *.png') DO optipng -nc -nb -o7 -full %G
ONE LINE!

If you are looking for batch processing, keep in mind trimage complains if you don't have Xserver avail. In that case just write a bash or php script to do something like
<?php
echo "Processing jpegs<br />";
exec("find /home/example/public_html/images/ -type f -name '*.jpg' -exec jpegoptim --strip-all {} \;");
echo "Processing pngs<br />";
exec("find /home/example/public_html/images/ -type f -name '*.png' -exec optipng -o7 {} \;");
?>
Change options to suite your needs.

For windows there are several drag'n'drop interfaces for easy access.
https://sourceforge.net/projects/nikkhokkho/files/FileOptimizer/
For PNG files I found this one for my enjoyment, apparently 3 different tools wrapped in this GIU. Just drag and drop and it does it for you.
https://pnggauntlet.com/
It takes time though, try compressing a 1MB png file - I was amazed how much CPU went into this compression comparison which has to be what is going on here. Seems the image is compressed a 100 ways and the best one wins :D
Regarding the JPG compression I to feel its risky to strip of color profiles and all extra info - however - if everyone is doing it - its the business standard so I just did it myself :D
I saved 113MB on 5500 files on a WP install today, so its definately worth it!

Related

diff vs rsync for comparing directories

I am trying to find the most efficient way to compare large directories of media files (RAW photos, 4k video, WAV audio, etc.) on different volumes. I am currently using:
diff -x '.*' -rq --report-identical-files --side-by-side /folder1/ /folde2/
I have a few questions:
do diff and rsync --dry-run both use checksum in the same way to verify file contents?
is one more efficient than the other in terms of the time required to work through these files and compare them?
are there any recommendations for other methods of verifying that the directories and the files in them are the same and looking for any differences still to be copied over?

youtube-dl "best" option doesn't do anything

I'm trying to download a 4k video from youtube. For this, I used the command
youtube-dl -f best https://youtu.be/VcR5RCzWfeY
However, using this command only downloads the video in 720p. Manually specifying the resolution, however, seems to work:
youtube-dl https://youtu.be/VcR5RCzWfeY -f 313+bestaudio
The documentation states that using nothing should download the best quality possible, but I always get the default quality of 720p. This tends to be an issue when I am downloading playlists with multiple file qualities. So what gives? Is there some other code I should be using?

youtube-dl downloads the best quality by default. (This may not be the highest resolution for all of the supported sites, but it tends to be that one for YouTube.)
-f best is not the default. It advises youtube-dl to download the best single file format. For many supported sites, the best single format will be the best overall, but that does not apply to YouTube.
To get the highest quality, simply run youtube-dl without any -f:
youtube-dl https://youtu.be/VcR5RCzWfeY
For your example video, this will produce an 7680x4320 video file weighing 957MB.
Note that this requires ffmpeg to be installed on your machine and available in your PATH (or specified with --ffmpeg-location). To find out which version of ffmpeg you have, type ffmpeg.

libpng warning: iCCP: known incorrect sRGB profile

I'm trying to load a PNG image using SDL but the program doesn't work and this error appears in the console
libpng warning: iCCP: known incorrect sRGB profile
Why does this warning appear? What should I do to solve this problem?

Some applications treat warnings as errors; if you are using such an application, you do have to remove the chunk. You can do that with any variety of PNG editors, like ImageMagick.
With Windows CMD prompt, you will need to cd (change directory) into the folder with the images you want to focus on before you can use the commands listed below.
Libpng-1.6 is more stringent about checking ICC profiles than previous versions; you can ignore the warning. To get rid of it, remove the iCCP chunk from the PNG image.
convert in.png out.png
To remove the invalid iCCP chunk from all of the PNG files in a folder (directory), you can use mogrify from ImageMagick:
mogrify *.png
This requires that your ImageMagick was built with libpng16. You can easily check it by running:
convert -list format | grep PNG
If you'd like to find out which files need to be fixed instead of blindly processing all of them, you can run
pngcrush -n -q *.png
where the -n means don't rewrite the files and -q means suppress most of the output except for warnings. Sorry, there's no option yet in pngcrush to suppress everything but the warnings.
Note: You must have pngcrush installed.
Binary Releases of ImageMagick are here
For Android Projects (Android Studio) navigate into res folder.
For example:
C:\{your_project_folder}\app\src\main\res\drawable-hdpi\mogrify *.png

Use pngcrush to remove the incorrect sRGB profile from the png file:
pngcrush -ow -rem allb -reduce file.png
-ow will overwrite the input file
-rem allb will remove all ancillary chunks except tRNS and gAMA
-reduce does lossless color-type or bit-depth reduction
In the console output you should see Removed the sRGB chunk, and possibly more messages about chunk removals. You will end up with a smaller, optimized PNG file. As the command will overwrite the original file, make sure to create a backup or use version control.

Solution
The incorrect profile could be fixed by:
Opening the image with the incorrect profile using QPixmap::load
Saving the image back to the disk (already with the correct profile) using QPixmap::save
Note: This solution uses the Qt Library.
Example
Here is a minimal example I have written in C++ in order to demonstrate how to implement the proposed solution:
QPixmap pixmap;
pixmap.load("badProfileImage.png");
QFile file("goodProfileImage.png");
file.open(QIODevice::WriteOnly);
pixmap.save(&file, "PNG");
The complete source code of a GUI application based on this example is available on GitHub.
UPDATE FROM 05.12.2019: The answer was and is still valid, however there was a bug in the GUI application I have shared on GitHub, causing the output image to be empty. I have just fixed it and apologise for the inconvenience!

You can also just fix this in photoshop...
Open your .png file.
File -> Save As and in the dialog that opens up uncheck "ICC Profile: sRGB IEC61966-2.1"
Uncheck "As a Copy".
Courageously save over your original .png.
Move on with your life knowing that you've removed just that little bit of evil from the world.

To add to Glenn's great answer, here's what I did to find which files were faulty:
find . -name "*.png" -type f -print0 | xargs \
-0 pngcrush_1_8_8_w64.exe -n -q > pngError.txt 2>&1
I used the find and xargs because pngcrush could not handle lots of arguments (which were returned by **/*.png). The -print0 and -0 is required to handle file names containing spaces.
Then search in the output for these lines: iCCP: Not recognizing known sRGB profile that has been edited.
./Installer/Images/installer_background.png:
Total length of data found in critical chunks = 11286
pngcrush: iCCP: Not recognizing known sRGB profile that has been edited
And for each of those, run mogrify on it to fix them.
mogrify ./Installer/Images/installer_background.png
Doing this prevents having a commit changing every single png file in the repository when only a few have actually been modified. Plus it has the advantage to show exactly which files were faulty.
I tested this on Windows with a Cygwin console and a zsh shell. Thanks again to Glenn who put most of the above, I'm just adding an answer as it's usually easier to find than comments :)

After trying a couple of the suggestions on this page I ended up using the pngcrush solution. You can use the bash script below to recursively detect and fix bad png profiles. Just pass it the full path to the directory you want to search for png files.
fixpng "/path/to/png/folder"
The script:
#!/bin/bash
FILES=$(find "$1" -type f -iname '*.png')
FIXED=0
for f in $FILES; do
WARN=$(pngcrush -n -warn "$f" 2>&1)
if [[ "$WARN" == *"PCS illuminant is not D50"* ]] || [[ "$WARN" == *"known incorrect sRGB profile"* ]]; then
pngcrush -s -ow -rem allb -reduce "$f"
FIXED=$((FIXED + 1))
fi
done
echo "$FIXED errors fixed"

There is an easier way to fix this issue with Mac OS and Homebrew:
Install homebrew if it is not installed yet
$brew install libpng
$pngfix --strip=color --out=file2.png file.png
or to do it with every file in the current directory:
mkdir tmp; for f in ./*.png; do pngfix --strip=color --out=tmp/"$f" "$f"; done
It will create a fixed copy for each png file in the current directory and put it in the the tmp subdirectory. After that, if everything is OK, you just need to override the original files.
Another tip is to use the Keynote and Preview applications to create the icons. I draw them using Keynote, in the size of about 120x120 pixels, over a slide with a white background (the option to make polygons editable is great!). Before exporting to Preview, I draw a rectangle around the icon (without any fill or shadow, just the outline, with the size of about 135x135) and copy everything to the clipboard. After that, you just need to open it with the Preview tool using "New from Clipboard", select a 128x128 pixels area around the icon, copy, use "New from Clipboard" again, and export it to PNG. You won't need to run the pngfix tool.

Thanks to the fantastic answer from Glenn, I used ImageMagik's "mogrify *.png" functionality. However, I had images buried in sub-folders, so I used this simple Python script to apply this to all images in all sub-folders and thought it might help others:
import os
import subprocess
def system_call(args, cwd="."):
print("Running '{}' in '{}'".format(str(args), cwd))
subprocess.call(args, cwd=cwd)
pass
def fix_image_files(root=os.curdir):
for path, dirs, files in os.walk(os.path.abspath(root)):
# sys.stdout.write('.')
for dir in dirs:
system_call("mogrify *.png", "{}".format(os.path.join(path, dir)))
fix_image_files(os.curdir)

some background info on this:
Some changes in libpng version 1.6+ cause it to issue a warning or
even not work correctly with the original HP/MS sRGB profile, leading
to the following stderr: libpng warning: iCCP: known incorrect sRGB
profile The old profile uses a D50 whitepoint, where D65 is standard.
This profile is not uncommon, being used by Adobe Photoshop, although
it was not embedded into images by default.
(source: https://wiki.archlinux.org/index.php/Libpng_errors)
Error detection in some chunks has improved; in particular the iCCP
chunk reader now does pretty complete validation of the basic format.
Some bad profiles that were previously accepted are now rejected, in
particular the very old broken Microsoft/HP sRGB profile. The PNG spec
requirement that only grayscale profiles may appear in images with
color type 0 or 4 and that even if the image only contains gray
pixels, only RGB profiles may appear in images with color type 2, 3,
or 6, is now enforced. The sRGB chunk is allowed to appear in images
with any color type.
(source: https://forum.qt.io/topic/58638/solved-libpng-warning-iccp-known-incorrect-srgb-profile-drive-me-nuts/16)

Using IrfanView image viewer in Windows, I simply resaved the PNG image and that corrected the problem.

Some of the proposed answers use pngcrush with the -rem allb option, which the documentation says is like "surgery with a chainsaw." The option removes many chunks. To prevent the "iCCP: known incorrect sRGB profile" warning it is sufficient to remove the iCCP chunk, as follows:
pngcrush -ow -rem iCCP filename.png

Extending the friederbluemle solution, download the pngcrush and then use the code like this if you are running it on multiple png files
path =r"C:\\project\\project\\images" # path to all .png images
import os
png_files =[]
for dirpath, subdirs, files in os.walk(path):
for x in files:
if x.endswith(".png"):
png_files.append(os.path.join(dirpath, x))
file =r'C:\\Users\\user\\Downloads\\pngcrush_1_8_9_w64.exe' #pngcrush file
for name in png_files:
cmd = r'{} -ow -rem allb -reduce {}'.format(file,name)
os.system(cmd)
here all the png file related to projects are in 1 folder.

I ran those two commands in the root of the project and its fixed.
Basically redirect the output of the "find" command to a text file to use as your list of files to process. Then you can read that text file into "mogrify" using the "#" flag:
find *.png -mtime -1 > list.txt
mogrify -resize 50% #list.txt
That would use "find" to get all the *.png images newer than 1 day and print them to a file named "list.txt". Then "mogrify" reads that list, processes the images, and overwrites the originals with the resized versions. There may be minor differences in the behavior of "find" from one system to another, so you'll have to check the man page for the exact usage.

When I training yolo, the warninglibpng warning: iCCP: known incorrect sRGB profile occurs each epoch. Then I use bash to find the png, then use python3 and opencv(cv2) to rewrite the png files. So, the warning just occurs when rewriting. Steps as follow:
step 1. Create a python file:
# rewrite.py
import cv2, sys, os
fpath = sys.argv[1]
if os.path.exists(fpath):
cv2.imwrite(fpath, cv2.imread(fpath))
step 2. In bash, run:
# cd your image dir
# then find and rewrite png file
find . -iname "*.png" | xargs python3 rewrite.py

for PHP developers having this issue with imagecreatefrompng function
you can try suppressing the warning using #
$img = #imagecreatefrompng($file);

Here is a ridiculously brute force answer:
I modified the gradlew script. Here is my new exec command at the end of the file in the
exec "$JAVACMD" "${JVM_OPTS[#]}" -classpath "$CLASSPATH" org.gradle.wrapper.GradleWrapperMain "$#" **| grep -v "libpng warning:"**

how to rapidly extract intraframe from a video (in c++ or python)

i want to capture some frame from a video so i used command like this:
ffmpeg -i MyVideo.mp4 -ss 1:20:12 -vframes 1 test-pic.jpg
but ffmpeg proccess frame from begin of video so this command is too slow. i research and i found some article about keyframe so i try to extract keyframe by a command like this
ffmpeg -vf select="eq(pict_type\,PICT_TYPE_I)" -i MyVideo.mp4 -vsync 2 -s 160x90 -f image2 thumbnails-%02d.jpeg
but this command also is to slow and capture too many frame.
I need a linux command or c++ or python code to capture a frame that dont take long time

The ffmpeg wiki states regarding fast seeking:
The -ss parameter needs to be specified before -i:
ffmpeg -ss 00:03:00 -i Underworld.Awakening.avi -frames:v 1 out1.jpg
This example will produce one image frame (out1.jpg) somewhere around
the third minute from the beginning of the movie. The input will be
parsed using keyframes, which is very fast. The drawback is that it
will also finish the seeking at some keyframe, not necessarily located
at specified time (00:03:00), so the seeking will not be as accurate
as expected.
You could also use hybrid mode, combining fast seeking and slow (decode) seeking, which is kind of the middle ground.
If you want to implement this in C/C++, see the docs/examples directory of ffmpeg to get started and av_seek_frame.
I recently hacked together some C code to do thumbnails myself, which uses the hybrid mode effectively. May be helpful to you, or not.

Hello, Mr. Anderson.
I'm not familiar with using C++ or Python to do such a thing. I'm sure it's possible (I could probably get a good idea of how to do this if I researched for an hour), but the time it would take to implement a full solution may outweigh the time cost of finding a better frame capturing program. After a bit of Googling, I came up with:
VirtualDub
Camtasia
Frame-shots

get metadata from jpg, dng and arw raw files

I was wondering if anyone new how to get access the metadata (the date in particular) from jpg, arw and dng files.
I've recently lost the folder structure after a merge operation gone-bad and would like to rename the recovered files according to the metadata.
I'm planning on creating a little C++ app to dig into each file and get the metadata.
any input is appreciated.
( alternatively, if you know of an app that already does this I'd like to know :)

Have you looked at the libexif project http://libexif.sourceforge.net/?

ok, so I did a google search (probably should have started with that) for "batch rename based on exif data arw dng jpg"
and the first page that popped up was the ExifTool by Phil Harvey
it supports recent arw and dng files, and with some command line magic I should be able to get it to do what pretty much what I want
exiftool -r -d images/%Y-%m-%d/%Y%m%d_%%.4c.%%e "-filename<filemodifydate" pics
-move files to folders (images/YYYY-MM-DD/) and rename files to YYYYMMDD_####.ext that are in pics folder(and subfolders)
hope this helps others

You should also try Adobe XMP SDK, which is great for its supported formats (JPEG, PNG, TIFF and DNG).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js