Advertisement

ffmpeg encoding H.264 - decrease size, maintain quality

top computer
#re-encode original with varying CRF values
for i in 14 18 24 28 32 38 46 51; do
ffmpeg -i original.mp4 -c:a copy -c:v libx264 \
             -preset fast -crf $i -qphist -tune stillimage \
             crf_$i.mp4

Examples:
source video size: 233mb

with configuration:
source video:
Duration: 00:25:21.21, start: 0.000000, bitrate: 1288 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 1151 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)

source audio:
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)

Examples:
commands size video output format audio output format
Original 233  MB 1280x720, 1151 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc aac, 44100 Hz, stereo, fltp, 128 kb/s
ffmpeg -i input.mp4 -c:v libx264 -crf 23 output.mp4 336  MB 1280x720, 1465 kb/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc same
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -qphist -tune stillimage crfimage.mp4 344  MB 1280x720, 1764 kb/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc same
ffmpeg -i input.mp4 -c:v libx264 -crf 23 crf23.mp4 290  MB 1280x720, 1465kb/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc same
ffmpeg -i input.mp4 -vcodec h264 -acodec mp2 output.mp4 336  MB 1280x720, 1465 kb/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc same
ffmpeg -i input.mp4 -c:v libx264 -preset fast -crf 18 crf18fast.mp4 468  MB 1280x720, 2447 kbps, 30fps, 16:9 aac 4 40Lc, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -c:v libx264 -b:v 500k 500knoaudioption.mp4 116  MB 1280x720, 500 Kbps, 30 fps, 16:9 aac 4 40LC, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -r 25  framerate25fps.mp4 282  MB 1280x720, 1423 kb/s, q=-1--1, 25 fps, 12800 tbn, 25 tbc aac 4 40LC, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -b:v 64k -bufsize 64k videobitrate64kbits.mp4 37  MB 1280x720, 66 kb/s, q=-1--1, 64 kb/s, 30 fps, 15360 tbn, 30 tbc aac 4 40LC, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -c:v libx264 -crf 28 crf28.mp4 195  MB 1280x720, 1051.1kbits/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc aac 4 40LC, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -c:v libx264 -crf 28 -acodec libmp3lame -b:a 16k -ac 1 -ar 16000 crf2811.mp4 178  MB 1280x720, 913kbps, q=-1--1, 30 fps, 15360 tbn, 30 tbc MPEG ver 2 69 Layer 3, 64kbs, 1 chan, 24000 Hz, 16 bits
ffmpeg -i input.mp4 -c:v libx264 -crf 26 -acodec libmp3lame -b:a 24k -ac 2 -ar 24000 crf2611.mp4   MB 1280x720, 913kbps, q=-1--1, 30 fps, 15360 tbn, 30 tbc MPEG ver 2 69 Layer 3, 64kbs, 1 chan, 24000 Hz, 16 bits
ffmpeg -i input.mp4 -acodec libopus crf2811.mp4ts

 
Lossless H.264

You can use -crf 0 to encode a lossless output. Two useful presets for this are ultrafast or veryslow since either a fast encoding speed or best compression are usually the most important factors.

  Lossless Example (fastest encoding)

ffmpeg -i input -c:v libx264 -preset ultrafast -crf 0 output.mkv

  Lossless Example (best compression)

ffmpeg -i input -c:v libx264 -preset veryslow -crf 0 output.mkv
Note that lossless output files will likely be huge, and most non-FFmpeg based players will not be able to decode lossless, so if compatibility or file size issues you should not use lossless. If you're looking for an output that is roughly "visually lossless" but not technically lossless use a -crf value of around 17 or 18 (you'll have to experiment to see which value is acceptable for you). It will likely be indistinguishable from the source and not result in a huge, possibly incompatible file like true lossless mode.

 CRF Guide
CRF stands for Constant Rate Factor, x264’s best single-pass encoding method.


 Quick Summary: What is the Constant Rate Factor?

The Constant Rate Factor (CRF) is the default quality setting for the x264 encoder. You can set the values between 0 and 51, where lower values would result in better quality (at the expense of higher file sizes). Sane values are between 18 and 28. The default for x264 is 23, so you can use this as a starting point.
With ffmpeg, it'd look like this:
ffmpeg -i input.mp4 -c:v libx264 -crf 23 output.mp4
If you're unsure about what CRF to use, begin with 23 and change it according to your subjective impression of the output. Is the quality good enough? No? Then set a lower CRF. Is the file size too high? Choose a higher CRF. A change of ±6 should result in about half/double the file size, although your results might vary.

comp. size remain artifacts
original 100.0% 19.16 GB uncompress
crf=14 15.7% 3.00 GB
crf=18 10.1% 1.98 GB
crf=24 5.1% 0.98 GB
crf=28 3.5% 0.68 GB
crf=32 2.4% 0.47 GB minimal
crf=38 1.9% 0.36 GB usable
crf=46 1.6% 0.30 GB bad
crf=51 1.4% 0.28 GB unusable
NB: crf more big, size more small, quality more down.

Video format = Timeline

The following table is a partial history of international video compression standards.
History of video compression standards
Year Standard Publisher Popular implementations
1984 H.120 ITU-T
1988 H.261 ITU-T Videoconferencing, videotelephony
1993 MPEG-1 Part 2 ISOIEC Video-CD
1995 H.262/MPEG-2 Part 2 ISOIECITU-T DVD VideoBlu-rayDigital Video BroadcastingSVCD
1996 H.263 ITU-T Videoconferencing, videotelephony, video on mobile phones (3GP)
1999 MPEG-4 Part 2 ISOIEC Video on Internet (DivXXvid ...)
2003 H.264/MPEG-4 AVC SonyPanasonicSamsungISOIECITU-T Blu-rayHD DVDDigital Video BroadcastingiPod VideoApple TV, videoconferencing
2009 VC-2 (Dirac) SMPTE Video on Internet, HDTV broadcast, UHDTV
2013 H.265 ISOIECITU-T

Levels

As the term is used in the standard, a "level" is a specified set of constraints that indicate a degree of required decoder performance for a profile. For example, a level of support within a profile specifies the maximum picture resolution, frame rate, and bit rate that a decoder may use. A decoder that conforms to a given level must be able to decode all bitstreams encoded for that level and all lower levels.
Levels with maximum property values[citation needed]
LevelMax decoding speedMax frame sizeMax video bit rate for video coding layer (VCL) kbit/sExamples for high resolution
@ highest frame rate
(max stored frames)
Toggle additional details
Luma samples/sMacroblocks/sLuma samplesMacroblocksBaseline, Extended
and Main Profiles
High ProfileHigh 10 Profile
1380,1601,48525,344996480192
128×96@30.9 (8)
176×144@15.0 (4)
1b380,1601,48525,34499128160384
128×96@30.9 (8)
176×144@15.0 (4)
1.1768,0003,000101,376396192240576
176×144@30.3 (9)
320×240@10.0 (3)
352×288@7.5 (2)
1.21,536,0006,000101,3763963844801,152
320×240@20.0 (7)
352×288@15.2 (6)
1.33,041,28011,880101,3763967689602,304
320×240@36.0 (7)
352×288@30.0 (6)
23,041,28011,880101,3763962,0002,5006,000
320×240@36.0 (7)
352×288@30.0 (6)
2.15,068,80019,800202,7527924,0005,00012,000
352×480@30.0 (7)
352×576@25.0 (6)
2.25,184,00020,250414,7201,6204,0005,00012,000
352×480@30.7 (12)
352×576@25.6 (10)
720×480@15.0 (6)
720×576@12.5 (5)
310,368,00040,500414,7201,62010,00012,50030,000
352×480@61.4 (12)
352×576@51.1 (10)
720×480@30.0 (6)
720×576@25.0 (5)
3.127,648,000108,000921,6003,60014,00017,50042,000
720×480@80.0 (13)
720×576@66.7 (11)
1,280×720@30.0 (5)
3.255,296,000216,0001,310,7205,12020,00025,00060,000
1,280×720@60.0 (5)
1,280×1,024@42.2 (4)
462,914,560245,7602,097,1528,19220,00025,00060,000
1,280×720@68.3 (9)
1,920×1,080@30.1 (4)
2,048×1,024@30.0 (4)
4.162,914,560245,7602,097,1528,19250,00062,500150,000
1,280×720@68.3 (9)
1,920×1,080@30.1 (4)
2,048×1,024@30.0 (4)
4.2133,693,440522,2402,228,2248,70450,00062,500150,000
1,280×720@145.1 (9)
1,920×1,080@64.0 (4)
2,048×1,080@60.0 (4)
5150,994,944589,8245,652,48022,080135,000168,750405,000
1,920×1,080@72.3 (13)
2,048×1,024@72.0 (13)
2,048×1,080@67.8 (12)
2,560×1,920@30.7 (5)
3,672×1,536@26.7 (5)
5.1251,658,240983,0409,437,18436,864240,000300,000720,000
1,920×1,080@120.5 (16)
2,560×1,920@51.2 (9)
3,840×2,160@31.7 (5)
4,096×2,048@30.0 (5)
4,096×2,160@28.5 (5)
4,096×2,304@26.7 (5)
5.2530,841,6002,073,6009,437,18436,864240,000300,000720,000
1,920×1,080@172.0 (16)
2,560×1,920@108.0 (9)
3,840×2,160@66.8 (5)
4,096×2,048@63.3 (5)
4,096×2,160@60.0 (5)
4,096×2,304@56.3 (5)
The maximum bit rate for High Profile is 1.25 times that of the Base/Extended/Main Profiles, 3 times for Hi10P, and 4 times for Hi422P/Hi444PP.
The number of luma samples is 16x16=256 times the number of macroblocks (and the number of luma samples per second is 256 times the number of macroblocks per second).

Decoded picture buffering

Previously encoded pictures are used by H.264/AVC encoders to provide predictions of the values of samples in other pictures. This allows the encoder to make efficient decisions on the best way to encode a given picture. At the decoder, such pictures are stored in a virtual decoded picture buffer (DPB). The maximum capacity of the DPB is in units of frames (or pairs of fields), as shown in parentheses in the right column of the table above, can be computed as follows:
capacity = min(floor(MaxDpbMbs / (PicWidthInMbs * FrameHeightInMbs)), 16)
Where MaxDpbMbs is a constant value provided in the table below as a function of level number, and PicWidthInMbs and FrameHeightInMbs are the picture width and frame height for the coded video data, expressed in units of macroblocks (rounded up to integer values and accounting for cropping and macroblock pairing when applicable). This formula is specified in sections A.3.1.h and A.3.2.f of the 2009 edition of the standard.
Level
1
1b
1.1
1.2
1.3
2
2.1
2.2
3
3.1
3.2
4
4.1
4.2
5
5.1
5.2
MaxDpbMbs
396
396
900
2,376
2,376
2,376
4,752
8,100
8,100
18,000
20,480
32,768
32,768
34,816
110,400
184,320
184,320
For example, for an HDTV picture that is 1920 samples wide (PicWidthInMbs = 120) and 1080 samples high (FrameHeightInMbs = 68), a Level 4 decoder has a maximum DPB storage capacity of Floor(32768/(120*68)) = 4 frames (or 8 fields) when encoded with minimal cropping parameter values. Thus, the value 4 is shown in parentheses in the table above in the right column of the row for Level 4 with the frame size 1920×1080.

It is important to note that the current picture being decoded is not included in the computation of DPB fullness (unless the encoder has indicated for it to be stored for use as a reference for decoding other pictures or for delayed output timing). Thus, a decoder needs to actually have sufficient memory to handle (at least) one frame more than the maximum capacity of the DPB as calculated above.
Audio format
 You can use the following table to select the target sound format based on characteristics of sound source.
Characteristics of source sound
Format
Attributes
Megabytes per hour (approximate)
Low-quality voice recording
DSP TrueSpeech
8.0kHz,1 bit, mono
4
Low-quality Internet music
MP3
11.5kHz, 16kBit/s mono
4.5
High-quality voice recording
Lernout & Hauspie
8.0kHz,16 bit, mono
9
High-quality voice recording
WMA voice
20kBit/s, 22.05kHz, mono
9
Middle-quality Internet music
MP3
22.05kHz, 56kBit/s, stereo
25
Near high-quality recordings
MP3
44.1kHz, 128kBit/s, stereo
56
High-quality recordings
WMA lossless
VBR Quality 100, 44 kHz, 2 channel 16 bit
150
High-quality recordings
Flac lossless
96kHz, 16 bit, stereo
240
High-quality (CD quality) recordings
PCM
44.1kHz,16 bit, stereo
600
DVD Audio, Super Audio CD recordings
PCM
96kHz, 24 bit, stereo
(available for Professional, VideoPro and Developer Edition users only)
1978
High quality formats
Starting with version 5.2, Total Recorder Professional Edition supports high-quality formats. This includes high-quality PCM formats (up to 192kHz 24bit and float mono and stereo), high-quality FLAC formats (up to 192kHz 24bit mono and stereo), high-quality Windows Media Audio Lossless stereo formats (up to 96kHz 24bit), and the stereo formats of the Windows Media Audio Professional codec.
This is a reference to compare the monophonic (not stereophonic) audio quality and compression bitrates of audio coding formats available for WAV files including PCMADPCM, Microsoft GSM 06.10CELPSBCTruespeech and MPEG Layer-3.
Format Bitrate (kbit/s) 1 minute (KiB) Sample
11,025 Hz 16 bit PCM 176.4 1292 11k16bitpcm.wav
8,000 Hz 16 bit PCM 128 938 8k16bitpcm.wav
11,025 Hz 8 bit PCM 88.2 646 11k8bitpcm.wav
11,025 Hz µ-Law 88.2 646 11kulaw.wav
8,000 Hz 8 bit PCM 64 469 8k8bitpcm.wav
8,000 Hz µ-Law 64 469 8kulaw.wav
11,025 Hz 4 bit ADPCM 44.1 323 11kadpcm.wav
8,000 Hz 4 bit ADPCM 32 234 8kadpcm.wav
11,025 Hz GSM 06.10 18 132 11kgsm.wav
8,000 Hz MP3 16 kbit/s 16 117 8kmp316.wav
8,000 Hz GSM 06.10 13 103 8kgsm.wav
8,000 Hz Lernout & Hauspie SBC 12 kbit/s 12 88 8ksbc12.wav
8,000 Hz DSP Group Truespeech 9 66 8ktruespeech.wav
8,000 Hz MP3 8 kbit/s 8 60 8kmp38.wav
8,000 Hz Lernout & Hauspie CELP 4.8 35 8kcelp.wav
The above are WAV files; even those that use MP3 compression have the ".wav" extension.

Technical information LAME#Recommended_encoder_settings

Recommended settings details[edit]

Technical details of the recommended settings
SwitchPresetTarget KbpsTypical Kbps[3]Y Switch enabled by defaultLowpass[4]ResampleFormerly Known As
-b 320--preset insane320320Y[5]20094 Hz – 20627 Hzapi
-V 0--preset extreme~240220–260noneape or apx
-V 1~220190–25019383 Hz – 19916 Hz
-V 2--preset standard~190170–21018671 Hz – 19205 Hzaps
-V 3~170150–195Y17960 Hz – 18494 Hz
-V 4--preset medium~160140–185Y17249 Hz – 17782 Hzapm
-V 5~130120–150Y16538 Hz – 17071 Hz
-V 6~120100–130Y16538 Hz – 17071 Hz
The default lowpass settings were not chosen at random; for general use, they are as high as they can be without putting quality at risk. Raising the the cutoff via command-line options is not recommended. See the high-frequency content in MP3s article for more info.

Recommended upload encoding settings

  • No Edit Lists (or the video might not get processed correctly)
  • moov atom at the front of the file (Fast Start)
  • Channels: Stereo or Stereo + 5.1
  • Sample rate 96khz or 48khz
  • Progressive scan (no interlacing)
  • High Profile
  • 2 consecutive B frames
  • Closed GOP. GOP of half the frame rate.
  • CABAC
  • Variable bitrate. No bitrate limit required, though we offer recommended bit rates below for reference
  • Chroma subsampling: 4:2:0
Content should be encoded and uploaded in the same frame rate it was recorded.
Common frame rates include: 24, 25, 30, 48, 50, 60 frames per second (other frame rates are also acceptable).
Interlaced content should be deinterlaced before uploading. For example, 1080i60 content should be deinterlaced to 1080p30, going from 60 interlaced fields per second to 30 progressive frames per second.
The bitrates below are recommendations for uploads. Audio playback bitrate is not related to video resolution.

Recommended video bitrates for SDR uploads

To view new 4K uploads in 4K, use a browser or device that supports VP9.
Type Video Bitrate, Standard Frame Rate
(24, 25, 30)
Video Bitrate, High Frame Rate
(48, 50, 60)
2160p (4k) 35-45 Mbps 53-68 Mbps
1440p (2k) 16 Mbps 24 Mbps
1080p 8 Mbps 12 Mbps
720p 5 Mbps 7.5 Mbps
480p 2.5 Mbps 4 Mbps
360p 1 Mbps 1.5 Mbps

Recommended video bitrates for HDR uploads

Type Video Bitrate, Standard Frame Rate
(24, 25, 30)
Video Bitrate, High Frame Rate
(48, 50, 60)
2160p (4k) 44-56 Mbps 66-85 Mbps
1440p (2k) 20 Mbps 30 Mbps
1080p 10 Mbps 15 Mbps
720p 6.5 Mbps 9.5 Mbps
480p
Not supported
Not supported
360p Not supported Not supported

Recommended audio bitrates for uploads

Type Audio Bitrate
Mono 128 kbps
Stereo 384 kbps
5.1 512 kbps
YouTube uses 16:9 aspect ratio players. If you're uploading a non-16:9 file, it will be processed and displayed correctly as well, with pillar boxes (black bars on the left and right) or letter boxes (black bars at the top and bottom) provided by the player.
  • 2160p: 3840x2160
  • 1440p: 2560x1440
  • 1080p: 1920x1080
  • 720p: 1280x720
  • 480p: 854x480
  • 360p: 640x360
  • 240p: 426x240
     

Comparison of ffmpeg’s x264 presets

Measured performance of ffmpeg’s x264 quality presets, to know which option is the best for my purpose converting mpeg2 ts to Apple TV mp4. Factors need to be considered are 1) conversion speed, 2) file size and 3) quality of the video. The most important factor is conversion speed, which would be nice if the time for conversion is shorter than for recording. For file size off course, the smaller the better as the space of my Apple TV storage is limited.

Test Method

The test was done with ffmpeg version 0.6.1 on 2.8GHz Intel Core i7 running Mac OS X 10.6.8 as;
for i in ultrafast superfast veryfast faster fast medium slow slower veryslow placebo
do ffmpeg -i sample.ts -threads 4 -vcodec libx264 \
      -vpre $i -vpre main -crf 18 -s 1280x720 \
      -acodec libfaac -ab 160k \
      -y $i.mp4
done
sample.ts is a 60 seconds of full HD (1440×1080) mpeg2 video.

Test Results

preset used for conversion Conversion speed (fps - frames per second) Size of the output video file (in bytes)
ultrafast 29.6 81,046,858
superfast 27.9 58,180,478
veryfast 20.6 43,968,615
faster 13.0 32,981,783
fast 9.1 35,461,071
medium 8.0 35,119,322
slow 5.6 33,367,247
slower 2.1 33,698,347
veryslow 1.3 29,697,663
placebo 0.6 29,395,288

On my environment, ultrafast can convert ts video almost at the recording speed (29.97 fps). But my decision was to go with superfast as its fps is closer to ultrafast yet the size is much smaller.