ffmpeg encoding H.264 - decrease size, maintain quality

#re-encode original with varying CRF values
for i in 14 18 24 28 32 38 46 51; do
ffmpeg -i original.mp4 -c:a copy -c:v libx264 \
-preset fast -crf $i -qphist -tune stillimage \
crf_$i.mp4

Examples:
source video size: 233mb

with configuration:
source video:
Duration: 00:25:21.21, start: 0.000000, bitrate: 1288 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 1151 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)

source audio:
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)

Examples:

commands	size	video output format	audio output format
Original	233 MB	1280x720, 1151 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc	aac, 44100 Hz, stereo, fltp, 128 kb/s
ffmpeg -i input.mp4 -c:v libx264 -crf 23 output.mp4	336 MB	1280x720, 1465 kb/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc	same
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -qphist -tune stillimage crfimage.mp4	344 MB	1280x720, 1764 kb/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc	same
ffmpeg -i input.mp4 -c:v libx264 -crf 23 crf23.mp4	290 MB	1280x720, 1465kb/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc	same
ffmpeg -i input.mp4 -vcodec h264 -acodec mp2 output.mp4	336 MB	1280x720, 1465 kb/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc	same
ffmpeg -i input.mp4 -c:v libx264 -preset fast -crf 18 crf18fast.mp4	468 MB	1280x720, 2447 kbps, 30fps, 16:9	aac 4 40Lc, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -c:v libx264 -b:v 500k 500knoaudioption.mp4	116 MB	1280x720, 500 Kbps, 30 fps, 16:9	aac 4 40LC, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -r 25 framerate25fps.mp4	282 MB	1280x720, 1423 kb/s, q=-1--1, 25 fps, 12800 tbn, 25 tbc	aac 4 40LC, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -b:v 64k -bufsize 64k videobitrate64kbits.mp4	37 MB	1280x720, 66 kb/s, q=-1--1, 64 kb/s, 30 fps, 15360 tbn, 30 tbc	aac 4 40LC, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -c:v libx264 -crf 28 crf28.mp4	195 MB	1280x720, 1051.1kbits/s, q=-1--1, 30 fps, 15360 tbn, 30 tbc	aac 4 40LC, 128 Kbps, 2 chan, 44100, 16 bits
ffmpeg -i input.mp4 -c:v libx264 -crf 28 -acodec libmp3lame -b:a 16k -ac 1 -ar 16000 crf2811.mp4	178 MB	1280x720, 913kbps, q=-1--1, 30 fps, 15360 tbn, 30 tbc	MPEG ver 2 69 Layer 3, 64kbs, 1 chan, 24000 Hz, 16 bits
ffmpeg -i input.mp4 -c:v libx264 -crf 26 -acodec libmp3lame -b:a 24k -ac 2 -ar 24000 crf2611.mp4	MB	1280x720, 913kbps, q=-1--1, 30 fps, 15360 tbn, 30 tbc	MPEG ver 2 69 Layer 3, 64kbs, 1 chan, 24000 Hz, 16 bits

ffmpeg -i input.mp4 -acodec libopus crf2811.mp4

Lossless H.264

You can use -crf 0 to encode a lossless output. Two useful presets for this are ultrafast or veryslow since either a fast encoding speed or best compression are usually the most important factors.

Lossless Example (fastest encoding)

ffmpeg -i input -c:v libx264 -preset ultrafast -crf 0 output.mkv

Lossless Example (best compression)

ffmpeg -i input -c:v libx264 -preset veryslow -crf 0 output.mkv

Note that lossless output files will likely be huge, and most non-FFmpeg based players will not be able to decode lossless, so if compatibility or file size issues you should not use lossless. If you're looking for an output that is roughly "visually lossless" but not technically lossless use a -crf value of around 17 or 18 (you'll have to experiment to see which value is acceptable for you). It will likely be indistinguishable from the source and not result in a huge, possibly incompatible file like true lossless mode.

CRF Guide

CRF stands for Constant Rate Factor, x264’s best single-pass encoding method.

Quick Summary: What is the Constant Rate Factor?

The Constant Rate Factor (CRF) is the default quality setting for the x264 encoder. You can set the values between 0 and 51, where lower values would result in better quality (at the expense of higher file sizes). Sane values are between 18 and 28. The default for x264 is 23, so you can use this as a starting point.

With ffmpeg, it'd look like this:

ffmpeg -i input.mp4 -c:v libx264 -crf 23 output.mp4

If you're unsure about what CRF to use, begin with 23 and change it according to your subjective impression of the output. Is the quality good enough? No? Then set a lower CRF. Is the file size too high? Choose a higher CRF. A change of ±6 should result in about half/double the file size, although your results might vary.

	comp.	size remain	artifacts
original	100.0%	19.16 GB	uncompress
crf=14	15.7%	3.00 GB
crf=18	10.1%	1.98 GB
crf=24	5.1%	0.98 GB
crf=28	3.5%	0.68 GB
crf=32	2.4%	0.47 GB	minimal
crf=38	1.9%	0.36 GB	usable
crf=46	1.6%	0.30 GB	bad
crf=51	1.4%	0.28 GB	unusable

NB: crf more big, size more small, quality more down.

Video format = Timeline

The following table is a partial history of international video compression standards.

History of video compression standards
Year	Standard	Publisher	Popular implementations
1984	H.120	ITU-T
1988	H.261	ITU-T	Videoconferencing, videotelephony
1993	MPEG-1 Part 2	ISO, IEC	Video-CD
1995	H.262/MPEG-2 Part 2	ISO, IEC, ITU-T	DVD Video, Blu-ray, Digital Video Broadcasting, SVCD
1996	H.263	ITU-T	Videoconferencing, videotelephony, video on mobile phones (3GP)
1999	MPEG-4 Part 2	ISO, IEC	Video on Internet (DivX, Xvid ...)
2003	H.264/MPEG-4 AVC	Sony, Panasonic, Samsung, ISO, IEC, ITU-T	Blu-ray, HD DVD, Digital Video Broadcasting, iPod Video, Apple TV, videoconferencing
2009	VC-2 (Dirac)	SMPTE	Video on Internet, HDTV broadcast, UHDTV
2013	H.265	ISO, IEC, ITU-T

Levels

As the term is used in the standard, a "level" is a specified set of constraints that indicate a degree of required decoder performance for a profile. For example, a level of support within a profile specifies the maximum picture resolution, frame rate, and bit rate that a decoder may use. A decoder that conforms to a given level must be able to decode all bitstreams encoded for that level and all lower levels.

Levels with maximum property values^{[citation needed]}
Level	Max decoding speed		Max frame size		Max video bit rate for video coding layer (VCL) kbit/s			Examples for high resolution @ highest frame rate (max stored frames) Toggle additional details
Level	Luma samples/s	Macroblocks/s	Luma samples	Macroblocks	Baseline, Extended and Main Profiles	High Profile	High 10 Profile
1	380,160	1,485	25,344	99	64	80	192	128×96@30.9 (8) 176×144@15.0 (4)
1b	380,160	1,485	25,344	99	128	160	384	128×96@30.9 (8) 176×144@15.0 (4)
1.1	768,000	3,000	101,376	396	192	240	576	176×144@30.3 (9) 320×240@10.0 (3) 352×288@7.5 (2)
1.2	1,536,000	6,000	101,376	396	384	480	1,152	320×240@20.0 (7) 352×288@15.2 (6)
1.3	3,041,280	11,880	101,376	396	768	960	2,304	320×240@36.0 (7) 352×288@30.0 (6)
2	3,041,280	11,880	101,376	396	2,000	2,500	6,000	320×240@36.0 (7) 352×288@30.0 (6)
2.1	5,068,800	19,800	202,752	792	4,000	5,000	12,000	352×480@30.0 (7) 352×576@25.0 (6)
2.2	5,184,000	20,250	414,720	1,620	4,000	5,000	12,000	352×480@30.7 (12) 352×576@25.6 (10) 720×480@15.0 (6) 720×576@12.5 (5)
3	10,368,000	40,500	414,720	1,620	10,000	12,500	30,000	352×480@61.4 (12) 352×576@51.1 (10) 720×480@30.0 (6) 720×576@25.0 (5)
3.1	27,648,000	108,000	921,600	3,600	14,000	17,500	42,000	720×480@80.0 (13) 720×576@66.7 (11) 1,280×720@30.0 (5)
3.2	55,296,000	216,000	1,310,720	5,120	20,000	25,000	60,000	1,280×720@60.0 (5) 1,280×1,024@42.2 (4)
4	62,914,560	245,760	2,097,152	8,192	20,000	25,000	60,000	1,280×720@68.3 (9) 1,920×1,080@30.1 (4) 2,048×1,024@30.0 (4)
4.1	62,914,560	245,760	2,097,152	8,192	50,000	62,500	150,000	1,280×720@68.3 (9) 1,920×1,080@30.1 (4) 2,048×1,024@30.0 (4)
4.2	133,693,440	522,240	2,228,224	8,704	50,000	62,500	150,000	1,280×720@145.1 (9) 1,920×1,080@64.0 (4) 2,048×1,080@60.0 (4)
5	150,994,944	589,824	5,652,480	22,080	135,000	168,750	405,000	1,920×1,080@72.3 (13) 2,048×1,024@72.0 (13) 2,048×1,080@67.8 (12) 2,560×1,920@30.7 (5) 3,672×1,536@26.7 (5)
5.1	251,658,240	983,040	9,437,184	36,864	240,000	300,000	720,000	1,920×1,080@120.5 (16) 2,560×1,920@51.2 (9) 3,840×2,160@31.7 (5) 4,096×2,048@30.0 (5) 4,096×2,160@28.5 (5) 4,096×2,304@26.7 (5)
5.2	530,841,600	2,073,600	9,437,184	36,864	240,000	300,000	720,000	1,920×1,080@172.0 (16) 2,560×1,920@108.0 (9) 3,840×2,160@66.8 (5) 4,096×2,048@63.3 (5) 4,096×2,160@60.0 (5) 4,096×2,304@56.3 (5)

The maximum bit rate for High Profile is 1.25 times that of the Base/Extended/Main Profiles, 3 times for Hi10P, and 4 times for Hi422P/Hi444PP.

The number of luma samples is 16x16=256 times the number of macroblocks (and the number of luma samples per second is 256 times the number of macroblocks per second).

Decoded picture buffering

Previously encoded pictures are used by H.264/AVC encoders to provide predictions of the values of samples in other pictures. This allows the encoder to make efficient decisions on the best way to encode a given picture. At the decoder, such pictures are stored in a virtual decoded picture buffer (DPB). The maximum capacity of the DPB is in units of frames (or pairs of fields), as shown in parentheses in the right column of the table above, can be computed as follows:

capacity = min(floor(MaxDpbMbs / (PicWidthInMbs * FrameHeightInMbs)), 16)

Where MaxDpbMbs is a constant value provided in the table below as a function of level number, and PicWidthInMbs and FrameHeightInMbs are the picture width and frame height for the coded video data, expressed in units of macroblocks (rounded up to integer values and accounting for cropping and macroblock pairing when applicable). This formula is specified in sections A.3.1.h and A.3.2.f of the 2009 edition of the standard.

Level	1	1b	1.1	1.2	1.3	2	2.1	2.2	3	3.1	3.2	4	4.1	4.2	5	5.1	5.2
MaxDpbMbs	396	396	900	2,376	2,376	2,376	4,752	8,100	8,100	18,000	20,480	32,768	32,768	34,816	110,400	184,320	184,320

For example, for an HDTV picture that is 1920 samples wide (PicWidthInMbs = 120) and 1080 samples high (FrameHeightInMbs = 68), a Level 4 decoder has a maximum DPB storage capacity of Floor(32768/(120*68)) = 4 frames (or 8 fields) when encoded with minimal cropping parameter values. Thus, the value 4 is shown in parentheses in the table above in the right column of the row for Level 4 with the frame size 1920×1080.

It is important to note that the current picture being decoded is not included in the computation of DPB fullness (unless the encoder has indicated for it to be stored for use as a reference for decoding other pictures or for delayed output timing). Thus, a decoder needs to actually have sufficient memory to handle (at least) one frame more than the maximum capacity of the DPB as calculated above.

Audio format

You can use the following table to select the target sound format based on characteristics of sound source.

Characteristics of source sound	Format	Attributes	Megabytes per hour (approximate)
Low-quality voice recording	DSP TrueSpeech	8.0kHz,1 bit, mono	4
Low-quality Internet music	MP3	11.5kHz, 16kBit/s mono	4.5
High-quality voice recording	Lernout & Hauspie	8.0kHz,16 bit, mono	9
High-quality voice recording	WMA voice	20kBit/s, 22.05kHz, mono	9
Middle-quality Internet music	MP3	22.05kHz, 56kBit/s, stereo	25
Near high-quality recordings	MP3	44.1kHz, 128kBit/s, stereo	56
High-quality recordings	WMA lossless	VBR Quality 100, 44 kHz, 2 channel 16 bit	150
High-quality recordings	Flac lossless	96kHz, 16 bit, stereo	240
High-quality (CD quality) recordings	PCM	44.1kHz,16 bit, stereo	600
DVD Audio, Super Audio CD recordings	PCM	96kHz, 24 bit, stereo (available for Professional, VideoPro and Developer Edition users only)	1978

High quality formats

Starting with version 5.2, Total Recorder Professional Edition supports high-quality formats. This includes high-quality PCM formats (up to 192kHz 24bit and float mono and stereo), high-quality FLAC formats (up to 192kHz 24bit mono and stereo), high-quality Windows Media Audio Lossless stereo formats (up to 96kHz 24bit), and the stereo formats of the Windows Media Audio Professional codec.

This is a reference to compare the monophonic (not stereophonic) audio quality and compression bitrates of audio coding formats available for WAV files including PCM, ADPCM, Microsoft GSM 06.10, CELP, SBC, Truespeech and MPEG Layer-3.

Format	Bitrate (kbit/s)	1 minute (KiB)	Sample
11,025 Hz 16 bit PCM	176.4	1292	11k16bitpcm.wav
8,000 Hz 16 bit PCM	128	938	8k16bitpcm.wav
11,025 Hz 8 bit PCM	88.2	646	11k8bitpcm.wav
11,025 Hz µ-Law	88.2	646	11kulaw.wav
8,000 Hz 8 bit PCM	64	469	8k8bitpcm.wav
8,000 Hz µ-Law	64	469	8kulaw.wav
11,025 Hz 4 bit ADPCM	44.1	323	11kadpcm.wav
8,000 Hz 4 bit ADPCM	32	234	8kadpcm.wav
11,025 Hz GSM 06.10	18	132	11kgsm.wav
8,000 Hz MP3 16 kbit/s	16	117	8kmp316.wav
8,000 Hz GSM 06.10	13	103	8kgsm.wav
8,000 Hz Lernout & Hauspie SBC 12 kbit/s	12	88	8ksbc12.wav
8,000 Hz DSP Group Truespeech	9	66	8ktruespeech.wav
8,000 Hz MP3 8 kbit/s	8	60	8kmp38.wav
8,000 Hz Lernout & Hauspie CELP	4.8	35	8kcelp.wav

The above are WAV files; even those that use MP3 compression have the ".wav" extension.

Technical information LAME#Recommended_encoder_settings

Recommended settings details[edit]

**Technical details of the recommended settings**
Switch	Preset	Target Kbps	Typical Kbps^[3]	Y Switch enabled by default	Lowpass^[4]	Formerly Known As
`-b 320`	`--preset insane`	320	320	Y^[5]	20094 Hz – 20627 Hz	api
`-V 0`	`--preset extreme`	~240	220–260		none	ape or apx
`-V 1`		~220	190–250		19383 Hz – 19916 Hz
`-V 2`	`--preset standard`	~190	170–210		18671 Hz – 19205 Hz	aps
`-V 3`		~170	150–195	Y	17960 Hz – 18494 Hz
`-V 4`	`--preset medium`	~160	140–185	Y	17249 Hz – 17782 Hz	apm
`-V 5`		~130	120–150	Y	16538 Hz – 17071 Hz
`-V 6`		~120	100–130	Y	16538 Hz – 17071 Hz

The default lowpass settings were not chosen at random; for general use, they are as high as they can be without putting quality at risk. Raising the the cutoff via command-line options is not recommended. See the high-frequency content in MP3s article for more info.

Recommended upload encoding settings

Container: MP4

No Edit Lists (or the video might not get processed correctly)
moov atom at the front of the file (Fast Start)

Audio codec: AAC-LC

Channels: Stereo or Stereo + 5.1
Sample rate 96khz or 48khz

Video codec: H.264

Progressive scan (no interlacing)
High Profile
2 consecutive B frames
Closed GOP. GOP of half the frame rate.
CABAC
Variable bitrate. No bitrate limit required, though we offer recommended bit rates below for reference
Chroma subsampling: 4:2:0

Frame rate

Content should be encoded and uploaded in the same frame rate it was recorded.

Common frame rates include: 24, 25, 30, 48, 50, 60 frames per second (other frame rates are also acceptable).

Interlaced content should be deinterlaced before uploading. For example, 1080i60 content should be deinterlaced to 1080p30, going from 60 interlaced fields per second to 30 progressive frames per second.

Bitrate

The bitrates below are recommendations for uploads. Audio playback bitrate is not related to video resolution.

Recommended audio bitrates for uploads

Type	Audio Bitrate
Mono	128 kbps
Stereo	384 kbps
5.1	512 kbps

Resolution and aspect ratio

YouTube uses 16:9 aspect ratio players. If you're uploading a non-16:9 file, it will be processed and displayed correctly as well, with pillar boxes (black bars on the left and right) or letter boxes (black bars at the top and bottom) provided by the player.

2160p: 3840x2160
1440p: 2560x1440
1080p: 1920x1080
720p: 1280x720
480p: 854x480
360p: 640x360
240p: 426x240

Comparison of ffmpeg’s x264 presets

Measured performance of ffmpeg’s x264 quality presets, to know which option is the best for my purpose converting mpeg2 ts to Apple TV mp4. Factors need to be considered are 1) conversion speed, 2) file size and 3) quality of the video. The most important factor is conversion speed, which would be nice if the time for conversion is shorter than for recording. For file size off course, the smaller the better as the space of my Apple TV storage is limited.

Test Method

The test was done with ffmpeg version 0.6.1 on 2.8GHz Intel Core 
    i7 running Mac OS X 10.6.8 as;

for i in ultrafast superfast veryfast faster fast medium slow slower veryslow placebo

do ffmpeg -i sample.ts -threads 4 -vcodec libx264 \

      -vpre $i -vpre main -crf 18 -s 1280x720 \

      -acodec libfaac -ab 160k \

      -y $i.mp4

done

sample.ts is a 60 seconds of full HD (1440×1080) mpeg2 video.

Test Results

preset used for conversion	Conversion speed (fps - frames per second)	Size of the output video file (in bytes)
ultrafast	29.6	81,046,858
superfast	27.9	58,180,478
veryfast	20.6	43,968,615
faster	13.0	32,981,783
fast	9.1	35,461,071
medium	8.0	35,119,322
slow	5.6	33,367,247
slower	2.1	33,698,347
veryslow	1.3	29,697,663
placebo	0.6	29,395,288

On my environment, ultrafast can convert ts video almost at the recording speed (29.97 fps). But my decision was to go with superfast as its fps is closer to ultrafast yet the size is much smaller.

Search This Blog

Advertisement

ffmpeg encoding H.264 - decrease size, maintain quality

Lossless H.264

Lossless Example (fastest encoding)

Lossless Example (best compression)

Quick Summary: What is the Constant Rate Factor?

Video format = Timeline

Levels

Decoded picture buffering

Technical information LAME#Recommended_encoder_settings

Recommended settings details[edit]

Recommended upload encoding settings

Recommended video bitrates for SDR uploads

Recommended video bitrates for HDR uploads

Recommended audio bitrates for uploads

Comparison of ffmpeg’s x264 presets

Test Method

Test Results

top computer

Kumpulan Lagu-lagu Tri Suaka

Archive

Sounds

Labels