Commits · m70-3538 · mirror / aom

This project is mirrored from https://aomedia.googlesource.com/aom. Pull mirroring failed 2 months ago.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update 2 months ago.

09 Oct, 2018 2 commits

row_mt worker should check for bit reader overflow · a5078bf8

Wan-Teh Chang authored 6 years ago

This is the row_mt version of bug oss-fuzz:9663. The row_mt worker
should also check if the entropy decoder has read beyond the end of the
data buffer.

Move most of the code inside the first while loop in
row_mt_worker_hook() to the new parse_tile_row_mt() function. The check
for bit reader overflow is performed in parse_tile_row_mt(). One reason
for adding the new parse_tile_row_mt() function is to highlight its
similarity to the decode_tile() function,

Move the set_decode_func_pointers(td, 0x1) call in row_mt_worker_hook()
from inside the first while loop to the outside, so that it is called
only once.

BUG=oss-fuzz:10646,oss-fuzz:9663

Change-Id: I39735bb7dd8b879985e6cf87e8a24b9dbcd01a34
(cherry picked from commit fe9ce8d6)

a5078bf8

row_mt_worker_hook: return early if corrupted is 1 · c62663e8

Wan-Teh Chang authored 6 years ago

In row_mt_worker_hook(), if the first while loop exits because
td->xd.corrupted is 1, it should not go on to the second while loop.
Instead, it should set frame_row_mt_info->row_mt_exit to 1, call
pthread_cond_broadcast(pbi->row_mt_cond_), and return early.

BUG=b/114646745

Change-Id: Ie43484332350b40176cf1ef87d79f61f78f8f3a8
(cherry picked from commit 82e0ef6d)

c62663e8

19 Sep, 2018 5 commits

Fix cmake -DENABLE_EXAMPLES=0 error. · f866f5eb

Wan-Teh Chang authored 6 years ago

This cmake error was introduced in commit
bdf8bf9f:
https://aomedia-review.googlesource.com/c/aom/+/70787

Add aom_encoder_stats to AOM_LIB_TARGETS in only one place, if
ENABLE_EXAMPLES is nonzero.

Change-Id: Ia29f9df73be7d78c0a753c04845a080556895a75

f866f5eb

Bitmask patch 7: Optimize traversal steps. · 1e227d41

Cheng Chen authored 6 years ago

Restructure loop to enable less calling of getting filter mask
functions and unnecessary operations.

Change-Id: Ic4c34a73a26cda7604a447a99b57f820ca037d20

1e227d41

Bitmask patch 6: Minor improvement. · a2969e72

Cheng Chen authored 6 years ago

Update bitmask to the exact bit, no need of for loop.

Change-Id: I8536e2b0511277df7944920d9e598edb60cebdef

a2969e72

Bitmask patch 5: Store vertical border and skip using bitmask · 063975b2

Cheng Chen authored 6 years ago

Vertical border and skip can be represented using one bitmask.
This change reduces operations in storing information, which is
later used to build bitmask.

Change-Id: Ib2ecbd6d4cb6433ca53c1051a81f256c20009ed0

063975b2

Extend blk_skip to support chroma planes · 39248c31
Grant Hsu authored 6 years ago
```
Change-Id: Ia1d2d10010c2b6bf0d729b3a32ad072dad1b475a
```
39248c31

18 Sep, 2018 16 commits

Bitmask patch 4: Store transform size info for vartx · ec3515bb

Cheng Chen authored 6 years ago

To store tx_size information, instead of advancing by 4x4 step
vertically and tx_size step horizontally (for vertical edges,
similar to horizontal edges), follow recursion in vartx parsing.

Benefit:
Suppose the block is 32x32 with transform sizes 16x16, the original
traversal needs 8*2(vertical) + 8*2(horizontal) = 32 steps.
Now, it needs 4 steps.

Change-Id: I0a9154e2943107613b4b2fcda47324aafb3a7c80

ec3515bb

Bitmask patch 3: Add more lookup table entries for bitmask. · 48d2bafc

Cheng Chen authored 6 years ago

For a given partition block, when transform sizes inside it are the
same, we can directly find an entry from the lookup table and build
the bitmask. This saves bitmask operations.

Change-Id: I1394f704cccf3b4570aa77c1047dea19e23a5b41

48d2bafc

Fix sending repeated random_seed in film grain · d13a4eb7

Andrey Norkin authored 6 years ago

Enable denoising by default

BUG: aomedia:2146

Change-Id: I7917a5cfe97b324cc00487efa0bb9ed59cfe236b

d13a4eb7

Only build stats library if examples are enabled. · bdf8bf9f
Thomas Daede authored 6 years ago
```
Change-Id: Iad4e8b88dbc669d008e6dfb98ac5926de065c44a
```
bdf8bf9f

Call decoder_decode() even on invalid input data. · e81859e9

Wan-Teh Chang authored 6 years ago

The first thing decoder_decode() does is to release the output frames
from the previous decoder_decode() call. We need that side effect even
when the input data pointer or data size is invalid, so that the
subsequent aom_codec_get_frame() call (which is called by
av1_dec_fuzzer.cc even after a failed aom_codec_decode() call) will not
return an old output frame again.

BUG=oss-fuzz:10151

Change-Id: I4e3f3b3a3d437abc0140074595d83ac285cf8149

e81859e9

Fix range check in half_btf() function · 6f33c683

David Barker authored 6 years ago

Per the linked bug report, the condition on 'result_64' in half_btf()
is not satisfied by all conformant bitstreams. But it can easily be
modified to give a condition which *is* true for all conformant bitstreams.

Add a comment explaining the updated condition, as well as its implications
for alternative implementations (ie. hardware and/or vectorized code)

Similarly, there is an unnecessary range check on the output of
av1_iadst4_new(). This is followed immediately by a clamp to the same
range, so we shouldn't check the range here (and indeed the spec doesn't
do so).

BUG=aomedia:2151

Change-Id: Iab30344e2e7b1ce00245541b1a32a9495d5ed717

6f33c683

Optimize set_planes_to_neutral_grey for highbd. · 6efd3b24

Wan-Teh Chang authored 6 years ago

In the high bit depth case, the frame buffer holds uint16_t samples and
we cannot use memset() to set the frame buffer to neutral grey. Since
all the rows will be set to the same bytes, call aom_memset16() to set
the first row to neutral grey, and then call memcpy() to copy the first
row to all subsequent rows.

BUG=oss-fuzz:10242

Change-Id: I0694cc58dee75f938c9d88bc498eabd72b2c0648

6efd3b24

Bitmask patch 2: Applying filtering once bitmask is built · 146b72ce

Cheng Chen authored 6 years ago

(1). Apply loop filtering after bitmask is built.
This process is decoder only because the bitmask info is stored
at decoding time. For encoder, loop filter should go through
original path.

(2).Apply filtering first vertically and then horizontally for
each superblock. This way is preferred since is goes through
buffer only once.

(3). Call dual filter functions correctly, since bitmask enables it.

Change-Id: I0034b633ceef4231a55341e08be819a482f97890

146b72ce

Bitmask patch 1: Reduce u/v horizontal/vertical · 69ac765d

Cheng Chen authored 6 years ago

(1). u/v planes do not need to distinguish vertical and horizontal
direction.

(2). Use memset for decoder to store information.

Change-Id: I2fffa3404cf9c3928b7e62ed0918e24749987fb3

69ac765d

Propagate user private data when film grain enabled. · 881889ad
Tom Finegan authored 6 years ago
```
BUG=aomedia:2163

Change-Id: Ib8e059347d630fd97c4ed7e345a71d74281eedde
```
881889ad

Fix 32bit build for arm target arch 8 · f651c0fb

Umang Saini authored 6 years ago

Changed the macro to check for 64bit build.
Toolchain defines __ARM_ARCH as 8 for 32bit armv8 target

Change-Id: Id7fdb121b0b4c71f8a075a89ca8c3cfb35d94762

f651c0fb

Add cmdline switch --skip-film-grain to aomdec · 5e1fa0d3

Yaowu Xu authored 6 years ago

This commit adds a commandline switch "--skip-film-grain" to aomdec,
which allows decoder to output decoded video either with film grain
applied or without film grain applied.

BUG=aomedia:2126

Change-Id: I3f98dbb83e14343de9bd6c7230bfe736e51985f1

5e1fa0d3

Add sse4_1 variant for highbd inv_txfm 32x32 · 4db12666

Remya authored 6 years ago

Coded different variants of idct32x32_sse4_1 based on eobx logic.

Achieved module level gains of 7.7x on an average over all eob values.

Change-Id: I209c6d7e0c44b5c0c8b7ebda890c7698cda61bb8

4db12666

Optimize highbd 64x64 fwd_txfm · fd5aae41

Satish Kumar Suman authored 6 years ago

Added sse4_1 variant for highbd 64x64 fwd_txfm.

When tested for 20 frames of crowd_run_360p_10 at 750 kbps,
observed ~3.7% reduction in encoder time for speed=1 preset.

Achieved module level gains of 4x w.r.t. C function.

Change-Id: Id9da2231a7a5c0eebe81f5062f8c2d5a7feb3227

fd5aae41

Add ML-based rectangular partition pruning · 9b5fb2ce

Alexander Bokov authored 6 years ago

Average speed-up ranges from 5% to 6% on speed 0 depending on QP
(measured on 16 lowres and midres clips).

Coding efficiency impact is minor:

                 |  avg_psnr  |  ovr_psnr  |   ssim
------------------------------------------------------
lowres (speed=0) |   0.024%    |  0.023%   |  0.043%
------------------------------------------------------
midres (speed=0) |   0.030%    |  0.028%   |  0.011%
------------------------------------------------------
hdres (speed=1)  |   0.028%    |  0.028%   |  0.033%
------------------------------------------------------

STATS_CHANGED

Change-Id: I0d5e91cbe056c6c4e904a68ea3d173191dfdd5ca

9b5fb2ce

Add speed feature skip_repeated_newmv · 731b5cff

Grant Hsu authored 6 years ago

If find same motion search result in single motion NEWMV mode,
skip further search base on the cost differences. Enabled at
speed 1 and above. AWCY shows 0.03% PSNR loss.

Tested BasketballDrill_832x480_50 (20 frames, bitrate=800),
Speed 1: 3.4% faster.
Speed 8: 9.6% faster.

STATS_CHANGED

Change-Id: Ic0736c9bf3f409e3af9764543a6d09aba7d367ef

731b5cff

17 Sep, 2018 1 commit

Fix hang in multi-thread decoding · a39eb91b

Deepa K G authored 6 years ago

Threads waiting for parsing to be completed are
unblocked when erroneous stream is encountered.

Change-Id: I85815c134a4d2f5e007e57ffe4815b25fd90f4c3

a39eb91b

16 Sep, 2018 1 commit

Add decoder control skip_film_grain · 6fa40060

Yaowu Xu authored 6 years ago

This commit adds a decoder control to skip the application of film
grain in decoder, so the decoder output recontructed decoder image
without apply film grain even parameters are signalled in bitstream.

BUG=aomedia:2126

Change-Id: Ia88fde91c53755427ba55958e8fe5e648eff108e

6fa40060

15 Sep, 2018 11 commits

Define the maximum GOP size for the 4-layer structure · cdb083e8

Wei-Ting Lin authored 6 years ago

Allow a GOP of size <= 24 to use pyramid struture, but currently this
change has no effect since we use fix GF size = 16.

We might want to group few stray frames into the current GOP for
forward keyframe setting or use ALTREF to match the quality of the
next keyframe.

Change-Id: I228d4515a6775d14f0ea22ad2e88b418964bcdc2

cdb083e8

Account for partition rd cost in partition none search · 02b5d1bf

Cherma Rajan A authored 6 years ago

Partition rd cost is accounted in the calculation of best rd
so far in partition none search.
For speed=1 preset, 4.8% encode time reduced for 20 frames of
BasketballDrill_832x480_50 content when encoded at 1 mbps and
0.04% average BD-rate drop is seen for AWCY tests.

STATS_CHANGED

Change-Id: I1672ce51d51775f94bf5ec7e58c930ce4fcef2bb

02b5d1bf

port libvpx sse2 quantize · 44a445cc

Johann authored 6 years ago

libvpx got a lot of quantize improvements after libaom was forked.

For libaom we just need to remove skip_block and always assume the
large tran_low_t values.

Change-Id: Iaba4e5ae44146b77a5ee06f9f8af852457b71cf3

44a445cc

aomdec: respect chroma-sample-position · 27d46e38
elliottk authored 6 years ago
```
BUG=aomedia:2096

Change-Id: I278c0ddd3acb5a4fb514bce5d6acd2d302c7807c
```
27d46e38

Clean up some unnecessary variables · 84b3e446

Wei-Ting Lin authored 6 years ago

This patch is a follow up on comments of commit 70785

Change-Id: I625fe4692b5552f58b19bb44a8b72d1c25f15546

84b3e446

Account for partition rd cost in sqr and rect partition search · 7d2577f5

Cherma Rajan A authored 6 years ago

Partition rd cost is accounted in the calculation of best rd
so far in Square, Horizontal and Vertical partition search.
For speed=1 preset, 2.6% encode time reduced for 20 frames of
BasketballDrill_832x480_50 content when encoded at 1 mbps and
0.02% average BD-rate drop is seen for AWCY tests.

STATS_CHANGED

Change-Id: I9add751e3cd2746d9c4e303b0bc637c0fede2292

7d2577f5

Print warning when upgrading bitdepth · 43bc4593

elliottk authored 6 years ago

Matches behavior for profile upgrading

BUG=aomedia:2147

Change-Id: I28762c3ba6aea35f91648437ea630fff047ff60a

43bc4593

add quantize functions to tests · 89b4c615

Johann authored 6 years ago

All the smaller (16x16) quantize tests pass. The larger ones do not but
we can get limited coverage by testing the ssse3 against the avx code.

Similar issue to libvpx:
BUG=webm:1448

Change-Id: Ide8a381a3898c12b6f9a033dc3c932c5df629514

89b4c615

Refine mode rd based gating · fa29a1cb

Venkat authored 6 years ago

Refactored the mode rd cost based gating.

When tested for 20 frames of BasketBallDrill_832x480_50 at 1 mbps,
observed ~0.3% reduction in encoder time for speed=1 preset.

As per AWCY runs Average PSNR is seen as 0.00% for speed=1 preset.

STATS_CHANGED

Change-Id: I56c6c7e82552bda0abfe1ed4d07935092314ccc3

fa29a1cb

Add speed feature to turn off global motion recode · 79853eb1

Debargha Mukherjee authored 6 years ago

This feature is turned on by default. There are big speedups
on certain sequences.
Ex. city_cif.y4m 20 frames goes from: 410ms -> 234ms

Coding efficiency loss is +0.21% on cam_lowres, and
+0.14% on lowres.

The global motion recode loop is quite inefficient, and other ways
to recover the loss  by better implementation of the strategy
without a full recode will be explored next. But as of now the
trade off seems reasonable.

The reduced reference global motion search feature is also
turned on by default. It gives a small speedup, but negligible
loss in coding efficiency.

STATS_CHANGED expected.

Change-Id: If7c971717af60c1c12cef8bc2ba44b3a651ea96a

79853eb1

Refactor av1_rd_pick_inter_mode_sb · 0e94366d

Peng Bin authored 6 years ago

1. Extract code for sf->drop_ref as functions
2. Refine the initial of HandleInterModeArgs
3. Use comp_pred instead of check
"second_ref_frame > INTRA_FRAME"

Change-Id: Ib33dd45fcbd503c7912521c14849a30fab2f329e

0e94366d

14 Sep, 2018 4 commits

Fix initialization in loop restoration mt · 32017745

Ravi Chaudhary authored 6 years ago

When luma loop restoration is disabled, the initialization of
cur_sb_col in lr_sync was not happening correctly.

BUG=b/114647746
BUG=oss-fuzz:10252

Change-Id: I842a4a142680fdc78265c2f037b8bb1641f5e5d3

32017745

Enforce new stack size limits to avoid regressions · 8ae39302

Urvang Joshi authored 6 years ago

New limits are enforced only for the core library (in C). Unit tests
(C++) still use larger limits, as some of them use large stacks.

BUG=aomedia:2135

Change-Id: Ie4a4adae279cedbb08e8ee2b71a61ad54ce0acdc

8ae39302

Refine handle_inter_intra_mode · 2013ee9a

Grant Hsu authored 6 years ago

Use the motion search result of simple translation as mv predictor
for inter_intra newmv refine search.

Tested BasketballDrill_832x480_50 (20 frames, bitrate=800, Speed 1),
it is 0.4% faster. AWCY result shows 0.03 improvement.

STATS_CHANGED

Change-Id: Iec5763d65dcb72cca95500d29421218eb077acf8

2013ee9a

AV1FrameSizeTests.LargeValidSizes: avoid segfault. · e41de6b5

Urvang Joshi authored 6 years ago

We check if pointers are NULL before freeing the contents.

BUG=aomedia:2144

Change-Id: Id0acf7c187cfa065ff9c72340315fb728f94ad9e

e41de6b5