The libjpeg in-memory destination manager has always re-allocated the
JPEG destination buffer if the specified buffer pointer is NULL or the
specified buffer size is 0. TurboJPEG's destination manager inherited
that behavior. Because of fe80ec2275,
TurboJPEG's destination manager tries to reuse the most recent
destination buffer if the same buffer pointer is specified. (The
purpose of that is to enable repeated invocations of tjCompress*() or
tjTransform() to automatically grow the destination buffer, as needed,
with no intervention from the calling program.) However, because of the
inherited code, TurboJPEG's destination manager also reallocated the
destination buffer if the specified buffer size was 0. Thus, passing a
previously-used JPEG destination buffer pointer to tjCompress*() or
tjTransform() while specifying a destination buffer size of 0 confused
the destination manager. It reallocated the destination buffer to 4096
bytes but reported the old destination buffer size to the libjpeg API.
This caused a buffer overrun if the old destination buffer size was
larger than 4096 bytes.
The documentation for tjCompress*() is contradictory on this matter.
It states that the JPEG destination buffer size must be specified if the
destination buffer pointer is non-NULL. However, it also states that,
if the destination buffer is reused, the specified destination buffer
size is ignored. The documentation for tjTransform() does not specify
the function's behavior if the destination buffer is reused. Thus, the
behavior of the API is at best undefined if a calling application
attempts to reuse a destination buffer while specifying a destination
buffer size of 0. If that ever worked, it only worked in libjpeg-turbo
1.3.x and prior.
This issue was exposed only through API abuse, and calling applications
that abused the API in that manner would not have worked for the last 11
years. Thus, the issue did not represent a security threat. This
commit merely hardens the API against such abuse, by modifying
TurboJPEG's destination manager so that it refuses to re-allocate the
JPEG destination buffer if the buffer pointer is reused and the
specified buffer size is 0. That is consistent with the most permissive
interpretation of the TurboJPEG API documentation. (The API already
ignored the destination buffer size if the destination buffer pointer
was reused and the specified buffer size was non-zero. It makes sense
for it to do likewise if the specified buffer size is 0.) This commit
also modifies TJUnitTest so that it verifies whether the API is hardened
against the aforementioned abuse.
- Due to an oversight, a113506d17
(libjpeg-turbo 1.4 beta1) effectively made the call to
std_huff_tables() in jpeg_set_defaults() a no-op if the Huffman tables
were previously defined, which made it impossible to disable Huffman
table optimization or progressive mode if they were previously enabled
in the same API instance. std_huff_tables() retains its previous
behavior for decompression instances, but it now force-enables the
standard (baseline) Huffman tables for compression instances.
jcopy_markers_execute() has historically ignored its option argument,
which is OK for jpegtran, but tj*Transform() needs to be able to save a
set of markers from the source image and write a subset of those markers
to each destination image. Without that ability, the function
effectively behaved as if TJ*OPT_COPYNONE was not specified unless all
transforms specified it.
Because the crop spec was parsed using unsigned 32-bit integers,
negative numbers were interpreted as values ~= UINT_MAX (4,294,967,295).
This had the following ramifications:
- If the cropping region width was negative and the adjusted width + the
adjusted left boundary was greater than 0, then the 32-bit unsigned
integer bounds checks in djpeg and jpeg_crop_scanline() overflowed and
failed to detect the out-of-bounds width, jpeg_crop_scanline() set
cinfo->output_width to a value ~= UINT_MAX, and a buffer overrun and
subsequent segfault occurred in the upsampling or color conversion
routine. The segfault occurred in the body of
jpeg_skip_scanlines() --> read_and_discard_scanlines() if the cropping
region upper boundary was greater than 0 and the JPEG image used
chrominance subsampling and in the body of jpeg_read_scanlines()
otherwise.
- If the cropping region width was negative and the adjusted width + the
adjusted left boundary was 0, then a zero-width output image was
generated.
- If the cropping region left boundary was negative, then an output
image with bogus data was generated.
This commit modifies djpeg and jpeg_crop_scanline() so that the
aforementioned bounds checks use 64-bit unsigned integers, thus guarding
against overflow. It similarly modifies jpeg_skip_scanlines(). In the
case of jpeg_skip_scanlines(), the issue was not reproducible with
djpeg, but passing a negative number of lines to jpeg_skip_scanlines()
caused a similar overflow if the number of lines +
cinfo->output_scanline was greater than 0. That caused
jpeg_skip_scanlines() to read past the end of the JPEG image, throw a
warning ("Corrupt JPEG data: premature end of data segment"), and fail
to return unless warnings were treated as fatal. Also, djpeg now parses
the crop spec using signed integers and checks for negative values.
If the calling application invokes jpeg_save_markers() to save a
particular type of marker, then the save_marker() function will be
invoked for every marker of that type that is encountered. Previously,
only the head of the marker linked list was stored (in
jpeg_decompress_struct), so save_marker() had to traverse the entire
linked list before it could add a new marker to the tail of the list.
That caused CPU usage to grow exponentially with the number of markers.
Referring to #764, it is possible to create a JPEG image that contains
an excessive number of markers. The specific reproducer that uncovered
this issue is a specially-crafted 1-megabyte malformed JPEG image with
tens of thousands of APP1 markers, which required approximately 30
seconds of CPU time (on a modern Intel processor) to process. However,
it should also be possible to create a legitimate JPEG image that
reproduces the issue (such as an image with tens of thousands of
duplicate EXIF tags.)
This commit introduces a new pointer (in jpeg_decomp_master, in order to
preserve backward ABI compatibility) that is used to store the tail of
the marker linked list whenever a marker is added to it. Thus, it is no
longer necessary to traverse the list when adding a marker, and CPU
usage will grow linearly rather than exponentially with the number of
markers.
Fixes#764
If an error manager instance is passed to jpeg_std_error(), then its
format_message() method will point to the format_message() function in
jerror.c. The format_message() function passes all eight values from
the jpeg_error_mgr::msg_parm.i[] array as arguments to
snprintf()/_snprintf_s(), even if the format string doesn't use all of
those values. Subsequently invoking one of the ERREXIT[1-6]() macros
will leave the unused values uninitialized, and if the
-fsanitize-memory-param-retval option (introduced in Clang 14) is
enabled (which it is by default in Clang 16 and later), then MSan will
complain when the format_message() function tries to pass the
uninitialized-but-unused values as function arguments.
This commit modifies jpeg_std_error() so that it zeroes out the error
manager instance passed to it, thus working around the warning as well
as simplifying the code.
Closes#761
This is very subtle, but if a user specifies a libjpeg virtual array
memory limit via the JPEGMEM environment variable and one of the
tjCompress*() functions hits that limit, the libjpeg error handler
will be invoked in jpeg_start_compress() (more specifically in
realize_virt_arrays() in jinit_compress_master()) before the libjpeg
global compression state can be incremented. Thus,
jpeg_abort_compress() will not be called before the tjCompress*()
function exits, the unrealized virtual arrays will not be freed, and if
the TurboJPEG compression instance is reused, those unrealized virtual
arrays will count against the specified memory limit. This could cause
subsequent compression operations that require smaller virtual arrays
(or even no virtual arrays at all) to fail when they would otherwise
succeed. In reality, the vast majority of calling programs would abort
and free the TurboJPEG compression instance if one of the tjCompress*()
functions failed, but TJBench is a rare exception. This issue does not
bear documenting because of its subtlety and rarity and because JPEGMEM
is not a documented feature of the TurboJPEG API.
Note that the issue does not exist in the tjEncode*() and tjDecode*()
functions, because realize_virt_arrays() is never called in the body of
those functions. The issue also does not exist in the tjDecompress*()
and tjTransform() functions, because those functions ensure that the
JPEG header is read (and thus the libjpeg global decompression state is
incremented) prior to calling a function that calls
realize_virt_arrays() (i.e. jpeg_start_decompress() or
jpeg_read_coefficients().) If realize_virt_arrays() failed in the body
of jpeg_write_coefficients(), then tjTransform() would abort without
calling jpeg_abort_compress(). However, since jpeg_start_compress() is
never called in the body of tjTransform(), no virtual arrays are ever
requested from the compression object, so failing to call
jpeg_abort_compress() would be innocuous.
If the align parameter was set to an unreasonably large value, such as
0x2000000, strides[0] * ph0 and strides[1] * ph1 could have overflowed
the int datatype and wrapped around when computing (src|dst)Planes[1]
and (src|dst)Planes[2] (respectively.) This would have caused
(src|dst)Planes[1] and (src|dst)Planes[2] to point to lower addresses in
the YUV buffer than expected, so the worst case would have been a
visually incorrect output image, not a buffer overrun or other
exploitable issue.
Because of a569e8cc2c, we can now
reliably determine the correct default value for FLOATTEST when using
Clang or GCC to build for AArch64 or PowerPC platforms. (Testing
confirms that this is the case with GCC 5-13 and Clang 5-14 on
Ubuntu/AArch64, GCC 4 on CentOS 7/PPC, and GCC 8-10 and Clang 6-12 on
Ubuntu/PPCLE.)
Referring to 63bd71885f, it is not
possible to reliably determine the correct default value for FLOATTEST
when using other compilers or building for other architectures until
those compilers and architectures are thoroughly tested. Thus, we no
longer attempt to set a default value except with tested configurations.
Other CPU architectures and compilers can be added on a case-by-case
basis as they are tested.
libjpeg-turbo implements 4:4:0 "fancy" (smooth) upsampling, which is
enabled by default when decompressing JPEG images that use 4:4:0
chrominance subsampling. libjpeg did not and does not implement fancy
4:4:0 upsampling.
The MD5 sums associated with FLOATTEST=fp-contract are appropriate for
GCC (tested v5 through v13) with -ffp-contract=fast, which is the
default when compiling for an architecture that has fused multiply-add
(FMA) instructions. However, different MD5 sums are needed for Clang
(tested v5 through v14) with -ffp-contract=on, which is now the default
in Clang 14 when compiling for an architecture that has FMA
instructions.
Refer to #705, #709, #710
This fixes a buffer overrun, reported by OSS-Fuzz, that occurred when
attempting to transform a specially-crafted malformed arithmetic-coded
JPEG image into a baseline Huffman-coded JPEG destination image with
default Huffman tables. This issue probably had a similar root cause to
the issue fixed in bae3804a1c, but in this
case, the issue only occurred with the SIMD baseline Huffman encoder in
libjpeg-turbo 2.1.x and 2.0.x. It was not reproducible in 3.0.x or when
using the C baseline Huffman encoder.
The intent was for the final transform operation to be the same as the
first transform operation but without TJXOPT_COPYNONE or
TJFLAG_NOREALLOC. Unrolling the transform operations in
655450bbde accidentally changed that.
Referring to
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=60379
there are some specially-crafted malformed JPEG images that, when
transformed to grayscale, will exceed the worst-case transformed
grayscale JPEG image size. This is similar in nature to the issue fixed
by 655450bbde, except that in this case,
the issue occurs regardless of the amount of metadata in the source
image. Also, the tjTransform() function, the
Java_org_libjpegturbo_turbojpeg_TJTransformer_transform() JNI function,
and TJBench were behaving correctly in this case, because the TurboJPEG
API documentation specifies that the source image's subsampling type
should be used when computing the worst-case transformed JPEG image
size. (However, only the Java API documentation specified that. Oops.
The C API documentation now does as well.) The documented usage
mitigates the issue, and only the transform fuzzer did not adhere to
that. Thus, this was an issue with the fuzzer itself rather than an
issue with the library.
Restore two coefficient range checks from libjpeg to the C baseline
Huffman encoder. This fixes an issue
(https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=60253) whereby
the encoder could read from uninitialized memory when attempting to
transform a specially-crafted malformed arithmetic-coded JPEG source
image into a baseline Huffman-coded JPEG destination image with default
Huffman tables. More specifically, the out-of-range coefficients caused
r to equal 256, which overflowed the actbl->ehufsi[] array. Because the
overflow was contained within the huff_entropy_encoder structure, this
issue was not exploitable (nor was it observable at all on x86 or Arm
CPUs unless JSIMD_NOHUFFENC=1 or JSIMD_FORCENONE=1 was set in the
environment or unless libjpeg-turbo was built with WITH_SIMD=0.)
The fix is performance-neutral (+/- 1-2%) for x86-64 code and causes a
0-4% (avg. 1-2%, +/- 1-2%) compression regression for i386 code on Intel
CPUs when the C baseline Huffman encoder is used (JSIMD_NOHUFFENC=1).
The fix is performance-neutral (+/- 1-2%) on Intel CPUs when all of the
libjpeg-turbo SIMD extensions are disabled (JSIMD_FORCENONE=1). The fix
causes a 0-2% (avg. <1%, +/- 1%) compression regression for PowerPC
code.
When used with TJFLAG_NOREALLOC and with TJXOP_TRANSPOSE,
TJXOP_TRANSVERSE, TJXOP_ROT90, or TJXOP_ROT270, tjTransform()
incorrectly based the destination buffer size for a transform on the
source image dimensions rather than the transformed image dimensions.
This was apparently a long-standing bug that had existed in the
tjTransform() function since its inception. As initially implemented in
the evolving libjpeg-turbo v1.2 code base, tjTransform() required
dstSizes[i] to be set regardless of whether TJFLAG_NOREALLOC was set.
ff78e37595, which was introduced later in
the evolving libjpeg-turbo v1.2 code base, removed that requirement and
planted the seed for the bug. However, the bug was not activated until
9b49f0e4c7 was introduced still later in
the evolving libjpeg-turbo v1.2 code base, adding a subsampling type
argument to the (new at the time) tjBufSize() function and thus making
the width and height arguments no longer commutative.
The bug opened up the possibility that a JPEG source image could cause
tjTransform() to overflow the destination buffer for a transform if all
of the following were true:
- The JPEG source image used 4:2:2, 4:4:0, or 4:1:1 subsampling. (These
are the only supported subsampling types for which the width and
height arguments to tjBufSize() are not commutative.)
- The width and height of the JPEG source image were such that
tjBufSize(height, width, subsamplingType) returned a smaller value
than tjBufSize(width, height, subsamplingType).
- The JPEG source image contained enough metadata that the size of the
transformed image was larger than
tjBufSize(height, width, subsamplingType).
- TJFLAG_NOREALLOC was set.
- TJXOP_TRANSPOSE, TJXOP_TRANSVERSE, TJXOP_ROT90, or TJXOP_ROT270 was
used.
- TJXOPT_COPYNONE was not set.
- TJXOPT_CROP was not set.
- The calling program allocated
tjBufSize(height, width, subsamplingType) bytes for the
destination buffer, as the API documentation instructs.
The API documentation cautions that JPEG source images containing a
large amount of extraneous metadata (EXIF, IPTC, ICC, etc.) cannot
reliably be transformed if TJFLAG_NOREALLOC is set and TJXOPT_COPYNONE
is not set. Irrespective of the bug, there are still cases in which a
JPEG source image with a large amount of metadata can, when transformed,
exceed the worst-case transformed JPEG image size. For instance, if you
try to losslessly crop a JPEG image with 3 kB of EXIF data to 16x16
pixels, then you are guaranteed to exceed the worst-case 16x16 JPEG
image size unless you discard the EXIF data.
Even without the bug, tjTransform() will still fail with "Buffer passed
to JPEG library is too small" when attempting to transform JPEG source
images that meet the aforementioned criteria. The bug is that the
function segfaults rather than failing gracefully, but the chances of
that occurring in a real-world application are very slim. Any
real-world application developers who attempted to transform arbitrary
JPEG source images with TJFLAG_NOREALLOC set would very quickly realize
that they cannot reliably do that without also setting TJXOPT_COPYNONE.
Thus, I posit that the actual risk posed by this bug is low.
Applications such as web browsers that are the most exposed to security
risks from arbitrary JPEG source images do not use the TurboJPEG
lossless transform feature. (None of those applications even use the
TurboJPEG API, to the best of my knowledge, and the public libjpeg API
has no equivalent transform function.) Our only command-line interface
to the tjTransform() function, TJBench, was not exposed to the bug
because it had a compatible bug whereby it allocated the JPEG
destination buffer to the same size that tjTransform() erroneously
expected. The TurboJPEG Java API was also not exposed to the bug
because of a similar compatible bug in the
Java_org_libjpegturbo_turbojpeg_TJTransformer_transform() JNI function.
(This commit fixes both compatible bugs.)
In short, best practices for tjTransform() are to use TJFLAG_NOREALLOC
only with JPEG source images that are known to be free of metadata (such
as images generated by tjCompress*()) or to use TJXOPT_COPYNONE along
with TJFLAG_NOREALLOC. Still, however, the function shouldn't segfault
as long as the calling program allocates the suggested amount of space
for the JPEG destination buffer.
Usability notes:
tjTransform() could hypothetically require dstSizes[i] to be set
regardless of whether TJFLAG_NOREALLOC is set, but there are usability
pitfalls either way. The main pitfall I sought to avoid with
ff78e37595 was a calling program failing
to set dstSizes[i] at all, thus leaving its value undefined. It could
be argued that requiring dstSizes[i] to be set in all cases is more
consistent, but it could also be argued that not requiring it to be set
when TJFLAG_NOREALLOC is set is more user-proof. tjTransform() could
also hypothetically set TJXOPT_COPYNONE automatically when
TJFLAG_NOREALLOC is set, but that could lead to user confusion.
Because width, height, and tjPixelSize[] are signed integers, signed
integer overflow will occur if width * height *
tjPixelSize[pixelFormat] > INT_MAX or jpegSize > INT_MAX, which would
cause an incorrect value to be passed to tjAlloc(). This commit
modifies tjexample.c in the following ways:
- Use malloc() rather than tjAlloc() to allocate the uncompressed image
buffer. (tjAlloc() is only necessary for JPEG buffers that will
potentially be reallocated by the TurboJPEG API library.)
- Implicitly promote width, height, and tjPixelSize[pixelFormat] to
size_t before multiplying them.
- If size_t is 32-bit, throw an error if width * height *
tjPixelSize[pixelFormat] would overflow the data type.
- Throw an error if jpegSize would overflow a signed integer (which
could only happen on a 64-bit machine.)
Since tjexample is not installed or packaged, the worst case for this
issue was that a downstream application might interpret tjexample.c
literally and introduce a similar overflow issue into its own code.
However, it's worth noting that such issues could also be introduced
when using malloc().
When computing the downsampled width for a particular component,
jpeg_crop_scanline() needs to take into account the fact that the
libjpeg code uses a combination of IDCT scaling and upsampling to
implement 4x2 and 2x4 upsampling with certain decompression scaling
factors. Failing to account for that led to incomplete upsampling of
4x2- or 2x4-subsampled components, which caused the color converter to
read from uninitialized memory. With 12-bit data precision, this caused
a buffer overrun or underrun and subsequent segfault if the
uninitialized memory contained a value that was outside of the valid
sample range (because the color converter uses the value as an array
index.)
Fixes#669
The 2-pass color quantization algorithm assumes 3-sample pixels. RGB565
is the only 3-component colorspace that doesn't have 3-sample pixels, so
we need to treat it as a special case when determining whether to enable
2-pass color quantization. Otherwise, attempting to initialize 2-pass
color quantization with an RGB565 output buffer could cause
prescan_quantize() to read from uninitialized memory and subsequently
underflow/overflow the histogram array.
djpeg is supposed to fail gracefully if both -rgb565 and -colors are
specified, because none of its destination managers (image writers)
support color quantization with RGB565. However, prescan_quantize() was
called before that could occur. It is possible but very unlikely that
these issues could have been reproduced in applications other than
djpeg. The issues involve the use of two features (12-bit precision and
RGB565) that are incompatible, and they also involve the use of two
rarely-used legacy features (RGB565 and color quantization) that don't
make much sense when combined.
Fixes#668Fixes#671Fixes#680
When -tile is used with a JPEG input image, TJBench generates the tiles
using lossless cropping, which will fail if the cropping region doesn't
align with an MCU boundary. Furthermore, there is no reason to avoid
8x8 tiles when decompressing 4:4:4 or grayscale JPEG images.
It's not that the build system doesn't support multiple values in
CMAKE_OSX_ARCHITECTURES. It's that libjpeg-turbo, because of its SIMD
extensions, *cannot* support multiple values in CMAKE_OSX_ARCHITECTURES.
Even though BUILDING.md and CONTRIBUTING.md explicitly state that the
libjpeg-turbo build system does not and will not support being
integrated into downstream build systems using add_subdirectory() (see
7e2246313e), people continue to file bug
reports, feature requests, and pull requests regarding that (see #265,
#637, and #653 in addition to the issues listed in
7e2246313e325ce52299663e374812acb0ee6cc8.) Responding to those
issues wastes our project's limited resources. Hopefully people will
get the hint if the build system explicitly tells them that it can't be
included using add_subdirectory(), which will prompt them to read the
comments in CMakeLists.txt explaining why. To anyone stumbling upon
this commit message, please refer to the discussions under the issues
listed above, as well as the issues listed in
7e2246313e. Our project's position on
this has been stated, explained, and defended numerous times.
Don't keep trying to decompress the same image if tj3Decompress*() has
already thrown an error. Otherwise, if the image has an excessive
number of scans, then each iteration of the loop will try to decompress
up to the scan limit, which may cause the overall test to time out even
if one iteration doesn't time out.
As long as a libjpeg instance is only used by one thread at a time, a
program is technically within its rights to call jpeg_start_*compress()
in one thread and jpeg_(read|write)_*(), with the same libjpeg instance,
in a second thread. However, because the various jsimd_can*() functions
are called within the body of jpeg_start_*compress() and simd_support is
now thread-local (due to 1e9209e3c5), that
led to a situation in which simd_support was initialized in the first
thread but not the second. The uninitialized value of simd_support is
0xFFFFFFFF, which the second thread interpreted to mean that it could
use any instruction set, and when it attempted to use AVX2 instructions
on a CPU that didn't support them, an illegal instruction error
occurred.
This issue was known to affect libvips.
This commit modifies the i386 and x86-64 SIMD dispatchers so that the
various jsimd_*() functions always call init_simd(), if simd_support is
uninitialized, prior to dispatching based on the value of simd_support.
Note that the other SIMD dispatchers don't need this, because only the
x86 SIMD extensions currently support multiple instruction sets.
This patch has been verified to be performance-neutral to within
+/- 0.4% with 32-bit and 64-bit code running on a 2.8 GHz Intel Xeon
W3530 and a 3.6 GHz Intel Xeon W2123.
Fixes#649
The default install prefix when building under MinGW is chosen based on
the needs of the official build system, which uses MSYS2 to generate
Windows installer packages that install under c:\libjpeg-turbo-gcc[64].
However, attempting to configure the build with that install prefix on
a Un*x machine causes a CMake error.
Fixes#641
Because Java array sizes are ints, the various size methods in the TJ
class have int return values. Thus, we have to guard against signed
int overflow at the JNI level, because the C functions can return sizes
greater than INT_MAX.
This also adds a test for TJ.planeWidth() and TJ.planeHeight(), in order
to validate 2d5a0a35bb in Java.
tjPlaneWidth() and tjPlaneHeight() could overflow a signed int and
return a negative value if passed a width/height argument of INT_MAX and
a subsampling type for which the MCU block size is larger than 8x8.
The documented behavior of the -progressive option is to use progressive
entropy coding in JPEG images generated by compression and transform
operations. However, setting TJFLAG_PROGRESSIVE was insufficient to
accomplish that, because TJBench doesn't enable lossless transformation
if xformOpt == 0.