setxbg

X11 background image
git clone https://git.porkepik.fr/setxbg
Log | Files | Refs | README | LICENSE

stb_image.h (267319B)


      1 /* stb_image - v2.25 - public domain image loader - http://nothings.org/stb
      2                                   no warranty implied; use at your own risk
      3 
      4    Do this:
      5       #define STB_IMAGE_IMPLEMENTATION
      6    before you include this file in *one* C or C++ file to create the implementation.
      7 
      8    // i.e. it should look like this:
      9    #include ...
     10    #include ...
     11    #include ...
     12    #define STB_IMAGE_IMPLEMENTATION
     13    #include "stb_image.h"
     14 
     15    You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
     16    And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
     17 
     18 
     19    QUICK NOTES:
     20       Primarily of interest to game developers and other people who can
     21           avoid problematic images and only need the trivial interface
     22 
     23       JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
     24       PNG 1/2/4/8/16-bit-per-channel
     25 
     26       TGA (not sure what subset, if a subset)
     27       BMP non-1bpp, non-RLE
     28       PSD (composited view only, no extra channels, 8/16 bit-per-channel)
     29 
     30       GIF (*comp always reports as 4-channel)
     31       HDR (radiance rgbE format)
     32       PIC (Softimage PIC)
     33       PNM (PPM and PGM binary only)
     34 
     35       Animated GIF still needs a proper API, but here's one way to do it:
     36           http://gist.github.com/urraka/685d9a6340b26b830d49
     37 
     38       - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
     39       - decode from arbitrary I/O callbacks
     40       - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
     41 
     42    Full documentation under "DOCUMENTATION" below.
     43 
     44 
     45 LICENSE
     46 
     47   See end of file for license information.
     48 
     49 RECENT REVISION HISTORY:
     50 
     51       2.25  (2020-02-02) fix warnings
     52       2.24  (2020-02-02) fix warnings; thread-local failure_reason and flip_vertically
     53       2.23  (2019-08-11) fix clang static analysis warning
     54       2.22  (2019-03-04) gif fixes, fix warnings
     55       2.21  (2019-02-25) fix typo in comment
     56       2.20  (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
     57       2.19  (2018-02-11) fix warning
     58       2.18  (2018-01-30) fix warnings
     59       2.17  (2018-01-29) bugfix, 1-bit BMP, 16-bitness query, fix warnings
     60       2.16  (2017-07-23) all functions have 16-bit variants; optimizations; bugfixes
     61       2.15  (2017-03-18) fix png-1,2,4; all Imagenet JPGs; no runtime SSE detection on GCC
     62       2.14  (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
     63       2.13  (2016-12-04) experimental 16-bit API, only for PNG so far; fixes
     64       2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
     65       2.11  (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
     66                          RGB-format JPEG; remove white matting in PSD;
     67                          allocate large structures on the stack;
     68                          correct channel count for PNG & BMP
     69       2.10  (2016-01-22) avoid warning introduced in 2.09
     70       2.09  (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
     71 
     72    See end of file for full revision history.
     73 
     74 
     75  ============================    Contributors    =========================
     76 
     77  Image formats                          Extensions, features
     78     Sean Barrett (jpeg, png, bmp)          Jetro Lauha (stbi_info)
     79     Nicolas Schulz (hdr, psd)              Martin "SpartanJ" Golini (stbi_info)
     80     Jonathan Dummer (tga)                  James "moose2000" Brown (iPhone PNG)
     81     Jean-Marc Lienher (gif)                Ben "Disch" Wenger (io callbacks)
     82     Tom Seddon (pic)                       Omar Cornut (1/2/4-bit PNG)
     83     Thatcher Ulrich (psd)                  Nicolas Guillemot (vertical flip)
     84     Ken Miller (pgm, ppm)                  Richard Mitton (16-bit PSD)
     85     github:urraka (animated gif)           Junggon Kim (PNM comments)
     86     Christopher Forseth (animated gif)     Daniel Gibson (16-bit TGA)
     87                                            socks-the-fox (16-bit PNG)
     88                                            Jeremy Sawicki (handle all ImageNet JPGs)
     89  Optimizations & bugfixes                  Mikhail Morozov (1-bit BMP)
     90     Fabian "ryg" Giesen                    Anael Seghezzi (is-16-bit query)
     91     Arseny Kapoulkine
     92     John-Mark Allen
     93     Carmelo J Fdez-Aguera
     94 
     95  Bug & warning fixes
     96     Marc LeBlanc            David Woo          Guillaume George   Martins Mozeiko
     97     Christpher Lloyd        Jerry Jansson      Joseph Thomson     Phil Jordan
     98     Dave Moore              Roy Eltham         Hayaki Saito       Nathan Reed
     99     Won Chun                Luke Graham        Johan Duparc       Nick Verigakis
    100     the Horde3D community   Thomas Ruf         Ronny Chevalier    github:rlyeh
    101     Janez Zemva             John Bartholomew   Michal Cichon      github:romigrou
    102     Jonathan Blow           Ken Hamada         Tero Hanninen      github:svdijk
    103     Laurent Gomila          Cort Stratton      Sergio Gonzalez    github:snagar
    104     Aruelien Pocheville     Thibault Reuille   Cass Everitt       github:Zelex
    105     Ryamond Barbiero        Paul Du Bois       Engin Manap        github:grim210
    106     Aldo Culquicondor       Philipp Wiesemann  Dale Weiler        github:sammyhw
    107     Oriol Ferrer Mesia      Josh Tobin         Matthew Gregan     github:phprus
    108     Julian Raschke          Gregory Mullen     Baldur Karlsson    github:poppolopoppo
    109     Christian Floisand      Kevin Schmidt      JR Smith           github:darealshinji
    110     Brad Weinberger         Matvey Cherevko                       github:Michaelangel007
    111     Blazej Dariusz Roszkowski                  Alexander Veselov
    112 */
    113 
    114 #ifndef STBI_INCLUDE_STB_IMAGE_H
    115 #define STBI_INCLUDE_STB_IMAGE_H
    116 
    117 // DOCUMENTATION
    118 //
    119 // Limitations:
    120 //    - no 12-bit-per-channel JPEG
    121 //    - no JPEGs with arithmetic coding
    122 //    - GIF always returns *comp=4
    123 //
    124 // Basic usage (see HDR discussion below for HDR usage):
    125 //    int x,y,n;
    126 //    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
    127 //    // ... process data if not NULL ...
    128 //    // ... x = width, y = height, n = # 8-bit components per pixel ...
    129 //    // ... replace '0' with '1'..'4' to force that many components per pixel
    130 //    // ... but 'n' will always be the number that it would have been if you said 0
    131 //    stbi_image_free(data)
    132 //
    133 // Standard parameters:
    134 //    int *x                 -- outputs image width in pixels
    135 //    int *y                 -- outputs image height in pixels
    136 //    int *channels_in_file  -- outputs # of image components in image file
    137 //    int desired_channels   -- if non-zero, # of image components requested in result
    138 //
    139 // The return value from an image loader is an 'unsigned char *' which points
    140 // to the pixel data, or NULL on an allocation failure or if the image is
    141 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
    142 // with each pixel consisting of N interleaved 8-bit components; the first
    143 // pixel pointed to is top-left-most in the image. There is no padding between
    144 // image scanlines or between pixels, regardless of format. The number of
    145 // components N is 'desired_channels' if desired_channels is non-zero, or
    146 // *channels_in_file otherwise. If desired_channels is non-zero,
    147 // *channels_in_file has the number of components that _would_ have been
    148 // output otherwise. E.g. if you set desired_channels to 4, you will always
    149 // get RGBA output, but you can check *channels_in_file to see if it's trivially
    150 // opaque because e.g. there were only 3 channels in the source image.
    151 //
    152 // An output image with N components has the following components interleaved
    153 // in this order in each pixel:
    154 //
    155 //     N=#comp     components
    156 //       1           grey
    157 //       2           grey, alpha
    158 //       3           red, green, blue
    159 //       4           red, green, blue, alpha
    160 //
    161 // If image loading fails for any reason, the return value will be NULL,
    162 // and *x, *y, *channels_in_file will be unchanged. The function
    163 // stbi_failure_reason() can be queried for an extremely brief, end-user
    164 // unfriendly explanation of why the load failed. Define STBI_NO_FAILURE_STRINGS
    165 // to avoid compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
    166 // more user-friendly ones.
    167 //
    168 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
    169 //
    170 // ===========================================================================
    171 //
    172 // UNICODE:
    173 //
    174 //   If compiling for Windows and you wish to use Unicode filenames, compile
    175 //   with
    176 //       #define STBI_WINDOWS_UTF8
    177 //   and pass utf8-encoded filenames. Call stbi_convert_wchar_to_utf8 to convert
    178 //   Windows wchar_t filenames to utf8.
    179 //
    180 // ===========================================================================
    181 //
    182 // Philosophy
    183 //
    184 // stb libraries are designed with the following priorities:
    185 //
    186 //    1. easy to use
    187 //    2. easy to maintain
    188 //    3. good performance
    189 //
    190 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
    191 // and for best performance I may provide less-easy-to-use APIs that give higher
    192 // performance, in addition to the easy-to-use ones. Nevertheless, it's important
    193 // to keep in mind that from the standpoint of you, a client of this library,
    194 // all you care about is #1 and #3, and stb libraries DO NOT emphasize #3 above all.
    195 //
    196 // Some secondary priorities arise directly from the first two, some of which
    197 // provide more explicit reasons why performance can't be emphasized.
    198 //
    199 //    - Portable ("ease of use")
    200 //    - Small source code footprint ("easy to maintain")
    201 //    - No dependencies ("ease of use")
    202 //
    203 // ===========================================================================
    204 //
    205 // I/O callbacks
    206 //
    207 // I/O callbacks allow you to read from arbitrary sources, like packaged
    208 // files or some other source. Data read from callbacks are processed
    209 // through a small internal buffer (currently 128 bytes) to try to reduce
    210 // overhead.
    211 //
    212 // The three functions you must define are "read" (reads some bytes of data),
    213 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
    214 //
    215 // ===========================================================================
    216 //
    217 // SIMD support
    218 //
    219 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
    220 // supported by the compiler. For ARM Neon support, you must explicitly
    221 // request it.
    222 //
    223 // (The old do-it-yourself SIMD API is no longer supported in the current
    224 // code.)
    225 //
    226 // On x86, SSE2 will automatically be used when available based on a run-time
    227 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
    228 // the typical path is to have separate builds for NEON and non-NEON devices
    229 // (at least this is true for iOS and Android). Therefore, the NEON support is
    230 // toggled by a build flag: define STBI_NEON to get NEON loops.
    231 //
    232 // If for some reason you do not want to use any of SIMD code, or if
    233 // you have issues compiling it, you can disable it entirely by
    234 // defining STBI_NO_SIMD.
    235 //
    236 // ===========================================================================
    237 //
    238 // HDR image support   (disable by defining STBI_NO_HDR)
    239 //
    240 // stb_image supports loading HDR images in general, and currently the Radiance
    241 // .HDR file format specifically. You can still load any file through the existing
    242 // interface; if you attempt to load an HDR file, it will be automatically remapped
    243 // to LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
    244 // both of these constants can be reconfigured through this interface:
    245 //
    246 //     stbi_hdr_to_ldr_gamma(2.2f);
    247 //     stbi_hdr_to_ldr_scale(1.0f);
    248 //
    249 // (note, do not use _inverse_ constants; stbi_image will invert them
    250 // appropriately).
    251 //
    252 // Additionally, there is a new, parallel interface for loading files as
    253 // (linear) floats to preserve the full dynamic range:
    254 //
    255 //    float *data = stbi_loadf(filename, &x, &y, &n, 0);
    256 //
    257 // If you load LDR images through this interface, those images will
    258 // be promoted to floating point values, run through the inverse of
    259 // constants corresponding to the above:
    260 //
    261 //     stbi_ldr_to_hdr_scale(1.0f);
    262 //     stbi_ldr_to_hdr_gamma(2.2f);
    263 //
    264 // Finally, given a filename (or an open file or memory block--see header
    265 // file for details) containing image data, you can query for the "most
    266 // appropriate" interface to use (that is, whether the image is HDR or
    267 // not), using:
    268 //
    269 //     stbi_is_hdr(char *filename);
    270 //
    271 // ===========================================================================
    272 //
    273 // iPhone PNG support:
    274 //
    275 // By default we convert iphone-formatted PNGs back to RGB, even though
    276 // they are internally encoded differently. You can disable this conversion
    277 // by calling stbi_convert_iphone_png_to_rgb(0), in which case
    278 // you will always just get the native iphone "format" through (which
    279 // is BGR stored in RGB).
    280 //
    281 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
    282 // pixel to remove any premultiplied alpha *only* if the image file explicitly
    283 // says there's premultiplied data (currently only happens in iPhone images,
    284 // and only if iPhone convert-to-rgb processing is on).
    285 //
    286 // ===========================================================================
    287 //
    288 // ADDITIONAL CONFIGURATION
    289 //
    290 //  - You can suppress implementation of any of the decoders to reduce
    291 //    your code footprint by #defining one or more of the following
    292 //    symbols before creating the implementation.
    293 //
    294 //        STBI_NO_JPEG
    295 //        STBI_NO_PNG
    296 //        STBI_NO_BMP
    297 //        STBI_NO_PSD
    298 //        STBI_NO_TGA
    299 //        STBI_NO_GIF
    300 //        STBI_NO_HDR
    301 //        STBI_NO_PIC
    302 //        STBI_NO_PNM   (.ppm and .pgm)
    303 //
    304 //  - You can request *only* certain decoders and suppress all other ones
    305 //    (this will be more forward-compatible, as addition of new decoders
    306 //    doesn't require you to disable them explicitly):
    307 //
    308 //        STBI_ONLY_JPEG
    309 //        STBI_ONLY_PNG
    310 //        STBI_ONLY_BMP
    311 //        STBI_ONLY_PSD
    312 //        STBI_ONLY_TGA
    313 //        STBI_ONLY_GIF
    314 //        STBI_ONLY_HDR
    315 //        STBI_ONLY_PIC
    316 //        STBI_ONLY_PNM   (.ppm and .pgm)
    317 //
    318 //   - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
    319 //     want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
    320 //
    321 
    322 
    323 #ifndef STBI_NO_STDIO
    324 #include <stdio.h>
    325 #endif // STBI_NO_STDIO
    326 
    327 #define STBI_VERSION 1
    328 
    329 enum
    330 {
    331    STBI_default = 0, // only used for desired_channels
    332 
    333    STBI_grey       = 1,
    334    STBI_grey_alpha = 2,
    335    STBI_rgb        = 3,
    336    STBI_rgb_alpha  = 4
    337 };
    338 
    339 #include <stdlib.h>
    340 typedef unsigned char stbi_uc;
    341 typedef unsigned short stbi_us;
    342 
    343 #ifdef __cplusplus
    344 extern "C" {
    345 #endif
    346 
    347 #ifndef STBIDEF
    348 #ifdef STB_IMAGE_STATIC
    349 #define STBIDEF static
    350 #else
    351 #define STBIDEF extern
    352 #endif
    353 #endif
    354 
    355 //////////////////////////////////////////////////////////////////////////////
    356 //
    357 // PRIMARY API - works on images of any type
    358 //
    359 
    360 //
    361 // load image by filename, open file, or memory buffer
    362 //
    363 
    364 typedef struct
    365 {
    366    int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of bytes actually read
    367    void     (*skip)  (void *user,int n);                 // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
    368    int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
    369 } stbi_io_callbacks;
    370 
    371 ////////////////////////////////////
    372 //
    373 // 8-bits-per-channel interface
    374 //
    375 
    376 STBIDEF stbi_uc *stbi_load_from_memory   (stbi_uc           const *buffer, int len   , int *x, int *y, int *channels_in_file, int desired_channels);
    377 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk  , void *user, int *x, int *y, int *channels_in_file, int desired_channels);
    378 
    379 #ifndef STBI_NO_STDIO
    380 STBIDEF stbi_uc *stbi_load            (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    381 STBIDEF stbi_uc *stbi_load_from_file  (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    382 // for stbi_load_from_file, file pointer is left pointing immediately after image
    383 #endif
    384 
    385 #ifndef STBI_NO_GIF
    386 STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
    387 #endif
    388 
    389 #ifdef STBI_WINDOWS_UTF8
    390 STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input);
    391 #endif
    392 
    393 ////////////////////////////////////
    394 //
    395 // 16-bits-per-channel interface
    396 //
    397 
    398 STBIDEF stbi_us *stbi_load_16_from_memory   (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
    399 STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels);
    400 
    401 #ifndef STBI_NO_STDIO
    402 STBIDEF stbi_us *stbi_load_16          (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    403 STBIDEF stbi_us *stbi_load_from_file_16(FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    404 #endif
    405 
    406 ////////////////////////////////////
    407 //
    408 // float-per-channel interface
    409 //
    410 #ifndef STBI_NO_LINEAR
    411    STBIDEF float *stbi_loadf_from_memory     (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
    412    STBIDEF float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y,  int *channels_in_file, int desired_channels);
    413 
    414    #ifndef STBI_NO_STDIO
    415    STBIDEF float *stbi_loadf            (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    416    STBIDEF float *stbi_loadf_from_file  (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    417    #endif
    418 #endif
    419 
    420 #ifndef STBI_NO_HDR
    421    STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma);
    422    STBIDEF void   stbi_hdr_to_ldr_scale(float scale);
    423 #endif // STBI_NO_HDR
    424 
    425 #ifndef STBI_NO_LINEAR
    426    STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma);
    427    STBIDEF void   stbi_ldr_to_hdr_scale(float scale);
    428 #endif // STBI_NO_LINEAR
    429 
    430 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
    431 STBIDEF int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
    432 STBIDEF int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
    433 #ifndef STBI_NO_STDIO
    434 STBIDEF int      stbi_is_hdr          (char const *filename);
    435 STBIDEF int      stbi_is_hdr_from_file(FILE *f);
    436 #endif // STBI_NO_STDIO
    437 
    438 
    439 // get a VERY brief reason for failure
    440 // on most compilers (and ALL modern mainstream compilers) this is threadsafe
    441 STBIDEF const char *stbi_failure_reason  (void);
    442 
    443 // free the loaded image -- this is just free()
    444 STBIDEF void     stbi_image_free      (void *retval_from_stbi_load);
    445 
    446 // get image dimensions & components without fully decoding
    447 STBIDEF int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    448 STBIDEF int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
    449 STBIDEF int      stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len);
    450 STBIDEF int      stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *clbk, void *user);
    451 
    452 #ifndef STBI_NO_STDIO
    453 STBIDEF int      stbi_info               (char const *filename,     int *x, int *y, int *comp);
    454 STBIDEF int      stbi_info_from_file     (FILE *f,                  int *x, int *y, int *comp);
    455 STBIDEF int      stbi_is_16_bit          (char const *filename);
    456 STBIDEF int      stbi_is_16_bit_from_file(FILE *f);
    457 #endif
    458 
    459 
    460 
    461 // for image formats that explicitly notate that they have premultiplied alpha,
    462 // we just return the colors as stored in the file. set this flag to force
    463 // unpremultiplication. results are undefined if the unpremultiply overflow.
    464 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
    465 
    466 // indicate whether we should process iphone images back to canonical format,
    467 // or just pass them through "as-is"
    468 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
    469 
    470 // flip the image vertically, so the first pixel in the output array is the bottom left
    471 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
    472 
    473 // as above, but only applies to images loaded on the thread that calls the function
    474 // this function is only available if your compiler supports thread-local variables;
    475 // calling it will fail to link if your compiler doesn't
    476 STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip);
    477 
    478 // ZLIB client - used by PNG, available for other purposes
    479 
    480 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
    481 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
    482 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
    483 STBIDEF int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
    484 
    485 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
    486 STBIDEF int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
    487 
    488 
    489 #ifdef __cplusplus
    490 }
    491 #endif
    492 
    493 //
    494 //
    495 ////   end header file   /////////////////////////////////////////////////////
    496 #endif // STBI_INCLUDE_STB_IMAGE_H
    497 
    498 #ifdef STB_IMAGE_IMPLEMENTATION
    499 
    500 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
    501   || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
    502   || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
    503   || defined(STBI_ONLY_ZLIB)
    504    #ifndef STBI_ONLY_JPEG
    505    #define STBI_NO_JPEG
    506    #endif
    507    #ifndef STBI_ONLY_PNG
    508    #define STBI_NO_PNG
    509    #endif
    510    #ifndef STBI_ONLY_BMP
    511    #define STBI_NO_BMP
    512    #endif
    513    #ifndef STBI_ONLY_PSD
    514    #define STBI_NO_PSD
    515    #endif
    516    #ifndef STBI_ONLY_TGA
    517    #define STBI_NO_TGA
    518    #endif
    519    #ifndef STBI_ONLY_GIF
    520    #define STBI_NO_GIF
    521    #endif
    522    #ifndef STBI_ONLY_HDR
    523    #define STBI_NO_HDR
    524    #endif
    525    #ifndef STBI_ONLY_PIC
    526    #define STBI_NO_PIC
    527    #endif
    528    #ifndef STBI_ONLY_PNM
    529    #define STBI_NO_PNM
    530    #endif
    531 #endif
    532 
    533 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
    534 #define STBI_NO_ZLIB
    535 #endif
    536 
    537 
    538 #include <stdarg.h>
    539 #include <stddef.h> // ptrdiff_t on osx
    540 #include <stdlib.h>
    541 #include <string.h>
    542 #include <limits.h>
    543 
    544 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
    545 #include <math.h>  // ldexp, pow
    546 #endif
    547 
    548 #ifndef STBI_NO_STDIO
    549 #include <stdio.h>
    550 #endif
    551 
    552 #ifndef STBI_ASSERT
    553 #include <assert.h>
    554 #define STBI_ASSERT(x) assert(x)
    555 #endif
    556 
    557 #ifdef __cplusplus
    558 #define STBI_EXTERN extern "C"
    559 #else
    560 #define STBI_EXTERN extern
    561 #endif
    562 
    563 
    564 #ifndef _MSC_VER
    565    #ifdef __cplusplus
    566    #define stbi_inline inline
    567    #else
    568    #define stbi_inline
    569    #endif
    570 #else
    571    #define stbi_inline __forceinline
    572 #endif
    573 
    574 #ifndef STBI_NO_THREAD_LOCALS
    575    #if defined(__cplusplus) &&  __cplusplus >= 201103L
    576       #define STBI_THREAD_LOCAL       thread_local
    577    #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
    578       #define STBI_THREAD_LOCAL       _Thread_local
    579    #elif defined(__GNUC__)
    580       #define STBI_THREAD_LOCAL       __thread
    581    #elif defined(_MSC_VER)
    582       #define STBI_THREAD_LOCAL       __declspec(thread)
    583 #endif
    584 #endif
    585 
    586 #ifdef _MSC_VER
    587 typedef unsigned short stbi__uint16;
    588 typedef   signed short stbi__int16;
    589 typedef unsigned int   stbi__uint32;
    590 typedef   signed int   stbi__int32;
    591 #else
    592 #include <stdint.h>
    593 typedef uint16_t stbi__uint16;
    594 typedef int16_t  stbi__int16;
    595 typedef uint32_t stbi__uint32;
    596 typedef int32_t  stbi__int32;
    597 #endif
    598 
    599 // should produce compiler error if size is wrong
    600 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
    601 
    602 #ifdef _MSC_VER
    603 #define STBI_NOTUSED(v)  (void)(v)
    604 #else
    605 #define STBI_NOTUSED(v)  (void)sizeof(v)
    606 #endif
    607 
    608 #ifdef _MSC_VER
    609 #define STBI_HAS_LROTL
    610 #endif
    611 
    612 #ifdef STBI_HAS_LROTL
    613    #define stbi_lrot(x,y)  _lrotl(x,y)
    614 #else
    615    #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (32 - (y))))
    616 #endif
    617 
    618 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
    619 // ok
    620 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
    621 // ok
    622 #else
    623 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
    624 #endif
    625 
    626 #ifndef STBI_MALLOC
    627 #define STBI_MALLOC(sz)           malloc(sz)
    628 #define STBI_REALLOC(p,newsz)     realloc(p,newsz)
    629 #define STBI_FREE(p)              free(p)
    630 #endif
    631 
    632 #ifndef STBI_REALLOC_SIZED
    633 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
    634 #endif
    635 
    636 // x86/x64 detection
    637 #if defined(__x86_64__) || defined(_M_X64)
    638 #define STBI__X64_TARGET
    639 #elif defined(__i386) || defined(_M_IX86)
    640 #define STBI__X86_TARGET
    641 #endif
    642 
    643 #if defined(__GNUC__) && defined(STBI__X86_TARGET) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
    644 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
    645 // which in turn means it gets to use SSE2 everywhere. This is unfortunate,
    646 // but previous attempts to provide the SSE2 functions with runtime
    647 // detection caused numerous issues. The way architecture extensions are
    648 // exposed in GCC/Clang is, sadly, not really suited for one-file libs.
    649 // New behavior: if compiled with -msse2, we use SSE2 without any
    650 // detection; if not, we don't use it at all.
    651 #define STBI_NO_SIMD
    652 #endif
    653 
    654 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
    655 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
    656 //
    657 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
    658 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
    659 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
    660 // simultaneously enabling "-mstackrealign".
    661 //
    662 // See https://github.com/nothings/stb/issues/81 for more information.
    663 //
    664 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
    665 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
    666 #define STBI_NO_SIMD
    667 #endif
    668 
    669 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
    670 #define STBI_SSE2
    671 #include <emmintrin.h>
    672 
    673 #ifdef _MSC_VER
    674 
    675 #if _MSC_VER >= 1400  // not VC6
    676 #include <intrin.h> // __cpuid
    677 static int stbi__cpuid3(void)
    678 {
    679    int info[4];
    680    __cpuid(info,1);
    681    return info[3];
    682 }
    683 #else
    684 static int stbi__cpuid3(void)
    685 {
    686    int res;
    687    __asm {
    688       mov  eax,1
    689       cpuid
    690       mov  res,edx
    691    }
    692    return res;
    693 }
    694 #endif
    695 
    696 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
    697 
    698 #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
    699 static int stbi__sse2_available(void)
    700 {
    701    int info3 = stbi__cpuid3();
    702    return ((info3 >> 26) & 1) != 0;
    703 }
    704 #endif
    705 
    706 #else // assume GCC-style if not VC++
    707 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
    708 
    709 #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
    710 static int stbi__sse2_available(void)
    711 {
    712    // If we're even attempting to compile this on GCC/Clang, that means
    713    // -msse2 is on, which means the compiler is allowed to use SSE2
    714    // instructions at will, and so are we.
    715    return 1;
    716 }
    717 #endif
    718 
    719 #endif
    720 #endif
    721 
    722 // ARM NEON
    723 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
    724 #undef STBI_NEON
    725 #endif
    726 
    727 #ifdef STBI_NEON
    728 #include <arm_neon.h>
    729 // assume GCC or Clang on ARM targets
    730 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
    731 #endif
    732 
    733 #ifndef STBI_SIMD_ALIGN
    734 #define STBI_SIMD_ALIGN(type, name) type name
    735 #endif
    736 
    737 ///////////////////////////////////////////////
    738 //
    739 //  stbi__context struct and start_xxx functions
    740 
    741 // stbi__context structure is our basic context used by all images, so it
    742 // contains all the IO context, plus some basic image information
    743 typedef struct
    744 {
    745    stbi__uint32 img_x, img_y;
    746    int img_n, img_out_n;
    747 
    748    stbi_io_callbacks io;
    749    void *io_user_data;
    750 
    751    int read_from_callbacks;
    752    int buflen;
    753    stbi_uc buffer_start[128];
    754 
    755    stbi_uc *img_buffer, *img_buffer_end;
    756    stbi_uc *img_buffer_original, *img_buffer_original_end;
    757 } stbi__context;
    758 
    759 
    760 static void stbi__refill_buffer(stbi__context *s);
    761 
    762 // initialize a memory-decode context
    763 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
    764 {
    765    s->io.read = NULL;
    766    s->read_from_callbacks = 0;
    767    s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
    768    s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
    769 }
    770 
    771 // initialize a callback-based context
    772 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
    773 {
    774    s->io = *c;
    775    s->io_user_data = user;
    776    s->buflen = sizeof(s->buffer_start);
    777    s->read_from_callbacks = 1;
    778    s->img_buffer_original = s->buffer_start;
    779    stbi__refill_buffer(s);
    780    s->img_buffer_original_end = s->img_buffer_end;
    781 }
    782 
    783 #ifndef STBI_NO_STDIO
    784 
    785 static int stbi__stdio_read(void *user, char *data, int size)
    786 {
    787    return (int) fread(data,1,size,(FILE*) user);
    788 }
    789 
    790 static void stbi__stdio_skip(void *user, int n)
    791 {
    792    fseek((FILE*) user, n, SEEK_CUR);
    793 }
    794 
    795 static int stbi__stdio_eof(void *user)
    796 {
    797    return feof((FILE*) user);
    798 }
    799 
    800 static stbi_io_callbacks stbi__stdio_callbacks =
    801 {
    802    stbi__stdio_read,
    803    stbi__stdio_skip,
    804    stbi__stdio_eof,
    805 };
    806 
    807 static void stbi__start_file(stbi__context *s, FILE *f)
    808 {
    809    stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
    810 }
    811 
    812 //static void stop_file(stbi__context *s) { }
    813 
    814 #endif // !STBI_NO_STDIO
    815 
    816 static void stbi__rewind(stbi__context *s)
    817 {
    818    // conceptually rewind SHOULD rewind to the beginning of the stream,
    819    // but we just rewind to the beginning of the initial buffer, because
    820    // we only use it after doing 'test', which only ever looks at at most 92 bytes
    821    s->img_buffer = s->img_buffer_original;
    822    s->img_buffer_end = s->img_buffer_original_end;
    823 }
    824 
    825 enum
    826 {
    827    STBI_ORDER_RGB,
    828    STBI_ORDER_BGR
    829 };
    830 
    831 typedef struct
    832 {
    833    int bits_per_channel;
    834    int num_channels;
    835    int channel_order;
    836 } stbi__result_info;
    837 
    838 #ifndef STBI_NO_JPEG
    839 static int      stbi__jpeg_test(stbi__context *s);
    840 static void    *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    841 static int      stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
    842 #endif
    843 
    844 #ifndef STBI_NO_PNG
    845 static int      stbi__png_test(stbi__context *s);
    846 static void    *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    847 static int      stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
    848 static int      stbi__png_is16(stbi__context *s);
    849 #endif
    850 
    851 #ifndef STBI_NO_BMP
    852 static int      stbi__bmp_test(stbi__context *s);
    853 static void    *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    854 static int      stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
    855 #endif
    856 
    857 #ifndef STBI_NO_TGA
    858 static int      stbi__tga_test(stbi__context *s);
    859 static void    *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    860 static int      stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
    861 #endif
    862 
    863 #ifndef STBI_NO_PSD
    864 static int      stbi__psd_test(stbi__context *s);
    865 static void    *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc);
    866 static int      stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
    867 static int      stbi__psd_is16(stbi__context *s);
    868 #endif
    869 
    870 #ifndef STBI_NO_HDR
    871 static int      stbi__hdr_test(stbi__context *s);
    872 static float   *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    873 static int      stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
    874 #endif
    875 
    876 #ifndef STBI_NO_PIC
    877 static int      stbi__pic_test(stbi__context *s);
    878 static void    *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    879 static int      stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
    880 #endif
    881 
    882 #ifndef STBI_NO_GIF
    883 static int      stbi__gif_test(stbi__context *s);
    884 static void    *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    885 static void    *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
    886 static int      stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
    887 #endif
    888 
    889 #ifndef STBI_NO_PNM
    890 static int      stbi__pnm_test(stbi__context *s);
    891 static void    *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    892 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
    893 #endif
    894 
    895 static
    896 #ifdef STBI_THREAD_LOCAL
    897 STBI_THREAD_LOCAL
    898 #endif
    899 const char *stbi__g_failure_reason;
    900 
    901 STBIDEF const char *stbi_failure_reason(void)
    902 {
    903    return stbi__g_failure_reason;
    904 }
    905 
    906 #ifndef STBI_NO_FAILURE_STRINGS
    907 static int stbi__err(const char *str)
    908 {
    909    stbi__g_failure_reason = str;
    910    return 0;
    911 }
    912 #endif
    913 
    914 static void *stbi__malloc(size_t size)
    915 {
    916     return STBI_MALLOC(size);
    917 }
    918 
    919 // stb_image uses ints pervasively, including for offset calculations.
    920 // therefore the largest decoded image size we can support with the
    921 // current code, even on 64-bit targets, is INT_MAX. this is not a
    922 // significant limitation for the intended use case.
    923 //
    924 // we do, however, need to make sure our size calculations don't
    925 // overflow. hence a few helper functions for size calculations that
    926 // multiply integers together, making sure that they're non-negative
    927 // and no overflow occurs.
    928 
    929 // return 1 if the sum is valid, 0 on overflow.
    930 // negative terms are considered invalid.
    931 static int stbi__addsizes_valid(int a, int b)
    932 {
    933    if (b < 0) return 0;
    934    // now 0 <= b <= INT_MAX, hence also
    935    // 0 <= INT_MAX - b <= INTMAX.
    936    // And "a + b <= INT_MAX" (which might overflow) is the
    937    // same as a <= INT_MAX - b (no overflow)
    938    return a <= INT_MAX - b;
    939 }
    940 
    941 // returns 1 if the product is valid, 0 on overflow.
    942 // negative factors are considered invalid.
    943 static int stbi__mul2sizes_valid(int a, int b)
    944 {
    945    if (a < 0 || b < 0) return 0;
    946    if (b == 0) return 1; // mul-by-0 is always safe
    947    // portable way to check for no overflows in a*b
    948    return a <= INT_MAX/b;
    949 }
    950 
    951 #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
    952 // returns 1 if "a*b + add" has no negative terms/factors and doesn't overflow
    953 static int stbi__mad2sizes_valid(int a, int b, int add)
    954 {
    955    return stbi__mul2sizes_valid(a, b) && stbi__addsizes_valid(a*b, add);
    956 }
    957 #endif
    958 
    959 // returns 1 if "a*b*c + add" has no negative terms/factors and doesn't overflow
    960 static int stbi__mad3sizes_valid(int a, int b, int c, int add)
    961 {
    962    return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
    963       stbi__addsizes_valid(a*b*c, add);
    964 }
    965 
    966 // returns 1 if "a*b*c*d + add" has no negative terms/factors and doesn't overflow
    967 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
    968 static int stbi__mad4sizes_valid(int a, int b, int c, int d, int add)
    969 {
    970    return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
    971       stbi__mul2sizes_valid(a*b*c, d) && stbi__addsizes_valid(a*b*c*d, add);
    972 }
    973 #endif
    974 
    975 #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
    976 // mallocs with size overflow checking
    977 static void *stbi__malloc_mad2(int a, int b, int add)
    978 {
    979    if (!stbi__mad2sizes_valid(a, b, add)) return NULL;
    980    return stbi__malloc(a*b + add);
    981 }
    982 #endif
    983 
    984 static void *stbi__malloc_mad3(int a, int b, int c, int add)
    985 {
    986    if (!stbi__mad3sizes_valid(a, b, c, add)) return NULL;
    987    return stbi__malloc(a*b*c + add);
    988 }
    989 
    990 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
    991 static void *stbi__malloc_mad4(int a, int b, int c, int d, int add)
    992 {
    993    if (!stbi__mad4sizes_valid(a, b, c, d, add)) return NULL;
    994    return stbi__malloc(a*b*c*d + add);
    995 }
    996 #endif
    997 
    998 // stbi__err - error
    999 // stbi__errpf - error returning pointer to float
   1000 // stbi__errpuc - error returning pointer to unsigned char
   1001 
   1002 #ifdef STBI_NO_FAILURE_STRINGS
   1003    #define stbi__err(x,y)  0
   1004 #elif defined(STBI_FAILURE_USERMSG)
   1005    #define stbi__err(x,y)  stbi__err(y)
   1006 #else
   1007    #define stbi__err(x,y)  stbi__err(x)
   1008 #endif
   1009 
   1010 #define stbi__errpf(x,y)   ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
   1011 #define stbi__errpuc(x,y)  ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
   1012 
   1013 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
   1014 {
   1015    STBI_FREE(retval_from_stbi_load);
   1016 }
   1017 
   1018 #ifndef STBI_NO_LINEAR
   1019 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
   1020 #endif
   1021 
   1022 #ifndef STBI_NO_HDR
   1023 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp);
   1024 #endif
   1025 
   1026 static int stbi__vertically_flip_on_load_global = 0;
   1027 
   1028 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
   1029 {
   1030    stbi__vertically_flip_on_load_global = flag_true_if_should_flip;
   1031 }
   1032 
   1033 #ifndef STBI_THREAD_LOCAL
   1034 #define stbi__vertically_flip_on_load  stbi__vertically_flip_on_load_global
   1035 #else
   1036 static STBI_THREAD_LOCAL int stbi__vertically_flip_on_load_local, stbi__vertically_flip_on_load_set;
   1037 
   1038 STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip)
   1039 {
   1040    stbi__vertically_flip_on_load_local = flag_true_if_should_flip;
   1041    stbi__vertically_flip_on_load_set = 1;
   1042 }
   1043 
   1044 #define stbi__vertically_flip_on_load  (stbi__vertically_flip_on_load_set       \
   1045                                          ? stbi__vertically_flip_on_load_local  \
   1046                                          : stbi__vertically_flip_on_load_global)
   1047 #endif // STBI_THREAD_LOCAL
   1048 
   1049 static void *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
   1050 {
   1051    memset(ri, 0, sizeof(*ri)); // make sure it's initialized if we add new fields
   1052    ri->bits_per_channel = 8; // default is 8 so most paths don't have to be changed
   1053    ri->channel_order = STBI_ORDER_RGB; // all current input & output are this, but this is here so we can add BGR order
   1054    ri->num_channels = 0;
   1055 
   1056    #ifndef STBI_NO_JPEG
   1057    if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp, ri);
   1058    #endif
   1059    #ifndef STBI_NO_PNG
   1060    if (stbi__png_test(s))  return stbi__png_load(s,x,y,comp,req_comp, ri);
   1061    #endif
   1062    #ifndef STBI_NO_BMP
   1063    if (stbi__bmp_test(s))  return stbi__bmp_load(s,x,y,comp,req_comp, ri);
   1064    #endif
   1065    #ifndef STBI_NO_GIF
   1066    if (stbi__gif_test(s))  return stbi__gif_load(s,x,y,comp,req_comp, ri);
   1067    #endif
   1068    #ifndef STBI_NO_PSD
   1069    if (stbi__psd_test(s))  return stbi__psd_load(s,x,y,comp,req_comp, ri, bpc);
   1070    #else
   1071    STBI_NOTUSED(bpc);
   1072    #endif
   1073    #ifndef STBI_NO_PIC
   1074    if (stbi__pic_test(s))  return stbi__pic_load(s,x,y,comp,req_comp, ri);
   1075    #endif
   1076    #ifndef STBI_NO_PNM
   1077    if (stbi__pnm_test(s))  return stbi__pnm_load(s,x,y,comp,req_comp, ri);
   1078    #endif
   1079 
   1080    #ifndef STBI_NO_HDR
   1081    if (stbi__hdr_test(s)) {
   1082       float *hdr = stbi__hdr_load(s, x,y,comp,req_comp, ri);
   1083       return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
   1084    }
   1085    #endif
   1086 
   1087    #ifndef STBI_NO_TGA
   1088    // test tga last because it's a crappy test!
   1089    if (stbi__tga_test(s))
   1090       return stbi__tga_load(s,x,y,comp,req_comp, ri);
   1091    #endif
   1092 
   1093    return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
   1094 }
   1095 
   1096 static stbi_uc *stbi__convert_16_to_8(stbi__uint16 *orig, int w, int h, int channels)
   1097 {
   1098    int i;
   1099    int img_len = w * h * channels;
   1100    stbi_uc *reduced;
   1101 
   1102    reduced = (stbi_uc *) stbi__malloc(img_len);
   1103    if (reduced == NULL) return stbi__errpuc("outofmem", "Out of memory");
   1104 
   1105    for (i = 0; i < img_len; ++i)
   1106       reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is sufficient approx of 16->8 bit scaling
   1107 
   1108    STBI_FREE(orig);
   1109    return reduced;
   1110 }
   1111 
   1112 static stbi__uint16 *stbi__convert_8_to_16(stbi_uc *orig, int w, int h, int channels)
   1113 {
   1114    int i;
   1115    int img_len = w * h * channels;
   1116    stbi__uint16 *enlarged;
   1117 
   1118    enlarged = (stbi__uint16 *) stbi__malloc(img_len*2);
   1119    if (enlarged == NULL) return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
   1120 
   1121    for (i = 0; i < img_len; ++i)
   1122       enlarged[i] = (stbi__uint16)((orig[i] << 8) + orig[i]); // replicate to high and low byte, maps 0->0, 255->0xffff
   1123 
   1124    STBI_FREE(orig);
   1125    return enlarged;
   1126 }
   1127 
   1128 static void stbi__vertical_flip(void *image, int w, int h, int bytes_per_pixel)
   1129 {
   1130    int row;
   1131    size_t bytes_per_row = (size_t)w * bytes_per_pixel;
   1132    stbi_uc temp[2048];
   1133    stbi_uc *bytes = (stbi_uc *)image;
   1134 
   1135    for (row = 0; row < (h>>1); row++) {
   1136       stbi_uc *row0 = bytes + row*bytes_per_row;
   1137       stbi_uc *row1 = bytes + (h - row - 1)*bytes_per_row;
   1138       // swap row0 with row1
   1139       size_t bytes_left = bytes_per_row;
   1140       while (bytes_left) {
   1141          size_t bytes_copy = (bytes_left < sizeof(temp)) ? bytes_left : sizeof(temp);
   1142          memcpy(temp, row0, bytes_copy);
   1143          memcpy(row0, row1, bytes_copy);
   1144          memcpy(row1, temp, bytes_copy);
   1145          row0 += bytes_copy;
   1146          row1 += bytes_copy;
   1147          bytes_left -= bytes_copy;
   1148       }
   1149    }
   1150 }
   1151 
   1152 #ifndef STBI_NO_GIF
   1153 static void stbi__vertical_flip_slices(void *image, int w, int h, int z, int bytes_per_pixel)
   1154 {
   1155    int slice;
   1156    int slice_size = w * h * bytes_per_pixel;
   1157 
   1158    stbi_uc *bytes = (stbi_uc *)image;
   1159    for (slice = 0; slice < z; ++slice) {
   1160       stbi__vertical_flip(bytes, w, h, bytes_per_pixel);
   1161       bytes += slice_size;
   1162    }
   1163 }
   1164 #endif
   1165 
   1166 static unsigned char *stbi__load_and_postprocess_8bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1167 {
   1168    stbi__result_info ri;
   1169    void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 8);
   1170 
   1171    if (result == NULL)
   1172       return NULL;
   1173 
   1174    if (ri.bits_per_channel != 8) {
   1175       STBI_ASSERT(ri.bits_per_channel == 16);
   1176       result = stbi__convert_16_to_8((stbi__uint16 *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
   1177       ri.bits_per_channel = 8;
   1178    }
   1179 
   1180    // @TODO: move stbi__convert_format to here
   1181 
   1182    if (stbi__vertically_flip_on_load) {
   1183       int channels = req_comp ? req_comp : *comp;
   1184       stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi_uc));
   1185    }
   1186 
   1187    return (unsigned char *) result;
   1188 }
   1189 
   1190 static stbi__uint16 *stbi__load_and_postprocess_16bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1191 {
   1192    stbi__result_info ri;
   1193    void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 16);
   1194 
   1195    if (result == NULL)
   1196       return NULL;
   1197 
   1198    if (ri.bits_per_channel != 16) {
   1199       STBI_ASSERT(ri.bits_per_channel == 8);
   1200       result = stbi__convert_8_to_16((stbi_uc *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
   1201       ri.bits_per_channel = 16;
   1202    }
   1203 
   1204    // @TODO: move stbi__convert_format16 to here
   1205    // @TODO: special case RGB-to-Y (and RGBA-to-YA) for 8-bit-to-16-bit case to keep more precision
   1206 
   1207    if (stbi__vertically_flip_on_load) {
   1208       int channels = req_comp ? req_comp : *comp;
   1209       stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi__uint16));
   1210    }
   1211 
   1212    return (stbi__uint16 *) result;
   1213 }
   1214 
   1215 #if !defined(STBI_NO_HDR) && !defined(STBI_NO_LINEAR)
   1216 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
   1217 {
   1218    if (stbi__vertically_flip_on_load && result != NULL) {
   1219       int channels = req_comp ? req_comp : *comp;
   1220       stbi__vertical_flip(result, *x, *y, channels * sizeof(float));
   1221    }
   1222 }
   1223 #endif
   1224 
   1225 #ifndef STBI_NO_STDIO
   1226 
   1227 #if defined(_MSC_VER) && defined(STBI_WINDOWS_UTF8)
   1228 STBI_EXTERN __declspec(dllimport) int __stdcall MultiByteToWideChar(unsigned int cp, unsigned long flags, const char *str, int cbmb, wchar_t *widestr, int cchwide);
   1229 STBI_EXTERN __declspec(dllimport) int __stdcall WideCharToMultiByte(unsigned int cp, unsigned long flags, const wchar_t *widestr, int cchwide, char *str, int cbmb, const char *defchar, int *used_default);
   1230 #endif
   1231 
   1232 #if defined(_MSC_VER) && defined(STBI_WINDOWS_UTF8)
   1233 STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input)
   1234 {
   1235 	return WideCharToMultiByte(65001 /* UTF8 */, 0, input, -1, buffer, (int) bufferlen, NULL, NULL);
   1236 }
   1237 #endif
   1238 
   1239 static FILE *stbi__fopen(char const *filename, char const *mode)
   1240 {
   1241    FILE *f;
   1242 #if defined(_MSC_VER) && defined(STBI_WINDOWS_UTF8)
   1243    wchar_t wMode[64];
   1244    wchar_t wFilename[1024];
   1245 	if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, filename, -1, wFilename, sizeof(wFilename)))
   1246       return 0;
   1247 
   1248 	if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, mode, -1, wMode, sizeof(wMode)))
   1249       return 0;
   1250 
   1251 #if _MSC_VER >= 1400
   1252 	if (0 != _wfopen_s(&f, wFilename, wMode))
   1253 		f = 0;
   1254 #else
   1255    f = _wfopen(wFilename, wMode);
   1256 #endif
   1257 
   1258 #elif defined(_MSC_VER) && _MSC_VER >= 1400
   1259    if (0 != fopen_s(&f, filename, mode))
   1260       f=0;
   1261 #else
   1262    f = fopen(filename, mode);
   1263 #endif
   1264    return f;
   1265 }
   1266 
   1267 
   1268 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
   1269 {
   1270    FILE *f = stbi__fopen(filename, "rb");
   1271    unsigned char *result;
   1272    if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
   1273    result = stbi_load_from_file(f,x,y,comp,req_comp);
   1274    fclose(f);
   1275    return result;
   1276 }
   1277 
   1278 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   1279 {
   1280    unsigned char *result;
   1281    stbi__context s;
   1282    stbi__start_file(&s,f);
   1283    result = stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1284    if (result) {
   1285       // need to 'unget' all the characters in the IO buffer
   1286       fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
   1287    }
   1288    return result;
   1289 }
   1290 
   1291 STBIDEF stbi__uint16 *stbi_load_from_file_16(FILE *f, int *x, int *y, int *comp, int req_comp)
   1292 {
   1293    stbi__uint16 *result;
   1294    stbi__context s;
   1295    stbi__start_file(&s,f);
   1296    result = stbi__load_and_postprocess_16bit(&s,x,y,comp,req_comp);
   1297    if (result) {
   1298       // need to 'unget' all the characters in the IO buffer
   1299       fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
   1300    }
   1301    return result;
   1302 }
   1303 
   1304 STBIDEF stbi_us *stbi_load_16(char const *filename, int *x, int *y, int *comp, int req_comp)
   1305 {
   1306    FILE *f = stbi__fopen(filename, "rb");
   1307    stbi__uint16 *result;
   1308    if (!f) return (stbi_us *) stbi__errpuc("can't fopen", "Unable to open file");
   1309    result = stbi_load_from_file_16(f,x,y,comp,req_comp);
   1310    fclose(f);
   1311    return result;
   1312 }
   1313 
   1314 
   1315 #endif //!STBI_NO_STDIO
   1316 
   1317 STBIDEF stbi_us *stbi_load_16_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels)
   1318 {
   1319    stbi__context s;
   1320    stbi__start_mem(&s,buffer,len);
   1321    return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
   1322 }
   1323 
   1324 STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels)
   1325 {
   1326    stbi__context s;
   1327    stbi__start_callbacks(&s, (stbi_io_callbacks *)clbk, user);
   1328    return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
   1329 }
   1330 
   1331 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   1332 {
   1333    stbi__context s;
   1334    stbi__start_mem(&s,buffer,len);
   1335    return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1336 }
   1337 
   1338 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
   1339 {
   1340    stbi__context s;
   1341    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1342    return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1343 }
   1344 
   1345 #ifndef STBI_NO_GIF
   1346 STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
   1347 {
   1348    unsigned char *result;
   1349    stbi__context s;
   1350    stbi__start_mem(&s,buffer,len);
   1351 
   1352    result = (unsigned char*) stbi__load_gif_main(&s, delays, x, y, z, comp, req_comp);
   1353    if (stbi__vertically_flip_on_load) {
   1354       stbi__vertical_flip_slices( result, *x, *y, *z, *comp );
   1355    }
   1356 
   1357    return result;
   1358 }
   1359 #endif
   1360 
   1361 #ifndef STBI_NO_LINEAR
   1362 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1363 {
   1364    unsigned char *data;
   1365    #ifndef STBI_NO_HDR
   1366    if (stbi__hdr_test(s)) {
   1367       stbi__result_info ri;
   1368       float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp, &ri);
   1369       if (hdr_data)
   1370          stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
   1371       return hdr_data;
   1372    }
   1373    #endif
   1374    data = stbi__load_and_postprocess_8bit(s, x, y, comp, req_comp);
   1375    if (data)
   1376       return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
   1377    return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
   1378 }
   1379 
   1380 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   1381 {
   1382    stbi__context s;
   1383    stbi__start_mem(&s,buffer,len);
   1384    return stbi__loadf_main(&s,x,y,comp,req_comp);
   1385 }
   1386 
   1387 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
   1388 {
   1389    stbi__context s;
   1390    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1391    return stbi__loadf_main(&s,x,y,comp,req_comp);
   1392 }
   1393 
   1394 #ifndef STBI_NO_STDIO
   1395 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
   1396 {
   1397    float *result;
   1398    FILE *f = stbi__fopen(filename, "rb");
   1399    if (!f) return stbi__errpf("can't fopen", "Unable to open file");
   1400    result = stbi_loadf_from_file(f,x,y,comp,req_comp);
   1401    fclose(f);
   1402    return result;
   1403 }
   1404 
   1405 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   1406 {
   1407    stbi__context s;
   1408    stbi__start_file(&s,f);
   1409    return stbi__loadf_main(&s,x,y,comp,req_comp);
   1410 }
   1411 #endif // !STBI_NO_STDIO
   1412 
   1413 #endif // !STBI_NO_LINEAR
   1414 
   1415 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
   1416 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
   1417 // reports false!
   1418 
   1419 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
   1420 {
   1421    #ifndef STBI_NO_HDR
   1422    stbi__context s;
   1423    stbi__start_mem(&s,buffer,len);
   1424    return stbi__hdr_test(&s);
   1425    #else
   1426    STBI_NOTUSED(buffer);
   1427    STBI_NOTUSED(len);
   1428    return 0;
   1429    #endif
   1430 }
   1431 
   1432 #ifndef STBI_NO_STDIO
   1433 STBIDEF int      stbi_is_hdr          (char const *filename)
   1434 {
   1435    FILE *f = stbi__fopen(filename, "rb");
   1436    int result=0;
   1437    if (f) {
   1438       result = stbi_is_hdr_from_file(f);
   1439       fclose(f);
   1440    }
   1441    return result;
   1442 }
   1443 
   1444 STBIDEF int stbi_is_hdr_from_file(FILE *f)
   1445 {
   1446    #ifndef STBI_NO_HDR
   1447    long pos = ftell(f);
   1448    int res;
   1449    stbi__context s;
   1450    stbi__start_file(&s,f);
   1451    res = stbi__hdr_test(&s);
   1452    fseek(f, pos, SEEK_SET);
   1453    return res;
   1454    #else
   1455    STBI_NOTUSED(f);
   1456    return 0;
   1457    #endif
   1458 }
   1459 #endif // !STBI_NO_STDIO
   1460 
   1461 STBIDEF int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
   1462 {
   1463    #ifndef STBI_NO_HDR
   1464    stbi__context s;
   1465    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1466    return stbi__hdr_test(&s);
   1467    #else
   1468    STBI_NOTUSED(clbk);
   1469    STBI_NOTUSED(user);
   1470    return 0;
   1471    #endif
   1472 }
   1473 
   1474 #ifndef STBI_NO_LINEAR
   1475 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
   1476 
   1477 STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
   1478 STBIDEF void   stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
   1479 #endif
   1480 
   1481 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
   1482 
   1483 STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
   1484 STBIDEF void   stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
   1485 
   1486 
   1487 //////////////////////////////////////////////////////////////////////////////
   1488 //
   1489 // Common code used by all image loaders
   1490 //
   1491 
   1492 enum
   1493 {
   1494    STBI__SCAN_load=0,
   1495    STBI__SCAN_type,
   1496    STBI__SCAN_header
   1497 };
   1498 
   1499 static void stbi__refill_buffer(stbi__context *s)
   1500 {
   1501    int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
   1502    if (n == 0) {
   1503       // at end of file, treat same as if from memory, but need to handle case
   1504       // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
   1505       s->read_from_callbacks = 0;
   1506       s->img_buffer = s->buffer_start;
   1507       s->img_buffer_end = s->buffer_start+1;
   1508       *s->img_buffer = 0;
   1509    } else {
   1510       s->img_buffer = s->buffer_start;
   1511       s->img_buffer_end = s->buffer_start + n;
   1512    }
   1513 }
   1514 
   1515 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
   1516 {
   1517    if (s->img_buffer < s->img_buffer_end)
   1518       return *s->img_buffer++;
   1519    if (s->read_from_callbacks) {
   1520       stbi__refill_buffer(s);
   1521       return *s->img_buffer++;
   1522    }
   1523    return 0;
   1524 }
   1525 
   1526 #if defined(STBI_NO_JPEG) && defined(STBI_NO_HDR) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1527 // nothing
   1528 #else
   1529 stbi_inline static int stbi__at_eof(stbi__context *s)
   1530 {
   1531    if (s->io.read) {
   1532       if (!(s->io.eof)(s->io_user_data)) return 0;
   1533       // if feof() is true, check if buffer = end
   1534       // special case: we've only got the special 0 character at the end
   1535       if (s->read_from_callbacks == 0) return 1;
   1536    }
   1537 
   1538    return s->img_buffer >= s->img_buffer_end;
   1539 }
   1540 #endif
   1541 
   1542 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC)
   1543 // nothing
   1544 #else
   1545 static void stbi__skip(stbi__context *s, int n)
   1546 {
   1547    if (n < 0) {
   1548       s->img_buffer = s->img_buffer_end;
   1549       return;
   1550    }
   1551    if (s->io.read) {
   1552       int blen = (int) (s->img_buffer_end - s->img_buffer);
   1553       if (blen < n) {
   1554          s->img_buffer = s->img_buffer_end;
   1555          (s->io.skip)(s->io_user_data, n - blen);
   1556          return;
   1557       }
   1558    }
   1559    s->img_buffer += n;
   1560 }
   1561 #endif
   1562 
   1563 #if defined(STBI_NO_PNG) && defined(STBI_NO_TGA) && defined(STBI_NO_HDR) && defined(STBI_NO_PNM)
   1564 // nothing
   1565 #else
   1566 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
   1567 {
   1568    if (s->io.read) {
   1569       int blen = (int) (s->img_buffer_end - s->img_buffer);
   1570       if (blen < n) {
   1571          int res, count;
   1572 
   1573          memcpy(buffer, s->img_buffer, blen);
   1574 
   1575          count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
   1576          res = (count == (n-blen));
   1577          s->img_buffer = s->img_buffer_end;
   1578          return res;
   1579       }
   1580    }
   1581 
   1582    if (s->img_buffer+n <= s->img_buffer_end) {
   1583       memcpy(buffer, s->img_buffer, n);
   1584       s->img_buffer += n;
   1585       return 1;
   1586    } else
   1587       return 0;
   1588 }
   1589 #endif
   1590 
   1591 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
   1592 // nothing
   1593 #else
   1594 static int stbi__get16be(stbi__context *s)
   1595 {
   1596    int z = stbi__get8(s);
   1597    return (z << 8) + stbi__get8(s);
   1598 }
   1599 #endif
   1600 
   1601 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
   1602 // nothing
   1603 #else
   1604 static stbi__uint32 stbi__get32be(stbi__context *s)
   1605 {
   1606    stbi__uint32 z = stbi__get16be(s);
   1607    return (z << 16) + stbi__get16be(s);
   1608 }
   1609 #endif
   1610 
   1611 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
   1612 // nothing
   1613 #else
   1614 static int stbi__get16le(stbi__context *s)
   1615 {
   1616    int z = stbi__get8(s);
   1617    return z + (stbi__get8(s) << 8);
   1618 }
   1619 #endif
   1620 
   1621 #ifndef STBI_NO_BMP
   1622 static stbi__uint32 stbi__get32le(stbi__context *s)
   1623 {
   1624    stbi__uint32 z = stbi__get16le(s);
   1625    return z + (stbi__get16le(s) << 16);
   1626 }
   1627 #endif
   1628 
   1629 #define STBI__BYTECAST(x)  ((stbi_uc) ((x) & 255))  // truncate int to byte without warnings
   1630 
   1631 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1632 // nothing
   1633 #else
   1634 //////////////////////////////////////////////////////////////////////////////
   1635 //
   1636 //  generic converter from built-in img_n to req_comp
   1637 //    individual types do this automatically as much as possible (e.g. jpeg
   1638 //    does all cases internally since it needs to colorspace convert anyway,
   1639 //    and it never has alpha, so very few cases ). png can automatically
   1640 //    interleave an alpha=255 channel, but falls back to this for other cases
   1641 //
   1642 //  assume data buffer is malloced, so malloc a new one and free that one
   1643 //  only failure mode is malloc failing
   1644 
   1645 static stbi_uc stbi__compute_y(int r, int g, int b)
   1646 {
   1647    return (stbi_uc) (((r*77) + (g*150) +  (29*b)) >> 8);
   1648 }
   1649 #endif
   1650 
   1651 #if defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1652 // nothing
   1653 #else
   1654 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
   1655 {
   1656    int i,j;
   1657    unsigned char *good;
   1658 
   1659    if (req_comp == img_n) return data;
   1660    STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
   1661 
   1662    good = (unsigned char *) stbi__malloc_mad3(req_comp, x, y, 0);
   1663    if (good == NULL) {
   1664       STBI_FREE(data);
   1665       return stbi__errpuc("outofmem", "Out of memory");
   1666    }
   1667 
   1668    for (j=0; j < (int) y; ++j) {
   1669       unsigned char *src  = data + j * x * img_n   ;
   1670       unsigned char *dest = good + j * x * req_comp;
   1671 
   1672       #define STBI__COMBO(a,b)  ((a)*8+(b))
   1673       #define STBI__CASE(a,b)   case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
   1674       // convert source image with img_n components to one with req_comp components;
   1675       // avoid switch per pixel, so use switch per scanline and massive macros
   1676       switch (STBI__COMBO(img_n, req_comp)) {
   1677          STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=255;                                     } break;
   1678          STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0];                                  } break;
   1679          STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=255;                     } break;
   1680          STBI__CASE(2,1) { dest[0]=src[0];                                                  } break;
   1681          STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0];                                  } break;
   1682          STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1];                  } break;
   1683          STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=255;        } break;
   1684          STBI__CASE(3,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]);                   } break;
   1685          STBI__CASE(3,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = 255;    } break;
   1686          STBI__CASE(4,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]);                   } break;
   1687          STBI__CASE(4,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = src[3]; } break;
   1688          STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];                    } break;
   1689          default: STBI_ASSERT(0);
   1690       }
   1691       #undef STBI__CASE
   1692    }
   1693 
   1694    STBI_FREE(data);
   1695    return good;
   1696 }
   1697 #endif
   1698 
   1699 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
   1700 // nothing
   1701 #else
   1702 static stbi__uint16 stbi__compute_y_16(int r, int g, int b)
   1703 {
   1704    return (stbi__uint16) (((r*77) + (g*150) +  (29*b)) >> 8);
   1705 }
   1706 #endif
   1707 
   1708 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
   1709 // nothing
   1710 #else
   1711 static stbi__uint16 *stbi__convert_format16(stbi__uint16 *data, int img_n, int req_comp, unsigned int x, unsigned int y)
   1712 {
   1713    int i,j;
   1714    stbi__uint16 *good;
   1715 
   1716    if (req_comp == img_n) return data;
   1717    STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
   1718 
   1719    good = (stbi__uint16 *) stbi__malloc(req_comp * x * y * 2);
   1720    if (good == NULL) {
   1721       STBI_FREE(data);
   1722       return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
   1723    }
   1724 
   1725    for (j=0; j < (int) y; ++j) {
   1726       stbi__uint16 *src  = data + j * x * img_n   ;
   1727       stbi__uint16 *dest = good + j * x * req_comp;
   1728 
   1729       #define STBI__COMBO(a,b)  ((a)*8+(b))
   1730       #define STBI__CASE(a,b)   case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
   1731       // convert source image with img_n components to one with req_comp components;
   1732       // avoid switch per pixel, so use switch per scanline and massive macros
   1733       switch (STBI__COMBO(img_n, req_comp)) {
   1734          STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=0xffff;                                     } break;
   1735          STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0];                                     } break;
   1736          STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=0xffff;                     } break;
   1737          STBI__CASE(2,1) { dest[0]=src[0];                                                     } break;
   1738          STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0];                                     } break;
   1739          STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1];                     } break;
   1740          STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=0xffff;        } break;
   1741          STBI__CASE(3,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]);                   } break;
   1742          STBI__CASE(3,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = 0xffff; } break;
   1743          STBI__CASE(4,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]);                   } break;
   1744          STBI__CASE(4,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = src[3]; } break;
   1745          STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];                       } break;
   1746          default: STBI_ASSERT(0);
   1747       }
   1748       #undef STBI__CASE
   1749    }
   1750 
   1751    STBI_FREE(data);
   1752    return good;
   1753 }
   1754 #endif
   1755 
   1756 #ifndef STBI_NO_LINEAR
   1757 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
   1758 {
   1759    int i,k,n;
   1760    float *output;
   1761    if (!data) return NULL;
   1762    output = (float *) stbi__malloc_mad4(x, y, comp, sizeof(float), 0);
   1763    if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
   1764    // compute number of non-alpha components
   1765    if (comp & 1) n = comp; else n = comp-1;
   1766    for (i=0; i < x*y; ++i) {
   1767       for (k=0; k < n; ++k) {
   1768          output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
   1769       }
   1770    }
   1771    if (n < comp) {
   1772       for (i=0; i < x*y; ++i) {
   1773          output[i*comp + n] = data[i*comp + n]/255.0f;
   1774       }
   1775    }
   1776    STBI_FREE(data);
   1777    return output;
   1778 }
   1779 #endif
   1780 
   1781 #ifndef STBI_NO_HDR
   1782 #define stbi__float2int(x)   ((int) (x))
   1783 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp)
   1784 {
   1785    int i,k,n;
   1786    stbi_uc *output;
   1787    if (!data) return NULL;
   1788    output = (stbi_uc *) stbi__malloc_mad3(x, y, comp, 0);
   1789    if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
   1790    // compute number of non-alpha components
   1791    if (comp & 1) n = comp; else n = comp-1;
   1792    for (i=0; i < x*y; ++i) {
   1793       for (k=0; k < n; ++k) {
   1794          float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
   1795          if (z < 0) z = 0;
   1796          if (z > 255) z = 255;
   1797          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
   1798       }
   1799       if (k < comp) {
   1800          float z = data[i*comp+k] * 255 + 0.5f;
   1801          if (z < 0) z = 0;
   1802          if (z > 255) z = 255;
   1803          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
   1804       }
   1805    }
   1806    STBI_FREE(data);
   1807    return output;
   1808 }
   1809 #endif
   1810 
   1811 //////////////////////////////////////////////////////////////////////////////
   1812 //
   1813 //  "baseline" JPEG/JFIF decoder
   1814 //
   1815 //    simple implementation
   1816 //      - doesn't support delayed output of y-dimension
   1817 //      - simple interface (only one output format: 8-bit interleaved RGB)
   1818 //      - doesn't try to recover corrupt jpegs
   1819 //      - doesn't allow partial loading, loading multiple at once
   1820 //      - still fast on x86 (copying globals into locals doesn't help x86)
   1821 //      - allocates lots of intermediate memory (full size of all components)
   1822 //        - non-interleaved case requires this anyway
   1823 //        - allows good upsampling (see next)
   1824 //    high-quality
   1825 //      - upsampled channels are bilinearly interpolated, even across blocks
   1826 //      - quality integer IDCT derived from IJG's 'slow'
   1827 //    performance
   1828 //      - fast huffman; reasonable integer IDCT
   1829 //      - some SIMD kernels for common paths on targets with SSE2/NEON
   1830 //      - uses a lot of intermediate memory, could cache poorly
   1831 
   1832 #ifndef STBI_NO_JPEG
   1833 
   1834 // huffman decoding acceleration
   1835 #define FAST_BITS   9  // larger handles more cases; smaller stomps less cache
   1836 
   1837 typedef struct
   1838 {
   1839    stbi_uc  fast[1 << FAST_BITS];
   1840    // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
   1841    stbi__uint16 code[256];
   1842    stbi_uc  values[256];
   1843    stbi_uc  size[257];
   1844    unsigned int maxcode[18];
   1845    int    delta[17];   // old 'firstsymbol' - old 'firstcode'
   1846 } stbi__huffman;
   1847 
   1848 typedef struct
   1849 {
   1850    stbi__context *s;
   1851    stbi__huffman huff_dc[4];
   1852    stbi__huffman huff_ac[4];
   1853    stbi__uint16 dequant[4][64];
   1854    stbi__int16 fast_ac[4][1 << FAST_BITS];
   1855 
   1856 // sizes for components, interleaved MCUs
   1857    int img_h_max, img_v_max;
   1858    int img_mcu_x, img_mcu_y;
   1859    int img_mcu_w, img_mcu_h;
   1860 
   1861 // definition of jpeg image component
   1862    struct
   1863    {
   1864       int id;
   1865       int h,v;
   1866       int tq;
   1867       int hd,ha;
   1868       int dc_pred;
   1869 
   1870       int x,y,w2,h2;
   1871       stbi_uc *data;
   1872       void *raw_data, *raw_coeff;
   1873       stbi_uc *linebuf;
   1874       short   *coeff;   // progressive only
   1875       int      coeff_w, coeff_h; // number of 8x8 coefficient blocks
   1876    } img_comp[4];
   1877 
   1878    stbi__uint32   code_buffer; // jpeg entropy-coded buffer
   1879    int            code_bits;   // number of valid bits
   1880    unsigned char  marker;      // marker seen while filling entropy buffer
   1881    int            nomore;      // flag if we saw a marker so must stop
   1882 
   1883    int            progressive;
   1884    int            spec_start;
   1885    int            spec_end;
   1886    int            succ_high;
   1887    int            succ_low;
   1888    int            eob_run;
   1889    int            jfif;
   1890    int            app14_color_transform; // Adobe APP14 tag
   1891    int            rgb;
   1892 
   1893    int scan_n, order[4];
   1894    int restart_interval, todo;
   1895 
   1896 // kernels
   1897    void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
   1898    void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
   1899    stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
   1900 } stbi__jpeg;
   1901 
   1902 static int stbi__build_huffman(stbi__huffman *h, int *count)
   1903 {
   1904    int i,j,k=0;
   1905    unsigned int code;
   1906    // build size list for each symbol (from JPEG spec)
   1907    for (i=0; i < 16; ++i)
   1908       for (j=0; j < count[i]; ++j)
   1909          h->size[k++] = (stbi_uc) (i+1);
   1910    h->size[k] = 0;
   1911 
   1912    // compute actual symbols (from jpeg spec)
   1913    code = 0;
   1914    k = 0;
   1915    for(j=1; j <= 16; ++j) {
   1916       // compute delta to add to code to compute symbol id
   1917       h->delta[j] = k - code;
   1918       if (h->size[k] == j) {
   1919          while (h->size[k] == j)
   1920             h->code[k++] = (stbi__uint16) (code++);
   1921          if (code-1 >= (1u << j)) return stbi__err("bad code lengths","Corrupt JPEG");
   1922       }
   1923       // compute largest code + 1 for this size, preshifted as needed later
   1924       h->maxcode[j] = code << (16-j);
   1925       code <<= 1;
   1926    }
   1927    h->maxcode[j] = 0xffffffff;
   1928 
   1929    // build non-spec acceleration table; 255 is flag for not-accelerated
   1930    memset(h->fast, 255, 1 << FAST_BITS);
   1931    for (i=0; i < k; ++i) {
   1932       int s = h->size[i];
   1933       if (s <= FAST_BITS) {
   1934          int c = h->code[i] << (FAST_BITS-s);
   1935          int m = 1 << (FAST_BITS-s);
   1936          for (j=0; j < m; ++j) {
   1937             h->fast[c+j] = (stbi_uc) i;
   1938          }
   1939       }
   1940    }
   1941    return 1;
   1942 }
   1943 
   1944 // build a table that decodes both magnitude and value of small ACs in
   1945 // one go.
   1946 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
   1947 {
   1948    int i;
   1949    for (i=0; i < (1 << FAST_BITS); ++i) {
   1950       stbi_uc fast = h->fast[i];
   1951       fast_ac[i] = 0;
   1952       if (fast < 255) {
   1953          int rs = h->values[fast];
   1954          int run = (rs >> 4) & 15;
   1955          int magbits = rs & 15;
   1956          int len = h->size[fast];
   1957 
   1958          if (magbits && len + magbits <= FAST_BITS) {
   1959             // magnitude code followed by receive_extend code
   1960             int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
   1961             int m = 1 << (magbits - 1);
   1962             if (k < m) k += (~0U << magbits) + 1;
   1963             // if the result is small enough, we can fit it in fast_ac table
   1964             if (k >= -128 && k <= 127)
   1965                fast_ac[i] = (stbi__int16) ((k * 256) + (run * 16) + (len + magbits));
   1966          }
   1967       }
   1968    }
   1969 }
   1970 
   1971 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
   1972 {
   1973    do {
   1974       unsigned int b = j->nomore ? 0 : stbi__get8(j->s);
   1975       if (b == 0xff) {
   1976          int c = stbi__get8(j->s);
   1977          while (c == 0xff) c = stbi__get8(j->s); // consume fill bytes
   1978          if (c != 0) {
   1979             j->marker = (unsigned char) c;
   1980             j->nomore = 1;
   1981             return;
   1982          }
   1983       }
   1984       j->code_buffer |= b << (24 - j->code_bits);
   1985       j->code_bits += 8;
   1986    } while (j->code_bits <= 24);
   1987 }
   1988 
   1989 // (1 << n) - 1
   1990 static const stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
   1991 
   1992 // decode a jpeg huffman value from the bitstream
   1993 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
   1994 {
   1995    unsigned int temp;
   1996    int c,k;
   1997 
   1998    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   1999 
   2000    // look at the top FAST_BITS and determine what symbol ID it is,
   2001    // if the code is <= FAST_BITS
   2002    c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2003    k = h->fast[c];
   2004    if (k < 255) {
   2005       int s = h->size[k];
   2006       if (s > j->code_bits)
   2007          return -1;
   2008       j->code_buffer <<= s;
   2009       j->code_bits -= s;
   2010       return h->values[k];
   2011    }
   2012 
   2013    // naive test is to shift the code_buffer down so k bits are
   2014    // valid, then test against maxcode. To speed this up, we've
   2015    // preshifted maxcode left so that it has (16-k) 0s at the
   2016    // end; in other words, regardless of the number of bits, it
   2017    // wants to be compared against something shifted to have 16;
   2018    // that way we don't need to shift inside the loop.
   2019    temp = j->code_buffer >> 16;
   2020    for (k=FAST_BITS+1 ; ; ++k)
   2021       if (temp < h->maxcode[k])
   2022          break;
   2023    if (k == 17) {
   2024       // error! code not found
   2025       j->code_bits -= 16;
   2026       return -1;
   2027    }
   2028 
   2029    if (k > j->code_bits)
   2030       return -1;
   2031 
   2032    // convert the huffman code to the symbol id
   2033    c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
   2034    STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
   2035 
   2036    // convert the id to a symbol
   2037    j->code_bits -= k;
   2038    j->code_buffer <<= k;
   2039    return h->values[c];
   2040 }
   2041 
   2042 // bias[n] = (-1<<n) + 1
   2043 static const int stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
   2044 
   2045 // combined JPEG 'receive' and JPEG 'extend', since baseline
   2046 // always extends everything it receives.
   2047 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
   2048 {
   2049    unsigned int k;
   2050    int sgn;
   2051    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
   2052 
   2053    sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
   2054    k = stbi_lrot(j->code_buffer, n);
   2055    STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
   2056    j->code_buffer = k & ~stbi__bmask[n];
   2057    k &= stbi__bmask[n];
   2058    j->code_bits -= n;
   2059    return k + (stbi__jbias[n] & ~sgn);
   2060 }
   2061 
   2062 // get some unsigned bits
   2063 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
   2064 {
   2065    unsigned int k;
   2066    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
   2067    k = stbi_lrot(j->code_buffer, n);
   2068    j->code_buffer = k & ~stbi__bmask[n];
   2069    k &= stbi__bmask[n];
   2070    j->code_bits -= n;
   2071    return k;
   2072 }
   2073 
   2074 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
   2075 {
   2076    unsigned int k;
   2077    if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
   2078    k = j->code_buffer;
   2079    j->code_buffer <<= 1;
   2080    --j->code_bits;
   2081    return k & 0x80000000;
   2082 }
   2083 
   2084 // given a value that's at position X in the zigzag stream,
   2085 // where does it appear in the 8x8 matrix coded as row-major?
   2086 static const stbi_uc stbi__jpeg_dezigzag[64+15] =
   2087 {
   2088     0,  1,  8, 16,  9,  2,  3, 10,
   2089    17, 24, 32, 25, 18, 11,  4,  5,
   2090    12, 19, 26, 33, 40, 48, 41, 34,
   2091    27, 20, 13,  6,  7, 14, 21, 28,
   2092    35, 42, 49, 56, 57, 50, 43, 36,
   2093    29, 22, 15, 23, 30, 37, 44, 51,
   2094    58, 59, 52, 45, 38, 31, 39, 46,
   2095    53, 60, 61, 54, 47, 55, 62, 63,
   2096    // let corrupt input sample past end
   2097    63, 63, 63, 63, 63, 63, 63, 63,
   2098    63, 63, 63, 63, 63, 63, 63
   2099 };
   2100 
   2101 // decode one 64-entry block--
   2102 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi__uint16 *dequant)
   2103 {
   2104    int diff,dc,k;
   2105    int t;
   2106 
   2107    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2108    t = stbi__jpeg_huff_decode(j, hdc);
   2109    if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2110 
   2111    // 0 all the ac values now so we can do it 32-bits at a time
   2112    memset(data,0,64*sizeof(data[0]));
   2113 
   2114    diff = t ? stbi__extend_receive(j, t) : 0;
   2115    dc = j->img_comp[b].dc_pred + diff;
   2116    j->img_comp[b].dc_pred = dc;
   2117    data[0] = (short) (dc * dequant[0]);
   2118 
   2119    // decode AC components, see JPEG spec
   2120    k = 1;
   2121    do {
   2122       unsigned int zig;
   2123       int c,r,s;
   2124       if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2125       c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2126       r = fac[c];
   2127       if (r) { // fast-AC path
   2128          k += (r >> 4) & 15; // run
   2129          s = r & 15; // combined length
   2130          j->code_buffer <<= s;
   2131          j->code_bits -= s;
   2132          // decode into unzigzag'd location
   2133          zig = stbi__jpeg_dezigzag[k++];
   2134          data[zig] = (short) ((r >> 8) * dequant[zig]);
   2135       } else {
   2136          int rs = stbi__jpeg_huff_decode(j, hac);
   2137          if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2138          s = rs & 15;
   2139          r = rs >> 4;
   2140          if (s == 0) {
   2141             if (rs != 0xf0) break; // end block
   2142             k += 16;
   2143          } else {
   2144             k += r;
   2145             // decode into unzigzag'd location
   2146             zig = stbi__jpeg_dezigzag[k++];
   2147             data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
   2148          }
   2149       }
   2150    } while (k < 64);
   2151    return 1;
   2152 }
   2153 
   2154 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
   2155 {
   2156    int diff,dc;
   2157    int t;
   2158    if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2159 
   2160    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2161 
   2162    if (j->succ_high == 0) {
   2163       // first scan for DC coefficient, must be first
   2164       memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
   2165       t = stbi__jpeg_huff_decode(j, hdc);
   2166       diff = t ? stbi__extend_receive(j, t) : 0;
   2167 
   2168       dc = j->img_comp[b].dc_pred + diff;
   2169       j->img_comp[b].dc_pred = dc;
   2170       data[0] = (short) (dc << j->succ_low);
   2171    } else {
   2172       // refinement scan for DC coefficient
   2173       if (stbi__jpeg_get_bit(j))
   2174          data[0] += (short) (1 << j->succ_low);
   2175    }
   2176    return 1;
   2177 }
   2178 
   2179 // @OPTIMIZE: store non-zigzagged during the decode passes,
   2180 // and only de-zigzag when dequantizing
   2181 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
   2182 {
   2183    int k;
   2184    if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2185 
   2186    if (j->succ_high == 0) {
   2187       int shift = j->succ_low;
   2188 
   2189       if (j->eob_run) {
   2190          --j->eob_run;
   2191          return 1;
   2192       }
   2193 
   2194       k = j->spec_start;
   2195       do {
   2196          unsigned int zig;
   2197          int c,r,s;
   2198          if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2199          c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2200          r = fac[c];
   2201          if (r) { // fast-AC path
   2202             k += (r >> 4) & 15; // run
   2203             s = r & 15; // combined length
   2204             j->code_buffer <<= s;
   2205             j->code_bits -= s;
   2206             zig = stbi__jpeg_dezigzag[k++];
   2207             data[zig] = (short) ((r >> 8) << shift);
   2208          } else {
   2209             int rs = stbi__jpeg_huff_decode(j, hac);
   2210             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2211             s = rs & 15;
   2212             r = rs >> 4;
   2213             if (s == 0) {
   2214                if (r < 15) {
   2215                   j->eob_run = (1 << r);
   2216                   if (r)
   2217                      j->eob_run += stbi__jpeg_get_bits(j, r);
   2218                   --j->eob_run;
   2219                   break;
   2220                }
   2221                k += 16;
   2222             } else {
   2223                k += r;
   2224                zig = stbi__jpeg_dezigzag[k++];
   2225                data[zig] = (short) (stbi__extend_receive(j,s) << shift);
   2226             }
   2227          }
   2228       } while (k <= j->spec_end);
   2229    } else {
   2230       // refinement scan for these AC coefficients
   2231 
   2232       short bit = (short) (1 << j->succ_low);
   2233 
   2234       if (j->eob_run) {
   2235          --j->eob_run;
   2236          for (k = j->spec_start; k <= j->spec_end; ++k) {
   2237             short *p = &data[stbi__jpeg_dezigzag[k]];
   2238             if (*p != 0)
   2239                if (stbi__jpeg_get_bit(j))
   2240                   if ((*p & bit)==0) {
   2241                      if (*p > 0)
   2242                         *p += bit;
   2243                      else
   2244                         *p -= bit;
   2245                   }
   2246          }
   2247       } else {
   2248          k = j->spec_start;
   2249          do {
   2250             int r,s;
   2251             int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
   2252             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2253             s = rs & 15;
   2254             r = rs >> 4;
   2255             if (s == 0) {
   2256                if (r < 15) {
   2257                   j->eob_run = (1 << r) - 1;
   2258                   if (r)
   2259                      j->eob_run += stbi__jpeg_get_bits(j, r);
   2260                   r = 64; // force end of block
   2261                } else {
   2262                   // r=15 s=0 should write 16 0s, so we just do
   2263                   // a run of 15 0s and then write s (which is 0),
   2264                   // so we don't have to do anything special here
   2265                }
   2266             } else {
   2267                if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
   2268                // sign bit
   2269                if (stbi__jpeg_get_bit(j))
   2270                   s = bit;
   2271                else
   2272                   s = -bit;
   2273             }
   2274 
   2275             // advance by r
   2276             while (k <= j->spec_end) {
   2277                short *p = &data[stbi__jpeg_dezigzag[k++]];
   2278                if (*p != 0) {
   2279                   if (stbi__jpeg_get_bit(j))
   2280                      if ((*p & bit)==0) {
   2281                         if (*p > 0)
   2282                            *p += bit;
   2283                         else
   2284                            *p -= bit;
   2285                      }
   2286                } else {
   2287                   if (r == 0) {
   2288                      *p = (short) s;
   2289                      break;
   2290                   }
   2291                   --r;
   2292                }
   2293             }
   2294          } while (k <= j->spec_end);
   2295       }
   2296    }
   2297    return 1;
   2298 }
   2299 
   2300 // take a -128..127 value and stbi__clamp it and convert to 0..255
   2301 stbi_inline static stbi_uc stbi__clamp(int x)
   2302 {
   2303    // trick to use a single test to catch both cases
   2304    if ((unsigned int) x > 255) {
   2305       if (x < 0) return 0;
   2306       if (x > 255) return 255;
   2307    }
   2308    return (stbi_uc) x;
   2309 }
   2310 
   2311 #define stbi__f2f(x)  ((int) (((x) * 4096 + 0.5)))
   2312 #define stbi__fsh(x)  ((x) * 4096)
   2313 
   2314 // derived from jidctint -- DCT_ISLOW
   2315 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
   2316    int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
   2317    p2 = s2;                                    \
   2318    p3 = s6;                                    \
   2319    p1 = (p2+p3) * stbi__f2f(0.5411961f);       \
   2320    t2 = p1 + p3*stbi__f2f(-1.847759065f);      \
   2321    t3 = p1 + p2*stbi__f2f( 0.765366865f);      \
   2322    p2 = s0;                                    \
   2323    p3 = s4;                                    \
   2324    t0 = stbi__fsh(p2+p3);                      \
   2325    t1 = stbi__fsh(p2-p3);                      \
   2326    x0 = t0+t3;                                 \
   2327    x3 = t0-t3;                                 \
   2328    x1 = t1+t2;                                 \
   2329    x2 = t1-t2;                                 \
   2330    t0 = s7;                                    \
   2331    t1 = s5;                                    \
   2332    t2 = s3;                                    \
   2333    t3 = s1;                                    \
   2334    p3 = t0+t2;                                 \
   2335    p4 = t1+t3;                                 \
   2336    p1 = t0+t3;                                 \
   2337    p2 = t1+t2;                                 \
   2338    p5 = (p3+p4)*stbi__f2f( 1.175875602f);      \
   2339    t0 = t0*stbi__f2f( 0.298631336f);           \
   2340    t1 = t1*stbi__f2f( 2.053119869f);           \
   2341    t2 = t2*stbi__f2f( 3.072711026f);           \
   2342    t3 = t3*stbi__f2f( 1.501321110f);           \
   2343    p1 = p5 + p1*stbi__f2f(-0.899976223f);      \
   2344    p2 = p5 + p2*stbi__f2f(-2.562915447f);      \
   2345    p3 = p3*stbi__f2f(-1.961570560f);           \
   2346    p4 = p4*stbi__f2f(-0.390180644f);           \
   2347    t3 += p1+p4;                                \
   2348    t2 += p2+p3;                                \
   2349    t1 += p2+p4;                                \
   2350    t0 += p1+p3;
   2351 
   2352 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
   2353 {
   2354    int i,val[64],*v=val;
   2355    stbi_uc *o;
   2356    short *d = data;
   2357 
   2358    // columns
   2359    for (i=0; i < 8; ++i,++d, ++v) {
   2360       // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
   2361       if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
   2362            && d[40]==0 && d[48]==0 && d[56]==0) {
   2363          //    no shortcut                 0     seconds
   2364          //    (1|2|3|4|5|6|7)==0          0     seconds
   2365          //    all separate               -0.047 seconds
   2366          //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
   2367          int dcterm = d[0]*4;
   2368          v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
   2369       } else {
   2370          STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
   2371          // constants scaled things up by 1<<12; let's bring them back
   2372          // down, but keep 2 extra bits of precision
   2373          x0 += 512; x1 += 512; x2 += 512; x3 += 512;
   2374          v[ 0] = (x0+t3) >> 10;
   2375          v[56] = (x0-t3) >> 10;
   2376          v[ 8] = (x1+t2) >> 10;
   2377          v[48] = (x1-t2) >> 10;
   2378          v[16] = (x2+t1) >> 10;
   2379          v[40] = (x2-t1) >> 10;
   2380          v[24] = (x3+t0) >> 10;
   2381          v[32] = (x3-t0) >> 10;
   2382       }
   2383    }
   2384 
   2385    for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
   2386       // no fast case since the first 1D IDCT spread components out
   2387       STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
   2388       // constants scaled things up by 1<<12, plus we had 1<<2 from first
   2389       // loop, plus horizontal and vertical each scale by sqrt(8) so together
   2390       // we've got an extra 1<<3, so 1<<17 total we need to remove.
   2391       // so we want to round that, which means adding 0.5 * 1<<17,
   2392       // aka 65536. Also, we'll end up with -128 to 127 that we want
   2393       // to encode as 0..255 by adding 128, so we'll add that before the shift
   2394       x0 += 65536 + (128<<17);
   2395       x1 += 65536 + (128<<17);
   2396       x2 += 65536 + (128<<17);
   2397       x3 += 65536 + (128<<17);
   2398       // tried computing the shifts into temps, or'ing the temps to see
   2399       // if any were out of range, but that was slower
   2400       o[0] = stbi__clamp((x0+t3) >> 17);
   2401       o[7] = stbi__clamp((x0-t3) >> 17);
   2402       o[1] = stbi__clamp((x1+t2) >> 17);
   2403       o[6] = stbi__clamp((x1-t2) >> 17);
   2404       o[2] = stbi__clamp((x2+t1) >> 17);
   2405       o[5] = stbi__clamp((x2-t1) >> 17);
   2406       o[3] = stbi__clamp((x3+t0) >> 17);
   2407       o[4] = stbi__clamp((x3-t0) >> 17);
   2408    }
   2409 }
   2410 
   2411 #ifdef STBI_SSE2
   2412 // sse2 integer IDCT. not the fastest possible implementation but it
   2413 // produces bit-identical results to the generic C version so it's
   2414 // fully "transparent".
   2415 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
   2416 {
   2417    // This is constructed to match our regular (generic) integer IDCT exactly.
   2418    __m128i row0, row1, row2, row3, row4, row5, row6, row7;
   2419    __m128i tmp;
   2420 
   2421    // dot product constant: even elems=x, odd elems=y
   2422    #define dct_const(x,y)  _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
   2423 
   2424    // out(0) = c0[even]*x + c0[odd]*y   (c0, x, y 16-bit, out 32-bit)
   2425    // out(1) = c1[even]*x + c1[odd]*y
   2426    #define dct_rot(out0,out1, x,y,c0,c1) \
   2427       __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
   2428       __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
   2429       __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
   2430       __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
   2431       __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
   2432       __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
   2433 
   2434    // out = in << 12  (in 16-bit, out 32-bit)
   2435    #define dct_widen(out, in) \
   2436       __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
   2437       __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
   2438 
   2439    // wide add
   2440    #define dct_wadd(out, a, b) \
   2441       __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
   2442       __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
   2443 
   2444    // wide sub
   2445    #define dct_wsub(out, a, b) \
   2446       __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
   2447       __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
   2448 
   2449    // butterfly a/b, add bias, then shift by "s" and pack
   2450    #define dct_bfly32o(out0, out1, a,b,bias,s) \
   2451       { \
   2452          __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
   2453          __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
   2454          dct_wadd(sum, abiased, b); \
   2455          dct_wsub(dif, abiased, b); \
   2456          out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
   2457          out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
   2458       }
   2459 
   2460    // 8-bit interleave step (for transposes)
   2461    #define dct_interleave8(a, b) \
   2462       tmp = a; \
   2463       a = _mm_unpacklo_epi8(a, b); \
   2464       b = _mm_unpackhi_epi8(tmp, b)
   2465 
   2466    // 16-bit interleave step (for transposes)
   2467    #define dct_interleave16(a, b) \
   2468       tmp = a; \
   2469       a = _mm_unpacklo_epi16(a, b); \
   2470       b = _mm_unpackhi_epi16(tmp, b)
   2471 
   2472    #define dct_pass(bias,shift) \
   2473       { \
   2474          /* even part */ \
   2475          dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
   2476          __m128i sum04 = _mm_add_epi16(row0, row4); \
   2477          __m128i dif04 = _mm_sub_epi16(row0, row4); \
   2478          dct_widen(t0e, sum04); \
   2479          dct_widen(t1e, dif04); \
   2480          dct_wadd(x0, t0e, t3e); \
   2481          dct_wsub(x3, t0e, t3e); \
   2482          dct_wadd(x1, t1e, t2e); \
   2483          dct_wsub(x2, t1e, t2e); \
   2484          /* odd part */ \
   2485          dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
   2486          dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
   2487          __m128i sum17 = _mm_add_epi16(row1, row7); \
   2488          __m128i sum35 = _mm_add_epi16(row3, row5); \
   2489          dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
   2490          dct_wadd(x4, y0o, y4o); \
   2491          dct_wadd(x5, y1o, y5o); \
   2492          dct_wadd(x6, y2o, y5o); \
   2493          dct_wadd(x7, y3o, y4o); \
   2494          dct_bfly32o(row0,row7, x0,x7,bias,shift); \
   2495          dct_bfly32o(row1,row6, x1,x6,bias,shift); \
   2496          dct_bfly32o(row2,row5, x2,x5,bias,shift); \
   2497          dct_bfly32o(row3,row4, x3,x4,bias,shift); \
   2498       }
   2499 
   2500    __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
   2501    __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
   2502    __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
   2503    __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
   2504    __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
   2505    __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
   2506    __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
   2507    __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
   2508 
   2509    // rounding biases in column/row passes, see stbi__idct_block for explanation.
   2510    __m128i bias_0 = _mm_set1_epi32(512);
   2511    __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
   2512 
   2513    // load
   2514    row0 = _mm_load_si128((const __m128i *) (data + 0*8));
   2515    row1 = _mm_load_si128((const __m128i *) (data + 1*8));
   2516    row2 = _mm_load_si128((const __m128i *) (data + 2*8));
   2517    row3 = _mm_load_si128((const __m128i *) (data + 3*8));
   2518    row4 = _mm_load_si128((const __m128i *) (data + 4*8));
   2519    row5 = _mm_load_si128((const __m128i *) (data + 5*8));
   2520    row6 = _mm_load_si128((const __m128i *) (data + 6*8));
   2521    row7 = _mm_load_si128((const __m128i *) (data + 7*8));
   2522 
   2523    // column pass
   2524    dct_pass(bias_0, 10);
   2525 
   2526    {
   2527       // 16bit 8x8 transpose pass 1
   2528       dct_interleave16(row0, row4);
   2529       dct_interleave16(row1, row5);
   2530       dct_interleave16(row2, row6);
   2531       dct_interleave16(row3, row7);
   2532 
   2533       // transpose pass 2
   2534       dct_interleave16(row0, row2);
   2535       dct_interleave16(row1, row3);
   2536       dct_interleave16(row4, row6);
   2537       dct_interleave16(row5, row7);
   2538 
   2539       // transpose pass 3
   2540       dct_interleave16(row0, row1);
   2541       dct_interleave16(row2, row3);
   2542       dct_interleave16(row4, row5);
   2543       dct_interleave16(row6, row7);
   2544    }
   2545 
   2546    // row pass
   2547    dct_pass(bias_1, 17);
   2548 
   2549    {
   2550       // pack
   2551       __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
   2552       __m128i p1 = _mm_packus_epi16(row2, row3);
   2553       __m128i p2 = _mm_packus_epi16(row4, row5);
   2554       __m128i p3 = _mm_packus_epi16(row6, row7);
   2555 
   2556       // 8bit 8x8 transpose pass 1
   2557       dct_interleave8(p0, p2); // a0e0a1e1...
   2558       dct_interleave8(p1, p3); // c0g0c1g1...
   2559 
   2560       // transpose pass 2
   2561       dct_interleave8(p0, p1); // a0c0e0g0...
   2562       dct_interleave8(p2, p3); // b0d0f0h0...
   2563 
   2564       // transpose pass 3
   2565       dct_interleave8(p0, p2); // a0b0c0d0...
   2566       dct_interleave8(p1, p3); // a4b4c4d4...
   2567 
   2568       // store
   2569       _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
   2570       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
   2571       _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
   2572       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
   2573       _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
   2574       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
   2575       _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
   2576       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
   2577    }
   2578 
   2579 #undef dct_const
   2580 #undef dct_rot
   2581 #undef dct_widen
   2582 #undef dct_wadd
   2583 #undef dct_wsub
   2584 #undef dct_bfly32o
   2585 #undef dct_interleave8
   2586 #undef dct_interleave16
   2587 #undef dct_pass
   2588 }
   2589 
   2590 #endif // STBI_SSE2
   2591 
   2592 #ifdef STBI_NEON
   2593 
   2594 // NEON integer IDCT. should produce bit-identical
   2595 // results to the generic C version.
   2596 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
   2597 {
   2598    int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
   2599 
   2600    int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
   2601    int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
   2602    int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
   2603    int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
   2604    int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
   2605    int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
   2606    int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
   2607    int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
   2608    int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
   2609    int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
   2610    int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
   2611    int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
   2612 
   2613 #define dct_long_mul(out, inq, coeff) \
   2614    int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
   2615    int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
   2616 
   2617 #define dct_long_mac(out, acc, inq, coeff) \
   2618    int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
   2619    int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
   2620 
   2621 #define dct_widen(out, inq) \
   2622    int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
   2623    int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
   2624 
   2625 // wide add
   2626 #define dct_wadd(out, a, b) \
   2627    int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
   2628    int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
   2629 
   2630 // wide sub
   2631 #define dct_wsub(out, a, b) \
   2632    int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
   2633    int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
   2634 
   2635 // butterfly a/b, then shift using "shiftop" by "s" and pack
   2636 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
   2637    { \
   2638       dct_wadd(sum, a, b); \
   2639       dct_wsub(dif, a, b); \
   2640       out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
   2641       out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
   2642    }
   2643 
   2644 #define dct_pass(shiftop, shift) \
   2645    { \
   2646       /* even part */ \
   2647       int16x8_t sum26 = vaddq_s16(row2, row6); \
   2648       dct_long_mul(p1e, sum26, rot0_0); \
   2649       dct_long_mac(t2e, p1e, row6, rot0_1); \
   2650       dct_long_mac(t3e, p1e, row2, rot0_2); \
   2651       int16x8_t sum04 = vaddq_s16(row0, row4); \
   2652       int16x8_t dif04 = vsubq_s16(row0, row4); \
   2653       dct_widen(t0e, sum04); \
   2654       dct_widen(t1e, dif04); \
   2655       dct_wadd(x0, t0e, t3e); \
   2656       dct_wsub(x3, t0e, t3e); \
   2657       dct_wadd(x1, t1e, t2e); \
   2658       dct_wsub(x2, t1e, t2e); \
   2659       /* odd part */ \
   2660       int16x8_t sum15 = vaddq_s16(row1, row5); \
   2661       int16x8_t sum17 = vaddq_s16(row1, row7); \
   2662       int16x8_t sum35 = vaddq_s16(row3, row5); \
   2663       int16x8_t sum37 = vaddq_s16(row3, row7); \
   2664       int16x8_t sumodd = vaddq_s16(sum17, sum35); \
   2665       dct_long_mul(p5o, sumodd, rot1_0); \
   2666       dct_long_mac(p1o, p5o, sum17, rot1_1); \
   2667       dct_long_mac(p2o, p5o, sum35, rot1_2); \
   2668       dct_long_mul(p3o, sum37, rot2_0); \
   2669       dct_long_mul(p4o, sum15, rot2_1); \
   2670       dct_wadd(sump13o, p1o, p3o); \
   2671       dct_wadd(sump24o, p2o, p4o); \
   2672       dct_wadd(sump23o, p2o, p3o); \
   2673       dct_wadd(sump14o, p1o, p4o); \
   2674       dct_long_mac(x4, sump13o, row7, rot3_0); \
   2675       dct_long_mac(x5, sump24o, row5, rot3_1); \
   2676       dct_long_mac(x6, sump23o, row3, rot3_2); \
   2677       dct_long_mac(x7, sump14o, row1, rot3_3); \
   2678       dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
   2679       dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
   2680       dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
   2681       dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
   2682    }
   2683 
   2684    // load
   2685    row0 = vld1q_s16(data + 0*8);
   2686    row1 = vld1q_s16(data + 1*8);
   2687    row2 = vld1q_s16(data + 2*8);
   2688    row3 = vld1q_s16(data + 3*8);
   2689    row4 = vld1q_s16(data + 4*8);
   2690    row5 = vld1q_s16(data + 5*8);
   2691    row6 = vld1q_s16(data + 6*8);
   2692    row7 = vld1q_s16(data + 7*8);
   2693 
   2694    // add DC bias
   2695    row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
   2696 
   2697    // column pass
   2698    dct_pass(vrshrn_n_s32, 10);
   2699 
   2700    // 16bit 8x8 transpose
   2701    {
   2702 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
   2703 // whether compilers actually get this is another story, sadly.
   2704 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
   2705 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
   2706 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
   2707 
   2708       // pass 1
   2709       dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
   2710       dct_trn16(row2, row3);
   2711       dct_trn16(row4, row5);
   2712       dct_trn16(row6, row7);
   2713 
   2714       // pass 2
   2715       dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
   2716       dct_trn32(row1, row3);
   2717       dct_trn32(row4, row6);
   2718       dct_trn32(row5, row7);
   2719 
   2720       // pass 3
   2721       dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
   2722       dct_trn64(row1, row5);
   2723       dct_trn64(row2, row6);
   2724       dct_trn64(row3, row7);
   2725 
   2726 #undef dct_trn16
   2727 #undef dct_trn32
   2728 #undef dct_trn64
   2729    }
   2730 
   2731    // row pass
   2732    // vrshrn_n_s32 only supports shifts up to 16, we need
   2733    // 17. so do a non-rounding shift of 16 first then follow
   2734    // up with a rounding shift by 1.
   2735    dct_pass(vshrn_n_s32, 16);
   2736 
   2737    {
   2738       // pack and round
   2739       uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
   2740       uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
   2741       uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
   2742       uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
   2743       uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
   2744       uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
   2745       uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
   2746       uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
   2747 
   2748       // again, these can translate into one instruction, but often don't.
   2749 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
   2750 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
   2751 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
   2752 
   2753       // sadly can't use interleaved stores here since we only write
   2754       // 8 bytes to each scan line!
   2755 
   2756       // 8x8 8-bit transpose pass 1
   2757       dct_trn8_8(p0, p1);
   2758       dct_trn8_8(p2, p3);
   2759       dct_trn8_8(p4, p5);
   2760       dct_trn8_8(p6, p7);
   2761 
   2762       // pass 2
   2763       dct_trn8_16(p0, p2);
   2764       dct_trn8_16(p1, p3);
   2765       dct_trn8_16(p4, p6);
   2766       dct_trn8_16(p5, p7);
   2767 
   2768       // pass 3
   2769       dct_trn8_32(p0, p4);
   2770       dct_trn8_32(p1, p5);
   2771       dct_trn8_32(p2, p6);
   2772       dct_trn8_32(p3, p7);
   2773 
   2774       // store
   2775       vst1_u8(out, p0); out += out_stride;
   2776       vst1_u8(out, p1); out += out_stride;
   2777       vst1_u8(out, p2); out += out_stride;
   2778       vst1_u8(out, p3); out += out_stride;
   2779       vst1_u8(out, p4); out += out_stride;
   2780       vst1_u8(out, p5); out += out_stride;
   2781       vst1_u8(out, p6); out += out_stride;
   2782       vst1_u8(out, p7);
   2783 
   2784 #undef dct_trn8_8
   2785 #undef dct_trn8_16
   2786 #undef dct_trn8_32
   2787    }
   2788 
   2789 #undef dct_long_mul
   2790 #undef dct_long_mac
   2791 #undef dct_widen
   2792 #undef dct_wadd
   2793 #undef dct_wsub
   2794 #undef dct_bfly32o
   2795 #undef dct_pass
   2796 }
   2797 
   2798 #endif // STBI_NEON
   2799 
   2800 #define STBI__MARKER_none  0xff
   2801 // if there's a pending marker from the entropy stream, return that
   2802 // otherwise, fetch from the stream and get a marker. if there's no
   2803 // marker, return 0xff, which is never a valid marker value
   2804 static stbi_uc stbi__get_marker(stbi__jpeg *j)
   2805 {
   2806    stbi_uc x;
   2807    if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
   2808    x = stbi__get8(j->s);
   2809    if (x != 0xff) return STBI__MARKER_none;
   2810    while (x == 0xff)
   2811       x = stbi__get8(j->s); // consume repeated 0xff fill bytes
   2812    return x;
   2813 }
   2814 
   2815 // in each scan, we'll have scan_n components, and the order
   2816 // of the components is specified by order[]
   2817 #define STBI__RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
   2818 
   2819 // after a restart interval, stbi__jpeg_reset the entropy decoder and
   2820 // the dc prediction
   2821 static void stbi__jpeg_reset(stbi__jpeg *j)
   2822 {
   2823    j->code_bits = 0;
   2824    j->code_buffer = 0;
   2825    j->nomore = 0;
   2826    j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = j->img_comp[3].dc_pred = 0;
   2827    j->marker = STBI__MARKER_none;
   2828    j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
   2829    j->eob_run = 0;
   2830    // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
   2831    // since we don't even allow 1<<30 pixels
   2832 }
   2833 
   2834 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
   2835 {
   2836    stbi__jpeg_reset(z);
   2837    if (!z->progressive) {
   2838       if (z->scan_n == 1) {
   2839          int i,j;
   2840          STBI_SIMD_ALIGN(short, data[64]);
   2841          int n = z->order[0];
   2842          // non-interleaved data, we just need to process one block at a time,
   2843          // in trivial scanline order
   2844          // number of blocks to do just depends on how many actual "pixels" this
   2845          // component has, independent of interleaved MCU blocking and such
   2846          int w = (z->img_comp[n].x+7) >> 3;
   2847          int h = (z->img_comp[n].y+7) >> 3;
   2848          for (j=0; j < h; ++j) {
   2849             for (i=0; i < w; ++i) {
   2850                int ha = z->img_comp[n].ha;
   2851                if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
   2852                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
   2853                // every data block is an MCU, so countdown the restart interval
   2854                if (--z->todo <= 0) {
   2855                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   2856                   // if it's NOT a restart, then just bail, so we get corrupt data
   2857                   // rather than no data
   2858                   if (!STBI__RESTART(z->marker)) return 1;
   2859                   stbi__jpeg_reset(z);
   2860                }
   2861             }
   2862          }
   2863          return 1;
   2864       } else { // interleaved
   2865          int i,j,k,x,y;
   2866          STBI_SIMD_ALIGN(short, data[64]);
   2867          for (j=0; j < z->img_mcu_y; ++j) {
   2868             for (i=0; i < z->img_mcu_x; ++i) {
   2869                // scan an interleaved mcu... process scan_n components in order
   2870                for (k=0; k < z->scan_n; ++k) {
   2871                   int n = z->order[k];
   2872                   // scan out an mcu's worth of this component; that's just determined
   2873                   // by the basic H and V specified for the component
   2874                   for (y=0; y < z->img_comp[n].v; ++y) {
   2875                      for (x=0; x < z->img_comp[n].h; ++x) {
   2876                         int x2 = (i*z->img_comp[n].h + x)*8;
   2877                         int y2 = (j*z->img_comp[n].v + y)*8;
   2878                         int ha = z->img_comp[n].ha;
   2879                         if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
   2880                         z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
   2881                      }
   2882                   }
   2883                }
   2884                // after all interleaved components, that's an interleaved MCU,
   2885                // so now count down the restart interval
   2886                if (--z->todo <= 0) {
   2887                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   2888                   if (!STBI__RESTART(z->marker)) return 1;
   2889                   stbi__jpeg_reset(z);
   2890                }
   2891             }
   2892          }
   2893          return 1;
   2894       }
   2895    } else {
   2896       if (z->scan_n == 1) {
   2897          int i,j;
   2898          int n = z->order[0];
   2899          // non-interleaved data, we just need to process one block at a time,
   2900          // in trivial scanline order
   2901          // number of blocks to do just depends on how many actual "pixels" this
   2902          // component has, independent of interleaved MCU blocking and such
   2903          int w = (z->img_comp[n].x+7) >> 3;
   2904          int h = (z->img_comp[n].y+7) >> 3;
   2905          for (j=0; j < h; ++j) {
   2906             for (i=0; i < w; ++i) {
   2907                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
   2908                if (z->spec_start == 0) {
   2909                   if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
   2910                      return 0;
   2911                } else {
   2912                   int ha = z->img_comp[n].ha;
   2913                   if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
   2914                      return 0;
   2915                }
   2916                // every data block is an MCU, so countdown the restart interval
   2917                if (--z->todo <= 0) {
   2918                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   2919                   if (!STBI__RESTART(z->marker)) return 1;
   2920                   stbi__jpeg_reset(z);
   2921                }
   2922             }
   2923          }
   2924          return 1;
   2925       } else { // interleaved
   2926          int i,j,k,x,y;
   2927          for (j=0; j < z->img_mcu_y; ++j) {
   2928             for (i=0; i < z->img_mcu_x; ++i) {
   2929                // scan an interleaved mcu... process scan_n components in order
   2930                for (k=0; k < z->scan_n; ++k) {
   2931                   int n = z->order[k];
   2932                   // scan out an mcu's worth of this component; that's just determined
   2933                   // by the basic H and V specified for the component
   2934                   for (y=0; y < z->img_comp[n].v; ++y) {
   2935                      for (x=0; x < z->img_comp[n].h; ++x) {
   2936                         int x2 = (i*z->img_comp[n].h + x);
   2937                         int y2 = (j*z->img_comp[n].v + y);
   2938                         short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
   2939                         if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
   2940                            return 0;
   2941                      }
   2942                   }
   2943                }
   2944                // after all interleaved components, that's an interleaved MCU,
   2945                // so now count down the restart interval
   2946                if (--z->todo <= 0) {
   2947                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   2948                   if (!STBI__RESTART(z->marker)) return 1;
   2949                   stbi__jpeg_reset(z);
   2950                }
   2951             }
   2952          }
   2953          return 1;
   2954       }
   2955    }
   2956 }
   2957 
   2958 static void stbi__jpeg_dequantize(short *data, stbi__uint16 *dequant)
   2959 {
   2960    int i;
   2961    for (i=0; i < 64; ++i)
   2962       data[i] *= dequant[i];
   2963 }
   2964 
   2965 static void stbi__jpeg_finish(stbi__jpeg *z)
   2966 {
   2967    if (z->progressive) {
   2968       // dequantize and idct the data
   2969       int i,j,n;
   2970       for (n=0; n < z->s->img_n; ++n) {
   2971          int w = (z->img_comp[n].x+7) >> 3;
   2972          int h = (z->img_comp[n].y+7) >> 3;
   2973          for (j=0; j < h; ++j) {
   2974             for (i=0; i < w; ++i) {
   2975                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
   2976                stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
   2977                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
   2978             }
   2979          }
   2980       }
   2981    }
   2982 }
   2983 
   2984 static int stbi__process_marker(stbi__jpeg *z, int m)
   2985 {
   2986    int L;
   2987    switch (m) {
   2988       case STBI__MARKER_none: // no marker found
   2989          return stbi__err("expected marker","Corrupt JPEG");
   2990 
   2991       case 0xDD: // DRI - specify restart interval
   2992          if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
   2993          z->restart_interval = stbi__get16be(z->s);
   2994          return 1;
   2995 
   2996       case 0xDB: // DQT - define quantization table
   2997          L = stbi__get16be(z->s)-2;
   2998          while (L > 0) {
   2999             int q = stbi__get8(z->s);
   3000             int p = q >> 4, sixteen = (p != 0);
   3001             int t = q & 15,i;
   3002             if (p != 0 && p != 1) return stbi__err("bad DQT type","Corrupt JPEG");
   3003             if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
   3004 
   3005             for (i=0; i < 64; ++i)
   3006                z->dequant[t][stbi__jpeg_dezigzag[i]] = (stbi__uint16)(sixteen ? stbi__get16be(z->s) : stbi__get8(z->s));
   3007             L -= (sixteen ? 129 : 65);
   3008          }
   3009          return L==0;
   3010 
   3011       case 0xC4: // DHT - define huffman table
   3012          L = stbi__get16be(z->s)-2;
   3013          while (L > 0) {
   3014             stbi_uc *v;
   3015             int sizes[16],i,n=0;
   3016             int q = stbi__get8(z->s);
   3017             int tc = q >> 4;
   3018             int th = q & 15;
   3019             if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
   3020             for (i=0; i < 16; ++i) {
   3021                sizes[i] = stbi__get8(z->s);
   3022                n += sizes[i];
   3023             }
   3024             L -= 17;
   3025             if (tc == 0) {
   3026                if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
   3027                v = z->huff_dc[th].values;
   3028             } else {
   3029                if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
   3030                v = z->huff_ac[th].values;
   3031             }
   3032             for (i=0; i < n; ++i)
   3033                v[i] = stbi__get8(z->s);
   3034             if (tc != 0)
   3035                stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
   3036             L -= n;
   3037          }
   3038          return L==0;
   3039    }
   3040 
   3041    // check for comment block or APP blocks
   3042    if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
   3043       L = stbi__get16be(z->s);
   3044       if (L < 2) {
   3045          if (m == 0xFE)
   3046             return stbi__err("bad COM len","Corrupt JPEG");
   3047          else
   3048             return stbi__err("bad APP len","Corrupt JPEG");
   3049       }
   3050       L -= 2;
   3051 
   3052       if (m == 0xE0 && L >= 5) { // JFIF APP0 segment
   3053          static const unsigned char tag[5] = {'J','F','I','F','\0'};
   3054          int ok = 1;
   3055          int i;
   3056          for (i=0; i < 5; ++i)
   3057             if (stbi__get8(z->s) != tag[i])
   3058                ok = 0;
   3059          L -= 5;
   3060          if (ok)
   3061             z->jfif = 1;
   3062       } else if (m == 0xEE && L >= 12) { // Adobe APP14 segment
   3063          static const unsigned char tag[6] = {'A','d','o','b','e','\0'};
   3064          int ok = 1;
   3065          int i;
   3066          for (i=0; i < 6; ++i)
   3067             if (stbi__get8(z->s) != tag[i])
   3068                ok = 0;
   3069          L -= 6;
   3070          if (ok) {
   3071             stbi__get8(z->s); // version
   3072             stbi__get16be(z->s); // flags0
   3073             stbi__get16be(z->s); // flags1
   3074             z->app14_color_transform = stbi__get8(z->s); // color transform
   3075             L -= 6;
   3076          }
   3077       }
   3078 
   3079       stbi__skip(z->s, L);
   3080       return 1;
   3081    }
   3082 
   3083    return stbi__err("unknown marker","Corrupt JPEG");
   3084 }
   3085 
   3086 // after we see SOS
   3087 static int stbi__process_scan_header(stbi__jpeg *z)
   3088 {
   3089    int i;
   3090    int Ls = stbi__get16be(z->s);
   3091    z->scan_n = stbi__get8(z->s);
   3092    if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
   3093    if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
   3094    for (i=0; i < z->scan_n; ++i) {
   3095       int id = stbi__get8(z->s), which;
   3096       int q = stbi__get8(z->s);
   3097       for (which = 0; which < z->s->img_n; ++which)
   3098          if (z->img_comp[which].id == id)
   3099             break;
   3100       if (which == z->s->img_n) return 0; // no match
   3101       z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
   3102       z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
   3103       z->order[i] = which;
   3104    }
   3105 
   3106    {
   3107       int aa;
   3108       z->spec_start = stbi__get8(z->s);
   3109       z->spec_end   = stbi__get8(z->s); // should be 63, but might be 0
   3110       aa = stbi__get8(z->s);
   3111       z->succ_high = (aa >> 4);
   3112       z->succ_low  = (aa & 15);
   3113       if (z->progressive) {
   3114          if (z->spec_start > 63 || z->spec_end > 63  || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
   3115             return stbi__err("bad SOS", "Corrupt JPEG");
   3116       } else {
   3117          if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
   3118          if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
   3119          z->spec_end = 63;
   3120       }
   3121    }
   3122 
   3123    return 1;
   3124 }
   3125 
   3126 static int stbi__free_jpeg_components(stbi__jpeg *z, int ncomp, int why)
   3127 {
   3128    int i;
   3129    for (i=0; i < ncomp; ++i) {
   3130       if (z->img_comp[i].raw_data) {
   3131          STBI_FREE(z->img_comp[i].raw_data);
   3132          z->img_comp[i].raw_data = NULL;
   3133          z->img_comp[i].data = NULL;
   3134       }
   3135       if (z->img_comp[i].raw_coeff) {
   3136          STBI_FREE(z->img_comp[i].raw_coeff);
   3137          z->img_comp[i].raw_coeff = 0;
   3138          z->img_comp[i].coeff = 0;
   3139       }
   3140       if (z->img_comp[i].linebuf) {
   3141          STBI_FREE(z->img_comp[i].linebuf);
   3142          z->img_comp[i].linebuf = NULL;
   3143       }
   3144    }
   3145    return why;
   3146 }
   3147 
   3148 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
   3149 {
   3150    stbi__context *s = z->s;
   3151    int Lf,p,i,q, h_max=1,v_max=1,c;
   3152    Lf = stbi__get16be(s);         if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
   3153    p  = stbi__get8(s);            if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
   3154    s->img_y = stbi__get16be(s);   if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
   3155    s->img_x = stbi__get16be(s);   if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
   3156    c = stbi__get8(s);
   3157    if (c != 3 && c != 1 && c != 4) return stbi__err("bad component count","Corrupt JPEG");
   3158    s->img_n = c;
   3159    for (i=0; i < c; ++i) {
   3160       z->img_comp[i].data = NULL;
   3161       z->img_comp[i].linebuf = NULL;
   3162    }
   3163 
   3164    if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
   3165 
   3166    z->rgb = 0;
   3167    for (i=0; i < s->img_n; ++i) {
   3168       static const unsigned char rgb[3] = { 'R', 'G', 'B' };
   3169       z->img_comp[i].id = stbi__get8(s);
   3170       if (s->img_n == 3 && z->img_comp[i].id == rgb[i])
   3171          ++z->rgb;
   3172       q = stbi__get8(s);
   3173       z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
   3174       z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
   3175       z->img_comp[i].tq = stbi__get8(s);  if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
   3176    }
   3177 
   3178    if (scan != STBI__SCAN_load) return 1;
   3179 
   3180    if (!stbi__mad3sizes_valid(s->img_x, s->img_y, s->img_n, 0)) return stbi__err("too large", "Image too large to decode");
   3181 
   3182    for (i=0; i < s->img_n; ++i) {
   3183       if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
   3184       if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
   3185    }
   3186 
   3187    // compute interleaved mcu info
   3188    z->img_h_max = h_max;
   3189    z->img_v_max = v_max;
   3190    z->img_mcu_w = h_max * 8;
   3191    z->img_mcu_h = v_max * 8;
   3192    // these sizes can't be more than 17 bits
   3193    z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
   3194    z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
   3195 
   3196    for (i=0; i < s->img_n; ++i) {
   3197       // number of effective pixels (e.g. for non-interleaved MCU)
   3198       z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
   3199       z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
   3200       // to simplify generation, we'll allocate enough memory to decode
   3201       // the bogus oversized data from using interleaved MCUs and their
   3202       // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
   3203       // discard the extra data until colorspace conversion
   3204       //
   3205       // img_mcu_x, img_mcu_y: <=17 bits; comp[i].h and .v are <=4 (checked earlier)
   3206       // so these muls can't overflow with 32-bit ints (which we require)
   3207       z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
   3208       z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
   3209       z->img_comp[i].coeff = 0;
   3210       z->img_comp[i].raw_coeff = 0;
   3211       z->img_comp[i].linebuf = NULL;
   3212       z->img_comp[i].raw_data = stbi__malloc_mad2(z->img_comp[i].w2, z->img_comp[i].h2, 15);
   3213       if (z->img_comp[i].raw_data == NULL)
   3214          return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
   3215       // align blocks for idct using mmx/sse
   3216       z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
   3217       if (z->progressive) {
   3218          // w2, h2 are multiples of 8 (see above)
   3219          z->img_comp[i].coeff_w = z->img_comp[i].w2 / 8;
   3220          z->img_comp[i].coeff_h = z->img_comp[i].h2 / 8;
   3221          z->img_comp[i].raw_coeff = stbi__malloc_mad3(z->img_comp[i].w2, z->img_comp[i].h2, sizeof(short), 15);
   3222          if (z->img_comp[i].raw_coeff == NULL)
   3223             return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
   3224          z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
   3225       }
   3226    }
   3227 
   3228    return 1;
   3229 }
   3230 
   3231 // use comparisons since in some cases we handle more than one case (e.g. SOF)
   3232 #define stbi__DNL(x)         ((x) == 0xdc)
   3233 #define stbi__SOI(x)         ((x) == 0xd8)
   3234 #define stbi__EOI(x)         ((x) == 0xd9)
   3235 #define stbi__SOF(x)         ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
   3236 #define stbi__SOS(x)         ((x) == 0xda)
   3237 
   3238 #define stbi__SOF_progressive(x)   ((x) == 0xc2)
   3239 
   3240 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
   3241 {
   3242    int m;
   3243    z->jfif = 0;
   3244    z->app14_color_transform = -1; // valid values are 0,1,2
   3245    z->marker = STBI__MARKER_none; // initialize cached marker to empty
   3246    m = stbi__get_marker(z);
   3247    if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
   3248    if (scan == STBI__SCAN_type) return 1;
   3249    m = stbi__get_marker(z);
   3250    while (!stbi__SOF(m)) {
   3251       if (!stbi__process_marker(z,m)) return 0;
   3252       m = stbi__get_marker(z);
   3253       while (m == STBI__MARKER_none) {
   3254          // some files have extra padding after their blocks, so ok, we'll scan
   3255          if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
   3256          m = stbi__get_marker(z);
   3257       }
   3258    }
   3259    z->progressive = stbi__SOF_progressive(m);
   3260    if (!stbi__process_frame_header(z, scan)) return 0;
   3261    return 1;
   3262 }
   3263 
   3264 // decode image to YCbCr format
   3265 static int stbi__decode_jpeg_image(stbi__jpeg *j)
   3266 {
   3267    int m;
   3268    for (m = 0; m < 4; m++) {
   3269       j->img_comp[m].raw_data = NULL;
   3270       j->img_comp[m].raw_coeff = NULL;
   3271    }
   3272    j->restart_interval = 0;
   3273    if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
   3274    m = stbi__get_marker(j);
   3275    while (!stbi__EOI(m)) {
   3276       if (stbi__SOS(m)) {
   3277          if (!stbi__process_scan_header(j)) return 0;
   3278          if (!stbi__parse_entropy_coded_data(j)) return 0;
   3279          if (j->marker == STBI__MARKER_none ) {
   3280             // handle 0s at the end of image data from IP Kamera 9060
   3281             while (!stbi__at_eof(j->s)) {
   3282                int x = stbi__get8(j->s);
   3283                if (x == 255) {
   3284                   j->marker = stbi__get8(j->s);
   3285                   break;
   3286                }
   3287             }
   3288             // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
   3289          }
   3290       } else if (stbi__DNL(m)) {
   3291          int Ld = stbi__get16be(j->s);
   3292          stbi__uint32 NL = stbi__get16be(j->s);
   3293          if (Ld != 4) return stbi__err("bad DNL len", "Corrupt JPEG");
   3294          if (NL != j->s->img_y) return stbi__err("bad DNL height", "Corrupt JPEG");
   3295       } else {
   3296          if (!stbi__process_marker(j, m)) return 0;
   3297       }
   3298       m = stbi__get_marker(j);
   3299    }
   3300    if (j->progressive)
   3301       stbi__jpeg_finish(j);
   3302    return 1;
   3303 }
   3304 
   3305 // static jfif-centered resampling (across block boundaries)
   3306 
   3307 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
   3308                                     int w, int hs);
   3309 
   3310 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
   3311 
   3312 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3313 {
   3314    STBI_NOTUSED(out);
   3315    STBI_NOTUSED(in_far);
   3316    STBI_NOTUSED(w);
   3317    STBI_NOTUSED(hs);
   3318    return in_near;
   3319 }
   3320 
   3321 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3322 {
   3323    // need to generate two samples vertically for every one in input
   3324    int i;
   3325    STBI_NOTUSED(hs);
   3326    for (i=0; i < w; ++i)
   3327       out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
   3328    return out;
   3329 }
   3330 
   3331 static stbi_uc*  stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3332 {
   3333    // need to generate two samples horizontally for every one in input
   3334    int i;
   3335    stbi_uc *input = in_near;
   3336 
   3337    if (w == 1) {
   3338       // if only one sample, can't do any interpolation
   3339       out[0] = out[1] = input[0];
   3340       return out;
   3341    }
   3342 
   3343    out[0] = input[0];
   3344    out[1] = stbi__div4(input[0]*3 + input[1] + 2);
   3345    for (i=1; i < w-1; ++i) {
   3346       int n = 3*input[i]+2;
   3347       out[i*2+0] = stbi__div4(n+input[i-1]);
   3348       out[i*2+1] = stbi__div4(n+input[i+1]);
   3349    }
   3350    out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
   3351    out[i*2+1] = input[w-1];
   3352 
   3353    STBI_NOTUSED(in_far);
   3354    STBI_NOTUSED(hs);
   3355 
   3356    return out;
   3357 }
   3358 
   3359 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
   3360 
   3361 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3362 {
   3363    // need to generate 2x2 samples for every one in input
   3364    int i,t0,t1;
   3365    if (w == 1) {
   3366       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
   3367       return out;
   3368    }
   3369 
   3370    t1 = 3*in_near[0] + in_far[0];
   3371    out[0] = stbi__div4(t1+2);
   3372    for (i=1; i < w; ++i) {
   3373       t0 = t1;
   3374       t1 = 3*in_near[i]+in_far[i];
   3375       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
   3376       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
   3377    }
   3378    out[w*2-1] = stbi__div4(t1+2);
   3379 
   3380    STBI_NOTUSED(hs);
   3381 
   3382    return out;
   3383 }
   3384 
   3385 #if defined(STBI_SSE2) || defined(STBI_NEON)
   3386 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3387 {
   3388    // need to generate 2x2 samples for every one in input
   3389    int i=0,t0,t1;
   3390 
   3391    if (w == 1) {
   3392       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
   3393       return out;
   3394    }
   3395 
   3396    t1 = 3*in_near[0] + in_far[0];
   3397    // process groups of 8 pixels for as long as we can.
   3398    // note we can't handle the last pixel in a row in this loop
   3399    // because we need to handle the filter boundary conditions.
   3400    for (; i < ((w-1) & ~7); i += 8) {
   3401 #if defined(STBI_SSE2)
   3402       // load and perform the vertical filtering pass
   3403       // this uses 3*x + y = 4*x + (y - x)
   3404       __m128i zero  = _mm_setzero_si128();
   3405       __m128i farb  = _mm_loadl_epi64((__m128i *) (in_far + i));
   3406       __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
   3407       __m128i farw  = _mm_unpacklo_epi8(farb, zero);
   3408       __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
   3409       __m128i diff  = _mm_sub_epi16(farw, nearw);
   3410       __m128i nears = _mm_slli_epi16(nearw, 2);
   3411       __m128i curr  = _mm_add_epi16(nears, diff); // current row
   3412 
   3413       // horizontal filter works the same based on shifted vers of current
   3414       // row. "prev" is current row shifted right by 1 pixel; we need to
   3415       // insert the previous pixel value (from t1).
   3416       // "next" is current row shifted left by 1 pixel, with first pixel
   3417       // of next block of 8 pixels added in.
   3418       __m128i prv0 = _mm_slli_si128(curr, 2);
   3419       __m128i nxt0 = _mm_srli_si128(curr, 2);
   3420       __m128i prev = _mm_insert_epi16(prv0, t1, 0);
   3421       __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
   3422 
   3423       // horizontal filter, polyphase implementation since it's convenient:
   3424       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
   3425       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
   3426       // note the shared term.
   3427       __m128i bias  = _mm_set1_epi16(8);
   3428       __m128i curs = _mm_slli_epi16(curr, 2);
   3429       __m128i prvd = _mm_sub_epi16(prev, curr);
   3430       __m128i nxtd = _mm_sub_epi16(next, curr);
   3431       __m128i curb = _mm_add_epi16(curs, bias);
   3432       __m128i even = _mm_add_epi16(prvd, curb);
   3433       __m128i odd  = _mm_add_epi16(nxtd, curb);
   3434 
   3435       // interleave even and odd pixels, then undo scaling.
   3436       __m128i int0 = _mm_unpacklo_epi16(even, odd);
   3437       __m128i int1 = _mm_unpackhi_epi16(even, odd);
   3438       __m128i de0  = _mm_srli_epi16(int0, 4);
   3439       __m128i de1  = _mm_srli_epi16(int1, 4);
   3440 
   3441       // pack and write output
   3442       __m128i outv = _mm_packus_epi16(de0, de1);
   3443       _mm_storeu_si128((__m128i *) (out + i*2), outv);
   3444 #elif defined(STBI_NEON)
   3445       // load and perform the vertical filtering pass
   3446       // this uses 3*x + y = 4*x + (y - x)
   3447       uint8x8_t farb  = vld1_u8(in_far + i);
   3448       uint8x8_t nearb = vld1_u8(in_near + i);
   3449       int16x8_t diff  = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
   3450       int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
   3451       int16x8_t curr  = vaddq_s16(nears, diff); // current row
   3452 
   3453       // horizontal filter works the same based on shifted vers of current
   3454       // row. "prev" is current row shifted right by 1 pixel; we need to
   3455       // insert the previous pixel value (from t1).
   3456       // "next" is current row shifted left by 1 pixel, with first pixel
   3457       // of next block of 8 pixels added in.
   3458       int16x8_t prv0 = vextq_s16(curr, curr, 7);
   3459       int16x8_t nxt0 = vextq_s16(curr, curr, 1);
   3460       int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
   3461       int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
   3462 
   3463       // horizontal filter, polyphase implementation since it's convenient:
   3464       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
   3465       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
   3466       // note the shared term.
   3467       int16x8_t curs = vshlq_n_s16(curr, 2);
   3468       int16x8_t prvd = vsubq_s16(prev, curr);
   3469       int16x8_t nxtd = vsubq_s16(next, curr);
   3470       int16x8_t even = vaddq_s16(curs, prvd);
   3471       int16x8_t odd  = vaddq_s16(curs, nxtd);
   3472 
   3473       // undo scaling and round, then store with even/odd phases interleaved
   3474       uint8x8x2_t o;
   3475       o.val[0] = vqrshrun_n_s16(even, 4);
   3476       o.val[1] = vqrshrun_n_s16(odd,  4);
   3477       vst2_u8(out + i*2, o);
   3478 #endif
   3479 
   3480       // "previous" value for next iter
   3481       t1 = 3*in_near[i+7] + in_far[i+7];
   3482    }
   3483 
   3484    t0 = t1;
   3485    t1 = 3*in_near[i] + in_far[i];
   3486    out[i*2] = stbi__div16(3*t1 + t0 + 8);
   3487 
   3488    for (++i; i < w; ++i) {
   3489       t0 = t1;
   3490       t1 = 3*in_near[i]+in_far[i];
   3491       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
   3492       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
   3493    }
   3494    out[w*2-1] = stbi__div4(t1+2);
   3495 
   3496    STBI_NOTUSED(hs);
   3497 
   3498    return out;
   3499 }
   3500 #endif
   3501 
   3502 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3503 {
   3504    // resample with nearest-neighbor
   3505    int i,j;
   3506    STBI_NOTUSED(in_far);
   3507    for (i=0; i < w; ++i)
   3508       for (j=0; j < hs; ++j)
   3509          out[i*hs+j] = in_near[i];
   3510    return out;
   3511 }
   3512 
   3513 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
   3514 // to make sure the code produces the same results in both SIMD and scalar
   3515 #define stbi__float2fixed(x)  (((int) ((x) * 4096.0f + 0.5f)) << 8)
   3516 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
   3517 {
   3518    int i;
   3519    for (i=0; i < count; ++i) {
   3520       int y_fixed = (y[i] << 20) + (1<<19); // rounding
   3521       int r,g,b;
   3522       int cr = pcr[i] - 128;
   3523       int cb = pcb[i] - 128;
   3524       r = y_fixed +  cr* stbi__float2fixed(1.40200f);
   3525       g = y_fixed + (cr*-stbi__float2fixed(0.71414f)) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
   3526       b = y_fixed                                     +   cb* stbi__float2fixed(1.77200f);
   3527       r >>= 20;
   3528       g >>= 20;
   3529       b >>= 20;
   3530       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
   3531       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
   3532       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
   3533       out[0] = (stbi_uc)r;
   3534       out[1] = (stbi_uc)g;
   3535       out[2] = (stbi_uc)b;
   3536       out[3] = 255;
   3537       out += step;
   3538    }
   3539 }
   3540 
   3541 #if defined(STBI_SSE2) || defined(STBI_NEON)
   3542 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
   3543 {
   3544    int i = 0;
   3545 
   3546 #ifdef STBI_SSE2
   3547    // step == 3 is pretty ugly on the final interleave, and i'm not convinced
   3548    // it's useful in practice (you wouldn't use it for textures, for example).
   3549    // so just accelerate step == 4 case.
   3550    if (step == 4) {
   3551       // this is a fairly straightforward implementation and not super-optimized.
   3552       __m128i signflip  = _mm_set1_epi8(-0x80);
   3553       __m128i cr_const0 = _mm_set1_epi16(   (short) ( 1.40200f*4096.0f+0.5f));
   3554       __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
   3555       __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
   3556       __m128i cb_const1 = _mm_set1_epi16(   (short) ( 1.77200f*4096.0f+0.5f));
   3557       __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
   3558       __m128i xw = _mm_set1_epi16(255); // alpha channel
   3559 
   3560       for (; i+7 < count; i += 8) {
   3561          // load
   3562          __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
   3563          __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
   3564          __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
   3565          __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
   3566          __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
   3567 
   3568          // unpack to short (and left-shift cr, cb by 8)
   3569          __m128i yw  = _mm_unpacklo_epi8(y_bias, y_bytes);
   3570          __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
   3571          __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
   3572 
   3573          // color transform
   3574          __m128i yws = _mm_srli_epi16(yw, 4);
   3575          __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
   3576          __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
   3577          __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
   3578          __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
   3579          __m128i rws = _mm_add_epi16(cr0, yws);
   3580          __m128i gwt = _mm_add_epi16(cb0, yws);
   3581          __m128i bws = _mm_add_epi16(yws, cb1);
   3582          __m128i gws = _mm_add_epi16(gwt, cr1);
   3583 
   3584          // descale
   3585          __m128i rw = _mm_srai_epi16(rws, 4);
   3586          __m128i bw = _mm_srai_epi16(bws, 4);
   3587          __m128i gw = _mm_srai_epi16(gws, 4);
   3588 
   3589          // back to byte, set up for transpose
   3590          __m128i brb = _mm_packus_epi16(rw, bw);
   3591          __m128i gxb = _mm_packus_epi16(gw, xw);
   3592 
   3593          // transpose to interleave channels
   3594          __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
   3595          __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
   3596          __m128i o0 = _mm_unpacklo_epi16(t0, t1);
   3597          __m128i o1 = _mm_unpackhi_epi16(t0, t1);
   3598 
   3599          // store
   3600          _mm_storeu_si128((__m128i *) (out + 0), o0);
   3601          _mm_storeu_si128((__m128i *) (out + 16), o1);
   3602          out += 32;
   3603       }
   3604    }
   3605 #endif
   3606 
   3607 #ifdef STBI_NEON
   3608    // in this version, step=3 support would be easy to add. but is there demand?
   3609    if (step == 4) {
   3610       // this is a fairly straightforward implementation and not super-optimized.
   3611       uint8x8_t signflip = vdup_n_u8(0x80);
   3612       int16x8_t cr_const0 = vdupq_n_s16(   (short) ( 1.40200f*4096.0f+0.5f));
   3613       int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
   3614       int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
   3615       int16x8_t cb_const1 = vdupq_n_s16(   (short) ( 1.77200f*4096.0f+0.5f));
   3616 
   3617       for (; i+7 < count; i += 8) {
   3618          // load
   3619          uint8x8_t y_bytes  = vld1_u8(y + i);
   3620          uint8x8_t cr_bytes = vld1_u8(pcr + i);
   3621          uint8x8_t cb_bytes = vld1_u8(pcb + i);
   3622          int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
   3623          int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
   3624 
   3625          // expand to s16
   3626          int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
   3627          int16x8_t crw = vshll_n_s8(cr_biased, 7);
   3628          int16x8_t cbw = vshll_n_s8(cb_biased, 7);
   3629 
   3630          // color transform
   3631          int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
   3632          int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
   3633          int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
   3634          int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
   3635          int16x8_t rws = vaddq_s16(yws, cr0);
   3636          int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
   3637          int16x8_t bws = vaddq_s16(yws, cb1);
   3638 
   3639          // undo scaling, round, convert to byte
   3640          uint8x8x4_t o;
   3641          o.val[0] = vqrshrun_n_s16(rws, 4);
   3642          o.val[1] = vqrshrun_n_s16(gws, 4);
   3643          o.val[2] = vqrshrun_n_s16(bws, 4);
   3644          o.val[3] = vdup_n_u8(255);
   3645 
   3646          // store, interleaving r/g/b/a
   3647          vst4_u8(out, o);
   3648          out += 8*4;
   3649       }
   3650    }
   3651 #endif
   3652 
   3653    for (; i < count; ++i) {
   3654       int y_fixed = (y[i] << 20) + (1<<19); // rounding
   3655       int r,g,b;
   3656       int cr = pcr[i] - 128;
   3657       int cb = pcb[i] - 128;
   3658       r = y_fixed + cr* stbi__float2fixed(1.40200f);
   3659       g = y_fixed + cr*-stbi__float2fixed(0.71414f) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
   3660       b = y_fixed                                   +   cb* stbi__float2fixed(1.77200f);
   3661       r >>= 20;
   3662       g >>= 20;
   3663       b >>= 20;
   3664       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
   3665       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
   3666       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
   3667       out[0] = (stbi_uc)r;
   3668       out[1] = (stbi_uc)g;
   3669       out[2] = (stbi_uc)b;
   3670       out[3] = 255;
   3671       out += step;
   3672    }
   3673 }
   3674 #endif
   3675 
   3676 // set up the kernels
   3677 static void stbi__setup_jpeg(stbi__jpeg *j)
   3678 {
   3679    j->idct_block_kernel = stbi__idct_block;
   3680    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
   3681    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
   3682 
   3683 #ifdef STBI_SSE2
   3684    if (stbi__sse2_available()) {
   3685       j->idct_block_kernel = stbi__idct_simd;
   3686       j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
   3687       j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
   3688    }
   3689 #endif
   3690 
   3691 #ifdef STBI_NEON
   3692    j->idct_block_kernel = stbi__idct_simd;
   3693    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
   3694    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
   3695 #endif
   3696 }
   3697 
   3698 // clean up the temporary component buffers
   3699 static void stbi__cleanup_jpeg(stbi__jpeg *j)
   3700 {
   3701    stbi__free_jpeg_components(j, j->s->img_n, 0);
   3702 }
   3703 
   3704 typedef struct
   3705 {
   3706    resample_row_func resample;
   3707    stbi_uc *line0,*line1;
   3708    int hs,vs;   // expansion factor in each axis
   3709    int w_lores; // horizontal pixels pre-expansion
   3710    int ystep;   // how far through vertical expansion we are
   3711    int ypos;    // which pre-expansion row we're on
   3712 } stbi__resample;
   3713 
   3714 // fast 0..255 * 0..255 => 0..255 rounded multiplication
   3715 static stbi_uc stbi__blinn_8x8(stbi_uc x, stbi_uc y)
   3716 {
   3717    unsigned int t = x*y + 128;
   3718    return (stbi_uc) ((t + (t >>8)) >> 8);
   3719 }
   3720 
   3721 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
   3722 {
   3723    int n, decode_n, is_rgb;
   3724    z->s->img_n = 0; // make stbi__cleanup_jpeg safe
   3725 
   3726    // validate req_comp
   3727    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
   3728 
   3729    // load a jpeg image from whichever source, but leave in YCbCr format
   3730    if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
   3731 
   3732    // determine actual number of components to generate
   3733    n = req_comp ? req_comp : z->s->img_n >= 3 ? 3 : 1;
   3734 
   3735    is_rgb = z->s->img_n == 3 && (z->rgb == 3 || (z->app14_color_transform == 0 && !z->jfif));
   3736 
   3737    if (z->s->img_n == 3 && n < 3 && !is_rgb)
   3738       decode_n = 1;
   3739    else
   3740       decode_n = z->s->img_n;
   3741 
   3742    // resample and color-convert
   3743    {
   3744       int k;
   3745       unsigned int i,j;
   3746       stbi_uc *output;
   3747       stbi_uc *coutput[4] = { NULL, NULL, NULL, NULL };
   3748 
   3749       stbi__resample res_comp[4];
   3750 
   3751       for (k=0; k < decode_n; ++k) {
   3752          stbi__resample *r = &res_comp[k];
   3753 
   3754          // allocate line buffer big enough for upsampling off the edges
   3755          // with upsample factor of 4
   3756          z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
   3757          if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
   3758 
   3759          r->hs      = z->img_h_max / z->img_comp[k].h;
   3760          r->vs      = z->img_v_max / z->img_comp[k].v;
   3761          r->ystep   = r->vs >> 1;
   3762          r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
   3763          r->ypos    = 0;
   3764          r->line0   = r->line1 = z->img_comp[k].data;
   3765 
   3766          if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
   3767          else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
   3768          else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
   3769          else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
   3770          else                               r->resample = stbi__resample_row_generic;
   3771       }
   3772 
   3773       // can't error after this so, this is safe
   3774       output = (stbi_uc *) stbi__malloc_mad3(n, z->s->img_x, z->s->img_y, 1);
   3775       if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
   3776 
   3777       // now go ahead and resample
   3778       for (j=0; j < z->s->img_y; ++j) {
   3779          stbi_uc *out = output + n * z->s->img_x * j;
   3780          for (k=0; k < decode_n; ++k) {
   3781             stbi__resample *r = &res_comp[k];
   3782             int y_bot = r->ystep >= (r->vs >> 1);
   3783             coutput[k] = r->resample(z->img_comp[k].linebuf,
   3784                                      y_bot ? r->line1 : r->line0,
   3785                                      y_bot ? r->line0 : r->line1,
   3786                                      r->w_lores, r->hs);
   3787             if (++r->ystep >= r->vs) {
   3788                r->ystep = 0;
   3789                r->line0 = r->line1;
   3790                if (++r->ypos < z->img_comp[k].y)
   3791                   r->line1 += z->img_comp[k].w2;
   3792             }
   3793          }
   3794          if (n >= 3) {
   3795             stbi_uc *y = coutput[0];
   3796             if (z->s->img_n == 3) {
   3797                if (is_rgb) {
   3798                   for (i=0; i < z->s->img_x; ++i) {
   3799                      out[0] = y[i];
   3800                      out[1] = coutput[1][i];
   3801                      out[2] = coutput[2][i];
   3802                      out[3] = 255;
   3803                      out += n;
   3804                   }
   3805                } else {
   3806                   z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3807                }
   3808             } else if (z->s->img_n == 4) {
   3809                if (z->app14_color_transform == 0) { // CMYK
   3810                   for (i=0; i < z->s->img_x; ++i) {
   3811                      stbi_uc m = coutput[3][i];
   3812                      out[0] = stbi__blinn_8x8(coutput[0][i], m);
   3813                      out[1] = stbi__blinn_8x8(coutput[1][i], m);
   3814                      out[2] = stbi__blinn_8x8(coutput[2][i], m);
   3815                      out[3] = 255;
   3816                      out += n;
   3817                   }
   3818                } else if (z->app14_color_transform == 2) { // YCCK
   3819                   z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3820                   for (i=0; i < z->s->img_x; ++i) {
   3821                      stbi_uc m = coutput[3][i];
   3822                      out[0] = stbi__blinn_8x8(255 - out[0], m);
   3823                      out[1] = stbi__blinn_8x8(255 - out[1], m);
   3824                      out[2] = stbi__blinn_8x8(255 - out[2], m);
   3825                      out += n;
   3826                   }
   3827                } else { // YCbCr + alpha?  Ignore the fourth channel for now
   3828                   z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3829                }
   3830             } else
   3831                for (i=0; i < z->s->img_x; ++i) {
   3832                   out[0] = out[1] = out[2] = y[i];
   3833                   out[3] = 255; // not used if n==3
   3834                   out += n;
   3835                }
   3836          } else {
   3837             if (is_rgb) {
   3838                if (n == 1)
   3839                   for (i=0; i < z->s->img_x; ++i)
   3840                      *out++ = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
   3841                else {
   3842                   for (i=0; i < z->s->img_x; ++i, out += 2) {
   3843                      out[0] = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
   3844                      out[1] = 255;
   3845                   }
   3846                }
   3847             } else if (z->s->img_n == 4 && z->app14_color_transform == 0) {
   3848                for (i=0; i < z->s->img_x; ++i) {
   3849                   stbi_uc m = coutput[3][i];
   3850                   stbi_uc r = stbi__blinn_8x8(coutput[0][i], m);
   3851                   stbi_uc g = stbi__blinn_8x8(coutput[1][i], m);
   3852                   stbi_uc b = stbi__blinn_8x8(coutput[2][i], m);
   3853                   out[0] = stbi__compute_y(r, g, b);
   3854                   out[1] = 255;
   3855                   out += n;
   3856                }
   3857             } else if (z->s->img_n == 4 && z->app14_color_transform == 2) {
   3858                for (i=0; i < z->s->img_x; ++i) {
   3859                   out[0] = stbi__blinn_8x8(255 - coutput[0][i], coutput[3][i]);
   3860                   out[1] = 255;
   3861                   out += n;
   3862                }
   3863             } else {
   3864                stbi_uc *y = coutput[0];
   3865                if (n == 1)
   3866                   for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
   3867                else
   3868                   for (i=0; i < z->s->img_x; ++i) { *out++ = y[i]; *out++ = 255; }
   3869             }
   3870          }
   3871       }
   3872       stbi__cleanup_jpeg(z);
   3873       *out_x = z->s->img_x;
   3874       *out_y = z->s->img_y;
   3875       if (comp) *comp = z->s->img_n >= 3 ? 3 : 1; // report original components, not output
   3876       return output;
   3877    }
   3878 }
   3879 
   3880 static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   3881 {
   3882    unsigned char* result;
   3883    stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
   3884    STBI_NOTUSED(ri);
   3885    j->s = s;
   3886    stbi__setup_jpeg(j);
   3887    result = load_jpeg_image(j, x,y,comp,req_comp);
   3888    STBI_FREE(j);
   3889    return result;
   3890 }
   3891 
   3892 static int stbi__jpeg_test(stbi__context *s)
   3893 {
   3894    int r;
   3895    stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg));
   3896    j->s = s;
   3897    stbi__setup_jpeg(j);
   3898    r = stbi__decode_jpeg_header(j, STBI__SCAN_type);
   3899    stbi__rewind(s);
   3900    STBI_FREE(j);
   3901    return r;
   3902 }
   3903 
   3904 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
   3905 {
   3906    if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
   3907       stbi__rewind( j->s );
   3908       return 0;
   3909    }
   3910    if (x) *x = j->s->img_x;
   3911    if (y) *y = j->s->img_y;
   3912    if (comp) *comp = j->s->img_n >= 3 ? 3 : 1;
   3913    return 1;
   3914 }
   3915 
   3916 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
   3917 {
   3918    int result;
   3919    stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
   3920    j->s = s;
   3921    result = stbi__jpeg_info_raw(j, x, y, comp);
   3922    STBI_FREE(j);
   3923    return result;
   3924 }
   3925 #endif
   3926 
   3927 // public domain zlib decode    v0.2  Sean Barrett 2006-11-18
   3928 //    simple implementation
   3929 //      - all input must be provided in an upfront buffer
   3930 //      - all output is written to a single output buffer (can malloc/realloc)
   3931 //    performance
   3932 //      - fast huffman
   3933 
   3934 #ifndef STBI_NO_ZLIB
   3935 
   3936 // fast-way is faster to check than jpeg huffman, but slow way is slower
   3937 #define STBI__ZFAST_BITS  9 // accelerate all cases in default tables
   3938 #define STBI__ZFAST_MASK  ((1 << STBI__ZFAST_BITS) - 1)
   3939 
   3940 // zlib-style huffman encoding
   3941 // (jpegs packs from left, zlib from right, so can't share code)
   3942 typedef struct
   3943 {
   3944    stbi__uint16 fast[1 << STBI__ZFAST_BITS];
   3945    stbi__uint16 firstcode[16];
   3946    int maxcode[17];
   3947    stbi__uint16 firstsymbol[16];
   3948    stbi_uc  size[288];
   3949    stbi__uint16 value[288];
   3950 } stbi__zhuffman;
   3951 
   3952 stbi_inline static int stbi__bitreverse16(int n)
   3953 {
   3954   n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
   3955   n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
   3956   n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
   3957   n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
   3958   return n;
   3959 }
   3960 
   3961 stbi_inline static int stbi__bit_reverse(int v, int bits)
   3962 {
   3963    STBI_ASSERT(bits <= 16);
   3964    // to bit reverse n bits, reverse 16 and shift
   3965    // e.g. 11 bits, bit reverse and shift away 5
   3966    return stbi__bitreverse16(v) >> (16-bits);
   3967 }
   3968 
   3969 static int stbi__zbuild_huffman(stbi__zhuffman *z, const stbi_uc *sizelist, int num)
   3970 {
   3971    int i,k=0;
   3972    int code, next_code[16], sizes[17];
   3973 
   3974    // DEFLATE spec for generating codes
   3975    memset(sizes, 0, sizeof(sizes));
   3976    memset(z->fast, 0, sizeof(z->fast));
   3977    for (i=0; i < num; ++i)
   3978       ++sizes[sizelist[i]];
   3979    sizes[0] = 0;
   3980    for (i=1; i < 16; ++i)
   3981       if (sizes[i] > (1 << i))
   3982          return stbi__err("bad sizes", "Corrupt PNG");
   3983    code = 0;
   3984    for (i=1; i < 16; ++i) {
   3985       next_code[i] = code;
   3986       z->firstcode[i] = (stbi__uint16) code;
   3987       z->firstsymbol[i] = (stbi__uint16) k;
   3988       code = (code + sizes[i]);
   3989       if (sizes[i])
   3990          if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
   3991       z->maxcode[i] = code << (16-i); // preshift for inner loop
   3992       code <<= 1;
   3993       k += sizes[i];
   3994    }
   3995    z->maxcode[16] = 0x10000; // sentinel
   3996    for (i=0; i < num; ++i) {
   3997       int s = sizelist[i];
   3998       if (s) {
   3999          int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
   4000          stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
   4001          z->size [c] = (stbi_uc     ) s;
   4002          z->value[c] = (stbi__uint16) i;
   4003          if (s <= STBI__ZFAST_BITS) {
   4004             int j = stbi__bit_reverse(next_code[s],s);
   4005             while (j < (1 << STBI__ZFAST_BITS)) {
   4006                z->fast[j] = fastv;
   4007                j += (1 << s);
   4008             }
   4009          }
   4010          ++next_code[s];
   4011       }
   4012    }
   4013    return 1;
   4014 }
   4015 
   4016 // zlib-from-memory implementation for PNG reading
   4017 //    because PNG allows splitting the zlib stream arbitrarily,
   4018 //    and it's annoying structurally to have PNG call ZLIB call PNG,
   4019 //    we require PNG read all the IDATs and combine them into a single
   4020 //    memory buffer
   4021 
   4022 typedef struct
   4023 {
   4024    stbi_uc *zbuffer, *zbuffer_end;
   4025    int num_bits;
   4026    stbi__uint32 code_buffer;
   4027 
   4028    char *zout;
   4029    char *zout_start;
   4030    char *zout_end;
   4031    int   z_expandable;
   4032 
   4033    stbi__zhuffman z_length, z_distance;
   4034 } stbi__zbuf;
   4035 
   4036 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
   4037 {
   4038    if (z->zbuffer >= z->zbuffer_end) return 0;
   4039    return *z->zbuffer++;
   4040 }
   4041 
   4042 static void stbi__fill_bits(stbi__zbuf *z)
   4043 {
   4044    do {
   4045       STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
   4046       z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
   4047       z->num_bits += 8;
   4048    } while (z->num_bits <= 24);
   4049 }
   4050 
   4051 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
   4052 {
   4053    unsigned int k;
   4054    if (z->num_bits < n) stbi__fill_bits(z);
   4055    k = z->code_buffer & ((1 << n) - 1);
   4056    z->code_buffer >>= n;
   4057    z->num_bits -= n;
   4058    return k;
   4059 }
   4060 
   4061 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
   4062 {
   4063    int b,s,k;
   4064    // not resolved by fast table, so compute it the slow way
   4065    // use jpeg approach, which requires MSbits at top
   4066    k = stbi__bit_reverse(a->code_buffer, 16);
   4067    for (s=STBI__ZFAST_BITS+1; ; ++s)
   4068       if (k < z->maxcode[s])
   4069          break;
   4070    if (s == 16) return -1; // invalid code!
   4071    // code size is s, so:
   4072    b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
   4073    STBI_ASSERT(z->size[b] == s);
   4074    a->code_buffer >>= s;
   4075    a->num_bits -= s;
   4076    return z->value[b];
   4077 }
   4078 
   4079 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
   4080 {
   4081    int b,s;
   4082    if (a->num_bits < 16) stbi__fill_bits(a);
   4083    b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
   4084    if (b) {
   4085       s = b >> 9;
   4086       a->code_buffer >>= s;
   4087       a->num_bits -= s;
   4088       return b & 511;
   4089    }
   4090    return stbi__zhuffman_decode_slowpath(a, z);
   4091 }
   4092 
   4093 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n)  // need to make room for n bytes
   4094 {
   4095    char *q;
   4096    int cur, limit, old_limit;
   4097    z->zout = zout;
   4098    if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
   4099    cur   = (int) (z->zout     - z->zout_start);
   4100    limit = old_limit = (int) (z->zout_end - z->zout_start);
   4101    while (cur + n > limit)
   4102       limit *= 2;
   4103    q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
   4104    STBI_NOTUSED(old_limit);
   4105    if (q == NULL) return stbi__err("outofmem", "Out of memory");
   4106    z->zout_start = q;
   4107    z->zout       = q + cur;
   4108    z->zout_end   = q + limit;
   4109    return 1;
   4110 }
   4111 
   4112 static const int stbi__zlength_base[31] = {
   4113    3,4,5,6,7,8,9,10,11,13,
   4114    15,17,19,23,27,31,35,43,51,59,
   4115    67,83,99,115,131,163,195,227,258,0,0 };
   4116 
   4117 static const int stbi__zlength_extra[31]=
   4118 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
   4119 
   4120 static const int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
   4121 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
   4122 
   4123 static const int stbi__zdist_extra[32] =
   4124 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
   4125 
   4126 static int stbi__parse_huffman_block(stbi__zbuf *a)
   4127 {
   4128    char *zout = a->zout;
   4129    for(;;) {
   4130       int z = stbi__zhuffman_decode(a, &a->z_length);
   4131       if (z < 256) {
   4132          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
   4133          if (zout >= a->zout_end) {
   4134             if (!stbi__zexpand(a, zout, 1)) return 0;
   4135             zout = a->zout;
   4136          }
   4137          *zout++ = (char) z;
   4138       } else {
   4139          stbi_uc *p;
   4140          int len,dist;
   4141          if (z == 256) {
   4142             a->zout = zout;
   4143             return 1;
   4144          }
   4145          z -= 257;
   4146          len = stbi__zlength_base[z];
   4147          if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
   4148          z = stbi__zhuffman_decode(a, &a->z_distance);
   4149          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
   4150          dist = stbi__zdist_base[z];
   4151          if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
   4152          if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
   4153          if (zout + len > a->zout_end) {
   4154             if (!stbi__zexpand(a, zout, len)) return 0;
   4155             zout = a->zout;
   4156          }
   4157          p = (stbi_uc *) (zout - dist);
   4158          if (dist == 1) { // run of one byte; common in images.
   4159             stbi_uc v = *p;
   4160             if (len) { do *zout++ = v; while (--len); }
   4161          } else {
   4162             if (len) { do *zout++ = *p++; while (--len); }
   4163          }
   4164       }
   4165    }
   4166 }
   4167 
   4168 static int stbi__compute_huffman_codes(stbi__zbuf *a)
   4169 {
   4170    static const stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
   4171    stbi__zhuffman z_codelength;
   4172    stbi_uc lencodes[286+32+137];//padding for maximum single op
   4173    stbi_uc codelength_sizes[19];
   4174    int i,n;
   4175 
   4176    int hlit  = stbi__zreceive(a,5) + 257;
   4177    int hdist = stbi__zreceive(a,5) + 1;
   4178    int hclen = stbi__zreceive(a,4) + 4;
   4179    int ntot  = hlit + hdist;
   4180 
   4181    memset(codelength_sizes, 0, sizeof(codelength_sizes));
   4182    for (i=0; i < hclen; ++i) {
   4183       int s = stbi__zreceive(a,3);
   4184       codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
   4185    }
   4186    if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
   4187 
   4188    n = 0;
   4189    while (n < ntot) {
   4190       int c = stbi__zhuffman_decode(a, &z_codelength);
   4191       if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
   4192       if (c < 16)
   4193          lencodes[n++] = (stbi_uc) c;
   4194       else {
   4195          stbi_uc fill = 0;
   4196          if (c == 16) {
   4197             c = stbi__zreceive(a,2)+3;
   4198             if (n == 0) return stbi__err("bad codelengths", "Corrupt PNG");
   4199             fill = lencodes[n-1];
   4200          } else if (c == 17)
   4201             c = stbi__zreceive(a,3)+3;
   4202          else {
   4203             STBI_ASSERT(c == 18);
   4204             c = stbi__zreceive(a,7)+11;
   4205          }
   4206          if (ntot - n < c) return stbi__err("bad codelengths", "Corrupt PNG");
   4207          memset(lencodes+n, fill, c);
   4208          n += c;
   4209       }
   4210    }
   4211    if (n != ntot) return stbi__err("bad codelengths","Corrupt PNG");
   4212    if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
   4213    if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
   4214    return 1;
   4215 }
   4216 
   4217 static int stbi__parse_uncompressed_block(stbi__zbuf *a)
   4218 {
   4219    stbi_uc header[4];
   4220    int len,nlen,k;
   4221    if (a->num_bits & 7)
   4222       stbi__zreceive(a, a->num_bits & 7); // discard
   4223    // drain the bit-packed data into header
   4224    k = 0;
   4225    while (a->num_bits > 0) {
   4226       header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
   4227       a->code_buffer >>= 8;
   4228       a->num_bits -= 8;
   4229    }
   4230    STBI_ASSERT(a->num_bits == 0);
   4231    // now fill header the normal way
   4232    while (k < 4)
   4233       header[k++] = stbi__zget8(a);
   4234    len  = header[1] * 256 + header[0];
   4235    nlen = header[3] * 256 + header[2];
   4236    if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
   4237    if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
   4238    if (a->zout + len > a->zout_end)
   4239       if (!stbi__zexpand(a, a->zout, len)) return 0;
   4240    memcpy(a->zout, a->zbuffer, len);
   4241    a->zbuffer += len;
   4242    a->zout += len;
   4243    return 1;
   4244 }
   4245 
   4246 static int stbi__parse_zlib_header(stbi__zbuf *a)
   4247 {
   4248    int cmf   = stbi__zget8(a);
   4249    int cm    = cmf & 15;
   4250    /* int cinfo = cmf >> 4; */
   4251    int flg   = stbi__zget8(a);
   4252    if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
   4253    if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
   4254    if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
   4255    // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
   4256    return 1;
   4257 }
   4258 
   4259 static const stbi_uc stbi__zdefault_length[288] =
   4260 {
   4261    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4262    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4263    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4264    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4265    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4266    9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4267    9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4268    9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4269    7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8
   4270 };
   4271 static const stbi_uc stbi__zdefault_distance[32] =
   4272 {
   4273    5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5
   4274 };
   4275 /*
   4276 Init algorithm:
   4277 {
   4278    int i;   // use <= to match clearly with spec
   4279    for (i=0; i <= 143; ++i)     stbi__zdefault_length[i]   = 8;
   4280    for (   ; i <= 255; ++i)     stbi__zdefault_length[i]   = 9;
   4281    for (   ; i <= 279; ++i)     stbi__zdefault_length[i]   = 7;
   4282    for (   ; i <= 287; ++i)     stbi__zdefault_length[i]   = 8;
   4283 
   4284    for (i=0; i <=  31; ++i)     stbi__zdefault_distance[i] = 5;
   4285 }
   4286 */
   4287 
   4288 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
   4289 {
   4290    int final, type;
   4291    if (parse_header)
   4292       if (!stbi__parse_zlib_header(a)) return 0;
   4293    a->num_bits = 0;
   4294    a->code_buffer = 0;
   4295    do {
   4296       final = stbi__zreceive(a,1);
   4297       type = stbi__zreceive(a,2);
   4298       if (type == 0) {
   4299          if (!stbi__parse_uncompressed_block(a)) return 0;
   4300       } else if (type == 3) {
   4301          return 0;
   4302       } else {
   4303          if (type == 1) {
   4304             // use fixed code lengths
   4305             if (!stbi__zbuild_huffman(&a->z_length  , stbi__zdefault_length  , 288)) return 0;
   4306             if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance,  32)) return 0;
   4307          } else {
   4308             if (!stbi__compute_huffman_codes(a)) return 0;
   4309          }
   4310          if (!stbi__parse_huffman_block(a)) return 0;
   4311       }
   4312    } while (!final);
   4313    return 1;
   4314 }
   4315 
   4316 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
   4317 {
   4318    a->zout_start = obuf;
   4319    a->zout       = obuf;
   4320    a->zout_end   = obuf + olen;
   4321    a->z_expandable = exp;
   4322 
   4323    return stbi__parse_zlib(a, parse_header);
   4324 }
   4325 
   4326 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
   4327 {
   4328    stbi__zbuf a;
   4329    char *p = (char *) stbi__malloc(initial_size);
   4330    if (p == NULL) return NULL;
   4331    a.zbuffer = (stbi_uc *) buffer;
   4332    a.zbuffer_end = (stbi_uc *) buffer + len;
   4333    if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
   4334       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4335       return a.zout_start;
   4336    } else {
   4337       STBI_FREE(a.zout_start);
   4338       return NULL;
   4339    }
   4340 }
   4341 
   4342 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
   4343 {
   4344    return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
   4345 }
   4346 
   4347 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
   4348 {
   4349    stbi__zbuf a;
   4350    char *p = (char *) stbi__malloc(initial_size);
   4351    if (p == NULL) return NULL;
   4352    a.zbuffer = (stbi_uc *) buffer;
   4353    a.zbuffer_end = (stbi_uc *) buffer + len;
   4354    if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
   4355       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4356       return a.zout_start;
   4357    } else {
   4358       STBI_FREE(a.zout_start);
   4359       return NULL;
   4360    }
   4361 }
   4362 
   4363 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
   4364 {
   4365    stbi__zbuf a;
   4366    a.zbuffer = (stbi_uc *) ibuffer;
   4367    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
   4368    if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
   4369       return (int) (a.zout - a.zout_start);
   4370    else
   4371       return -1;
   4372 }
   4373 
   4374 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
   4375 {
   4376    stbi__zbuf a;
   4377    char *p = (char *) stbi__malloc(16384);
   4378    if (p == NULL) return NULL;
   4379    a.zbuffer = (stbi_uc *) buffer;
   4380    a.zbuffer_end = (stbi_uc *) buffer+len;
   4381    if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
   4382       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4383       return a.zout_start;
   4384    } else {
   4385       STBI_FREE(a.zout_start);
   4386       return NULL;
   4387    }
   4388 }
   4389 
   4390 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
   4391 {
   4392    stbi__zbuf a;
   4393    a.zbuffer = (stbi_uc *) ibuffer;
   4394    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
   4395    if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
   4396       return (int) (a.zout - a.zout_start);
   4397    else
   4398       return -1;
   4399 }
   4400 #endif
   4401 
   4402 // public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
   4403 //    simple implementation
   4404 //      - only 8-bit samples
   4405 //      - no CRC checking
   4406 //      - allocates lots of intermediate memory
   4407 //        - avoids problem of streaming data between subsystems
   4408 //        - avoids explicit window management
   4409 //    performance
   4410 //      - uses stb_zlib, a PD zlib implementation with fast huffman decoding
   4411 
   4412 #ifndef STBI_NO_PNG
   4413 typedef struct
   4414 {
   4415    stbi__uint32 length;
   4416    stbi__uint32 type;
   4417 } stbi__pngchunk;
   4418 
   4419 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
   4420 {
   4421    stbi__pngchunk c;
   4422    c.length = stbi__get32be(s);
   4423    c.type   = stbi__get32be(s);
   4424    return c;
   4425 }
   4426 
   4427 static int stbi__check_png_header(stbi__context *s)
   4428 {
   4429    static const stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
   4430    int i;
   4431    for (i=0; i < 8; ++i)
   4432       if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
   4433    return 1;
   4434 }
   4435 
   4436 typedef struct
   4437 {
   4438    stbi__context *s;
   4439    stbi_uc *idata, *expanded, *out;
   4440    int depth;
   4441 } stbi__png;
   4442 
   4443 
   4444 enum {
   4445    STBI__F_none=0,
   4446    STBI__F_sub=1,
   4447    STBI__F_up=2,
   4448    STBI__F_avg=3,
   4449    STBI__F_paeth=4,
   4450    // synthetic filters used for first scanline to avoid needing a dummy row of 0s
   4451    STBI__F_avg_first,
   4452    STBI__F_paeth_first
   4453 };
   4454 
   4455 static stbi_uc first_row_filter[5] =
   4456 {
   4457    STBI__F_none,
   4458    STBI__F_sub,
   4459    STBI__F_none,
   4460    STBI__F_avg_first,
   4461    STBI__F_paeth_first
   4462 };
   4463 
   4464 static int stbi__paeth(int a, int b, int c)
   4465 {
   4466    int p = a + b - c;
   4467    int pa = abs(p-a);
   4468    int pb = abs(p-b);
   4469    int pc = abs(p-c);
   4470    if (pa <= pb && pa <= pc) return a;
   4471    if (pb <= pc) return b;
   4472    return c;
   4473 }
   4474 
   4475 static const stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
   4476 
   4477 // create the png data from post-deflated data
   4478 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
   4479 {
   4480    int bytes = (depth == 16? 2 : 1);
   4481    stbi__context *s = a->s;
   4482    stbi__uint32 i,j,stride = x*out_n*bytes;
   4483    stbi__uint32 img_len, img_width_bytes;
   4484    int k;
   4485    int img_n = s->img_n; // copy it into a local for later
   4486 
   4487    int output_bytes = out_n*bytes;
   4488    int filter_bytes = img_n*bytes;
   4489    int width = x;
   4490 
   4491    STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
   4492    a->out = (stbi_uc *) stbi__malloc_mad3(x, y, output_bytes, 0); // extra bytes to write off the end into
   4493    if (!a->out) return stbi__err("outofmem", "Out of memory");
   4494 
   4495    if (!stbi__mad3sizes_valid(img_n, x, depth, 7)) return stbi__err("too large", "Corrupt PNG");
   4496    img_width_bytes = (((img_n * x * depth) + 7) >> 3);
   4497    img_len = (img_width_bytes + 1) * y;
   4498 
   4499    // we used to check for exact match between raw_len and img_len on non-interlaced PNGs,
   4500    // but issue #276 reported a PNG in the wild that had extra data at the end (all zeros),
   4501    // so just check for raw_len < img_len always.
   4502    if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
   4503 
   4504    for (j=0; j < y; ++j) {
   4505       stbi_uc *cur = a->out + stride*j;
   4506       stbi_uc *prior;
   4507       int filter = *raw++;
   4508 
   4509       if (filter > 4)
   4510          return stbi__err("invalid filter","Corrupt PNG");
   4511 
   4512       if (depth < 8) {
   4513          STBI_ASSERT(img_width_bytes <= x);
   4514          cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
   4515          filter_bytes = 1;
   4516          width = img_width_bytes;
   4517       }
   4518       prior = cur - stride; // bugfix: need to compute this after 'cur +=' computation above
   4519 
   4520       // if first row, use special filter that doesn't sample previous row
   4521       if (j == 0) filter = first_row_filter[filter];
   4522 
   4523       // handle first byte explicitly
   4524       for (k=0; k < filter_bytes; ++k) {
   4525          switch (filter) {
   4526             case STBI__F_none       : cur[k] = raw[k]; break;
   4527             case STBI__F_sub        : cur[k] = raw[k]; break;
   4528             case STBI__F_up         : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
   4529             case STBI__F_avg        : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
   4530             case STBI__F_paeth      : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
   4531             case STBI__F_avg_first  : cur[k] = raw[k]; break;
   4532             case STBI__F_paeth_first: cur[k] = raw[k]; break;
   4533          }
   4534       }
   4535 
   4536       if (depth == 8) {
   4537          if (img_n != out_n)
   4538             cur[img_n] = 255; // first pixel
   4539          raw += img_n;
   4540          cur += out_n;
   4541          prior += out_n;
   4542       } else if (depth == 16) {
   4543          if (img_n != out_n) {
   4544             cur[filter_bytes]   = 255; // first pixel top byte
   4545             cur[filter_bytes+1] = 255; // first pixel bottom byte
   4546          }
   4547          raw += filter_bytes;
   4548          cur += output_bytes;
   4549          prior += output_bytes;
   4550       } else {
   4551          raw += 1;
   4552          cur += 1;
   4553          prior += 1;
   4554       }
   4555 
   4556       // this is a little gross, so that we don't switch per-pixel or per-component
   4557       if (depth < 8 || img_n == out_n) {
   4558          int nk = (width - 1)*filter_bytes;
   4559          #define STBI__CASE(f) \
   4560              case f:     \
   4561                 for (k=0; k < nk; ++k)
   4562          switch (filter) {
   4563             // "none" filter turns into a memcpy here; make that explicit.
   4564             case STBI__F_none:         memcpy(cur, raw, nk); break;
   4565             STBI__CASE(STBI__F_sub)          { cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); } break;
   4566             STBI__CASE(STBI__F_up)           { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
   4567             STBI__CASE(STBI__F_avg)          { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); } break;
   4568             STBI__CASE(STBI__F_paeth)        { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); } break;
   4569             STBI__CASE(STBI__F_avg_first)    { cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); } break;
   4570             STBI__CASE(STBI__F_paeth_first)  { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); } break;
   4571          }
   4572          #undef STBI__CASE
   4573          raw += nk;
   4574       } else {
   4575          STBI_ASSERT(img_n+1 == out_n);
   4576          #define STBI__CASE(f) \
   4577              case f:     \
   4578                 for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
   4579                    for (k=0; k < filter_bytes; ++k)
   4580          switch (filter) {
   4581             STBI__CASE(STBI__F_none)         { cur[k] = raw[k]; } break;
   4582             STBI__CASE(STBI__F_sub)          { cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); } break;
   4583             STBI__CASE(STBI__F_up)           { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
   4584             STBI__CASE(STBI__F_avg)          { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); } break;
   4585             STBI__CASE(STBI__F_paeth)        { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); } break;
   4586             STBI__CASE(STBI__F_avg_first)    { cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); } break;
   4587             STBI__CASE(STBI__F_paeth_first)  { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); } break;
   4588          }
   4589          #undef STBI__CASE
   4590 
   4591          // the loop above sets the high byte of the pixels' alpha, but for
   4592          // 16 bit png files we also need the low byte set. we'll do that here.
   4593          if (depth == 16) {
   4594             cur = a->out + stride*j; // start at the beginning of the row again
   4595             for (i=0; i < x; ++i,cur+=output_bytes) {
   4596                cur[filter_bytes+1] = 255;
   4597             }
   4598          }
   4599       }
   4600    }
   4601 
   4602    // we make a separate pass to expand bits to pixels; for performance,
   4603    // this could run two scanlines behind the above code, so it won't
   4604    // intefere with filtering but will still be in the cache.
   4605    if (depth < 8) {
   4606       for (j=0; j < y; ++j) {
   4607          stbi_uc *cur = a->out + stride*j;
   4608          stbi_uc *in  = a->out + stride*j + x*out_n - img_width_bytes;
   4609          // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
   4610          // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
   4611          stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
   4612 
   4613          // note that the final byte might overshoot and write more data than desired.
   4614          // we can allocate enough data that this never writes out of memory, but it
   4615          // could also overwrite the next scanline. can it overwrite non-empty data
   4616          // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
   4617          // so we need to explicitly clamp the final ones
   4618 
   4619          if (depth == 4) {
   4620             for (k=x*img_n; k >= 2; k-=2, ++in) {
   4621                *cur++ = scale * ((*in >> 4)       );
   4622                *cur++ = scale * ((*in     ) & 0x0f);
   4623             }
   4624             if (k > 0) *cur++ = scale * ((*in >> 4)       );
   4625          } else if (depth == 2) {
   4626             for (k=x*img_n; k >= 4; k-=4, ++in) {
   4627                *cur++ = scale * ((*in >> 6)       );
   4628                *cur++ = scale * ((*in >> 4) & 0x03);
   4629                *cur++ = scale * ((*in >> 2) & 0x03);
   4630                *cur++ = scale * ((*in     ) & 0x03);
   4631             }
   4632             if (k > 0) *cur++ = scale * ((*in >> 6)       );
   4633             if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
   4634             if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
   4635          } else if (depth == 1) {
   4636             for (k=x*img_n; k >= 8; k-=8, ++in) {
   4637                *cur++ = scale * ((*in >> 7)       );
   4638                *cur++ = scale * ((*in >> 6) & 0x01);
   4639                *cur++ = scale * ((*in >> 5) & 0x01);
   4640                *cur++ = scale * ((*in >> 4) & 0x01);
   4641                *cur++ = scale * ((*in >> 3) & 0x01);
   4642                *cur++ = scale * ((*in >> 2) & 0x01);
   4643                *cur++ = scale * ((*in >> 1) & 0x01);
   4644                *cur++ = scale * ((*in     ) & 0x01);
   4645             }
   4646             if (k > 0) *cur++ = scale * ((*in >> 7)       );
   4647             if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
   4648             if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
   4649             if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
   4650             if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
   4651             if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
   4652             if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
   4653          }
   4654          if (img_n != out_n) {
   4655             int q;
   4656             // insert alpha = 255
   4657             cur = a->out + stride*j;
   4658             if (img_n == 1) {
   4659                for (q=x-1; q >= 0; --q) {
   4660                   cur[q*2+1] = 255;
   4661                   cur[q*2+0] = cur[q];
   4662                }
   4663             } else {
   4664                STBI_ASSERT(img_n == 3);
   4665                for (q=x-1; q >= 0; --q) {
   4666                   cur[q*4+3] = 255;
   4667                   cur[q*4+2] = cur[q*3+2];
   4668                   cur[q*4+1] = cur[q*3+1];
   4669                   cur[q*4+0] = cur[q*3+0];
   4670                }
   4671             }
   4672          }
   4673       }
   4674    } else if (depth == 16) {
   4675       // force the image data from big-endian to platform-native.
   4676       // this is done in a separate pass due to the decoding relying
   4677       // on the data being untouched, but could probably be done
   4678       // per-line during decode if care is taken.
   4679       stbi_uc *cur = a->out;
   4680       stbi__uint16 *cur16 = (stbi__uint16*)cur;
   4681 
   4682       for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
   4683          *cur16 = (cur[0] << 8) | cur[1];
   4684       }
   4685    }
   4686 
   4687    return 1;
   4688 }
   4689 
   4690 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
   4691 {
   4692    int bytes = (depth == 16 ? 2 : 1);
   4693    int out_bytes = out_n * bytes;
   4694    stbi_uc *final;
   4695    int p;
   4696    if (!interlaced)
   4697       return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
   4698 
   4699    // de-interlacing
   4700    final = (stbi_uc *) stbi__malloc_mad3(a->s->img_x, a->s->img_y, out_bytes, 0);
   4701    for (p=0; p < 7; ++p) {
   4702       int xorig[] = { 0,4,0,2,0,1,0 };
   4703       int yorig[] = { 0,0,4,0,2,0,1 };
   4704       int xspc[]  = { 8,8,4,4,2,2,1 };
   4705       int yspc[]  = { 8,8,8,4,4,2,2 };
   4706       int i,j,x,y;
   4707       // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
   4708       x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
   4709       y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
   4710       if (x && y) {
   4711          stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
   4712          if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
   4713             STBI_FREE(final);
   4714             return 0;
   4715          }
   4716          for (j=0; j < y; ++j) {
   4717             for (i=0; i < x; ++i) {
   4718                int out_y = j*yspc[p]+yorig[p];
   4719                int out_x = i*xspc[p]+xorig[p];
   4720                memcpy(final + out_y*a->s->img_x*out_bytes + out_x*out_bytes,
   4721                       a->out + (j*x+i)*out_bytes, out_bytes);
   4722             }
   4723          }
   4724          STBI_FREE(a->out);
   4725          image_data += img_len;
   4726          image_data_len -= img_len;
   4727       }
   4728    }
   4729    a->out = final;
   4730 
   4731    return 1;
   4732 }
   4733 
   4734 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
   4735 {
   4736    stbi__context *s = z->s;
   4737    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   4738    stbi_uc *p = z->out;
   4739 
   4740    // compute color-based transparency, assuming we've
   4741    // already got 255 as the alpha value in the output
   4742    STBI_ASSERT(out_n == 2 || out_n == 4);
   4743 
   4744    if (out_n == 2) {
   4745       for (i=0; i < pixel_count; ++i) {
   4746          p[1] = (p[0] == tc[0] ? 0 : 255);
   4747          p += 2;
   4748       }
   4749    } else {
   4750       for (i=0; i < pixel_count; ++i) {
   4751          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
   4752             p[3] = 0;
   4753          p += 4;
   4754       }
   4755    }
   4756    return 1;
   4757 }
   4758 
   4759 static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
   4760 {
   4761    stbi__context *s = z->s;
   4762    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   4763    stbi__uint16 *p = (stbi__uint16*) z->out;
   4764 
   4765    // compute color-based transparency, assuming we've
   4766    // already got 65535 as the alpha value in the output
   4767    STBI_ASSERT(out_n == 2 || out_n == 4);
   4768 
   4769    if (out_n == 2) {
   4770       for (i = 0; i < pixel_count; ++i) {
   4771          p[1] = (p[0] == tc[0] ? 0 : 65535);
   4772          p += 2;
   4773       }
   4774    } else {
   4775       for (i = 0; i < pixel_count; ++i) {
   4776          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
   4777             p[3] = 0;
   4778          p += 4;
   4779       }
   4780    }
   4781    return 1;
   4782 }
   4783 
   4784 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
   4785 {
   4786    stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
   4787    stbi_uc *p, *temp_out, *orig = a->out;
   4788 
   4789    p = (stbi_uc *) stbi__malloc_mad2(pixel_count, pal_img_n, 0);
   4790    if (p == NULL) return stbi__err("outofmem", "Out of memory");
   4791 
   4792    // between here and free(out) below, exitting would leak
   4793    temp_out = p;
   4794 
   4795    if (pal_img_n == 3) {
   4796       for (i=0; i < pixel_count; ++i) {
   4797          int n = orig[i]*4;
   4798          p[0] = palette[n  ];
   4799          p[1] = palette[n+1];
   4800          p[2] = palette[n+2];
   4801          p += 3;
   4802       }
   4803    } else {
   4804       for (i=0; i < pixel_count; ++i) {
   4805          int n = orig[i]*4;
   4806          p[0] = palette[n  ];
   4807          p[1] = palette[n+1];
   4808          p[2] = palette[n+2];
   4809          p[3] = palette[n+3];
   4810          p += 4;
   4811       }
   4812    }
   4813    STBI_FREE(a->out);
   4814    a->out = temp_out;
   4815 
   4816    STBI_NOTUSED(len);
   4817 
   4818    return 1;
   4819 }
   4820 
   4821 static int stbi__unpremultiply_on_load = 0;
   4822 static int stbi__de_iphone_flag = 0;
   4823 
   4824 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
   4825 {
   4826    stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
   4827 }
   4828 
   4829 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
   4830 {
   4831    stbi__de_iphone_flag = flag_true_if_should_convert;
   4832 }
   4833 
   4834 static void stbi__de_iphone(stbi__png *z)
   4835 {
   4836    stbi__context *s = z->s;
   4837    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   4838    stbi_uc *p = z->out;
   4839 
   4840    if (s->img_out_n == 3) {  // convert bgr to rgb
   4841       for (i=0; i < pixel_count; ++i) {
   4842          stbi_uc t = p[0];
   4843          p[0] = p[2];
   4844          p[2] = t;
   4845          p += 3;
   4846       }
   4847    } else {
   4848       STBI_ASSERT(s->img_out_n == 4);
   4849       if (stbi__unpremultiply_on_load) {
   4850          // convert bgr to rgb and unpremultiply
   4851          for (i=0; i < pixel_count; ++i) {
   4852             stbi_uc a = p[3];
   4853             stbi_uc t = p[0];
   4854             if (a) {
   4855                stbi_uc half = a / 2;
   4856                p[0] = (p[2] * 255 + half) / a;
   4857                p[1] = (p[1] * 255 + half) / a;
   4858                p[2] = ( t   * 255 + half) / a;
   4859             } else {
   4860                p[0] = p[2];
   4861                p[2] = t;
   4862             }
   4863             p += 4;
   4864          }
   4865       } else {
   4866          // convert bgr to rgb
   4867          for (i=0; i < pixel_count; ++i) {
   4868             stbi_uc t = p[0];
   4869             p[0] = p[2];
   4870             p[2] = t;
   4871             p += 4;
   4872          }
   4873       }
   4874    }
   4875 }
   4876 
   4877 #define STBI__PNG_TYPE(a,b,c,d)  (((unsigned) (a) << 24) + ((unsigned) (b) << 16) + ((unsigned) (c) << 8) + (unsigned) (d))
   4878 
   4879 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
   4880 {
   4881    stbi_uc palette[1024], pal_img_n=0;
   4882    stbi_uc has_trans=0, tc[3]={0};
   4883    stbi__uint16 tc16[3];
   4884    stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
   4885    int first=1,k,interlace=0, color=0, is_iphone=0;
   4886    stbi__context *s = z->s;
   4887 
   4888    z->expanded = NULL;
   4889    z->idata = NULL;
   4890    z->out = NULL;
   4891 
   4892    if (!stbi__check_png_header(s)) return 0;
   4893 
   4894    if (scan == STBI__SCAN_type) return 1;
   4895 
   4896    for (;;) {
   4897       stbi__pngchunk c = stbi__get_chunk_header(s);
   4898       switch (c.type) {
   4899          case STBI__PNG_TYPE('C','g','B','I'):
   4900             is_iphone = 1;
   4901             stbi__skip(s, c.length);
   4902             break;
   4903          case STBI__PNG_TYPE('I','H','D','R'): {
   4904             int comp,filter;
   4905             if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
   4906             first = 0;
   4907             if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
   4908             s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
   4909             s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
   4910             z->depth = stbi__get8(s);  if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16)  return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
   4911             color = stbi__get8(s);  if (color > 6)         return stbi__err("bad ctype","Corrupt PNG");
   4912             if (color == 3 && z->depth == 16)                  return stbi__err("bad ctype","Corrupt PNG");
   4913             if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
   4914             comp  = stbi__get8(s);  if (comp) return stbi__err("bad comp method","Corrupt PNG");
   4915             filter= stbi__get8(s);  if (filter) return stbi__err("bad filter method","Corrupt PNG");
   4916             interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
   4917             if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
   4918             if (!pal_img_n) {
   4919                s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
   4920                if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
   4921                if (scan == STBI__SCAN_header) return 1;
   4922             } else {
   4923                // if paletted, then pal_n is our final components, and
   4924                // img_n is # components to decompress/filter.
   4925                s->img_n = 1;
   4926                if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
   4927                // if SCAN_header, have to scan to see if we have a tRNS
   4928             }
   4929             break;
   4930          }
   4931 
   4932          case STBI__PNG_TYPE('P','L','T','E'):  {
   4933             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   4934             if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
   4935             pal_len = c.length / 3;
   4936             if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
   4937             for (i=0; i < pal_len; ++i) {
   4938                palette[i*4+0] = stbi__get8(s);
   4939                palette[i*4+1] = stbi__get8(s);
   4940                palette[i*4+2] = stbi__get8(s);
   4941                palette[i*4+3] = 255;
   4942             }
   4943             break;
   4944          }
   4945 
   4946          case STBI__PNG_TYPE('t','R','N','S'): {
   4947             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   4948             if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
   4949             if (pal_img_n) {
   4950                if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
   4951                if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
   4952                if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
   4953                pal_img_n = 4;
   4954                for (i=0; i < c.length; ++i)
   4955                   palette[i*4+3] = stbi__get8(s);
   4956             } else {
   4957                if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
   4958                if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
   4959                has_trans = 1;
   4960                if (z->depth == 16) {
   4961                   for (k = 0; k < s->img_n; ++k) tc16[k] = (stbi__uint16)stbi__get16be(s); // copy the values as-is
   4962                } else {
   4963                   for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
   4964                }
   4965             }
   4966             break;
   4967          }
   4968 
   4969          case STBI__PNG_TYPE('I','D','A','T'): {
   4970             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   4971             if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
   4972             if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
   4973             if ((int)(ioff + c.length) < (int)ioff) return 0;
   4974             if (ioff + c.length > idata_limit) {
   4975                stbi__uint32 idata_limit_old = idata_limit;
   4976                stbi_uc *p;
   4977                if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
   4978                while (ioff + c.length > idata_limit)
   4979                   idata_limit *= 2;
   4980                STBI_NOTUSED(idata_limit_old);
   4981                p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
   4982                z->idata = p;
   4983             }
   4984             if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
   4985             ioff += c.length;
   4986             break;
   4987          }
   4988 
   4989          case STBI__PNG_TYPE('I','E','N','D'): {
   4990             stbi__uint32 raw_len, bpl;
   4991             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   4992             if (scan != STBI__SCAN_load) return 1;
   4993             if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
   4994             // initial guess for decoded data size to avoid unnecessary reallocs
   4995             bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
   4996             raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
   4997             z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
   4998             if (z->expanded == NULL) return 0; // zlib should set error
   4999             STBI_FREE(z->idata); z->idata = NULL;
   5000             if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
   5001                s->img_out_n = s->img_n+1;
   5002             else
   5003                s->img_out_n = s->img_n;
   5004             if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
   5005             if (has_trans) {
   5006                if (z->depth == 16) {
   5007                   if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
   5008                } else {
   5009                   if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
   5010                }
   5011             }
   5012             if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
   5013                stbi__de_iphone(z);
   5014             if (pal_img_n) {
   5015                // pal_img_n == 3 or 4
   5016                s->img_n = pal_img_n; // record the actual colors we had
   5017                s->img_out_n = pal_img_n;
   5018                if (req_comp >= 3) s->img_out_n = req_comp;
   5019                if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
   5020                   return 0;
   5021             } else if (has_trans) {
   5022                // non-paletted image with tRNS -> source image has (constant) alpha
   5023                ++s->img_n;
   5024             }
   5025             STBI_FREE(z->expanded); z->expanded = NULL;
   5026             // end of PNG chunk, read and skip CRC
   5027             stbi__get32be(s);
   5028             return 1;
   5029          }
   5030 
   5031          default:
   5032             // if critical, fail
   5033             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5034             if ((c.type & (1 << 29)) == 0) {
   5035                #ifndef STBI_NO_FAILURE_STRINGS
   5036                // not threadsafe
   5037                static char invalid_chunk[] = "XXXX PNG chunk not known";
   5038                invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
   5039                invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
   5040                invalid_chunk[2] = STBI__BYTECAST(c.type >>  8);
   5041                invalid_chunk[3] = STBI__BYTECAST(c.type >>  0);
   5042                #endif
   5043                return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
   5044             }
   5045             stbi__skip(s, c.length);
   5046             break;
   5047       }
   5048       // end of PNG chunk, read and skip CRC
   5049       stbi__get32be(s);
   5050    }
   5051 }
   5052 
   5053 static void *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp, stbi__result_info *ri)
   5054 {
   5055    void *result=NULL;
   5056    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
   5057    if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
   5058       if (p->depth < 8)
   5059          ri->bits_per_channel = 8;
   5060       else
   5061          ri->bits_per_channel = p->depth;
   5062       result = p->out;
   5063       p->out = NULL;
   5064       if (req_comp && req_comp != p->s->img_out_n) {
   5065          if (ri->bits_per_channel == 8)
   5066             result = stbi__convert_format((unsigned char *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
   5067          else
   5068             result = stbi__convert_format16((stbi__uint16 *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
   5069          p->s->img_out_n = req_comp;
   5070          if (result == NULL) return result;
   5071       }
   5072       *x = p->s->img_x;
   5073       *y = p->s->img_y;
   5074       if (n) *n = p->s->img_n;
   5075    }
   5076    STBI_FREE(p->out);      p->out      = NULL;
   5077    STBI_FREE(p->expanded); p->expanded = NULL;
   5078    STBI_FREE(p->idata);    p->idata    = NULL;
   5079 
   5080    return result;
   5081 }
   5082 
   5083 static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5084 {
   5085    stbi__png p;
   5086    p.s = s;
   5087    return stbi__do_png(&p, x,y,comp,req_comp, ri);
   5088 }
   5089 
   5090 static int stbi__png_test(stbi__context *s)
   5091 {
   5092    int r;
   5093    r = stbi__check_png_header(s);
   5094    stbi__rewind(s);
   5095    return r;
   5096 }
   5097 
   5098 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
   5099 {
   5100    if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
   5101       stbi__rewind( p->s );
   5102       return 0;
   5103    }
   5104    if (x) *x = p->s->img_x;
   5105    if (y) *y = p->s->img_y;
   5106    if (comp) *comp = p->s->img_n;
   5107    return 1;
   5108 }
   5109 
   5110 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
   5111 {
   5112    stbi__png p;
   5113    p.s = s;
   5114    return stbi__png_info_raw(&p, x, y, comp);
   5115 }
   5116 
   5117 static int stbi__png_is16(stbi__context *s)
   5118 {
   5119    stbi__png p;
   5120    p.s = s;
   5121    if (!stbi__png_info_raw(&p, NULL, NULL, NULL))
   5122 	   return 0;
   5123    if (p.depth != 16) {
   5124       stbi__rewind(p.s);
   5125       return 0;
   5126    }
   5127    return 1;
   5128 }
   5129 #endif
   5130 
   5131 // Microsoft/Windows BMP image
   5132 
   5133 #ifndef STBI_NO_BMP
   5134 static int stbi__bmp_test_raw(stbi__context *s)
   5135 {
   5136    int r;
   5137    int sz;
   5138    if (stbi__get8(s) != 'B') return 0;
   5139    if (stbi__get8(s) != 'M') return 0;
   5140    stbi__get32le(s); // discard filesize
   5141    stbi__get16le(s); // discard reserved
   5142    stbi__get16le(s); // discard reserved
   5143    stbi__get32le(s); // discard data offset
   5144    sz = stbi__get32le(s);
   5145    r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
   5146    return r;
   5147 }
   5148 
   5149 static int stbi__bmp_test(stbi__context *s)
   5150 {
   5151    int r = stbi__bmp_test_raw(s);
   5152    stbi__rewind(s);
   5153    return r;
   5154 }
   5155 
   5156 
   5157 // returns 0..31 for the highest set bit
   5158 static int stbi__high_bit(unsigned int z)
   5159 {
   5160    int n=0;
   5161    if (z == 0) return -1;
   5162    if (z >= 0x10000) { n += 16; z >>= 16; }
   5163    if (z >= 0x00100) { n +=  8; z >>=  8; }
   5164    if (z >= 0x00010) { n +=  4; z >>=  4; }
   5165    if (z >= 0x00004) { n +=  2; z >>=  2; }
   5166    if (z >= 0x00002) { n +=  1;/* >>=  1;*/ }
   5167    return n;
   5168 }
   5169 
   5170 static int stbi__bitcount(unsigned int a)
   5171 {
   5172    a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
   5173    a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
   5174    a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
   5175    a = (a + (a >> 8)); // max 16 per 8 bits
   5176    a = (a + (a >> 16)); // max 32 per 8 bits
   5177    return a & 0xff;
   5178 }
   5179 
   5180 // extract an arbitrarily-aligned N-bit value (N=bits)
   5181 // from v, and then make it 8-bits long and fractionally
   5182 // extend it to full full range.
   5183 static int stbi__shiftsigned(unsigned int v, int shift, int bits)
   5184 {
   5185    static unsigned int mul_table[9] = {
   5186       0,
   5187       0xff/*0b11111111*/, 0x55/*0b01010101*/, 0x49/*0b01001001*/, 0x11/*0b00010001*/,
   5188       0x21/*0b00100001*/, 0x41/*0b01000001*/, 0x81/*0b10000001*/, 0x01/*0b00000001*/,
   5189    };
   5190    static unsigned int shift_table[9] = {
   5191       0, 0,0,1,0,2,4,6,0,
   5192    };
   5193    if (shift < 0)
   5194       v <<= -shift;
   5195    else
   5196       v >>= shift;
   5197    STBI_ASSERT(v < 256);
   5198    v >>= (8-bits);
   5199    STBI_ASSERT(bits >= 0 && bits <= 8);
   5200    return (int) ((unsigned) v * mul_table[bits]) >> shift_table[bits];
   5201 }
   5202 
   5203 typedef struct
   5204 {
   5205    int bpp, offset, hsz;
   5206    unsigned int mr,mg,mb,ma, all_a;
   5207    int extra_read;
   5208 } stbi__bmp_data;
   5209 
   5210 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
   5211 {
   5212    int hsz;
   5213    if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
   5214    stbi__get32le(s); // discard filesize
   5215    stbi__get16le(s); // discard reserved
   5216    stbi__get16le(s); // discard reserved
   5217    info->offset = stbi__get32le(s);
   5218    info->hsz = hsz = stbi__get32le(s);
   5219    info->mr = info->mg = info->mb = info->ma = 0;
   5220    info->extra_read = 14;
   5221 
   5222    if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
   5223    if (hsz == 12) {
   5224       s->img_x = stbi__get16le(s);
   5225       s->img_y = stbi__get16le(s);
   5226    } else {
   5227       s->img_x = stbi__get32le(s);
   5228       s->img_y = stbi__get32le(s);
   5229    }
   5230    if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
   5231    info->bpp = stbi__get16le(s);
   5232    if (hsz != 12) {
   5233       int compress = stbi__get32le(s);
   5234       if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
   5235       stbi__get32le(s); // discard sizeof
   5236       stbi__get32le(s); // discard hres
   5237       stbi__get32le(s); // discard vres
   5238       stbi__get32le(s); // discard colorsused
   5239       stbi__get32le(s); // discard max important
   5240       if (hsz == 40 || hsz == 56) {
   5241          if (hsz == 56) {
   5242             stbi__get32le(s);
   5243             stbi__get32le(s);
   5244             stbi__get32le(s);
   5245             stbi__get32le(s);
   5246          }
   5247          if (info->bpp == 16 || info->bpp == 32) {
   5248             if (compress == 0) {
   5249                if (info->bpp == 32) {
   5250                   info->mr = 0xffu << 16;
   5251                   info->mg = 0xffu <<  8;
   5252                   info->mb = 0xffu <<  0;
   5253                   info->ma = 0xffu << 24;
   5254                   info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
   5255                } else {
   5256                   info->mr = 31u << 10;
   5257                   info->mg = 31u <<  5;
   5258                   info->mb = 31u <<  0;
   5259                }
   5260             } else if (compress == 3) {
   5261                info->mr = stbi__get32le(s);
   5262                info->mg = stbi__get32le(s);
   5263                info->mb = stbi__get32le(s);
   5264                info->extra_read += 12;
   5265                // not documented, but generated by photoshop and handled by mspaint
   5266                if (info->mr == info->mg && info->mg == info->mb) {
   5267                   // ?!?!?
   5268                   return stbi__errpuc("bad BMP", "bad BMP");
   5269                }
   5270             } else
   5271                return stbi__errpuc("bad BMP", "bad BMP");
   5272          }
   5273       } else {
   5274          int i;
   5275          if (hsz != 108 && hsz != 124)
   5276             return stbi__errpuc("bad BMP", "bad BMP");
   5277          info->mr = stbi__get32le(s);
   5278          info->mg = stbi__get32le(s);
   5279          info->mb = stbi__get32le(s);
   5280          info->ma = stbi__get32le(s);
   5281          stbi__get32le(s); // discard color space
   5282          for (i=0; i < 12; ++i)
   5283             stbi__get32le(s); // discard color space parameters
   5284          if (hsz == 124) {
   5285             stbi__get32le(s); // discard rendering intent
   5286             stbi__get32le(s); // discard offset of profile data
   5287             stbi__get32le(s); // discard size of profile data
   5288             stbi__get32le(s); // discard reserved
   5289          }
   5290       }
   5291    }
   5292    return (void *) 1;
   5293 }
   5294 
   5295 
   5296 static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5297 {
   5298    stbi_uc *out;
   5299    unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
   5300    stbi_uc pal[256][4];
   5301    int psize=0,i,j,width;
   5302    int flip_vertically, pad, target;
   5303    stbi__bmp_data info;
   5304    STBI_NOTUSED(ri);
   5305 
   5306    info.all_a = 255;
   5307    if (stbi__bmp_parse_header(s, &info) == NULL)
   5308       return NULL; // error code already set
   5309 
   5310    flip_vertically = ((int) s->img_y) > 0;
   5311    s->img_y = abs((int) s->img_y);
   5312 
   5313    mr = info.mr;
   5314    mg = info.mg;
   5315    mb = info.mb;
   5316    ma = info.ma;
   5317    all_a = info.all_a;
   5318 
   5319    if (info.hsz == 12) {
   5320       if (info.bpp < 24)
   5321          psize = (info.offset - info.extra_read - 24) / 3;
   5322    } else {
   5323       if (info.bpp < 16)
   5324          psize = (info.offset - info.extra_read - info.hsz) >> 2;
   5325    }
   5326    if (psize == 0) {
   5327       STBI_ASSERT(info.offset == (s->img_buffer - s->buffer_start));
   5328    }
   5329 
   5330    if (info.bpp == 24 && ma == 0xff000000)
   5331       s->img_n = 3;
   5332    else
   5333       s->img_n = ma ? 4 : 3;
   5334    if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
   5335       target = req_comp;
   5336    else
   5337       target = s->img_n; // if they want monochrome, we'll post-convert
   5338 
   5339    // sanity-check size
   5340    if (!stbi__mad3sizes_valid(target, s->img_x, s->img_y, 0))
   5341       return stbi__errpuc("too large", "Corrupt BMP");
   5342 
   5343    out = (stbi_uc *) stbi__malloc_mad3(target, s->img_x, s->img_y, 0);
   5344    if (!out) return stbi__errpuc("outofmem", "Out of memory");
   5345    if (info.bpp < 16) {
   5346       int z=0;
   5347       if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
   5348       for (i=0; i < psize; ++i) {
   5349          pal[i][2] = stbi__get8(s);
   5350          pal[i][1] = stbi__get8(s);
   5351          pal[i][0] = stbi__get8(s);
   5352          if (info.hsz != 12) stbi__get8(s);
   5353          pal[i][3] = 255;
   5354       }
   5355       stbi__skip(s, info.offset - info.extra_read - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
   5356       if (info.bpp == 1) width = (s->img_x + 7) >> 3;
   5357       else if (info.bpp == 4) width = (s->img_x + 1) >> 1;
   5358       else if (info.bpp == 8) width = s->img_x;
   5359       else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
   5360       pad = (-width)&3;
   5361       if (info.bpp == 1) {
   5362          for (j=0; j < (int) s->img_y; ++j) {
   5363             int bit_offset = 7, v = stbi__get8(s);
   5364             for (i=0; i < (int) s->img_x; ++i) {
   5365                int color = (v>>bit_offset)&0x1;
   5366                out[z++] = pal[color][0];
   5367                out[z++] = pal[color][1];
   5368                out[z++] = pal[color][2];
   5369                if (target == 4) out[z++] = 255;
   5370                if (i+1 == (int) s->img_x) break;
   5371                if((--bit_offset) < 0) {
   5372                   bit_offset = 7;
   5373                   v = stbi__get8(s);
   5374                }
   5375             }
   5376             stbi__skip(s, pad);
   5377          }
   5378       } else {
   5379          for (j=0; j < (int) s->img_y; ++j) {
   5380             for (i=0; i < (int) s->img_x; i += 2) {
   5381                int v=stbi__get8(s),v2=0;
   5382                if (info.bpp == 4) {
   5383                   v2 = v & 15;
   5384                   v >>= 4;
   5385                }
   5386                out[z++] = pal[v][0];
   5387                out[z++] = pal[v][1];
   5388                out[z++] = pal[v][2];
   5389                if (target == 4) out[z++] = 255;
   5390                if (i+1 == (int) s->img_x) break;
   5391                v = (info.bpp == 8) ? stbi__get8(s) : v2;
   5392                out[z++] = pal[v][0];
   5393                out[z++] = pal[v][1];
   5394                out[z++] = pal[v][2];
   5395                if (target == 4) out[z++] = 255;
   5396             }
   5397             stbi__skip(s, pad);
   5398          }
   5399       }
   5400    } else {
   5401       int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
   5402       int z = 0;
   5403       int easy=0;
   5404       stbi__skip(s, info.offset - info.extra_read - info.hsz);
   5405       if (info.bpp == 24) width = 3 * s->img_x;
   5406       else if (info.bpp == 16) width = 2*s->img_x;
   5407       else /* bpp = 32 and pad = 0 */ width=0;
   5408       pad = (-width) & 3;
   5409       if (info.bpp == 24) {
   5410          easy = 1;
   5411       } else if (info.bpp == 32) {
   5412          if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
   5413             easy = 2;
   5414       }
   5415       if (!easy) {
   5416          if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
   5417          // right shift amt to put high bit in position #7
   5418          rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
   5419          gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
   5420          bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
   5421          ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
   5422       }
   5423       for (j=0; j < (int) s->img_y; ++j) {
   5424          if (easy) {
   5425             for (i=0; i < (int) s->img_x; ++i) {
   5426                unsigned char a;
   5427                out[z+2] = stbi__get8(s);
   5428                out[z+1] = stbi__get8(s);
   5429                out[z+0] = stbi__get8(s);
   5430                z += 3;
   5431                a = (easy == 2 ? stbi__get8(s) : 255);
   5432                all_a |= a;
   5433                if (target == 4) out[z++] = a;
   5434             }
   5435          } else {
   5436             int bpp = info.bpp;
   5437             for (i=0; i < (int) s->img_x; ++i) {
   5438                stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
   5439                unsigned int a;
   5440                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
   5441                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
   5442                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
   5443                a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
   5444                all_a |= a;
   5445                if (target == 4) out[z++] = STBI__BYTECAST(a);
   5446             }
   5447          }
   5448          stbi__skip(s, pad);
   5449       }
   5450    }
   5451 
   5452    // if alpha channel is all 0s, replace with all 255s
   5453    if (target == 4 && all_a == 0)
   5454       for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
   5455          out[i] = 255;
   5456 
   5457    if (flip_vertically) {
   5458       stbi_uc t;
   5459       for (j=0; j < (int) s->img_y>>1; ++j) {
   5460          stbi_uc *p1 = out +      j     *s->img_x*target;
   5461          stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
   5462          for (i=0; i < (int) s->img_x*target; ++i) {
   5463             t = p1[i]; p1[i] = p2[i]; p2[i] = t;
   5464          }
   5465       }
   5466    }
   5467 
   5468    if (req_comp && req_comp != target) {
   5469       out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
   5470       if (out == NULL) return out; // stbi__convert_format frees input on failure
   5471    }
   5472 
   5473    *x = s->img_x;
   5474    *y = s->img_y;
   5475    if (comp) *comp = s->img_n;
   5476    return out;
   5477 }
   5478 #endif
   5479 
   5480 // Targa Truevision - TGA
   5481 // by Jonathan Dummer
   5482 #ifndef STBI_NO_TGA
   5483 // returns STBI_rgb or whatever, 0 on error
   5484 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
   5485 {
   5486    // only RGB or RGBA (incl. 16bit) or grey allowed
   5487    if (is_rgb16) *is_rgb16 = 0;
   5488    switch(bits_per_pixel) {
   5489       case 8:  return STBI_grey;
   5490       case 16: if(is_grey) return STBI_grey_alpha;
   5491                // fallthrough
   5492       case 15: if(is_rgb16) *is_rgb16 = 1;
   5493                return STBI_rgb;
   5494       case 24: // fallthrough
   5495       case 32: return bits_per_pixel/8;
   5496       default: return 0;
   5497    }
   5498 }
   5499 
   5500 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
   5501 {
   5502     int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
   5503     int sz, tga_colormap_type;
   5504     stbi__get8(s);                   // discard Offset
   5505     tga_colormap_type = stbi__get8(s); // colormap type
   5506     if( tga_colormap_type > 1 ) {
   5507         stbi__rewind(s);
   5508         return 0;      // only RGB or indexed allowed
   5509     }
   5510     tga_image_type = stbi__get8(s); // image type
   5511     if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
   5512         if (tga_image_type != 1 && tga_image_type != 9) {
   5513             stbi__rewind(s);
   5514             return 0;
   5515         }
   5516         stbi__skip(s,4);       // skip index of first colormap entry and number of entries
   5517         sz = stbi__get8(s);    //   check bits per palette color entry
   5518         if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
   5519             stbi__rewind(s);
   5520             return 0;
   5521         }
   5522         stbi__skip(s,4);       // skip image x and y origin
   5523         tga_colormap_bpp = sz;
   5524     } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
   5525         if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
   5526             stbi__rewind(s);
   5527             return 0; // only RGB or grey allowed, +/- RLE
   5528         }
   5529         stbi__skip(s,9); // skip colormap specification and image x/y origin
   5530         tga_colormap_bpp = 0;
   5531     }
   5532     tga_w = stbi__get16le(s);
   5533     if( tga_w < 1 ) {
   5534         stbi__rewind(s);
   5535         return 0;   // test width
   5536     }
   5537     tga_h = stbi__get16le(s);
   5538     if( tga_h < 1 ) {
   5539         stbi__rewind(s);
   5540         return 0;   // test height
   5541     }
   5542     tga_bits_per_pixel = stbi__get8(s); // bits per pixel
   5543     stbi__get8(s); // ignore alpha bits
   5544     if (tga_colormap_bpp != 0) {
   5545         if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
   5546             // when using a colormap, tga_bits_per_pixel is the size of the indexes
   5547             // I don't think anything but 8 or 16bit indexes makes sense
   5548             stbi__rewind(s);
   5549             return 0;
   5550         }
   5551         tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
   5552     } else {
   5553         tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
   5554     }
   5555     if(!tga_comp) {
   5556       stbi__rewind(s);
   5557       return 0;
   5558     }
   5559     if (x) *x = tga_w;
   5560     if (y) *y = tga_h;
   5561     if (comp) *comp = tga_comp;
   5562     return 1;                   // seems to have passed everything
   5563 }
   5564 
   5565 static int stbi__tga_test(stbi__context *s)
   5566 {
   5567    int res = 0;
   5568    int sz, tga_color_type;
   5569    stbi__get8(s);      //   discard Offset
   5570    tga_color_type = stbi__get8(s);   //   color type
   5571    if ( tga_color_type > 1 ) goto errorEnd;   //   only RGB or indexed allowed
   5572    sz = stbi__get8(s);   //   image type
   5573    if ( tga_color_type == 1 ) { // colormapped (paletted) image
   5574       if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
   5575       stbi__skip(s,4);       // skip index of first colormap entry and number of entries
   5576       sz = stbi__get8(s);    //   check bits per palette color entry
   5577       if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
   5578       stbi__skip(s,4);       // skip image x and y origin
   5579    } else { // "normal" image w/o colormap
   5580       if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
   5581       stbi__skip(s,9); // skip colormap specification and image x/y origin
   5582    }
   5583    if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test width
   5584    if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test height
   5585    sz = stbi__get8(s);   //   bits per pixel
   5586    if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
   5587    if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
   5588 
   5589    res = 1; // if we got this far, everything's good and we can return 1 instead of 0
   5590 
   5591 errorEnd:
   5592    stbi__rewind(s);
   5593    return res;
   5594 }
   5595 
   5596 // read 16bit value and convert to 24bit RGB
   5597 static void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
   5598 {
   5599    stbi__uint16 px = (stbi__uint16)stbi__get16le(s);
   5600    stbi__uint16 fiveBitMask = 31;
   5601    // we have 3 channels with 5bits each
   5602    int r = (px >> 10) & fiveBitMask;
   5603    int g = (px >> 5) & fiveBitMask;
   5604    int b = px & fiveBitMask;
   5605    // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
   5606    out[0] = (stbi_uc)((r * 255)/31);
   5607    out[1] = (stbi_uc)((g * 255)/31);
   5608    out[2] = (stbi_uc)((b * 255)/31);
   5609 
   5610    // some people claim that the most significant bit might be used for alpha
   5611    // (possibly if an alpha-bit is set in the "image descriptor byte")
   5612    // but that only made 16bit test images completely translucent..
   5613    // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
   5614 }
   5615 
   5616 static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5617 {
   5618    //   read in the TGA header stuff
   5619    int tga_offset = stbi__get8(s);
   5620    int tga_indexed = stbi__get8(s);
   5621    int tga_image_type = stbi__get8(s);
   5622    int tga_is_RLE = 0;
   5623    int tga_palette_start = stbi__get16le(s);
   5624    int tga_palette_len = stbi__get16le(s);
   5625    int tga_palette_bits = stbi__get8(s);
   5626    int tga_x_origin = stbi__get16le(s);
   5627    int tga_y_origin = stbi__get16le(s);
   5628    int tga_width = stbi__get16le(s);
   5629    int tga_height = stbi__get16le(s);
   5630    int tga_bits_per_pixel = stbi__get8(s);
   5631    int tga_comp, tga_rgb16=0;
   5632    int tga_inverted = stbi__get8(s);
   5633    // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
   5634    //   image data
   5635    unsigned char *tga_data;
   5636    unsigned char *tga_palette = NULL;
   5637    int i, j;
   5638    unsigned char raw_data[4] = {0};
   5639    int RLE_count = 0;
   5640    int RLE_repeating = 0;
   5641    int read_next_pixel = 1;
   5642    STBI_NOTUSED(ri);
   5643    STBI_NOTUSED(tga_x_origin); // @TODO
   5644    STBI_NOTUSED(tga_y_origin); // @TODO
   5645 
   5646    //   do a tiny bit of precessing
   5647    if ( tga_image_type >= 8 )
   5648    {
   5649       tga_image_type -= 8;
   5650       tga_is_RLE = 1;
   5651    }
   5652    tga_inverted = 1 - ((tga_inverted >> 5) & 1);
   5653 
   5654    //   If I'm paletted, then I'll use the number of bits from the palette
   5655    if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
   5656    else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
   5657 
   5658    if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
   5659       return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
   5660 
   5661    //   tga info
   5662    *x = tga_width;
   5663    *y = tga_height;
   5664    if (comp) *comp = tga_comp;
   5665 
   5666    if (!stbi__mad3sizes_valid(tga_width, tga_height, tga_comp, 0))
   5667       return stbi__errpuc("too large", "Corrupt TGA");
   5668 
   5669    tga_data = (unsigned char*)stbi__malloc_mad3(tga_width, tga_height, tga_comp, 0);
   5670    if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
   5671 
   5672    // skip to the data's starting position (offset usually = 0)
   5673    stbi__skip(s, tga_offset );
   5674 
   5675    if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
   5676       for (i=0; i < tga_height; ++i) {
   5677          int row = tga_inverted ? tga_height -i - 1 : i;
   5678          stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
   5679          stbi__getn(s, tga_row, tga_width * tga_comp);
   5680       }
   5681    } else  {
   5682       //   do I need to load a palette?
   5683       if ( tga_indexed)
   5684       {
   5685          //   any data to skip? (offset usually = 0)
   5686          stbi__skip(s, tga_palette_start );
   5687          //   load the palette
   5688          tga_palette = (unsigned char*)stbi__malloc_mad2(tga_palette_len, tga_comp, 0);
   5689          if (!tga_palette) {
   5690             STBI_FREE(tga_data);
   5691             return stbi__errpuc("outofmem", "Out of memory");
   5692          }
   5693          if (tga_rgb16) {
   5694             stbi_uc *pal_entry = tga_palette;
   5695             STBI_ASSERT(tga_comp == STBI_rgb);
   5696             for (i=0; i < tga_palette_len; ++i) {
   5697                stbi__tga_read_rgb16(s, pal_entry);
   5698                pal_entry += tga_comp;
   5699             }
   5700          } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
   5701                STBI_FREE(tga_data);
   5702                STBI_FREE(tga_palette);
   5703                return stbi__errpuc("bad palette", "Corrupt TGA");
   5704          }
   5705       }
   5706       //   load the data
   5707       for (i=0; i < tga_width * tga_height; ++i)
   5708       {
   5709          //   if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
   5710          if ( tga_is_RLE )
   5711          {
   5712             if ( RLE_count == 0 )
   5713             {
   5714                //   yep, get the next byte as a RLE command
   5715                int RLE_cmd = stbi__get8(s);
   5716                RLE_count = 1 + (RLE_cmd & 127);
   5717                RLE_repeating = RLE_cmd >> 7;
   5718                read_next_pixel = 1;
   5719             } else if ( !RLE_repeating )
   5720             {
   5721                read_next_pixel = 1;
   5722             }
   5723          } else
   5724          {
   5725             read_next_pixel = 1;
   5726          }
   5727          //   OK, if I need to read a pixel, do it now
   5728          if ( read_next_pixel )
   5729          {
   5730             //   load however much data we did have
   5731             if ( tga_indexed )
   5732             {
   5733                // read in index, then perform the lookup
   5734                int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
   5735                if ( pal_idx >= tga_palette_len ) {
   5736                   // invalid index
   5737                   pal_idx = 0;
   5738                }
   5739                pal_idx *= tga_comp;
   5740                for (j = 0; j < tga_comp; ++j) {
   5741                   raw_data[j] = tga_palette[pal_idx+j];
   5742                }
   5743             } else if(tga_rgb16) {
   5744                STBI_ASSERT(tga_comp == STBI_rgb);
   5745                stbi__tga_read_rgb16(s, raw_data);
   5746             } else {
   5747                //   read in the data raw
   5748                for (j = 0; j < tga_comp; ++j) {
   5749                   raw_data[j] = stbi__get8(s);
   5750                }
   5751             }
   5752             //   clear the reading flag for the next pixel
   5753             read_next_pixel = 0;
   5754          } // end of reading a pixel
   5755 
   5756          // copy data
   5757          for (j = 0; j < tga_comp; ++j)
   5758            tga_data[i*tga_comp+j] = raw_data[j];
   5759 
   5760          //   in case we're in RLE mode, keep counting down
   5761          --RLE_count;
   5762       }
   5763       //   do I need to invert the image?
   5764       if ( tga_inverted )
   5765       {
   5766          for (j = 0; j*2 < tga_height; ++j)
   5767          {
   5768             int index1 = j * tga_width * tga_comp;
   5769             int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
   5770             for (i = tga_width * tga_comp; i > 0; --i)
   5771             {
   5772                unsigned char temp = tga_data[index1];
   5773                tga_data[index1] = tga_data[index2];
   5774                tga_data[index2] = temp;
   5775                ++index1;
   5776                ++index2;
   5777             }
   5778          }
   5779       }
   5780       //   clear my palette, if I had one
   5781       if ( tga_palette != NULL )
   5782       {
   5783          STBI_FREE( tga_palette );
   5784       }
   5785    }
   5786 
   5787    // swap RGB - if the source data was RGB16, it already is in the right order
   5788    if (tga_comp >= 3 && !tga_rgb16)
   5789    {
   5790       unsigned char* tga_pixel = tga_data;
   5791       for (i=0; i < tga_width * tga_height; ++i)
   5792       {
   5793          unsigned char temp = tga_pixel[0];
   5794          tga_pixel[0] = tga_pixel[2];
   5795          tga_pixel[2] = temp;
   5796          tga_pixel += tga_comp;
   5797       }
   5798    }
   5799 
   5800    // convert to target component count
   5801    if (req_comp && req_comp != tga_comp)
   5802       tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
   5803 
   5804    //   the things I do to get rid of an error message, and yet keep
   5805    //   Microsoft's C compilers happy... [8^(
   5806    tga_palette_start = tga_palette_len = tga_palette_bits =
   5807          tga_x_origin = tga_y_origin = 0;
   5808    STBI_NOTUSED(tga_palette_start);
   5809    //   OK, done
   5810    return tga_data;
   5811 }
   5812 #endif
   5813 
   5814 // *************************************************************************************************
   5815 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
   5816 
   5817 #ifndef STBI_NO_PSD
   5818 static int stbi__psd_test(stbi__context *s)
   5819 {
   5820    int r = (stbi__get32be(s) == 0x38425053);
   5821    stbi__rewind(s);
   5822    return r;
   5823 }
   5824 
   5825 static int stbi__psd_decode_rle(stbi__context *s, stbi_uc *p, int pixelCount)
   5826 {
   5827    int count, nleft, len;
   5828 
   5829    count = 0;
   5830    while ((nleft = pixelCount - count) > 0) {
   5831       len = stbi__get8(s);
   5832       if (len == 128) {
   5833          // No-op.
   5834       } else if (len < 128) {
   5835          // Copy next len+1 bytes literally.
   5836          len++;
   5837          if (len > nleft) return 0; // corrupt data
   5838          count += len;
   5839          while (len) {
   5840             *p = stbi__get8(s);
   5841             p += 4;
   5842             len--;
   5843          }
   5844       } else if (len > 128) {
   5845          stbi_uc   val;
   5846          // Next -len+1 bytes in the dest are replicated from next source byte.
   5847          // (Interpret len as a negative 8-bit int.)
   5848          len = 257 - len;
   5849          if (len > nleft) return 0; // corrupt data
   5850          val = stbi__get8(s);
   5851          count += len;
   5852          while (len) {
   5853             *p = val;
   5854             p += 4;
   5855             len--;
   5856          }
   5857       }
   5858    }
   5859 
   5860    return 1;
   5861 }
   5862 
   5863 static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
   5864 {
   5865    int pixelCount;
   5866    int channelCount, compression;
   5867    int channel, i;
   5868    int bitdepth;
   5869    int w,h;
   5870    stbi_uc *out;
   5871    STBI_NOTUSED(ri);
   5872 
   5873    // Check identifier
   5874    if (stbi__get32be(s) != 0x38425053)   // "8BPS"
   5875       return stbi__errpuc("not PSD", "Corrupt PSD image");
   5876 
   5877    // Check file type version.
   5878    if (stbi__get16be(s) != 1)
   5879       return stbi__errpuc("wrong version", "Unsupported version of PSD image");
   5880 
   5881    // Skip 6 reserved bytes.
   5882    stbi__skip(s, 6 );
   5883 
   5884    // Read the number of channels (R, G, B, A, etc).
   5885    channelCount = stbi__get16be(s);
   5886    if (channelCount < 0 || channelCount > 16)
   5887       return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
   5888 
   5889    // Read the rows and columns of the image.
   5890    h = stbi__get32be(s);
   5891    w = stbi__get32be(s);
   5892 
   5893    // Make sure the depth is 8 bits.
   5894    bitdepth = stbi__get16be(s);
   5895    if (bitdepth != 8 && bitdepth != 16)
   5896       return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
   5897 
   5898    // Make sure the color mode is RGB.
   5899    // Valid options are:
   5900    //   0: Bitmap
   5901    //   1: Grayscale
   5902    //   2: Indexed color
   5903    //   3: RGB color
   5904    //   4: CMYK color
   5905    //   7: Multichannel
   5906    //   8: Duotone
   5907    //   9: Lab color
   5908    if (stbi__get16be(s) != 3)
   5909       return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
   5910 
   5911    // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
   5912    stbi__skip(s,stbi__get32be(s) );
   5913 
   5914    // Skip the image resources.  (resolution, pen tool paths, etc)
   5915    stbi__skip(s, stbi__get32be(s) );
   5916 
   5917    // Skip the reserved data.
   5918    stbi__skip(s, stbi__get32be(s) );
   5919 
   5920    // Find out if the data is compressed.
   5921    // Known values:
   5922    //   0: no compression
   5923    //   1: RLE compressed
   5924    compression = stbi__get16be(s);
   5925    if (compression > 1)
   5926       return stbi__errpuc("bad compression", "PSD has an unknown compression format");
   5927 
   5928    // Check size
   5929    if (!stbi__mad3sizes_valid(4, w, h, 0))
   5930       return stbi__errpuc("too large", "Corrupt PSD");
   5931 
   5932    // Create the destination image.
   5933 
   5934    if (!compression && bitdepth == 16 && bpc == 16) {
   5935       out = (stbi_uc *) stbi__malloc_mad3(8, w, h, 0);
   5936       ri->bits_per_channel = 16;
   5937    } else
   5938       out = (stbi_uc *) stbi__malloc(4 * w*h);
   5939 
   5940    if (!out) return stbi__errpuc("outofmem", "Out of memory");
   5941    pixelCount = w*h;
   5942 
   5943    // Initialize the data to zero.
   5944    //memset( out, 0, pixelCount * 4 );
   5945 
   5946    // Finally, the image data.
   5947    if (compression) {
   5948       // RLE as used by .PSD and .TIFF
   5949       // Loop until you get the number of unpacked bytes you are expecting:
   5950       //     Read the next source byte into n.
   5951       //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
   5952       //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
   5953       //     Else if n is 128, noop.
   5954       // Endloop
   5955 
   5956       // The RLE-compressed data is preceded by a 2-byte data count for each row in the data,
   5957       // which we're going to just skip.
   5958       stbi__skip(s, h * channelCount * 2 );
   5959 
   5960       // Read the RLE data by channel.
   5961       for (channel = 0; channel < 4; channel++) {
   5962          stbi_uc *p;
   5963 
   5964          p = out+channel;
   5965          if (channel >= channelCount) {
   5966             // Fill this channel with default data.
   5967             for (i = 0; i < pixelCount; i++, p += 4)
   5968                *p = (channel == 3 ? 255 : 0);
   5969          } else {
   5970             // Read the RLE data.
   5971             if (!stbi__psd_decode_rle(s, p, pixelCount)) {
   5972                STBI_FREE(out);
   5973                return stbi__errpuc("corrupt", "bad RLE data");
   5974             }
   5975          }
   5976       }
   5977 
   5978    } else {
   5979       // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
   5980       // where each channel consists of an 8-bit (or 16-bit) value for each pixel in the image.
   5981 
   5982       // Read the data by channel.
   5983       for (channel = 0; channel < 4; channel++) {
   5984          if (channel >= channelCount) {
   5985             // Fill this channel with default data.
   5986             if (bitdepth == 16 && bpc == 16) {
   5987                stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
   5988                stbi__uint16 val = channel == 3 ? 65535 : 0;
   5989                for (i = 0; i < pixelCount; i++, q += 4)
   5990                   *q = val;
   5991             } else {
   5992                stbi_uc *p = out+channel;
   5993                stbi_uc val = channel == 3 ? 255 : 0;
   5994                for (i = 0; i < pixelCount; i++, p += 4)
   5995                   *p = val;
   5996             }
   5997          } else {
   5998             if (ri->bits_per_channel == 16) {    // output bpc
   5999                stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
   6000                for (i = 0; i < pixelCount; i++, q += 4)
   6001                   *q = (stbi__uint16) stbi__get16be(s);
   6002             } else {
   6003                stbi_uc *p = out+channel;
   6004                if (bitdepth == 16) {  // input bpc
   6005                   for (i = 0; i < pixelCount; i++, p += 4)
   6006                      *p = (stbi_uc) (stbi__get16be(s) >> 8);
   6007                } else {
   6008                   for (i = 0; i < pixelCount; i++, p += 4)
   6009                      *p = stbi__get8(s);
   6010                }
   6011             }
   6012          }
   6013       }
   6014    }
   6015 
   6016    // remove weird white matte from PSD
   6017    if (channelCount >= 4) {
   6018       if (ri->bits_per_channel == 16) {
   6019          for (i=0; i < w*h; ++i) {
   6020             stbi__uint16 *pixel = (stbi__uint16 *) out + 4*i;
   6021             if (pixel[3] != 0 && pixel[3] != 65535) {
   6022                float a = pixel[3] / 65535.0f;
   6023                float ra = 1.0f / a;
   6024                float inv_a = 65535.0f * (1 - ra);
   6025                pixel[0] = (stbi__uint16) (pixel[0]*ra + inv_a);
   6026                pixel[1] = (stbi__uint16) (pixel[1]*ra + inv_a);
   6027                pixel[2] = (stbi__uint16) (pixel[2]*ra + inv_a);
   6028             }
   6029          }
   6030       } else {
   6031          for (i=0; i < w*h; ++i) {
   6032             unsigned char *pixel = out + 4*i;
   6033             if (pixel[3] != 0 && pixel[3] != 255) {
   6034                float a = pixel[3] / 255.0f;
   6035                float ra = 1.0f / a;
   6036                float inv_a = 255.0f * (1 - ra);
   6037                pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
   6038                pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
   6039                pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
   6040             }
   6041          }
   6042       }
   6043    }
   6044 
   6045    // convert to desired output format
   6046    if (req_comp && req_comp != 4) {
   6047       if (ri->bits_per_channel == 16)
   6048          out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, 4, req_comp, w, h);
   6049       else
   6050          out = stbi__convert_format(out, 4, req_comp, w, h);
   6051       if (out == NULL) return out; // stbi__convert_format frees input on failure
   6052    }
   6053 
   6054    if (comp) *comp = 4;
   6055    *y = h;
   6056    *x = w;
   6057 
   6058    return out;
   6059 }
   6060 #endif
   6061 
   6062 // *************************************************************************************************
   6063 // Softimage PIC loader
   6064 // by Tom Seddon
   6065 //
   6066 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
   6067 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
   6068 
   6069 #ifndef STBI_NO_PIC
   6070 static int stbi__pic_is4(stbi__context *s,const char *str)
   6071 {
   6072    int i;
   6073    for (i=0; i<4; ++i)
   6074       if (stbi__get8(s) != (stbi_uc)str[i])
   6075          return 0;
   6076 
   6077    return 1;
   6078 }
   6079 
   6080 static int stbi__pic_test_core(stbi__context *s)
   6081 {
   6082    int i;
   6083 
   6084    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
   6085       return 0;
   6086 
   6087    for(i=0;i<84;++i)
   6088       stbi__get8(s);
   6089 
   6090    if (!stbi__pic_is4(s,"PICT"))
   6091       return 0;
   6092 
   6093    return 1;
   6094 }
   6095 
   6096 typedef struct
   6097 {
   6098    stbi_uc size,type,channel;
   6099 } stbi__pic_packet;
   6100 
   6101 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
   6102 {
   6103    int mask=0x80, i;
   6104 
   6105    for (i=0; i<4; ++i, mask>>=1) {
   6106       if (channel & mask) {
   6107          if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
   6108          dest[i]=stbi__get8(s);
   6109       }
   6110    }
   6111 
   6112    return dest;
   6113 }
   6114 
   6115 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
   6116 {
   6117    int mask=0x80,i;
   6118 
   6119    for (i=0;i<4; ++i, mask>>=1)
   6120       if (channel&mask)
   6121          dest[i]=src[i];
   6122 }
   6123 
   6124 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
   6125 {
   6126    int act_comp=0,num_packets=0,y,chained;
   6127    stbi__pic_packet packets[10];
   6128 
   6129    // this will (should...) cater for even some bizarre stuff like having data
   6130     // for the same channel in multiple packets.
   6131    do {
   6132       stbi__pic_packet *packet;
   6133 
   6134       if (num_packets==sizeof(packets)/sizeof(packets[0]))
   6135          return stbi__errpuc("bad format","too many packets");
   6136 
   6137       packet = &packets[num_packets++];
   6138 
   6139       chained = stbi__get8(s);
   6140       packet->size    = stbi__get8(s);
   6141       packet->type    = stbi__get8(s);
   6142       packet->channel = stbi__get8(s);
   6143 
   6144       act_comp |= packet->channel;
   6145 
   6146       if (stbi__at_eof(s))          return stbi__errpuc("bad file","file too short (reading packets)");
   6147       if (packet->size != 8)  return stbi__errpuc("bad format","packet isn't 8bpp");
   6148    } while (chained);
   6149 
   6150    *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
   6151 
   6152    for(y=0; y<height; ++y) {
   6153       int packet_idx;
   6154 
   6155       for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
   6156          stbi__pic_packet *packet = &packets[packet_idx];
   6157          stbi_uc *dest = result+y*width*4;
   6158 
   6159          switch (packet->type) {
   6160             default:
   6161                return stbi__errpuc("bad format","packet has bad compression type");
   6162 
   6163             case 0: {//uncompressed
   6164                int x;
   6165 
   6166                for(x=0;x<width;++x, dest+=4)
   6167                   if (!stbi__readval(s,packet->channel,dest))
   6168                      return 0;
   6169                break;
   6170             }
   6171 
   6172             case 1://Pure RLE
   6173                {
   6174                   int left=width, i;
   6175 
   6176                   while (left>0) {
   6177                      stbi_uc count,value[4];
   6178 
   6179                      count=stbi__get8(s);
   6180                      if (stbi__at_eof(s))   return stbi__errpuc("bad file","file too short (pure read count)");
   6181 
   6182                      if (count > left)
   6183                         count = (stbi_uc) left;
   6184 
   6185                      if (!stbi__readval(s,packet->channel,value))  return 0;
   6186 
   6187                      for(i=0; i<count; ++i,dest+=4)
   6188                         stbi__copyval(packet->channel,dest,value);
   6189                      left -= count;
   6190                   }
   6191                }
   6192                break;
   6193 
   6194             case 2: {//Mixed RLE
   6195                int left=width;
   6196                while (left>0) {
   6197                   int count = stbi__get8(s), i;
   6198                   if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (mixed read count)");
   6199 
   6200                   if (count >= 128) { // Repeated
   6201                      stbi_uc value[4];
   6202 
   6203                      if (count==128)
   6204                         count = stbi__get16be(s);
   6205                      else
   6206                         count -= 127;
   6207                      if (count > left)
   6208                         return stbi__errpuc("bad file","scanline overrun");
   6209 
   6210                      if (!stbi__readval(s,packet->channel,value))
   6211                         return 0;
   6212 
   6213                      for(i=0;i<count;++i, dest += 4)
   6214                         stbi__copyval(packet->channel,dest,value);
   6215                   } else { // Raw
   6216                      ++count;
   6217                      if (count>left) return stbi__errpuc("bad file","scanline overrun");
   6218 
   6219                      for(i=0;i<count;++i, dest+=4)
   6220                         if (!stbi__readval(s,packet->channel,dest))
   6221                            return 0;
   6222                   }
   6223                   left-=count;
   6224                }
   6225                break;
   6226             }
   6227          }
   6228       }
   6229    }
   6230 
   6231    return result;
   6232 }
   6233 
   6234 static void *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp, stbi__result_info *ri)
   6235 {
   6236    stbi_uc *result;
   6237    int i, x,y, internal_comp;
   6238    STBI_NOTUSED(ri);
   6239 
   6240    if (!comp) comp = &internal_comp;
   6241 
   6242    for (i=0; i<92; ++i)
   6243       stbi__get8(s);
   6244 
   6245    x = stbi__get16be(s);
   6246    y = stbi__get16be(s);
   6247    if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (pic header)");
   6248    if (!stbi__mad3sizes_valid(x, y, 4, 0)) return stbi__errpuc("too large", "PIC image too large to decode");
   6249 
   6250    stbi__get32be(s); //skip `ratio'
   6251    stbi__get16be(s); //skip `fields'
   6252    stbi__get16be(s); //skip `pad'
   6253 
   6254    // intermediate buffer is RGBA
   6255    result = (stbi_uc *) stbi__malloc_mad3(x, y, 4, 0);
   6256    memset(result, 0xff, x*y*4);
   6257 
   6258    if (!stbi__pic_load_core(s,x,y,comp, result)) {
   6259       STBI_FREE(result);
   6260       result=0;
   6261    }
   6262    *px = x;
   6263    *py = y;
   6264    if (req_comp == 0) req_comp = *comp;
   6265    result=stbi__convert_format(result,4,req_comp,x,y);
   6266 
   6267    return result;
   6268 }
   6269 
   6270 static int stbi__pic_test(stbi__context *s)
   6271 {
   6272    int r = stbi__pic_test_core(s);
   6273    stbi__rewind(s);
   6274    return r;
   6275 }
   6276 #endif
   6277 
   6278 // *************************************************************************************************
   6279 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
   6280 
   6281 #ifndef STBI_NO_GIF
   6282 typedef struct
   6283 {
   6284    stbi__int16 prefix;
   6285    stbi_uc first;
   6286    stbi_uc suffix;
   6287 } stbi__gif_lzw;
   6288 
   6289 typedef struct
   6290 {
   6291    int w,h;
   6292    stbi_uc *out;                 // output buffer (always 4 components)
   6293    stbi_uc *background;          // The current "background" as far as a gif is concerned
   6294    stbi_uc *history;
   6295    int flags, bgindex, ratio, transparent, eflags;
   6296    stbi_uc  pal[256][4];
   6297    stbi_uc lpal[256][4];
   6298    stbi__gif_lzw codes[8192];
   6299    stbi_uc *color_table;
   6300    int parse, step;
   6301    int lflags;
   6302    int start_x, start_y;
   6303    int max_x, max_y;
   6304    int cur_x, cur_y;
   6305    int line_size;
   6306    int delay;
   6307 } stbi__gif;
   6308 
   6309 static int stbi__gif_test_raw(stbi__context *s)
   6310 {
   6311    int sz;
   6312    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
   6313    sz = stbi__get8(s);
   6314    if (sz != '9' && sz != '7') return 0;
   6315    if (stbi__get8(s) != 'a') return 0;
   6316    return 1;
   6317 }
   6318 
   6319 static int stbi__gif_test(stbi__context *s)
   6320 {
   6321    int r = stbi__gif_test_raw(s);
   6322    stbi__rewind(s);
   6323    return r;
   6324 }
   6325 
   6326 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
   6327 {
   6328    int i;
   6329    for (i=0; i < num_entries; ++i) {
   6330       pal[i][2] = stbi__get8(s);
   6331       pal[i][1] = stbi__get8(s);
   6332       pal[i][0] = stbi__get8(s);
   6333       pal[i][3] = transp == i ? 0 : 255;
   6334    }
   6335 }
   6336 
   6337 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
   6338 {
   6339    stbi_uc version;
   6340    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
   6341       return stbi__err("not GIF", "Corrupt GIF");
   6342 
   6343    version = stbi__get8(s);
   6344    if (version != '7' && version != '9')    return stbi__err("not GIF", "Corrupt GIF");
   6345    if (stbi__get8(s) != 'a')                return stbi__err("not GIF", "Corrupt GIF");
   6346 
   6347    stbi__g_failure_reason = "";
   6348    g->w = stbi__get16le(s);
   6349    g->h = stbi__get16le(s);
   6350    g->flags = stbi__get8(s);
   6351    g->bgindex = stbi__get8(s);
   6352    g->ratio = stbi__get8(s);
   6353    g->transparent = -1;
   6354 
   6355    if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
   6356 
   6357    if (is_info) return 1;
   6358 
   6359    if (g->flags & 0x80)
   6360       stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
   6361 
   6362    return 1;
   6363 }
   6364 
   6365 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
   6366 {
   6367    stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
   6368    if (!stbi__gif_header(s, g, comp, 1)) {
   6369       STBI_FREE(g);
   6370       stbi__rewind( s );
   6371       return 0;
   6372    }
   6373    if (x) *x = g->w;
   6374    if (y) *y = g->h;
   6375    STBI_FREE(g);
   6376    return 1;
   6377 }
   6378 
   6379 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
   6380 {
   6381    stbi_uc *p, *c;
   6382    int idx;
   6383 
   6384    // recurse to decode the prefixes, since the linked-list is backwards,
   6385    // and working backwards through an interleaved image would be nasty
   6386    if (g->codes[code].prefix >= 0)
   6387       stbi__out_gif_code(g, g->codes[code].prefix);
   6388 
   6389    if (g->cur_y >= g->max_y) return;
   6390 
   6391    idx = g->cur_x + g->cur_y;
   6392    p = &g->out[idx];
   6393    g->history[idx / 4] = 1;
   6394 
   6395    c = &g->color_table[g->codes[code].suffix * 4];
   6396    if (c[3] > 128) { // don't render transparent pixels;
   6397       p[0] = c[2];
   6398       p[1] = c[1];
   6399       p[2] = c[0];
   6400       p[3] = c[3];
   6401    }
   6402    g->cur_x += 4;
   6403 
   6404    if (g->cur_x >= g->max_x) {
   6405       g->cur_x = g->start_x;
   6406       g->cur_y += g->step;
   6407 
   6408       while (g->cur_y >= g->max_y && g->parse > 0) {
   6409          g->step = (1 << g->parse) * g->line_size;
   6410          g->cur_y = g->start_y + (g->step >> 1);
   6411          --g->parse;
   6412       }
   6413    }
   6414 }
   6415 
   6416 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
   6417 {
   6418    stbi_uc lzw_cs;
   6419    stbi__int32 len, init_code;
   6420    stbi__uint32 first;
   6421    stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
   6422    stbi__gif_lzw *p;
   6423 
   6424    lzw_cs = stbi__get8(s);
   6425    if (lzw_cs > 12) return NULL;
   6426    clear = 1 << lzw_cs;
   6427    first = 1;
   6428    codesize = lzw_cs + 1;
   6429    codemask = (1 << codesize) - 1;
   6430    bits = 0;
   6431    valid_bits = 0;
   6432    for (init_code = 0; init_code < clear; init_code++) {
   6433       g->codes[init_code].prefix = -1;
   6434       g->codes[init_code].first = (stbi_uc) init_code;
   6435       g->codes[init_code].suffix = (stbi_uc) init_code;
   6436    }
   6437 
   6438    // support no starting clear code
   6439    avail = clear+2;
   6440    oldcode = -1;
   6441 
   6442    len = 0;
   6443    for(;;) {
   6444       if (valid_bits < codesize) {
   6445          if (len == 0) {
   6446             len = stbi__get8(s); // start new block
   6447             if (len == 0)
   6448                return g->out;
   6449          }
   6450          --len;
   6451          bits |= (stbi__int32) stbi__get8(s) << valid_bits;
   6452          valid_bits += 8;
   6453       } else {
   6454          stbi__int32 code = bits & codemask;
   6455          bits >>= codesize;
   6456          valid_bits -= codesize;
   6457          // @OPTIMIZE: is there some way we can accelerate the non-clear path?
   6458          if (code == clear) {  // clear code
   6459             codesize = lzw_cs + 1;
   6460             codemask = (1 << codesize) - 1;
   6461             avail = clear + 2;
   6462             oldcode = -1;
   6463             first = 0;
   6464          } else if (code == clear + 1) { // end of stream code
   6465             stbi__skip(s, len);
   6466             while ((len = stbi__get8(s)) > 0)
   6467                stbi__skip(s,len);
   6468             return g->out;
   6469          } else if (code <= avail) {
   6470             if (first) {
   6471                return stbi__errpuc("no clear code", "Corrupt GIF");
   6472             }
   6473 
   6474             if (oldcode >= 0) {
   6475                p = &g->codes[avail++];
   6476                if (avail > 8192) {
   6477                   return stbi__errpuc("too many codes", "Corrupt GIF");
   6478                }
   6479 
   6480                p->prefix = (stbi__int16) oldcode;
   6481                p->first = g->codes[oldcode].first;
   6482                p->suffix = (code == avail) ? p->first : g->codes[code].first;
   6483             } else if (code == avail)
   6484                return stbi__errpuc("illegal code in raster", "Corrupt GIF");
   6485 
   6486             stbi__out_gif_code(g, (stbi__uint16) code);
   6487 
   6488             if ((avail & codemask) == 0 && avail <= 0x0FFF) {
   6489                codesize++;
   6490                codemask = (1 << codesize) - 1;
   6491             }
   6492 
   6493             oldcode = code;
   6494          } else {
   6495             return stbi__errpuc("illegal code in raster", "Corrupt GIF");
   6496          }
   6497       }
   6498    }
   6499 }
   6500 
   6501 // this function is designed to support animated gifs, although stb_image doesn't support it
   6502 // two back is the image from two frames ago, used for a very specific disposal format
   6503 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp, stbi_uc *two_back)
   6504 {
   6505    int dispose;
   6506    int first_frame;
   6507    int pi;
   6508    int pcount;
   6509    STBI_NOTUSED(req_comp);
   6510 
   6511    // on first frame, any non-written pixels get the background colour (non-transparent)
   6512    first_frame = 0;
   6513    if (g->out == 0) {
   6514       if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
   6515       if (!stbi__mad3sizes_valid(4, g->w, g->h, 0))
   6516          return stbi__errpuc("too large", "GIF image is too large");
   6517       pcount = g->w * g->h;
   6518       g->out = (stbi_uc *) stbi__malloc(4 * pcount);
   6519       g->background = (stbi_uc *) stbi__malloc(4 * pcount);
   6520       g->history = (stbi_uc *) stbi__malloc(pcount);
   6521       if (!g->out || !g->background || !g->history)
   6522          return stbi__errpuc("outofmem", "Out of memory");
   6523 
   6524       // image is treated as "transparent" at the start - ie, nothing overwrites the current background;
   6525       // background colour is only used for pixels that are not rendered first frame, after that "background"
   6526       // color refers to the color that was there the previous frame.
   6527       memset(g->out, 0x00, 4 * pcount);
   6528       memset(g->background, 0x00, 4 * pcount); // state of the background (starts transparent)
   6529       memset(g->history, 0x00, pcount);        // pixels that were affected previous frame
   6530       first_frame = 1;
   6531    } else {
   6532       // second frame - how do we dispoase of the previous one?
   6533       dispose = (g->eflags & 0x1C) >> 2;
   6534       pcount = g->w * g->h;
   6535 
   6536       if ((dispose == 3) && (two_back == 0)) {
   6537          dispose = 2; // if I don't have an image to revert back to, default to the old background
   6538       }
   6539 
   6540       if (dispose == 3) { // use previous graphic
   6541          for (pi = 0; pi < pcount; ++pi) {
   6542             if (g->history[pi]) {
   6543                memcpy( &g->out[pi * 4], &two_back[pi * 4], 4 );
   6544             }
   6545          }
   6546       } else if (dispose == 2) {
   6547          // restore what was changed last frame to background before that frame;
   6548          for (pi = 0; pi < pcount; ++pi) {
   6549             if (g->history[pi]) {
   6550                memcpy( &g->out[pi * 4], &g->background[pi * 4], 4 );
   6551             }
   6552          }
   6553       } else {
   6554          // This is a non-disposal case eithe way, so just
   6555          // leave the pixels as is, and they will become the new background
   6556          // 1: do not dispose
   6557          // 0:  not specified.
   6558       }
   6559 
   6560       // background is what out is after the undoing of the previou frame;
   6561       memcpy( g->background, g->out, 4 * g->w * g->h );
   6562    }
   6563 
   6564    // clear my history;
   6565    memset( g->history, 0x00, g->w * g->h );        // pixels that were affected previous frame
   6566 
   6567    for (;;) {
   6568       int tag = stbi__get8(s);
   6569       switch (tag) {
   6570          case 0x2C: /* Image Descriptor */
   6571          {
   6572             stbi__int32 x, y, w, h;
   6573             stbi_uc *o;
   6574 
   6575             x = stbi__get16le(s);
   6576             y = stbi__get16le(s);
   6577             w = stbi__get16le(s);
   6578             h = stbi__get16le(s);
   6579             if (((x + w) > (g->w)) || ((y + h) > (g->h)))
   6580                return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
   6581 
   6582             g->line_size = g->w * 4;
   6583             g->start_x = x * 4;
   6584             g->start_y = y * g->line_size;
   6585             g->max_x   = g->start_x + w * 4;
   6586             g->max_y   = g->start_y + h * g->line_size;
   6587             g->cur_x   = g->start_x;
   6588             g->cur_y   = g->start_y;
   6589 
   6590             // if the width of the specified rectangle is 0, that means
   6591             // we may not see *any* pixels or the image is malformed;
   6592             // to make sure this is caught, move the current y down to
   6593             // max_y (which is what out_gif_code checks).
   6594             if (w == 0)
   6595                g->cur_y = g->max_y;
   6596 
   6597             g->lflags = stbi__get8(s);
   6598 
   6599             if (g->lflags & 0x40) {
   6600                g->step = 8 * g->line_size; // first interlaced spacing
   6601                g->parse = 3;
   6602             } else {
   6603                g->step = g->line_size;
   6604                g->parse = 0;
   6605             }
   6606 
   6607             if (g->lflags & 0x80) {
   6608                stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
   6609                g->color_table = (stbi_uc *) g->lpal;
   6610             } else if (g->flags & 0x80) {
   6611                g->color_table = (stbi_uc *) g->pal;
   6612             } else
   6613                return stbi__errpuc("missing color table", "Corrupt GIF");
   6614 
   6615             o = stbi__process_gif_raster(s, g);
   6616             if (!o) return NULL;
   6617 
   6618             // if this was the first frame,
   6619             pcount = g->w * g->h;
   6620             if (first_frame && (g->bgindex > 0)) {
   6621                // if first frame, any pixel not drawn to gets the background color
   6622                for (pi = 0; pi < pcount; ++pi) {
   6623                   if (g->history[pi] == 0) {
   6624                      g->pal[g->bgindex][3] = 255; // just in case it was made transparent, undo that; It will be reset next frame if need be;
   6625                      memcpy( &g->out[pi * 4], &g->pal[g->bgindex], 4 );
   6626                   }
   6627                }
   6628             }
   6629 
   6630             return o;
   6631          }
   6632 
   6633          case 0x21: // Comment Extension.
   6634          {
   6635             int len;
   6636             int ext = stbi__get8(s);
   6637             if (ext == 0xF9) { // Graphic Control Extension.
   6638                len = stbi__get8(s);
   6639                if (len == 4) {
   6640                   g->eflags = stbi__get8(s);
   6641                   g->delay = 10 * stbi__get16le(s); // delay - 1/100th of a second, saving as 1/1000ths.
   6642 
   6643                   // unset old transparent
   6644                   if (g->transparent >= 0) {
   6645                      g->pal[g->transparent][3] = 255;
   6646                   }
   6647                   if (g->eflags & 0x01) {
   6648                      g->transparent = stbi__get8(s);
   6649                      if (g->transparent >= 0) {
   6650                         g->pal[g->transparent][3] = 0;
   6651                      }
   6652                   } else {
   6653                      // don't need transparent
   6654                      stbi__skip(s, 1);
   6655                      g->transparent = -1;
   6656                   }
   6657                } else {
   6658                   stbi__skip(s, len);
   6659                   break;
   6660                }
   6661             }
   6662             while ((len = stbi__get8(s)) != 0) {
   6663                stbi__skip(s, len);
   6664             }
   6665             break;
   6666          }
   6667 
   6668          case 0x3B: // gif stream termination code
   6669             return (stbi_uc *) s; // using '1' causes warning on some compilers
   6670 
   6671          default:
   6672             return stbi__errpuc("unknown code", "Corrupt GIF");
   6673       }
   6674    }
   6675 }
   6676 
   6677 static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
   6678 {
   6679    if (stbi__gif_test(s)) {
   6680       int layers = 0;
   6681       stbi_uc *u = 0;
   6682       stbi_uc *out = 0;
   6683       stbi_uc *two_back = 0;
   6684       stbi__gif g;
   6685       int stride;
   6686       memset(&g, 0, sizeof(g));
   6687       if (delays) {
   6688          *delays = 0;
   6689       }
   6690 
   6691       do {
   6692          u = stbi__gif_load_next(s, &g, comp, req_comp, two_back);
   6693          if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
   6694 
   6695          if (u) {
   6696             *x = g.w;
   6697             *y = g.h;
   6698             ++layers;
   6699             stride = g.w * g.h * 4;
   6700 
   6701             if (out) {
   6702                void *tmp = (stbi_uc*) STBI_REALLOC( out, layers * stride );
   6703                if (NULL == tmp) {
   6704                   STBI_FREE(g.out);
   6705                   STBI_FREE(g.history);
   6706                   STBI_FREE(g.background);
   6707                   return stbi__errpuc("outofmem", "Out of memory");
   6708                }
   6709                else
   6710                   out = (stbi_uc*) tmp;
   6711                if (delays) {
   6712                   *delays = (int*) STBI_REALLOC( *delays, sizeof(int) * layers );
   6713                }
   6714             } else {
   6715                out = (stbi_uc*)stbi__malloc( layers * stride );
   6716                if (delays) {
   6717                   *delays = (int*) stbi__malloc( layers * sizeof(int) );
   6718                }
   6719             }
   6720             memcpy( out + ((layers - 1) * stride), u, stride );
   6721             if (layers >= 2) {
   6722                two_back = out - 2 * stride;
   6723             }
   6724 
   6725             if (delays) {
   6726                (*delays)[layers - 1U] = g.delay;
   6727             }
   6728          }
   6729       } while (u != 0);
   6730 
   6731       // free temp buffer;
   6732       STBI_FREE(g.out);
   6733       STBI_FREE(g.history);
   6734       STBI_FREE(g.background);
   6735 
   6736       // do the final conversion after loading everything;
   6737       if (req_comp && req_comp != 4)
   6738          out = stbi__convert_format(out, 4, req_comp, layers * g.w, g.h);
   6739 
   6740       *z = layers;
   6741       return out;
   6742    } else {
   6743       return stbi__errpuc("not GIF", "Image was not as a gif type.");
   6744    }
   6745 }
   6746 
   6747 static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   6748 {
   6749    stbi_uc *u = 0;
   6750    stbi__gif g;
   6751    memset(&g, 0, sizeof(g));
   6752    STBI_NOTUSED(ri);
   6753 
   6754    u = stbi__gif_load_next(s, &g, comp, req_comp, 0);
   6755    if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
   6756    if (u) {
   6757       *x = g.w;
   6758       *y = g.h;
   6759 
   6760       // moved conversion to after successful load so that the same
   6761       // can be done for multiple frames.
   6762       if (req_comp && req_comp != 4)
   6763          u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
   6764    } else if (g.out) {
   6765       // if there was an error and we allocated an image buffer, free it!
   6766       STBI_FREE(g.out);
   6767    }
   6768 
   6769    // free buffers needed for multiple frame loading;
   6770    STBI_FREE(g.history);
   6771    STBI_FREE(g.background);
   6772 
   6773    return u;
   6774 }
   6775 
   6776 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
   6777 {
   6778    return stbi__gif_info_raw(s,x,y,comp);
   6779 }
   6780 #endif
   6781 
   6782 // *************************************************************************************************
   6783 // Radiance RGBE HDR loader
   6784 // originally by Nicolas Schulz
   6785 #ifndef STBI_NO_HDR
   6786 static int stbi__hdr_test_core(stbi__context *s, const char *signature)
   6787 {
   6788    int i;
   6789    for (i=0; signature[i]; ++i)
   6790       if (stbi__get8(s) != signature[i])
   6791           return 0;
   6792    stbi__rewind(s);
   6793    return 1;
   6794 }
   6795 
   6796 static int stbi__hdr_test(stbi__context* s)
   6797 {
   6798    int r = stbi__hdr_test_core(s, "#?RADIANCE\n");
   6799    stbi__rewind(s);
   6800    if(!r) {
   6801        r = stbi__hdr_test_core(s, "#?RGBE\n");
   6802        stbi__rewind(s);
   6803    }
   6804    return r;
   6805 }
   6806 
   6807 #define STBI__HDR_BUFLEN  1024
   6808 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
   6809 {
   6810    int len=0;
   6811    char c = '\0';
   6812 
   6813    c = (char) stbi__get8(z);
   6814 
   6815    while (!stbi__at_eof(z) && c != '\n') {
   6816       buffer[len++] = c;
   6817       if (len == STBI__HDR_BUFLEN-1) {
   6818          // flush to end of line
   6819          while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
   6820             ;
   6821          break;
   6822       }
   6823       c = (char) stbi__get8(z);
   6824    }
   6825 
   6826    buffer[len] = 0;
   6827    return buffer;
   6828 }
   6829 
   6830 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
   6831 {
   6832    if ( input[3] != 0 ) {
   6833       float f1;
   6834       // Exponent
   6835       f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
   6836       if (req_comp <= 2)
   6837          output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
   6838       else {
   6839          output[0] = input[0] * f1;
   6840          output[1] = input[1] * f1;
   6841          output[2] = input[2] * f1;
   6842       }
   6843       if (req_comp == 2) output[1] = 1;
   6844       if (req_comp == 4) output[3] = 1;
   6845    } else {
   6846       switch (req_comp) {
   6847          case 4: output[3] = 1; /* fallthrough */
   6848          case 3: output[0] = output[1] = output[2] = 0;
   6849                  break;
   6850          case 2: output[1] = 1; /* fallthrough */
   6851          case 1: output[0] = 0;
   6852                  break;
   6853       }
   6854    }
   6855 }
   6856 
   6857 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   6858 {
   6859    char buffer[STBI__HDR_BUFLEN];
   6860    char *token;
   6861    int valid = 0;
   6862    int width, height;
   6863    stbi_uc *scanline;
   6864    float *hdr_data;
   6865    int len;
   6866    unsigned char count, value;
   6867    int i, j, k, c1,c2, z;
   6868    const char *headerToken;
   6869    STBI_NOTUSED(ri);
   6870 
   6871    // Check identifier
   6872    headerToken = stbi__hdr_gettoken(s,buffer);
   6873    if (strcmp(headerToken, "#?RADIANCE") != 0 && strcmp(headerToken, "#?RGBE") != 0)
   6874       return stbi__errpf("not HDR", "Corrupt HDR image");
   6875 
   6876    // Parse header
   6877    for(;;) {
   6878       token = stbi__hdr_gettoken(s,buffer);
   6879       if (token[0] == 0) break;
   6880       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   6881    }
   6882 
   6883    if (!valid)    return stbi__errpf("unsupported format", "Unsupported HDR format");
   6884 
   6885    // Parse width and height
   6886    // can't use sscanf() if we're not using stdio!
   6887    token = stbi__hdr_gettoken(s,buffer);
   6888    if (strncmp(token, "-Y ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
   6889    token += 3;
   6890    height = (int) strtol(token, &token, 10);
   6891    while (*token == ' ') ++token;
   6892    if (strncmp(token, "+X ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
   6893    token += 3;
   6894    width = (int) strtol(token, NULL, 10);
   6895 
   6896    *x = width;
   6897    *y = height;
   6898 
   6899    if (comp) *comp = 3;
   6900    if (req_comp == 0) req_comp = 3;
   6901 
   6902    if (!stbi__mad4sizes_valid(width, height, req_comp, sizeof(float), 0))
   6903       return stbi__errpf("too large", "HDR image is too large");
   6904 
   6905    // Read data
   6906    hdr_data = (float *) stbi__malloc_mad4(width, height, req_comp, sizeof(float), 0);
   6907    if (!hdr_data)
   6908       return stbi__errpf("outofmem", "Out of memory");
   6909 
   6910    // Load image data
   6911    // image data is stored as some number of sca
   6912    if ( width < 8 || width >= 32768) {
   6913       // Read flat data
   6914       for (j=0; j < height; ++j) {
   6915          for (i=0; i < width; ++i) {
   6916             stbi_uc rgbe[4];
   6917            main_decode_loop:
   6918             stbi__getn(s, rgbe, 4);
   6919             stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
   6920          }
   6921       }
   6922    } else {
   6923       // Read RLE-encoded data
   6924       scanline = NULL;
   6925 
   6926       for (j = 0; j < height; ++j) {
   6927          c1 = stbi__get8(s);
   6928          c2 = stbi__get8(s);
   6929          len = stbi__get8(s);
   6930          if (c1 != 2 || c2 != 2 || (len & 0x80)) {
   6931             // not run-length encoded, so we have to actually use THIS data as a decoded
   6932             // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
   6933             stbi_uc rgbe[4];
   6934             rgbe[0] = (stbi_uc) c1;
   6935             rgbe[1] = (stbi_uc) c2;
   6936             rgbe[2] = (stbi_uc) len;
   6937             rgbe[3] = (stbi_uc) stbi__get8(s);
   6938             stbi__hdr_convert(hdr_data, rgbe, req_comp);
   6939             i = 1;
   6940             j = 0;
   6941             STBI_FREE(scanline);
   6942             goto main_decode_loop; // yes, this makes no sense
   6943          }
   6944          len <<= 8;
   6945          len |= stbi__get8(s);
   6946          if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
   6947          if (scanline == NULL) {
   6948             scanline = (stbi_uc *) stbi__malloc_mad2(width, 4, 0);
   6949             if (!scanline) {
   6950                STBI_FREE(hdr_data);
   6951                return stbi__errpf("outofmem", "Out of memory");
   6952             }
   6953          }
   6954 
   6955          for (k = 0; k < 4; ++k) {
   6956             int nleft;
   6957             i = 0;
   6958             while ((nleft = width - i) > 0) {
   6959                count = stbi__get8(s);
   6960                if (count > 128) {
   6961                   // Run
   6962                   value = stbi__get8(s);
   6963                   count -= 128;
   6964                   if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
   6965                   for (z = 0; z < count; ++z)
   6966                      scanline[i++ * 4 + k] = value;
   6967                } else {
   6968                   // Dump
   6969                   if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
   6970                   for (z = 0; z < count; ++z)
   6971                      scanline[i++ * 4 + k] = stbi__get8(s);
   6972                }
   6973             }
   6974          }
   6975          for (i=0; i < width; ++i)
   6976             stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
   6977       }
   6978       if (scanline)
   6979          STBI_FREE(scanline);
   6980    }
   6981 
   6982    return hdr_data;
   6983 }
   6984 
   6985 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
   6986 {
   6987    char buffer[STBI__HDR_BUFLEN];
   6988    char *token;
   6989    int valid = 0;
   6990    int dummy;
   6991 
   6992    if (!x) x = &dummy;
   6993    if (!y) y = &dummy;
   6994    if (!comp) comp = &dummy;
   6995 
   6996    if (stbi__hdr_test(s) == 0) {
   6997        stbi__rewind( s );
   6998        return 0;
   6999    }
   7000 
   7001    for(;;) {
   7002       token = stbi__hdr_gettoken(s,buffer);
   7003       if (token[0] == 0) break;
   7004       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   7005    }
   7006 
   7007    if (!valid) {
   7008        stbi__rewind( s );
   7009        return 0;
   7010    }
   7011    token = stbi__hdr_gettoken(s,buffer);
   7012    if (strncmp(token, "-Y ", 3)) {
   7013        stbi__rewind( s );
   7014        return 0;
   7015    }
   7016    token += 3;
   7017    *y = (int) strtol(token, &token, 10);
   7018    while (*token == ' ') ++token;
   7019    if (strncmp(token, "+X ", 3)) {
   7020        stbi__rewind( s );
   7021        return 0;
   7022    }
   7023    token += 3;
   7024    *x = (int) strtol(token, NULL, 10);
   7025    *comp = 3;
   7026    return 1;
   7027 }
   7028 #endif // STBI_NO_HDR
   7029 
   7030 #ifndef STBI_NO_BMP
   7031 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
   7032 {
   7033    void *p;
   7034    stbi__bmp_data info;
   7035 
   7036    info.all_a = 255;
   7037    p = stbi__bmp_parse_header(s, &info);
   7038    stbi__rewind( s );
   7039    if (p == NULL)
   7040       return 0;
   7041    if (x) *x = s->img_x;
   7042    if (y) *y = s->img_y;
   7043    if (comp) {
   7044       if (info.bpp == 24 && info.ma == 0xff000000)
   7045          *comp = 3;
   7046       else
   7047          *comp = info.ma ? 4 : 3;
   7048    }
   7049    return 1;
   7050 }
   7051 #endif
   7052 
   7053 #ifndef STBI_NO_PSD
   7054 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
   7055 {
   7056    int channelCount, dummy, depth;
   7057    if (!x) x = &dummy;
   7058    if (!y) y = &dummy;
   7059    if (!comp) comp = &dummy;
   7060    if (stbi__get32be(s) != 0x38425053) {
   7061        stbi__rewind( s );
   7062        return 0;
   7063    }
   7064    if (stbi__get16be(s) != 1) {
   7065        stbi__rewind( s );
   7066        return 0;
   7067    }
   7068    stbi__skip(s, 6);
   7069    channelCount = stbi__get16be(s);
   7070    if (channelCount < 0 || channelCount > 16) {
   7071        stbi__rewind( s );
   7072        return 0;
   7073    }
   7074    *y = stbi__get32be(s);
   7075    *x = stbi__get32be(s);
   7076    depth = stbi__get16be(s);
   7077    if (depth != 8 && depth != 16) {
   7078        stbi__rewind( s );
   7079        return 0;
   7080    }
   7081    if (stbi__get16be(s) != 3) {
   7082        stbi__rewind( s );
   7083        return 0;
   7084    }
   7085    *comp = 4;
   7086    return 1;
   7087 }
   7088 
   7089 static int stbi__psd_is16(stbi__context *s)
   7090 {
   7091    int channelCount, depth;
   7092    if (stbi__get32be(s) != 0x38425053) {
   7093        stbi__rewind( s );
   7094        return 0;
   7095    }
   7096    if (stbi__get16be(s) != 1) {
   7097        stbi__rewind( s );
   7098        return 0;
   7099    }
   7100    stbi__skip(s, 6);
   7101    channelCount = stbi__get16be(s);
   7102    if (channelCount < 0 || channelCount > 16) {
   7103        stbi__rewind( s );
   7104        return 0;
   7105    }
   7106    (void) stbi__get32be(s);
   7107    (void) stbi__get32be(s);
   7108    depth = stbi__get16be(s);
   7109    if (depth != 16) {
   7110        stbi__rewind( s );
   7111        return 0;
   7112    }
   7113    return 1;
   7114 }
   7115 #endif
   7116 
   7117 #ifndef STBI_NO_PIC
   7118 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
   7119 {
   7120    int act_comp=0,num_packets=0,chained,dummy;
   7121    stbi__pic_packet packets[10];
   7122 
   7123    if (!x) x = &dummy;
   7124    if (!y) y = &dummy;
   7125    if (!comp) comp = &dummy;
   7126 
   7127    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
   7128       stbi__rewind(s);
   7129       return 0;
   7130    }
   7131 
   7132    stbi__skip(s, 88);
   7133 
   7134    *x = stbi__get16be(s);
   7135    *y = stbi__get16be(s);
   7136    if (stbi__at_eof(s)) {
   7137       stbi__rewind( s);
   7138       return 0;
   7139    }
   7140    if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
   7141       stbi__rewind( s );
   7142       return 0;
   7143    }
   7144 
   7145    stbi__skip(s, 8);
   7146 
   7147    do {
   7148       stbi__pic_packet *packet;
   7149 
   7150       if (num_packets==sizeof(packets)/sizeof(packets[0]))
   7151          return 0;
   7152 
   7153       packet = &packets[num_packets++];
   7154       chained = stbi__get8(s);
   7155       packet->size    = stbi__get8(s);
   7156       packet->type    = stbi__get8(s);
   7157       packet->channel = stbi__get8(s);
   7158       act_comp |= packet->channel;
   7159 
   7160       if (stbi__at_eof(s)) {
   7161           stbi__rewind( s );
   7162           return 0;
   7163       }
   7164       if (packet->size != 8) {
   7165           stbi__rewind( s );
   7166           return 0;
   7167       }
   7168    } while (chained);
   7169 
   7170    *comp = (act_comp & 0x10 ? 4 : 3);
   7171 
   7172    return 1;
   7173 }
   7174 #endif
   7175 
   7176 // *************************************************************************************************
   7177 // Portable Gray Map and Portable Pixel Map loader
   7178 // by Ken Miller
   7179 //
   7180 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
   7181 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
   7182 //
   7183 // Known limitations:
   7184 //    Does not support comments in the header section
   7185 //    Does not support ASCII image data (formats P2 and P3)
   7186 //    Does not support 16-bit-per-channel
   7187 
   7188 #ifndef STBI_NO_PNM
   7189 
   7190 static int      stbi__pnm_test(stbi__context *s)
   7191 {
   7192    char p, t;
   7193    p = (char) stbi__get8(s);
   7194    t = (char) stbi__get8(s);
   7195    if (p != 'P' || (t != '5' && t != '6')) {
   7196        stbi__rewind( s );
   7197        return 0;
   7198    }
   7199    return 1;
   7200 }
   7201 
   7202 static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   7203 {
   7204    stbi_uc *out;
   7205    STBI_NOTUSED(ri);
   7206 
   7207    if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
   7208       return 0;
   7209 
   7210    *x = s->img_x;
   7211    *y = s->img_y;
   7212    if (comp) *comp = s->img_n;
   7213 
   7214    if (!stbi__mad3sizes_valid(s->img_n, s->img_x, s->img_y, 0))
   7215       return stbi__errpuc("too large", "PNM too large");
   7216 
   7217    out = (stbi_uc *) stbi__malloc_mad3(s->img_n, s->img_x, s->img_y, 0);
   7218    if (!out) return stbi__errpuc("outofmem", "Out of memory");
   7219    stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
   7220 
   7221    if (req_comp && req_comp != s->img_n) {
   7222       out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
   7223       if (out == NULL) return out; // stbi__convert_format frees input on failure
   7224    }
   7225    return out;
   7226 }
   7227 
   7228 static int      stbi__pnm_isspace(char c)
   7229 {
   7230    return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
   7231 }
   7232 
   7233 static void     stbi__pnm_skip_whitespace(stbi__context *s, char *c)
   7234 {
   7235    for (;;) {
   7236       while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
   7237          *c = (char) stbi__get8(s);
   7238 
   7239       if (stbi__at_eof(s) || *c != '#')
   7240          break;
   7241 
   7242       while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
   7243          *c = (char) stbi__get8(s);
   7244    }
   7245 }
   7246 
   7247 static int      stbi__pnm_isdigit(char c)
   7248 {
   7249    return c >= '0' && c <= '9';
   7250 }
   7251 
   7252 static int      stbi__pnm_getinteger(stbi__context *s, char *c)
   7253 {
   7254    int value = 0;
   7255 
   7256    while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
   7257       value = value*10 + (*c - '0');
   7258       *c = (char) stbi__get8(s);
   7259    }
   7260 
   7261    return value;
   7262 }
   7263 
   7264 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
   7265 {
   7266    int maxv, dummy;
   7267    char c, p, t;
   7268 
   7269    if (!x) x = &dummy;
   7270    if (!y) y = &dummy;
   7271    if (!comp) comp = &dummy;
   7272 
   7273    stbi__rewind(s);
   7274 
   7275    // Get identifier
   7276    p = (char) stbi__get8(s);
   7277    t = (char) stbi__get8(s);
   7278    if (p != 'P' || (t != '5' && t != '6')) {
   7279        stbi__rewind(s);
   7280        return 0;
   7281    }
   7282 
   7283    *comp = (t == '6') ? 3 : 1;  // '5' is 1-component .pgm; '6' is 3-component .ppm
   7284 
   7285    c = (char) stbi__get8(s);
   7286    stbi__pnm_skip_whitespace(s, &c);
   7287 
   7288    *x = stbi__pnm_getinteger(s, &c); // read width
   7289    stbi__pnm_skip_whitespace(s, &c);
   7290 
   7291    *y = stbi__pnm_getinteger(s, &c); // read height
   7292    stbi__pnm_skip_whitespace(s, &c);
   7293 
   7294    maxv = stbi__pnm_getinteger(s, &c);  // read max value
   7295 
   7296    if (maxv > 255)
   7297       return stbi__err("max value > 255", "PPM image not 8-bit");
   7298    else
   7299       return 1;
   7300 }
   7301 #endif
   7302 
   7303 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
   7304 {
   7305    #ifndef STBI_NO_JPEG
   7306    if (stbi__jpeg_info(s, x, y, comp)) return 1;
   7307    #endif
   7308 
   7309    #ifndef STBI_NO_PNG
   7310    if (stbi__png_info(s, x, y, comp))  return 1;
   7311    #endif
   7312 
   7313    #ifndef STBI_NO_GIF
   7314    if (stbi__gif_info(s, x, y, comp))  return 1;
   7315    #endif
   7316 
   7317    #ifndef STBI_NO_BMP
   7318    if (stbi__bmp_info(s, x, y, comp))  return 1;
   7319    #endif
   7320 
   7321    #ifndef STBI_NO_PSD
   7322    if (stbi__psd_info(s, x, y, comp))  return 1;
   7323    #endif
   7324 
   7325    #ifndef STBI_NO_PIC
   7326    if (stbi__pic_info(s, x, y, comp))  return 1;
   7327    #endif
   7328 
   7329    #ifndef STBI_NO_PNM
   7330    if (stbi__pnm_info(s, x, y, comp))  return 1;
   7331    #endif
   7332 
   7333    #ifndef STBI_NO_HDR
   7334    if (stbi__hdr_info(s, x, y, comp))  return 1;
   7335    #endif
   7336 
   7337    // test tga last because it's a crappy test!
   7338    #ifndef STBI_NO_TGA
   7339    if (stbi__tga_info(s, x, y, comp))
   7340        return 1;
   7341    #endif
   7342    return stbi__err("unknown image type", "Image not of any known type, or corrupt");
   7343 }
   7344 
   7345 static int stbi__is_16_main(stbi__context *s)
   7346 {
   7347    #ifndef STBI_NO_PNG
   7348    if (stbi__png_is16(s))  return 1;
   7349    #endif
   7350 
   7351    #ifndef STBI_NO_PSD
   7352    if (stbi__psd_is16(s))  return 1;
   7353    #endif
   7354 
   7355    return 0;
   7356 }
   7357 
   7358 #ifndef STBI_NO_STDIO
   7359 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
   7360 {
   7361     FILE *f = stbi__fopen(filename, "rb");
   7362     int result;
   7363     if (!f) return stbi__err("can't fopen", "Unable to open file");
   7364     result = stbi_info_from_file(f, x, y, comp);
   7365     fclose(f);
   7366     return result;
   7367 }
   7368 
   7369 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
   7370 {
   7371    int r;
   7372    stbi__context s;
   7373    long pos = ftell(f);
   7374    stbi__start_file(&s, f);
   7375    r = stbi__info_main(&s,x,y,comp);
   7376    fseek(f,pos,SEEK_SET);
   7377    return r;
   7378 }
   7379 
   7380 STBIDEF int stbi_is_16_bit(char const *filename)
   7381 {
   7382     FILE *f = stbi__fopen(filename, "rb");
   7383     int result;
   7384     if (!f) return stbi__err("can't fopen", "Unable to open file");
   7385     result = stbi_is_16_bit_from_file(f);
   7386     fclose(f);
   7387     return result;
   7388 }
   7389 
   7390 STBIDEF int stbi_is_16_bit_from_file(FILE *f)
   7391 {
   7392    int r;
   7393    stbi__context s;
   7394    long pos = ftell(f);
   7395    stbi__start_file(&s, f);
   7396    r = stbi__is_16_main(&s);
   7397    fseek(f,pos,SEEK_SET);
   7398    return r;
   7399 }
   7400 #endif // !STBI_NO_STDIO
   7401 
   7402 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
   7403 {
   7404    stbi__context s;
   7405    stbi__start_mem(&s,buffer,len);
   7406    return stbi__info_main(&s,x,y,comp);
   7407 }
   7408 
   7409 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
   7410 {
   7411    stbi__context s;
   7412    stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
   7413    return stbi__info_main(&s,x,y,comp);
   7414 }
   7415 
   7416 STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len)
   7417 {
   7418    stbi__context s;
   7419    stbi__start_mem(&s,buffer,len);
   7420    return stbi__is_16_main(&s);
   7421 }
   7422 
   7423 STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *c, void *user)
   7424 {
   7425    stbi__context s;
   7426    stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
   7427    return stbi__is_16_main(&s);
   7428 }
   7429 
   7430 #endif // STB_IMAGE_IMPLEMENTATION
   7431 
   7432 /*
   7433    revision history:
   7434       2.20  (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
   7435       2.19  (2018-02-11) fix warning
   7436       2.18  (2018-01-30) fix warnings
   7437       2.17  (2018-01-29) change sbti__shiftsigned to avoid clang -O2 bug
   7438                          1-bit BMP
   7439                          *_is_16_bit api
   7440                          avoid warnings
   7441       2.16  (2017-07-23) all functions have 16-bit variants;
   7442                          STBI_NO_STDIO works again;
   7443                          compilation fixes;
   7444                          fix rounding in unpremultiply;
   7445                          optimize vertical flip;
   7446                          disable raw_len validation;
   7447                          documentation fixes
   7448       2.15  (2017-03-18) fix png-1,2,4 bug; now all Imagenet JPGs decode;
   7449                          warning fixes; disable run-time SSE detection on gcc;
   7450                          uniform handling of optional "return" values;
   7451                          thread-safe initialization of zlib tables
   7452       2.14  (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
   7453       2.13  (2016-11-29) add 16-bit API, only supported for PNG right now
   7454       2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
   7455       2.11  (2016-04-02) allocate large structures on the stack
   7456                          remove white matting for transparent PSD
   7457                          fix reported channel count for PNG & BMP
   7458                          re-enable SSE2 in non-gcc 64-bit
   7459                          support RGB-formatted JPEG
   7460                          read 16-bit PNGs (only as 8-bit)
   7461       2.10  (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
   7462       2.09  (2016-01-16) allow comments in PNM files
   7463                          16-bit-per-pixel TGA (not bit-per-component)
   7464                          info() for TGA could break due to .hdr handling
   7465                          info() for BMP to shares code instead of sloppy parse
   7466                          can use STBI_REALLOC_SIZED if allocator doesn't support realloc
   7467                          code cleanup
   7468       2.08  (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
   7469       2.07  (2015-09-13) fix compiler warnings
   7470                          partial animated GIF support
   7471                          limited 16-bpc PSD support
   7472                          #ifdef unused functions
   7473                          bug with < 92 byte PIC,PNM,HDR,TGA
   7474       2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
   7475       2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
   7476       2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
   7477       2.03  (2015-04-12) extra corruption checking (mmozeiko)
   7478                          stbi_set_flip_vertically_on_load (nguillemot)
   7479                          fix NEON support; fix mingw support
   7480       2.02  (2015-01-19) fix incorrect assert, fix warning
   7481       2.01  (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
   7482       2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
   7483       2.00  (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
   7484                          progressive JPEG (stb)
   7485                          PGM/PPM support (Ken Miller)
   7486                          STBI_MALLOC,STBI_REALLOC,STBI_FREE
   7487                          GIF bugfix -- seemingly never worked
   7488                          STBI_NO_*, STBI_ONLY_*
   7489       1.48  (2014-12-14) fix incorrectly-named assert()
   7490       1.47  (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
   7491                          optimize PNG (ryg)
   7492                          fix bug in interlaced PNG with user-specified channel count (stb)
   7493       1.46  (2014-08-26)
   7494               fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
   7495       1.45  (2014-08-16)
   7496               fix MSVC-ARM internal compiler error by wrapping malloc
   7497       1.44  (2014-08-07)
   7498               various warning fixes from Ronny Chevalier
   7499       1.43  (2014-07-15)
   7500               fix MSVC-only compiler problem in code changed in 1.42
   7501       1.42  (2014-07-09)
   7502               don't define _CRT_SECURE_NO_WARNINGS (affects user code)
   7503               fixes to stbi__cleanup_jpeg path
   7504               added STBI_ASSERT to avoid requiring assert.h
   7505       1.41  (2014-06-25)
   7506               fix search&replace from 1.36 that messed up comments/error messages
   7507       1.40  (2014-06-22)
   7508               fix gcc struct-initialization warning
   7509       1.39  (2014-06-15)
   7510               fix to TGA optimization when req_comp != number of components in TGA;
   7511               fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
   7512               add support for BMP version 5 (more ignored fields)
   7513       1.38  (2014-06-06)
   7514               suppress MSVC warnings on integer casts truncating values
   7515               fix accidental rename of 'skip' field of I/O
   7516       1.37  (2014-06-04)
   7517               remove duplicate typedef
   7518       1.36  (2014-06-03)
   7519               convert to header file single-file library
   7520               if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
   7521       1.35  (2014-05-27)
   7522               various warnings
   7523               fix broken STBI_SIMD path
   7524               fix bug where stbi_load_from_file no longer left file pointer in correct place
   7525               fix broken non-easy path for 32-bit BMP (possibly never used)
   7526               TGA optimization by Arseny Kapoulkine
   7527       1.34  (unknown)
   7528               use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
   7529       1.33  (2011-07-14)
   7530               make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
   7531       1.32  (2011-07-13)
   7532               support for "info" function for all supported filetypes (SpartanJ)
   7533       1.31  (2011-06-20)
   7534               a few more leak fixes, bug in PNG handling (SpartanJ)
   7535       1.30  (2011-06-11)
   7536               added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
   7537               removed deprecated format-specific test/load functions
   7538               removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
   7539               error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
   7540               fix inefficiency in decoding 32-bit BMP (David Woo)
   7541       1.29  (2010-08-16)
   7542               various warning fixes from Aurelien Pocheville
   7543       1.28  (2010-08-01)
   7544               fix bug in GIF palette transparency (SpartanJ)
   7545       1.27  (2010-08-01)
   7546               cast-to-stbi_uc to fix warnings
   7547       1.26  (2010-07-24)
   7548               fix bug in file buffering for PNG reported by SpartanJ
   7549       1.25  (2010-07-17)
   7550               refix trans_data warning (Won Chun)
   7551       1.24  (2010-07-12)
   7552               perf improvements reading from files on platforms with lock-heavy fgetc()
   7553               minor perf improvements for jpeg
   7554               deprecated type-specific functions so we'll get feedback if they're needed
   7555               attempt to fix trans_data warning (Won Chun)
   7556       1.23    fixed bug in iPhone support
   7557       1.22  (2010-07-10)
   7558               removed image *writing* support
   7559               stbi_info support from Jetro Lauha
   7560               GIF support from Jean-Marc Lienher
   7561               iPhone PNG-extensions from James Brown
   7562               warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
   7563       1.21    fix use of 'stbi_uc' in header (reported by jon blow)
   7564       1.20    added support for Softimage PIC, by Tom Seddon
   7565       1.19    bug in interlaced PNG corruption check (found by ryg)
   7566       1.18  (2008-08-02)
   7567               fix a threading bug (local mutable static)
   7568       1.17    support interlaced PNG
   7569       1.16    major bugfix - stbi__convert_format converted one too many pixels
   7570       1.15    initialize some fields for thread safety
   7571       1.14    fix threadsafe conversion bug
   7572               header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
   7573       1.13    threadsafe
   7574       1.12    const qualifiers in the API
   7575       1.11    Support installable IDCT, colorspace conversion routines
   7576       1.10    Fixes for 64-bit (don't use "unsigned long")
   7577               optimized upsampling by Fabian "ryg" Giesen
   7578       1.09    Fix format-conversion for PSD code (bad global variables!)
   7579       1.08    Thatcher Ulrich's PSD code integrated by Nicolas Schulz
   7580       1.07    attempt to fix C++ warning/errors again
   7581       1.06    attempt to fix C++ warning/errors again
   7582       1.05    fix TGA loading to return correct *comp and use good luminance calc
   7583       1.04    default float alpha is 1, not 255; use 'void *' for stbi_image_free
   7584       1.03    bugfixes to STBI_NO_STDIO, STBI_NO_HDR
   7585       1.02    support for (subset of) HDR files, float interface for preferred access to them
   7586       1.01    fix bug: possible bug in handling right-side up bmps... not sure
   7587               fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
   7588       1.00    interface to zlib that skips zlib header
   7589       0.99    correct handling of alpha in palette
   7590       0.98    TGA loader by lonesock; dynamically add loaders (untested)
   7591       0.97    jpeg errors on too large a file; also catch another malloc failure
   7592       0.96    fix detection of invalid v value - particleman@mollyrocket forum
   7593       0.95    during header scan, seek to markers in case of padding
   7594       0.94    STBI_NO_STDIO to disable stdio usage; rename all #defines the same
   7595       0.93    handle jpegtran output; verbose errors
   7596       0.92    read 4,8,16,24,32-bit BMP files of several formats
   7597       0.91    output 24-bit Windows 3.0 BMP files
   7598       0.90    fix a few more warnings; bump version number to approach 1.0
   7599       0.61    bugfixes due to Marc LeBlanc, Christopher Lloyd
   7600       0.60    fix compiling as c++
   7601       0.59    fix warnings: merge Dave Moore's -Wall fixes
   7602       0.58    fix bug: zlib uncompressed mode len/nlen was wrong endian
   7603       0.57    fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
   7604       0.56    fix bug: zlib uncompressed mode len vs. nlen
   7605       0.55    fix bug: restart_interval not initialized to 0
   7606       0.54    allow NULL for 'int *comp'
   7607       0.53    fix bug in png 3->4; speedup png decoding
   7608       0.52    png handles req_comp=3,4 directly; minor cleanup; jpeg comments
   7609       0.51    obey req_comp requests, 1-component jpegs return as 1-component,
   7610               on 'test' only check type, not whether we support this variant
   7611       0.50  (2006-11-19)
   7612               first released version
   7613 */
   7614 
   7615 
   7616 /*
   7617 ------------------------------------------------------------------------------
   7618 This software is available under 2 licenses -- choose whichever you prefer.
   7619 ------------------------------------------------------------------------------
   7620 ALTERNATIVE A - MIT License
   7621 Copyright (c) 2017 Sean Barrett
   7622 Permission is hereby granted, free of charge, to any person obtaining a copy of
   7623 this software and associated documentation files (the "Software"), to deal in
   7624 the Software without restriction, including without limitation the rights to
   7625 use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
   7626 of the Software, and to permit persons to whom the Software is furnished to do
   7627 so, subject to the following conditions:
   7628 The above copyright notice and this permission notice shall be included in all
   7629 copies or substantial portions of the Software.
   7630 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
   7631 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
   7632 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
   7633 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
   7634 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
   7635 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
   7636 SOFTWARE.
   7637 ------------------------------------------------------------------------------
   7638 ALTERNATIVE B - Public Domain (www.unlicense.org)
   7639 This is free and unencumbered software released into the public domain.
   7640 Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
   7641 software, either in source code form or as a compiled binary, for any purpose,
   7642 commercial or non-commercial, and by any means.
   7643 In jurisdictions that recognize copyright laws, the author or authors of this
   7644 software dedicate any and all copyright interest in the software to the public
   7645 domain. We make this dedication for the benefit of the public at large and to
   7646 the detriment of our heirs and successors. We intend this dedication to be an
   7647 overt act of relinquishment in perpetuity of all present and future rights to
   7648 this software under copyright law.
   7649 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
   7650 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
   7651 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
   7652 AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
   7653 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
   7654 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
   7655 ------------------------------------------------------------------------------
   7656 */