2
0
Эх сурвалжийг харах

Merge branch 'master' of https://github.com/nothings/stb

Ken Miller 11 жил өмнө
parent
commit
2834d9da08

+ 4 - 2
README.md

@@ -7,13 +7,15 @@ library    | lastest version | category | description
 --------------------- | ---- | -------- | --------------------------------
 **stb_vorbis.c** | 1.04 | audio | decode ogg vorbis files from file/memory to float/16-bit signed output
 **stb_image.h** | 1.46 | graphics | image loading/decoding from file/memory: JPG, PNG, TGA, BMP, PSD, GIF, HDR, PIC
-**stb_truetype.h** | 0.9 | graphics | parse, decode, and rasterize characters from truetype fonts
+**stb_truetype.h** | 0.99 | graphics | parse, decode, and rasterize characters from truetype fonts
 **stb_image_write.h** | 0.95 | graphics | image writing to disk: PNG, TGA, BMP
+**stb_image_resize.h** | 0.90 | graphics | resize images larger/smaller with good quality
 **stretchy_buffer.h** | 1.01 | utility | typesafe dynamic array for C (i.e. approximation to vector<>), doesn't compile as C++
 **stb_textedit.h** | 1.5 | UI | guts of a text editor for games etc implementing them from scratch
 **stb_dxt.h** | 1.04 | 3D&nbsp;graphics | Fabian "ryg" Giesen's real-time DXT compressor
-**stb_herringbone_wang_tile.h** | 0.6 | games | herringbone Wang tile map generator
 **stb_perlin.h** | 0.2 | 3D&nbsp;graphics | revised Perlin noise (3D input, 1D output)
+**stb_tilemap_editor.h** | 0.10 | games | embeddable tilemap editor
+**stb_herringbone_wang_tile.h** | 0.6 | games | herringbone Wang tile map generator
 **stb_c_lexer.h** | 0.06 | parsing | simplify writing parsers for C-like languages
 **stb_divide.h** | 0.91 | math | more useful 32-bit modulus e.g. "euclidean divide"
 **stb.h** | 2.23 | misc | helper functions for C, mostly redundant in C++; basically author's personal stuff

+ 0 - 192
docs/stb_resample_ideas.txt

@@ -1,192 +0,0 @@
-1.
-
-Consider just porting this C++ public domain
-library back to C:
-    https://code.google.com/p/imageresampler/source/browse/#svn%2Ftrunk
-(recommended by @castano)
-
-
-2.
-
-Consider three cases just to suggest the spectrum
-of possiblities:
-
-a) linear upsample: each output pixel is a weighted sum
-of 4 input pixels
-
-b) cubic upsample: each output pixel is a weighted sum
-of 16 input pixels
-
-c) downsample by N with box filter: each output pixel
-is a weighted sum of NxN input pixels, N can be very large
-
-Now, suppose you want to handle 8-bit input, 16-bit
-input, and float input, and you want to do sRGB correction
-or not.
-
-Suppose you create a temporary buffer of float pixels, say
-one scanline tall. Actually two temp buffers, one for the
-input and one for the output. You decode a scanline of the
-input into the temp buffer which is always linear floats. This
-isolates the handling of 8/16/float and sRGB to one place
-(and still allows you to make optimized 8-bit-sRGB-to-float
-lookup tables). This also allows you to put wrap logic here,
-explicitly wrapping, reflecting, or replicating-from-edge
-pixels that would come from off-edge.
-
-You then do whatever the appropriate weighted sums are
-into the output buffer, and you move on to the next
-scanline of the input.
-
-The algorithm just described works directly for case (c).
-Suppose you're downsampling by 2.5; then output scanline 0
-sums from input scanlines 0, 1, and 2; output scanline 1
-sums from 2,3,4; output 2 from 5,6,7; output 3 from 7,8,9.
-Note how 2 & 7 get reused, but we don't have to recompute
-them because we can do things in a single linear pass
-through the input and output at the same time.
-
-Now, consider case (a). When upsampling, the same two input
-scanlines will get sampled-from for multiple output scanlines.
-So, to avoid recomputing the input scanlines, we need either
-multiple input or multiple output temp buffer lines. Since
-the number of output lines a given pair of input scanlines
-might touch scales with the upsample amount, it makes more
-sense to use two input scanline buffers. For cubic, you'll
-need four scanline buffers, and in general the number of
-buffers will be limited by the max filter width, which is
-presumably hardcoded.
-
-It turns out to be slightly different for two reasons:
-
-   1. when using an arbitrary filter and downsampling,
-      you actually need N output buffers and 1 input buffer
-      (vs 1 output buffer and N input buffers upsampling)
-
-   2. this approach will be very inefficient as written.
-      you want to use separable filters and actually do
-      seperable computation: first decode an input scanline
-      into a 'decode' buffer, then horizontally resample it
-      into the "input" buffer (kind of a misnomer, but
-      they're the inputs to the vertical resampler)
-
-(The above approach isn't optimal for non-uniform resampling;
-optimal is to do whichever axis is smaller first, but I don't
-think we have to care about doing that right.)
-
-
-Now, you can either:
-
-    1. malloc the temp memory
-    2. alloca it
-    3. allocate a fixed amount on the stack
-    4. let the user pass it in
-
-I forbid #2 in stb libraries for portability.
-
-If you're not allocating the output image, but rather requiring
-the user to pass it in, it's probably worth trying to avoid #1
-because people always want to use stb libs without any memory
-allocations for various reason. (Note that most stb libs go
-crazy with memory allocations--you shouldn't use stb_image
-in a console game--but I've tried to avoid it more in newer
-libs.)
-
-The way #3 would work is instead of using a scanline-width
-temp buffer, use some fixed-width temp buffer that's W pixels,
-and scale the image in vertical stripes that are that wide.
-Suppose you make the temp buffers 256 wide; then an upsample
-by 8 computes 256-pixel-width strips (from ~32-pixel-wide input
-strips), but a downsample by 8 computes ~32-pixel-width
-strips (from a 256-pixel width strip). Note this limits
-the max down/upsampling to be ballpark 256x along the
-horizontal axis.
-
-In the following, I do #3 and allow #4 for cases where #3 is
-too small, but it's not the only possibility:
-
-
-
-Function prototypes:
-
-the highest-level one could be:
-
-   stb_resample_8bit(uint8_t       *dest, int dest_width, int dest_height,
-                     uint8_t const *src , int  src_width, int  src_height,
-                     int channels,
-                     stbr_filter filter);
-
-the lowest-level one could be:
-
-   stb_resample_arbitrary(void       *dst, stbr_type dst_type, int dst_width, int dst_height, int dst_stride_in_bytes,
-                          void const *src, stbr_type src_type, int src_width, int src_height, int src_stride_in_bytes,
-                          float s0, float t0, float s1, float t1, // range of source to use, 0..1 in GPU texture-coordinate style
-                          int channels,
-                          int nonpremul_alpha_channel_index,
-                          stbr_wrapmode wrap,                     // clamp, wrap, mirror
-                          stbr_filter filter,
-                          void  *tempmem, size_t tempmem_size_in_bytes);
-
-And there would be a bunch of convenience functions in-between those two levels.
-
-
-Some notes:
-
-   s0,t0,s1,t1:
-       this allows fine subpixel-positioning and subpixel-resizing in an explicit way without
-           things having to be exact pixel multiples. it allows people to pseudo-stream
-           images by computing "tiles" of images a bit at a time without forcing those
-           tiles to quantize their source data.
-
-   nonpremul_alpha_channel_index:
-       if this is negative, no channels are processed specially
-       if this is non-negative, then it's the index of the alpha channel,
-           and the image should be treated as non-premultiplied alpha that
-           needs to be resampled accounting for this (weight the sampling
-           by the alpha channel, i.e. premultiply, filter, unpremultiply).
-           this mechanism only allows one alpha channel and ALL channels 
-           are scaled by it; an alternative would be to find some way to
-           pass in which channels serve as alpha channels for which other
-           channels, but eh.
-
-   tempmem, tempmem_size:
-       all functions will needed tempmem, but they can allocate a fixed tempmem buffer
-           on the stack. providing an API that allows overriding the amount of tempmem
-           available allows people to process arbitrarily large images. the return
-           value for the function could be 0 on success or non-0 being the size of
-           tempmem needed.
-   
-   src_stride, dest_stride:
-       the stride variables are signed to allow you to describe both traditional
-           top-to-bottom images (pass in a pointer to the top-left pixel and
-           a positive stride) and bottom-to-top images (pass in a pointer to
-           the bottom-left pixel and a negative stride)
-
-   ordering of src & dest:
-       put these in whatever order you like, i just chose one arbitrarily
-
-   width & height
-       these are ints not unsigned ints or size_ts because i personally forbid
-           unsigned variables for almost everything to avoid signed/unsigned comparison
-           issues, but this is a matter of personal taste and you can do differently
-
-   Intermediate-level functions should be provided for each source type & same dest type
-   so that the code is typesafe; only when people fall back to stb_resample_arbitrary should
-   they be at risk for type unsafety. (One way to deal avoid an explosion of functions of
-   every possible *combination* of types in a type-safe way would be to define one function
-   for each input type, and accept three separate output pointers, one for each type, only
-   one of which can be non-NULL. 9 functions isn't that bad, but if you want to have three
-   or four intermediate-level functions with fewer parameters, 9*4 gets silly. Could also
-   use the same trick for stb_resample_arbitrary, replacing it with three typesafe functions.)
-
-
-
-
-Reference:
-
-Cubic sampling function for seperable cubic:
-   f(x) = (a+2)*x^3 - (a+3)*x^2 + 1       for 0 <= x <= 1
-   f(x) = a*x^3 - 5*a*x^2 + 8*a*x - 4*a   for 1 < x <= 2
-   f(x) = 0                               otherwise
-   "a" is configurable, try -1/2 (from http://pixinsight.com/forum/index.php?topic=556.0 )
-

+ 2585 - 0
stb_image_resize.h

@@ -0,0 +1,2585 @@
+/* stb_image_resize - v0.90 - public domain image resizing
+   by Jorge L Rodriguez (@VinoBS) - 2014
+   http://github.com/nothings/stb
+
+   Written with emphasis on usability, portability, and efficiency. (No
+   SIMD or threads, so it be easily outperformed by libs that use those.)
+   Only scaling and translation is supported, no rotations or shears.
+   Easy API downsamples w/Mitchell filter, upsamples w/cubic interpolation.
+
+   COMPILING & LINKING
+      In one C/C++ file that #includes this file, do this:
+         #define STB_IMAGE_RESIZE_IMPLEMENTATION
+      before the #include. That will create the implementation in that file.
+
+   QUICKSTART
+      stbir_resize_uint8(      input_pixels , in_w , in_h , 0,
+                               output_pixels, out_w, out_h, 0, num_channels)
+      stbir_resize_float(...)
+      stbir_resize_uint8_srgb( input_pixels , in_w , in_h , 0,
+                               output_pixels, out_w, out_h, 0,
+                               num_channels , alpha_chan  , 0)
+      stbir_resize_uint8_srgb_edgemode(
+                               input_pixels , in_w , in_h , 0, 
+                               output_pixels, out_w, out_h, 0, 
+                               num_channels , alpha_chan  , 0, STBIR_EDGE_CLAMP)
+                                                            // WRAP/REFLECT/ZERO
+
+   FULL API
+      See the "header file" section of the source for API documentation.
+
+   ADDITIONAL DOCUMENTATION
+
+      SRGB & FLOATING POINT REPRESENTATION
+         The sRGB functions presume IEEE floating point. If you do not have
+         IEEE floating point, define STBIR_NON_IEEE_FLOAT. This will use
+         a slower implementation.
+
+      MEMORY ALLOCATION
+         The resize functions here perform a single memory allocation using
+         malloc. To control the memory allocation, before the #include that
+         triggers the implementation, do:
+
+            #define STBIR_MALLOC(size,context) ...
+            #define STBIR_FREE(ptr,context)   ...
+
+         Each resize function makes exactly one call to malloc/free, so to use
+         temp memory, store the temp memory in the context and return that.
+
+      ASSERT
+         Define STBIR_ASSERT(boolval) to override assert() and not use assert.h
+
+      OPTIMIZATION
+         Define STBIR_SATURATE_INT to compute clamp values in-range using
+         integer operations instead of float operations. This may be faster
+         on some platforms.
+
+      DEFAULT FILTERS
+         For functions which don't provide explicit control over what filters
+         to use, you can change the compile-time defaults with
+
+            #define STBIR_DEFAULT_FILTER_UPSAMPLE     STBIR_FILTER_something
+            #define STBIR_DEFAULT_FILTER_DOWNSAMPLE   STBIR_FILTER_something
+
+         See stbir_filter in the header-file section for the list of filters.
+
+      NEW FILTERS
+         A number of 1D filter kernels are used. For a list of
+         supported filters see the stbir_filter enum. To add a new filter,
+         write a filter function and add it to stbir__filter_info_table.
+
+      PROGRESS
+         For interactive use with slow resize operations, you can install
+         a progress-report callback:
+
+            #define STBIR_PROGRESS_REPORT(val)   some_func(val)
+
+         The parameter val is a float which goes from 0 to 1 as progress is made.
+
+         For example:
+
+            static void my_progress_report(float progress);
+            #define STBIR_PROGRESS_REPORT(val) my_progress_report(val)
+
+            #define STB_IMAGE_RESIZE_IMPLEMENTATION
+            #include "stb_image_resize.h"
+
+            static void my_progress_report(float progress)
+            {
+               printf("Progress: %f%%\n", progress*100);
+            }
+
+      MAX CHANNELS
+         If your image has more than 64 channels, define STBIR_MAX_CHANNELS
+         to the max you'll have.
+
+      ALPHA CHANNEL
+         Most of the resizing functions provide the ability to control how
+         the alpha channel of an image is processed. The important things
+         to know about this:
+
+         1. The best mathematically-behaved version of alpha to use is
+         called "premultiplied alpha", in which the other color channels
+         have had the alpha value multiplied in. If you use premultiplied
+         alpha, linear filtering (such as image resampling done by this
+         library, or performed in texture units on GPUs) does the "right
+         thing". While premultiplied alpha is standard in the movie CGI
+         industry, it is still uncommon in the videogame/real-time world.
+
+         If you linearly filter non-premultiplied alpha, strange effects
+         occur. (For example, the average of 1% opaque bright green
+         and 99% opaque black produces 50% transparent dark green when
+         non-premultiplied, whereas premultiplied it produces 50%
+         transparent near-black. The former introduces green energy
+         that doesn't exist in the source image.)
+
+         2. Artists should not edit premultiplied-alpha images; artists
+         want non-premultiplied alpha images. Thus, art tools generally output
+         non-premultiplied alpha images.
+
+         3. You will get best results in most cases by converting images
+         to premultiplied alpha before processing them mathematically.
+
+         4. If you pass the flag STBIR_FLAG_ALPHA_PREMULTIPLIED, the
+         resizer does not do anything special for the alpha channel;
+         it is resampled identically to other channels. This produces
+         the correct results for premultiplied-alpha images, but produces
+         less-than-ideal results for non-premultiplied-alpha images.
+
+         5. If you do not pass the flag STBIR_FLAG_ALPHA_PREMULTIPLIED,
+         then the resizer weights the contribution of input pixels
+         based on their alpha values, or, equivalently, it multiplies
+         the alpha value into the color channels, resamples, then divides
+         by the resultant alpha value. Input pixels which have alpha=0 do
+         not contribute at all to output pixels unless _all_ of the input
+         pixels affecting that output pixel have alpha=0, in which case
+         the result for that pixel is the same as it would be without
+         STBIR_FLAG_ALPHA_PREMULTIPLIED. However, this is only true for
+         input images in integer formats. For input images in float format,
+         input pixels with alpha=0 have no effect, and output pixels
+         which have alpha=0 will be 0 in all channels. (For float images,
+         you can manually achieve the same result by adding a tiny epsilon
+         value to the alpha channel of every image, and then subtracting
+         or clamping it at the end.)
+
+         6. You can suppress the behavior described in #5 and make
+         all-0-alpha pixels have 0 in all channels by #defining
+         STBIR_NO_ALPHA_EPSILON.
+
+         7. You can separately control whether the alpha channel is
+         interpreted as linear or affected by the colorspace. By default
+         it is linear; you almost never want to apply the colorspace.
+         (For example, graphics hardware does not apply sRGB conversion
+         to the alpha channel.)
+
+   ADDITIONAL CONTRIBUTORS
+      Sean Barrett: API design, optimizations
+         
+   REVISIONS
+      0.90 (2014-09-17) first released version
+
+   LICENSE
+      This software is in the public domain. Where that dedication is not
+      recognized, you are granted a perpetual, irrevocable license to copy
+      and modify this file as you see fit.
+
+   TODO
+      Don't decode all of the image data when only processing a partial tile
+      Don't use full-width decode buffers when only processing a partial tile
+      When processing wide images, break processing into tiles so data fits in L1 cache
+      Installable filters?
+      Resize that respects alpha test coverage
+         (Reference code: FloatImage::alphaTestCoverage and FloatImage::scaleAlphaToCoverage:
+         https://code.google.com/p/nvidia-texture-tools/source/browse/trunk/src/nvimage/FloatImage.cpp )
+*/
+
+#ifndef STBIR_INCLUDE_STB_IMAGE_RESIZE_H
+#define STBIR_INCLUDE_STB_IMAGE_RESIZE_H
+
+#ifdef _MSC_VER
+typedef unsigned char  stbir_uint8;
+typedef unsigned short stbir_uint16;
+typedef unsigned int   stbir_uint32;
+#else
+#include <stdint.h>
+typedef uint8_t  stbir_uint8;
+typedef uint16_t stbir_uint16;
+typedef uint32_t stbir_uint32;
+#endif
+
+#ifdef STB_IMAGE_RESIZE_STATIC
+#define STBIRDEF static
+#else
+#ifdef __cplusplus
+#define STBIRDEF extern "C"
+#else
+#define STBIRDEF extern
+#endif
+#endif
+
+
+//////////////////////////////////////////////////////////////////////////////
+//
+// Easy-to-use API:
+//
+//     * "input pixels" points to an array of image data with 'num_channels' channels (e.g. RGB=3, RGBA=4)
+//     * input_w is input image width (x-axis), input_h is input image height (y-axis)
+//     * stride is the offset between successive rows of image data in memory, in bytes. you can
+//       specify 0 to mean packed continuously in memory
+//     * alpha channel is treated identically to other channels.
+//     * colorspace is linear or sRGB as specified by function name
+//     * returned result is 1 for success or 0 in case of an error.
+//       #define STBIR_ASSERT() to trigger an assert on parameter validation errors.
+//     * Memory required grows approximately linearly with input and output size, but with
+//       discontinuities at input_w == output_w and input_h == output_h.
+//     * These functions use a "default" resampling filter defined at compile time. To change the filter,
+//       you can change the compile-time defaults by #defining STBIR_DEFAULT_FILTER_UPSAMPLE
+//       and STBIR_DEFAULT_FILTER_DOWNSAMPLE, or you can use the medium-complexity API.
+
+STBIRDEF int stbir_resize_uint8(     const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                           unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                     int num_channels);
+
+STBIRDEF int stbir_resize_float(     const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                           float *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                     int num_channels);
+
+
+// The following functions interpret image data as gamma-corrected sRGB. 
+// Specify STBIR_ALPHA_CHANNEL_NONE if you have no alpha channel,
+// or otherwise provide the index of the alpha channel. Flags value
+// of 0 will probably do the right thing if you're not sure what
+// the flags mean.
+
+#define STBIR_ALPHA_CHANNEL_NONE       -1
+
+// Set this flag if your texture has premultiplied alpha. Otherwise, stbir will
+// use alpha-weighted resampling (effectively premultiplying, resampling,
+// then unpremultiplying).
+#define STBIR_FLAG_ALPHA_PREMULTIPLIED    (1 << 0)
+// The specified alpha channel should be handled as gamma-corrected value even
+// when doing sRGB operations.
+#define STBIR_FLAG_ALPHA_USES_COLORSPACE  (1 << 1)
+
+STBIRDEF int stbir_resize_uint8_srgb(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                           unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                     int num_channels, int alpha_channel, int flags);
+
+
+typedef enum
+{
+    STBIR_EDGE_CLAMP   = 1,
+    STBIR_EDGE_REFLECT = 2,
+    STBIR_EDGE_WRAP    = 3,
+    STBIR_EDGE_ZERO    = 4,
+} stbir_edge;
+
+// This function adds the ability to specify how requests to sample off the edge of the image are handled.
+STBIRDEF int stbir_resize_uint8_srgb_edgemode(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                                    unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                              int num_channels, int alpha_channel, int flags,
+                                              stbir_edge edge_wrap_mode);
+
+//////////////////////////////////////////////////////////////////////////////
+//
+// Medium-complexity API
+//
+// This extends the easy-to-use API as follows:
+//
+//     * Alpha-channel can be processed separately
+//       * If alpha_channel is not STBIR_ALPHA_CHANNEL_NONE
+//         * Alpha channel will not be gamma corrected (unless flags&STBIR_FLAG_GAMMA_CORRECT)
+//         * Filters will be weighted by alpha channel (unless flags&STBIR_FLAG_ALPHA_PREMULTIPLIED)
+//     * Filter can be selected explicitly
+//     * uint16 image type
+//     * sRGB colorspace available for all types
+//     * context parameter for passing to STBIR_MALLOC
+
+typedef enum
+{
+    STBIR_FILTER_DEFAULT      = 0,  // use same filter type that easy-to-use API chooses
+    STBIR_FILTER_BOX          = 1,  // A trapezoid w/1-pixel wide ramps, same result as box for integer scale ratios
+    STBIR_FILTER_TRIANGLE     = 2,  // On upsampling, produces same results as bilinear texture filtering
+    STBIR_FILTER_CUBICBSPLINE = 3,  // The cubic b-spline (aka Mitchell-Netrevalli with B=1,C=0), gaussian-esque
+    STBIR_FILTER_CATMULLROM   = 4,  // An interpolating cubic spline
+    STBIR_FILTER_MITCHELL     = 5,  // Mitchell-Netrevalli filter with B=1/3, C=1/3
+} stbir_filter;
+
+typedef enum
+{
+    STBIR_COLORSPACE_LINEAR,
+    STBIR_COLORSPACE_SRGB,
+
+    STBIR_MAX_COLORSPACES,
+} stbir_colorspace;
+
+// The following functions are all identical except for the type of the image data
+
+STBIRDEF int stbir_resize_uint8_generic( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                               unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                         int num_channels, int alpha_channel, int flags,
+                                         stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
+                                         void *alloc_context);
+
+STBIRDEF int stbir_resize_uint16_generic(const stbir_uint16 *input_pixels  , int input_w , int input_h , int input_stride_in_bytes,
+                                               stbir_uint16 *output_pixels , int output_w, int output_h, int output_stride_in_bytes,
+                                         int num_channels, int alpha_channel, int flags,
+                                         stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
+                                         void *alloc_context);
+
+STBIRDEF int stbir_resize_float_generic( const float *input_pixels         , int input_w , int input_h , int input_stride_in_bytes,
+                                               float *output_pixels        , int output_w, int output_h, int output_stride_in_bytes,
+                                         int num_channels, int alpha_channel, int flags,
+                                         stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
+                                         void *alloc_context);
+
+
+
+//////////////////////////////////////////////////////////////////////////////
+//
+// Full-complexity API
+//
+// This extends the medium API as follows:
+//
+//       * uint32 image type
+//     * not typesafe
+//     * separate filter types for each axis
+//     * separate edge modes for each axis
+//     * can specify scale explicitly for subpixel correctness
+//     * can specify image source tile using texture coordinates
+
+typedef enum
+{
+    STBIR_TYPE_UINT8 ,
+    STBIR_TYPE_UINT16,
+    STBIR_TYPE_UINT32,
+    STBIR_TYPE_FLOAT ,
+
+    STBIR_MAX_TYPES
+} stbir_datatype;
+
+STBIRDEF int stbir_resize(         const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                         void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                   stbir_datatype datatype,
+                                   int num_channels, int alpha_channel, int flags,
+                                   stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
+                                   stbir_filter filter_horizontal,  stbir_filter filter_vertical,
+                                   stbir_colorspace space, void *alloc_context);
+
+STBIRDEF int stbir_resize_subpixel(const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                         void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                   stbir_datatype datatype,
+                                   int num_channels, int alpha_channel, int flags,
+                                   stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
+                                   stbir_filter filter_horizontal,  stbir_filter filter_vertical,
+                                   stbir_colorspace space, void *alloc_context,
+                                   float x_scale, float y_scale,
+                                   float x_offset, float y_offset);
+
+STBIRDEF int stbir_resize_region(  const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                         void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                   stbir_datatype datatype,
+                                   int num_channels, int alpha_channel, int flags,
+                                   stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
+                                   stbir_filter filter_horizontal,  stbir_filter filter_vertical,
+                                   stbir_colorspace space, void *alloc_context,
+                                   float s0, float t0, float s1, float t1);
+// (s0, t0) & (s1, t1) are the top-left and bottom right corner (uv addressing style: [0, 1]x[0, 1]) of a region of the input image to use.
+
+//
+//
+////   end header file   /////////////////////////////////////////////////////
+#endif // STBIR_INCLUDE_STB_IMAGE_RESIZE_H
+
+
+
+
+
+#ifdef STB_IMAGE_RESIZE_IMPLEMENTATION
+
+#ifndef STBIR_ASSERT
+#include <assert.h>
+#define STBIR_ASSERT(x) assert(x)
+#endif
+
+#ifdef STBIR_DEBUG
+#define STBIR__DEBUG_ASSERT STBIR_ASSERT
+#else
+#define STBIR__DEBUG_ASSERT
+#endif
+
+// If you hit this it means I haven't done it yet.
+#define STBIR__UNIMPLEMENTED(x) STBIR_ASSERT(!(x))
+
+// For memset
+#include <string.h>
+
+#include <math.h>
+
+#ifndef STBIR_MALLOC
+#include <stdlib.h>
+#define STBIR_MALLOC(size,c) malloc(size)
+#define STBIR_FREE(ptr,c)    free(ptr)
+#endif
+
+#ifndef _MSC_VER
+#ifdef __cplusplus
+#define stbir__inline inline
+#else
+#define stbir__inline
+#endif
+#else
+#define stbir__inline __forceinline
+#endif
+
+
+// should produce compiler error if size is wrong
+typedef unsigned char stbir__validate_uint32[sizeof(stbir_uint32) == 4 ? 1 : -1];
+
+#ifdef _MSC_VER
+#define STBIR__NOTUSED(v)  (void)(v)
+#else
+#define STBIR__NOTUSED(v)  (void)sizeof(v)
+#endif
+
+#define STBIR__ARRAY_SIZE(a) (sizeof((a))/sizeof((a)[0]))
+
+#ifndef STBIR_DEFAULT_FILTER_UPSAMPLE
+#define STBIR_DEFAULT_FILTER_UPSAMPLE    STBIR_FILTER_CATMULLROM
+#endif
+
+#ifndef STBIR_DEFAULT_FILTER_DOWNSAMPLE
+#define STBIR_DEFAULT_FILTER_DOWNSAMPLE  STBIR_FILTER_MITCHELL
+#endif
+
+#ifndef STBIR_PROGRESS_REPORT
+#define STBIR_PROGRESS_REPORT(float_0_to_1)
+#endif
+
+#ifndef STBIR_MAX_CHANNELS
+#define STBIR_MAX_CHANNELS 64
+#endif
+
+#if STBIR_MAX_CHANNELS > 65536
+#error "Too many channels; STBIR_MAX_CHANNELS must be no more than 65536."
+// because we store the indices in 16-bit variables
+#endif
+
+// This value is added to alpha just before premultiplication to avoid
+// zeroing out color values. It is equivalent to 2^-80. If you don't want
+// that behavior (it may interfere if you have floating point images with
+// very small alpha values) then you can define STBIR_NO_ALPHA_EPSILON to
+// disable it.
+#ifndef STBIR_ALPHA_EPSILON
+#define STBIR_ALPHA_EPSILON ((float)1 / (1 << 20) / (1 << 20) / (1 << 20) / (1 << 20))
+#endif
+
+
+
+#ifdef _MSC_VER
+#define STBIR__UNUSED_PARAM(v)  (void)(v)
+#else
+#define STBIR__UNUSED_PARAM(v)  (void)sizeof(v)
+#endif
+
+// must match stbir_datatype
+static unsigned char stbir__type_size[] = {
+    1, // STBIR_TYPE_UINT8
+    2, // STBIR_TYPE_UINT16
+    4, // STBIR_TYPE_UINT32
+    4, // STBIR_TYPE_FLOAT
+};
+
+// Kernel function centered at 0
+typedef float (stbir__kernel_fn)(float x, float scale);
+typedef float (stbir__support_fn)(float scale);
+
+typedef struct
+{
+    stbir__kernel_fn* kernel;
+    stbir__support_fn* support;
+} stbir__filter_info;
+
+// When upsampling, the contributors are which source pixels contribute.
+// When downsampling, the contributors are which destination pixels are contributed to.
+typedef struct
+{
+    int n0; // First contributing pixel
+    int n1; // Last contributing pixel
+} stbir__contributors;
+
+typedef struct
+{
+    const void* input_data;
+    int input_w;
+    int input_h;
+    int input_stride_bytes;
+
+    void* output_data;
+    int output_w;
+    int output_h;
+    int output_stride_bytes;
+
+    float s0, t0, s1, t1;
+
+    float horizontal_shift; // Units: output pixels
+    float vertical_shift;   // Units: output pixels
+    float horizontal_scale;
+    float vertical_scale;
+
+    int channels;
+    int alpha_channel;
+    stbir_uint32 flags;
+    stbir_datatype type;
+    stbir_filter horizontal_filter;
+    stbir_filter vertical_filter;
+    stbir_edge edge_horizontal;
+    stbir_edge edge_vertical;
+    stbir_colorspace colorspace;
+
+    stbir__contributors* horizontal_contributors;
+    float* horizontal_coefficients;
+
+    stbir__contributors* vertical_contributors;
+    float* vertical_coefficients;
+
+    int decode_buffer_pixels;
+    float* decode_buffer;
+
+    float* horizontal_buffer;
+
+    // cache these because ceil/floor are inexplicably showing up in profile
+    int horizontal_coefficient_width;
+    int vertical_coefficient_width;
+    int horizontal_filter_pixel_width;
+    int vertical_filter_pixel_width;
+    int horizontal_filter_pixel_margin;
+    int vertical_filter_pixel_margin;
+    int horizontal_num_contributors;
+    int vertical_num_contributors;
+
+    int ring_buffer_length_bytes; // The length of an individual entry in the ring buffer. The total number of ring buffers is stbir__get_filter_pixel_width(filter)
+    int ring_buffer_first_scanline;
+    int ring_buffer_last_scanline;
+    int ring_buffer_begin_index;
+    float* ring_buffer;
+
+    float* encode_buffer; // A temporary buffer to store floats so we don't lose precision while we do multiply-adds.
+
+    int horizontal_contributors_size;
+    int horizontal_coefficients_size;
+    int vertical_contributors_size;
+    int vertical_coefficients_size;
+    int decode_buffer_size;
+    int horizontal_buffer_size;
+    int ring_buffer_size;
+    int encode_buffer_size;
+} stbir__info;
+
+static stbir__inline int stbir__min(int a, int b)
+{
+    return a < b ? a : b;
+}
+
+static stbir__inline int stbir__max(int a, int b)
+{
+    return a > b ? a : b;
+}
+
+static stbir__inline float stbir__saturate(float x)
+{
+    if (x < 0)
+        return 0;
+
+    if (x > 1)
+        return 1;
+
+    return x;
+}
+
+#ifdef STBIR_SATURATE_INT
+static stbir__inline stbir_uint8 stbir__saturate8(int x)
+{
+    if ((unsigned int) x <= 255)
+        return x;
+
+    if (x < 0)
+        return 0;
+
+    return 255;
+}
+
+static stbir__inline stbir_uint16 stbir__saturate16(int x)
+{
+    if ((unsigned int) x <= 65535)
+        return x;
+
+    if (x < 0)
+        return 0;
+
+    return 65535;
+}
+#endif
+
+static float stbir__srgb_uchar_to_linear_float[256] = {
+    0.000000f, 0.000304f, 0.000607f, 0.000911f, 0.001214f, 0.001518f, 0.001821f, 0.002125f, 0.002428f, 0.002732f, 0.003035f,
+    0.003347f, 0.003677f, 0.004025f, 0.004391f, 0.004777f, 0.005182f, 0.005605f, 0.006049f, 0.006512f, 0.006995f, 0.007499f,
+    0.008023f, 0.008568f, 0.009134f, 0.009721f, 0.010330f, 0.010960f, 0.011612f, 0.012286f, 0.012983f, 0.013702f, 0.014444f,
+    0.015209f, 0.015996f, 0.016807f, 0.017642f, 0.018500f, 0.019382f, 0.020289f, 0.021219f, 0.022174f, 0.023153f, 0.024158f,
+    0.025187f, 0.026241f, 0.027321f, 0.028426f, 0.029557f, 0.030713f, 0.031896f, 0.033105f, 0.034340f, 0.035601f, 0.036889f,
+    0.038204f, 0.039546f, 0.040915f, 0.042311f, 0.043735f, 0.045186f, 0.046665f, 0.048172f, 0.049707f, 0.051269f, 0.052861f,
+    0.054480f, 0.056128f, 0.057805f, 0.059511f, 0.061246f, 0.063010f, 0.064803f, 0.066626f, 0.068478f, 0.070360f, 0.072272f,
+    0.074214f, 0.076185f, 0.078187f, 0.080220f, 0.082283f, 0.084376f, 0.086500f, 0.088656f, 0.090842f, 0.093059f, 0.095307f,
+    0.097587f, 0.099899f, 0.102242f, 0.104616f, 0.107023f, 0.109462f, 0.111932f, 0.114435f, 0.116971f, 0.119538f, 0.122139f,
+    0.124772f, 0.127438f, 0.130136f, 0.132868f, 0.135633f, 0.138432f, 0.141263f, 0.144128f, 0.147027f, 0.149960f, 0.152926f,
+    0.155926f, 0.158961f, 0.162029f, 0.165132f, 0.168269f, 0.171441f, 0.174647f, 0.177888f, 0.181164f, 0.184475f, 0.187821f,
+    0.191202f, 0.194618f, 0.198069f, 0.201556f, 0.205079f, 0.208637f, 0.212231f, 0.215861f, 0.219526f, 0.223228f, 0.226966f,
+    0.230740f, 0.234551f, 0.238398f, 0.242281f, 0.246201f, 0.250158f, 0.254152f, 0.258183f, 0.262251f, 0.266356f, 0.270498f,
+    0.274677f, 0.278894f, 0.283149f, 0.287441f, 0.291771f, 0.296138f, 0.300544f, 0.304987f, 0.309469f, 0.313989f, 0.318547f,
+    0.323143f, 0.327778f, 0.332452f, 0.337164f, 0.341914f, 0.346704f, 0.351533f, 0.356400f, 0.361307f, 0.366253f, 0.371238f,
+    0.376262f, 0.381326f, 0.386430f, 0.391573f, 0.396755f, 0.401978f, 0.407240f, 0.412543f, 0.417885f, 0.423268f, 0.428691f,
+    0.434154f, 0.439657f, 0.445201f, 0.450786f, 0.456411f, 0.462077f, 0.467784f, 0.473532f, 0.479320f, 0.485150f, 0.491021f,
+    0.496933f, 0.502887f, 0.508881f, 0.514918f, 0.520996f, 0.527115f, 0.533276f, 0.539480f, 0.545725f, 0.552011f, 0.558340f,
+    0.564712f, 0.571125f, 0.577581f, 0.584078f, 0.590619f, 0.597202f, 0.603827f, 0.610496f, 0.617207f, 0.623960f, 0.630757f,
+    0.637597f, 0.644480f, 0.651406f, 0.658375f, 0.665387f, 0.672443f, 0.679543f, 0.686685f, 0.693872f, 0.701102f, 0.708376f,
+    0.715694f, 0.723055f, 0.730461f, 0.737911f, 0.745404f, 0.752942f, 0.760525f, 0.768151f, 0.775822f, 0.783538f, 0.791298f,
+    0.799103f, 0.806952f, 0.814847f, 0.822786f, 0.830770f, 0.838799f, 0.846873f, 0.854993f, 0.863157f, 0.871367f, 0.879622f,
+    0.887923f, 0.896269f, 0.904661f, 0.913099f, 0.921582f, 0.930111f, 0.938686f, 0.947307f, 0.955974f, 0.964686f, 0.973445f,
+    0.982251f, 0.991102f, 1.0f
+};
+
+static float stbir__srgb_to_linear(float f)
+{
+    if (f <= 0.04045f)
+        return f / 12.92f;
+    else
+        return (float)pow((f + 0.055f) / 1.055f, 2.4f);
+}
+
+static float stbir__linear_to_srgb(float f)
+{
+    if (f <= 0.0031308f)
+        return f * 12.92f;
+    else
+        return 1.055f * (float)pow(f, 1 / 2.4f) - 0.055f;
+}
+
+#ifndef STBIR_NON_IEEE_FLOAT
+// From https://gist.github.com/rygorous/2203834
+
+typedef union
+{
+    stbir_uint32 u;
+    float f;
+} stbir__FP32;
+
+static const stbir_uint32 fp32_to_srgb8_tab4[104] = {
+    0x0073000d, 0x007a000d, 0x0080000d, 0x0087000d, 0x008d000d, 0x0094000d, 0x009a000d, 0x00a1000d,
+    0x00a7001a, 0x00b4001a, 0x00c1001a, 0x00ce001a, 0x00da001a, 0x00e7001a, 0x00f4001a, 0x0101001a,
+    0x010e0033, 0x01280033, 0x01410033, 0x015b0033, 0x01750033, 0x018f0033, 0x01a80033, 0x01c20033,
+    0x01dc0067, 0x020f0067, 0x02430067, 0x02760067, 0x02aa0067, 0x02dd0067, 0x03110067, 0x03440067,
+    0x037800ce, 0x03df00ce, 0x044600ce, 0x04ad00ce, 0x051400ce, 0x057b00c5, 0x05dd00bc, 0x063b00b5,
+    0x06970158, 0x07420142, 0x07e30130, 0x087b0120, 0x090b0112, 0x09940106, 0x0a1700fc, 0x0a9500f2,
+    0x0b0f01cb, 0x0bf401ae, 0x0ccb0195, 0x0d950180, 0x0e56016e, 0x0f0d015e, 0x0fbc0150, 0x10630143,
+    0x11070264, 0x1238023e, 0x1357021d, 0x14660201, 0x156601e9, 0x165a01d3, 0x174401c0, 0x182401af,
+    0x18fe0331, 0x1a9602fe, 0x1c1502d2, 0x1d7e02ad, 0x1ed4028d, 0x201a0270, 0x21520256, 0x227d0240,
+    0x239f0443, 0x25c003fe, 0x27bf03c4, 0x29a10392, 0x2b6a0367, 0x2d1d0341, 0x2ebe031f, 0x304d0300,
+    0x31d105b0, 0x34a80555, 0x37520507, 0x39d504c5, 0x3c37048b, 0x3e7c0458, 0x40a8042a, 0x42bd0401,
+    0x44c20798, 0x488e071e, 0x4c1c06b6, 0x4f76065d, 0x52a50610, 0x55ac05cc, 0x5892058f, 0x5b590559,
+    0x5e0c0a23, 0x631c0980, 0x67db08f6, 0x6c55087f, 0x70940818, 0x74a007bd, 0x787d076c, 0x7c330723,
+};
+ 
+static stbir_uint8 stbir__linear_to_srgb_uchar(float in)
+{
+    static const stbir__FP32 almostone = { 0x3f7fffff }; // 1-eps
+    static const stbir__FP32 minval = { (127-13) << 23 };
+    stbir_uint32 tab,bias,scale,t;
+    stbir__FP32 f;
+ 
+    // Clamp to [2^(-13), 1-eps]; these two values map to 0 and 1, respectively.
+    // The tests are carefully written so that NaNs map to 0, same as in the reference
+    // implementation.
+    if (!(in > minval.f)) // written this way to catch NaNs
+        in = minval.f;
+    if (in > almostone.f)
+        in = almostone.f;
+ 
+    // Do the table lookup and unpack bias, scale
+    f.f = in;
+    tab = fp32_to_srgb8_tab4[(f.u - minval.u) >> 20];
+    bias = (tab >> 16) << 9;
+    scale = tab & 0xffff;
+ 
+    // Grab next-highest mantissa bits and perform linear interpolation
+    t = (f.u >> 12) & 0xff;
+    return (unsigned char) ((bias + scale*t) >> 16);
+}
+
+#else
+// sRGB transition values, scaled by 1<<28
+static int stbir__srgb_offset_to_linear_scaled[256] =
+{
+            0,     40738,    122216,    203693,    285170,    366648,    448125,    529603,
+       611080,    692557,    774035,    855852,    942009,   1033024,   1128971,   1229926,
+      1335959,   1447142,   1563542,   1685229,   1812268,   1944725,   2082664,   2226148,
+      2375238,   2529996,   2690481,   2856753,   3028870,   3206888,   3390865,   3580856,
+      3776916,   3979100,   4187460,   4402049,   4622919,   4850123,   5083710,   5323731,
+      5570236,   5823273,   6082892,   6349140,   6622065,   6901714,   7188133,   7481369,
+      7781466,   8088471,   8402427,   8723380,   9051372,   9386448,   9728650,  10078021,
+     10434603,  10798439,  11169569,  11548036,  11933879,  12327139,  12727857,  13136073,
+     13551826,  13975156,  14406100,  14844697,  15290987,  15745007,  16206795,  16676389,
+     17153826,  17639142,  18132374,  18633560,  19142734,  19659934,  20185196,  20718552,
+     21260042,  21809696,  22367554,  22933648,  23508010,  24090680,  24681686,  25281066,
+     25888850,  26505076,  27129772,  27762974,  28404716,  29055026,  29713942,  30381490,
+     31057708,  31742624,  32436272,  33138682,  33849884,  34569912,  35298800,  36036568,
+     36783260,  37538896,  38303512,  39077136,  39859796,  40651528,  41452360,  42262316,
+     43081432,  43909732,  44747252,  45594016,  46450052,  47315392,  48190064,  49074096,
+     49967516,  50870356,  51782636,  52704392,  53635648,  54576432,  55526772,  56486700,
+     57456236,  58435408,  59424248,  60422780,  61431036,  62449032,  63476804,  64514376,
+     65561776,  66619028,  67686160,  68763192,  69850160,  70947088,  72053992,  73170912,
+     74297864,  75434880,  76581976,  77739184,  78906536,  80084040,  81271736,  82469648,
+     83677792,  84896192,  86124888,  87363888,  88613232,  89872928,  91143016,  92423512,
+     93714432,  95015816,  96327688,  97650056,  98982952, 100326408, 101680440, 103045072,
+    104420320, 105806224, 107202800, 108610064, 110028048, 111456776, 112896264, 114346544,
+    115807632, 117279552, 118762328, 120255976, 121760536, 123276016, 124802440, 126339832,
+    127888216, 129447616, 131018048, 132599544, 134192112, 135795792, 137410592, 139036528,
+    140673648, 142321952, 143981456, 145652208, 147334208, 149027488, 150732064, 152447968,
+    154175200, 155913792, 157663776, 159425168, 161197984, 162982240, 164777968, 166585184,
+    168403904, 170234160, 172075968, 173929344, 175794320, 177670896, 179559120, 181458992,
+    183370528, 185293776, 187228736, 189175424, 191133888, 193104112, 195086128, 197079968,
+    199085648, 201103184, 203132592, 205173888, 207227120, 209292272, 211369392, 213458480,
+    215559568, 217672656, 219797792, 221934976, 224084240, 226245600, 228419056, 230604656,
+    232802400, 235012320, 237234432, 239468736, 241715280, 243974080, 246245120, 248528464,
+    250824112, 253132064, 255452368, 257785040, 260130080, 262487520, 264857376, 267239664,
+};
+
+static stbir_uint8 stbir__linear_to_srgb_uchar(float f)
+{
+    int x = (int) (f * (1 << 28)); // has headroom so you don't need to clamp
+    int v = 0;
+    int i;
+
+    // Refine the guess with a short binary search.
+    i = v + 128; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
+    i = v +  64; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
+    i = v +  32; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
+    i = v +  16; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
+    i = v +   8; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
+    i = v +   4; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
+    i = v +   2; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
+    i = v +   1; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
+
+    return (stbir_uint8) v;
+}
+#endif
+
+static float stbir__filter_trapezoid(float x, float scale)
+{
+    float halfscale = scale / 2;
+    float t = 0.5f + halfscale;
+    STBIR__DEBUG_ASSERT(scale <= 1);
+
+    x = (float)fabs(x);
+
+    if (x >= t)
+        return 0;
+    else
+    {
+        float r = 0.5f - halfscale;
+        if (x <= r)
+            return 1;
+        else
+            return (t - x) / scale;
+    }
+}
+
+static float stbir__support_trapezoid(float scale)
+{
+    STBIR__DEBUG_ASSERT(scale <= 1);
+    return 0.5f + scale / 2;
+}
+
+static float stbir__filter_triangle(float x, float s)
+{
+    STBIR__UNUSED_PARAM(s);
+
+    x = (float)fabs(x);
+
+    if (x <= 1.0f)
+        return 1 - x;
+    else
+        return 0;
+}
+
+static float stbir__filter_cubic(float x, float s)
+{
+    STBIR__UNUSED_PARAM(s);
+
+    x = (float)fabs(x);
+
+    if (x < 1.0f)
+        return (4 + x*x*(3*x - 6))/6;
+    else if (x < 2.0f)
+        return (8 + x*(-12 + x*(6 - x)))/6;
+
+    return (0.0f);
+}
+
+static float stbir__filter_catmullrom(float x, float s)
+{
+    STBIR__UNUSED_PARAM(s);
+
+    x = (float)fabs(x);
+
+    if (x < 1.0f)
+        return 1 - x*x*(2.5f - 1.5f*x);
+    else if (x < 2.0f)
+        return 2 - x*(4 + x*(0.5f*x - 2.5f));
+
+    return (0.0f);
+}
+
+static float stbir__filter_mitchell(float x, float s)
+{
+    STBIR__UNUSED_PARAM(s);
+
+    x = (float)fabs(x);
+
+    if (x < 1.0f)
+        return (16 + x*x*(21 * x - 36))/18;
+    else if (x < 2.0f)
+        return (32 + x*(-60 + x*(36 - 7*x)))/18;
+
+    return (0.0f);
+}
+
+static float stbir__support_zero(float s)
+{
+    STBIR__UNUSED_PARAM(s);
+    return 0;
+}
+
+static float stbir__support_one(float s)
+{
+    STBIR__UNUSED_PARAM(s);
+    return 1;
+}
+
+static float stbir__support_two(float s)
+{
+    STBIR__UNUSED_PARAM(s);
+    return 2;
+}
+
+static stbir__filter_info stbir__filter_info_table[] = {
+        { NULL,                     stbir__support_zero },
+        { stbir__filter_trapezoid,  stbir__support_trapezoid },
+        { stbir__filter_triangle,   stbir__support_one },
+        { stbir__filter_cubic,      stbir__support_two },
+        { stbir__filter_catmullrom, stbir__support_two },
+        { stbir__filter_mitchell,   stbir__support_two },
+};
+
+stbir__inline static int stbir__use_upsampling(float ratio)
+{
+    return ratio > 1;
+}
+
+stbir__inline static int stbir__use_width_upsampling(stbir__info* stbir_info)
+{
+    return stbir__use_upsampling(stbir_info->horizontal_scale);
+}
+
+stbir__inline static int stbir__use_height_upsampling(stbir__info* stbir_info)
+{
+    return stbir__use_upsampling(stbir_info->vertical_scale);
+}
+
+// This is the maximum number of input samples that can affect an output sample
+// with the given filter
+static int stbir__get_filter_pixel_width(stbir_filter filter, float scale)
+{
+    STBIR_ASSERT(filter != 0);
+    STBIR_ASSERT(filter < STBIR__ARRAY_SIZE(stbir__filter_info_table));
+
+    if (stbir__use_upsampling(scale))
+        return (int)ceil(stbir__filter_info_table[filter].support(1/scale) * 2);
+    else
+        return (int)ceil(stbir__filter_info_table[filter].support(scale) * 2 / scale);
+}
+
+// This is how much to expand buffers to account for filters seeking outside
+// the image boundaries.
+static int stbir__get_filter_pixel_margin(stbir_filter filter, float scale)
+{
+    return stbir__get_filter_pixel_width(filter, scale) / 2;
+}
+
+static int stbir__get_coefficient_width(stbir_filter filter, float scale)
+{
+    if (stbir__use_upsampling(scale))
+        return (int)ceil(stbir__filter_info_table[filter].support(1 / scale) * 2);
+    else
+        return (int)ceil(stbir__filter_info_table[filter].support(scale) * 2);
+}
+
+static int stbir__get_contributors(float scale, stbir_filter filter, int input_size, int output_size)
+{
+    if (stbir__use_upsampling(scale))
+        return output_size;
+    else
+        return (input_size + stbir__get_filter_pixel_margin(filter, scale) * 2);
+}
+
+static int stbir__get_total_horizontal_coefficients(stbir__info* info)
+{
+    return info->horizontal_num_contributors
+         * stbir__get_coefficient_width      (info->horizontal_filter, info->horizontal_scale);
+}
+
+static int stbir__get_total_vertical_coefficients(stbir__info* info)
+{
+    return info->vertical_num_contributors
+         * stbir__get_coefficient_width      (info->vertical_filter, info->vertical_scale);
+}
+
+static stbir__contributors* stbir__get_contributor(stbir__contributors* contributors, int n)
+{
+    return &contributors[n];
+}
+
+// For perf reasons this code is duplicated in stbir__resample_horizontal_upsample/downsample,
+// if you change it here change it there too.
+static float* stbir__get_coefficient(float* coefficients, stbir_filter filter, float scale, int n, int c)
+{
+    int width = stbir__get_coefficient_width(filter, scale);
+    return &coefficients[width*n + c];
+}
+
+static int stbir__edge_wrap_slow(stbir_edge edge, int n, int max)
+{
+    switch (edge)
+    {
+    case STBIR_EDGE_ZERO:
+        return 0; // we'll decode the wrong pixel here, and then overwrite with 0s later
+
+    case STBIR_EDGE_CLAMP:
+        if (n < 0)
+            return 0;
+
+        if (n >= max)
+            return max - 1;
+
+        return n; // NOTREACHED
+
+    case STBIR_EDGE_REFLECT:
+    {
+        if (n < 0)
+        {
+            if (n < max)
+                return -n;
+            else
+                return max - 1;
+        }
+
+        if (n >= max)
+        {
+            int max2 = max * 2;
+            if (n >= max2)
+                return 0;
+            else
+                return max2 - n - 1;
+        }
+
+        return n; // NOTREACHED
+    }
+
+    case STBIR_EDGE_WRAP:
+        if (n >= 0)
+            return (n % max);
+        else
+        {
+            int m = (-n) % max;
+
+            if (m != 0)
+                m = max - m;
+
+            return (m);
+        }
+        return n;  // NOTREACHED
+
+    default:
+        STBIR__UNIMPLEMENTED("Unimplemented edge type");
+        return 0;
+    }
+}
+
+stbir__inline static int stbir__edge_wrap(stbir_edge edge, int n, int max)
+{
+    // avoid per-pixel switch
+    if (n >= 0 && n < max)
+        return n;
+    return stbir__edge_wrap_slow(edge, n, max);
+}
+
+// What input pixels contribute to this output pixel?
+static void stbir__calculate_sample_range_upsample(int n, float out_filter_radius, float scale_ratio, float out_shift, int* in_first_pixel, int* in_last_pixel, float* in_center_of_out)
+{
+    float out_pixel_center = (float)n + 0.5f;
+    float out_pixel_influence_lowerbound = out_pixel_center - out_filter_radius;
+    float out_pixel_influence_upperbound = out_pixel_center + out_filter_radius;
+
+    float in_pixel_influence_lowerbound = (out_pixel_influence_lowerbound + out_shift) / scale_ratio;
+    float in_pixel_influence_upperbound = (out_pixel_influence_upperbound + out_shift) / scale_ratio;
+
+    *in_center_of_out = (out_pixel_center + out_shift) / scale_ratio;
+    *in_first_pixel = (int)(floor(in_pixel_influence_lowerbound + 0.5));
+    *in_last_pixel = (int)(floor(in_pixel_influence_upperbound - 0.5));
+}
+
+// What output pixels does this input pixel contribute to?
+static void stbir__calculate_sample_range_downsample(int n, float in_pixels_radius, float scale_ratio, float out_shift, int* out_first_pixel, int* out_last_pixel, float* out_center_of_in)
+{
+    float in_pixel_center = (float)n + 0.5f;
+    float in_pixel_influence_lowerbound = in_pixel_center - in_pixels_radius;
+    float in_pixel_influence_upperbound = in_pixel_center + in_pixels_radius;
+
+    float out_pixel_influence_lowerbound = in_pixel_influence_lowerbound * scale_ratio - out_shift;
+    float out_pixel_influence_upperbound = in_pixel_influence_upperbound * scale_ratio - out_shift;
+
+    *out_center_of_in = in_pixel_center * scale_ratio - out_shift;
+    *out_first_pixel = (int)(floor(out_pixel_influence_lowerbound + 0.5));
+    *out_last_pixel = (int)(floor(out_pixel_influence_upperbound - 0.5));
+}
+
+static void stbir__calculate_coefficients_upsample(stbir__info* stbir_info, stbir_filter filter, float scale, int in_first_pixel, int in_last_pixel, float in_center_of_out, stbir__contributors* contributor, float* coefficient_group)
+{
+    int i;
+    float total_filter = 0;
+    float filter_scale;
+
+    STBIR__DEBUG_ASSERT(in_last_pixel - in_first_pixel <= (int)ceil(stbir__filter_info_table[filter].support(1/scale) * 2)); // Taken directly from stbir__get_coefficient_width() which we can't call because we don't know if we're horizontal or vertical.
+
+    contributor->n0 = in_first_pixel;
+    contributor->n1 = in_last_pixel;
+
+    STBIR__DEBUG_ASSERT(contributor->n1 >= contributor->n0);
+
+    for (i = 0; i <= in_last_pixel - in_first_pixel; i++)
+    {
+        float in_pixel_center = (float)(i + in_first_pixel) + 0.5f;
+        coefficient_group[i] = stbir__filter_info_table[filter].kernel(in_center_of_out - in_pixel_center, 1 / scale);
+
+        // If the coefficient is zero, skip it. (Don't do the <0 check here, we want the influence of those outside pixels.)
+        if (i == 0 && !coefficient_group[i])
+        {
+            contributor->n0 = ++in_first_pixel;
+            i--;
+            continue;
+        }
+
+        total_filter += coefficient_group[i];
+    }
+
+    STBIR__DEBUG_ASSERT(stbir__filter_info_table[filter].kernel((float)(in_last_pixel + 1) + 0.5f - in_center_of_out, 1/scale) == 0);
+
+    STBIR__DEBUG_ASSERT(total_filter > 0.9);
+    STBIR__DEBUG_ASSERT(total_filter < 1.1f); // Make sure it's not way off.
+
+    // Make sure the sum of all coefficients is 1.
+    filter_scale = 1 / total_filter;
+
+    for (i = 0; i <= in_last_pixel - in_first_pixel; i++)
+        coefficient_group[i] *= filter_scale;
+
+    for (i = in_last_pixel - in_first_pixel; i >= 0; i--)
+    {
+        if (coefficient_group[i])
+            break;
+
+        // This line has no weight. We can skip it.
+        contributor->n1 = contributor->n0 + i - 1;
+    }
+}
+
+static void stbir__calculate_coefficients_downsample(stbir__info* stbir_info, stbir_filter filter, float scale_ratio, int out_first_pixel, int out_last_pixel, float out_center_of_in, stbir__contributors* contributor, float* coefficient_group)
+{
+    int i;
+
+     STBIR__DEBUG_ASSERT(out_last_pixel - out_first_pixel <= (int)ceil(stbir__filter_info_table[filter].support(scale_ratio) * 2)); // Taken directly from stbir__get_coefficient_width() which we can't call because we don't know if we're horizontal or vertical.
+
+    contributor->n0 = out_first_pixel;
+    contributor->n1 = out_last_pixel;
+
+    STBIR__DEBUG_ASSERT(contributor->n1 >= contributor->n0);
+
+    for (i = 0; i <= out_last_pixel - out_first_pixel; i++)
+    {
+        float out_pixel_center = (float)(i + out_first_pixel) + 0.5f;
+        float x = out_pixel_center - out_center_of_in;
+        coefficient_group[i] = stbir__filter_info_table[filter].kernel(x, scale_ratio) * scale_ratio;
+    }
+
+    STBIR__DEBUG_ASSERT(stbir__filter_info_table[filter].kernel((float)(out_last_pixel + 1) + 0.5f - out_center_of_in, scale_ratio) == 0);
+
+    for (i = out_last_pixel - out_first_pixel; i >= 0; i--)
+    {
+        if (coefficient_group[i])
+            break;
+
+        // This line has no weight. We can skip it.
+        contributor->n1 = contributor->n0 + i - 1;
+    }
+}
+
+static void stbir__normalize_downsample_coefficients(stbir__info* stbir_info, stbir__contributors* contributors, float* coefficients, stbir_filter filter, float scale_ratio, float shift, int input_size, int output_size)
+{
+    int num_contributors = stbir__get_contributors(scale_ratio, filter, input_size, output_size);
+    int num_coefficients = stbir__get_coefficient_width(filter, scale_ratio);
+    int i, j;
+    int skip;
+
+    for (i = 0; i < output_size; i++)
+    {
+        float scale;
+        float total = 0;
+
+        for (j = 0; j < num_contributors; j++)
+        {
+            if (i >= contributors[j].n0 && i <= contributors[j].n1)
+            {
+                float coefficient = *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i - contributors[j].n0);
+                total += coefficient;
+            }
+            else if (i < contributors[j].n0)
+                break;
+        }
+
+        STBIR__DEBUG_ASSERT(total > 0.9f);
+        STBIR__DEBUG_ASSERT(total < 1.1f);
+
+        scale = 1 / total;
+
+        for (j = 0; j < num_contributors; j++)
+        {
+            if (i >= contributors[j].n0 && i <= contributors[j].n1)
+                *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i - contributors[j].n0) *= scale;
+            else if (i < contributors[j].n0)
+                break;
+        }
+    }
+
+    // Optimize: Skip zero coefficients and contributions outside of image bounds.
+    // Do this after normalizing because normalization depends on the n0/n1 values.
+    for (j = 0; j < num_contributors; j++)
+    {
+        int range, max, width;
+
+        skip = 0;
+        while (*stbir__get_coefficient(coefficients, filter, scale_ratio, j, skip) == 0)
+            skip++;
+
+        contributors[j].n0 += skip;
+
+        while (contributors[j].n0 < 0)
+        {
+            contributors[j].n0++;
+            skip++;
+        }
+
+        range = contributors[j].n1 - contributors[j].n0 + 1;
+        max = stbir__min(num_coefficients, range);
+
+        width = stbir__get_coefficient_width(filter, scale_ratio);
+        for (i = 0; i < max; i++)
+        {
+            if (i + skip >= width)
+                break;
+
+            *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i) = *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i + skip);
+        }
+
+        continue;
+    }
+
+    // Using min to avoid writing into invalid pixels.
+    for (i = 0; i < num_contributors; i++)
+        contributors[i].n1 = stbir__min(contributors[i].n1, output_size - 1);
+}
+
+// Each scan line uses the same kernel values so we should calculate the kernel
+// values once and then we can use them for every scan line.
+static void stbir__calculate_filters(stbir__info* stbir_info, stbir__contributors* contributors, float* coefficients, stbir_filter filter, float scale_ratio, float shift, int input_size, int output_size)
+{
+    int n;
+    int total_contributors = stbir__get_contributors(scale_ratio, filter, input_size, output_size);
+
+    if (stbir__use_upsampling(scale_ratio))
+    {
+        float out_pixels_radius = stbir__filter_info_table[filter].support(1 / scale_ratio) * scale_ratio;
+
+        // Looping through out pixels
+        for (n = 0; n < total_contributors; n++)
+        {
+            float in_center_of_out; // Center of the current out pixel in the in pixel space
+            int in_first_pixel, in_last_pixel;
+
+            stbir__calculate_sample_range_upsample(n, out_pixels_radius, scale_ratio, shift, &in_first_pixel, &in_last_pixel, &in_center_of_out);
+
+            stbir__calculate_coefficients_upsample(stbir_info, filter, scale_ratio, in_first_pixel, in_last_pixel, in_center_of_out, stbir__get_contributor(contributors, n), stbir__get_coefficient(coefficients, filter, scale_ratio, n, 0));
+        }
+    }
+    else
+    {
+        float in_pixels_radius = stbir__filter_info_table[filter].support(scale_ratio) / scale_ratio;
+
+        // Looping through in pixels
+        for (n = 0; n < total_contributors; n++)
+        {
+            float out_center_of_in; // Center of the current out pixel in the in pixel space
+            int out_first_pixel, out_last_pixel;
+            int n_adjusted = n - stbir__get_filter_pixel_margin(filter, scale_ratio);
+
+            stbir__calculate_sample_range_downsample(n_adjusted, in_pixels_radius, scale_ratio, shift, &out_first_pixel, &out_last_pixel, &out_center_of_in);
+
+            stbir__calculate_coefficients_downsample(stbir_info, filter, scale_ratio, out_first_pixel, out_last_pixel, out_center_of_in, stbir__get_contributor(contributors, n), stbir__get_coefficient(coefficients, filter, scale_ratio, n, 0));
+        }
+
+        stbir__normalize_downsample_coefficients(stbir_info, contributors, coefficients, filter, scale_ratio, shift, input_size, output_size);
+    }
+}
+
+static float* stbir__get_decode_buffer(stbir__info* stbir_info)
+{
+    // The 0 index of the decode buffer starts after the margin. This makes
+    // it okay to use negative indexes on the decode buffer.
+    return &stbir_info->decode_buffer[stbir_info->horizontal_filter_pixel_margin * stbir_info->channels];
+}
+
+#define STBIR__DECODE(type, colorspace) ((type) * (STBIR_MAX_COLORSPACES) + (colorspace))
+
+static void stbir__decode_scanline(stbir__info* stbir_info, int n)
+{
+    int c;
+    int channels = stbir_info->channels;
+    int alpha_channel = stbir_info->alpha_channel;
+    int type = stbir_info->type;
+    int colorspace = stbir_info->colorspace;
+    int input_w = stbir_info->input_w;
+    int input_stride_bytes = stbir_info->input_stride_bytes;
+    float* decode_buffer = stbir__get_decode_buffer(stbir_info);
+    stbir_edge edge_horizontal = stbir_info->edge_horizontal;
+    stbir_edge edge_vertical = stbir_info->edge_vertical;
+    int in_buffer_row_offset = stbir__edge_wrap(edge_vertical, n, stbir_info->input_h) * input_stride_bytes;
+    const void* input_data = (char *) stbir_info->input_data + in_buffer_row_offset;
+    int max_x = input_w + stbir_info->horizontal_filter_pixel_margin;
+    int decode = STBIR__DECODE(type, colorspace);
+
+    int x = -stbir_info->horizontal_filter_pixel_margin;
+
+    // special handling for STBIR_EDGE_ZERO because it needs to return an item that doesn't appear in the input,
+    // and we want to avoid paying overhead on every pixel if not STBIR_EDGE_ZERO
+    if (edge_vertical == STBIR_EDGE_ZERO && (n < 0 || n >= stbir_info->input_h))
+    {
+        for (; x < max_x; x++)
+            for (c = 0; c < channels; c++)
+                decode_buffer[x*channels + c] = 0;
+        return;
+    }
+
+    switch (decode)
+    {
+    case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_LINEAR):
+        for (; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+            int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
+            for (c = 0; c < channels; c++)
+                decode_buffer[decode_pixel_index + c] = ((float)((const unsigned char*)input_data)[input_pixel_index + c]) / 255;
+        }
+        break;
+
+    case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_SRGB):
+        for (; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+            int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
+            for (c = 0; c < channels; c++)
+                decode_buffer[decode_pixel_index + c] = stbir__srgb_uchar_to_linear_float[((const unsigned char*)input_data)[input_pixel_index + c]];
+
+            if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
+                decode_buffer[decode_pixel_index + alpha_channel] = ((float)((const unsigned char*)input_data)[input_pixel_index + alpha_channel]) / 255;
+        }
+        break;
+
+    case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR):
+        for (; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+            int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
+            for (c = 0; c < channels; c++)
+                decode_buffer[decode_pixel_index + c] = ((float)((const unsigned short*)input_data)[input_pixel_index + c]) / 65535;
+        }
+        break;
+
+    case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB):
+        for (; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+            int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
+            for (c = 0; c < channels; c++)
+                decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear(((float)((const unsigned short*)input_data)[input_pixel_index + c]) / 65535);
+
+            if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
+                decode_buffer[decode_pixel_index + alpha_channel] = ((float)((const unsigned short*)input_data)[input_pixel_index + alpha_channel]) / 65535;
+        }
+        break;
+
+    case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR):
+        for (; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+            int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
+            for (c = 0; c < channels; c++)
+                decode_buffer[decode_pixel_index + c] = (float)(((double)((const unsigned int*)input_data)[input_pixel_index + c]) / 4294967295);
+        }
+        break;
+
+    case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB):
+        for (; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+            int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
+            for (c = 0; c < channels; c++)
+                decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear((float)(((double)((const unsigned int*)input_data)[input_pixel_index + c]) / 4294967295));
+
+            if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
+                decode_buffer[decode_pixel_index + alpha_channel] = (float)(((double)((const unsigned int*)input_data)[input_pixel_index + alpha_channel]) / 4294967295);
+        }
+        break;
+
+    case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR):
+        for (; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+            int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
+            for (c = 0; c < channels; c++)
+                decode_buffer[decode_pixel_index + c] = ((const float*)input_data)[input_pixel_index + c];
+        }
+        break;
+
+    case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB):
+        for (; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+            int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
+            for (c = 0; c < channels; c++)
+                decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear(((const float*)input_data)[input_pixel_index + c]);
+
+            if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
+                decode_buffer[decode_pixel_index + alpha_channel] = ((const float*)input_data)[input_pixel_index + alpha_channel];
+        }
+
+        break;
+
+    default:
+        STBIR__UNIMPLEMENTED("Unknown type/colorspace/channels combination.");
+        break;
+    }
+
+    if (!(stbir_info->flags & STBIR_FLAG_ALPHA_PREMULTIPLIED))
+    {
+        for (x = -stbir_info->horizontal_filter_pixel_margin; x < max_x; x++)
+        {
+            int decode_pixel_index = x * channels;
+
+            // If the alpha value is 0 it will clobber the color values. Make sure it's not.
+            float alpha = decode_buffer[decode_pixel_index + alpha_channel];
+#ifndef STBIR_NO_ALPHA_EPSILON
+            if (stbir_info->type != STBIR_TYPE_FLOAT) {
+                alpha += STBIR_ALPHA_EPSILON;
+                decode_buffer[decode_pixel_index + alpha_channel] = alpha;
+            }
+#endif
+            for (c = 0; c < channels; c++)
+            {
+                if (c == alpha_channel)
+                    continue;
+
+                decode_buffer[decode_pixel_index + c] *= alpha;
+            }
+        }
+    }
+
+    if (edge_horizontal == STBIR_EDGE_ZERO)
+    {
+        for (x = -stbir_info->horizontal_filter_pixel_margin; x < 0; x++)
+        {
+            for (c = 0; c < channels; c++)
+                decode_buffer[x*channels + c] = 0;
+        }
+        for (x = input_w; x < max_x; x++)
+        {
+            for (c = 0; c < channels; c++)
+                decode_buffer[x*channels + c] = 0;
+        }
+    }
+}
+
+static float* stbir__get_ring_buffer_entry(float* ring_buffer, int index, int ring_buffer_length)
+{
+    return &ring_buffer[index * ring_buffer_length];
+}
+
+static float* stbir__add_empty_ring_buffer_entry(stbir__info* stbir_info, int n)
+{
+    int ring_buffer_index;
+    float* ring_buffer;
+
+    if (stbir_info->ring_buffer_begin_index < 0)
+    {
+        ring_buffer_index = stbir_info->ring_buffer_begin_index = 0;
+        stbir_info->ring_buffer_first_scanline = n;
+    }
+    else
+    {
+        ring_buffer_index = (stbir_info->ring_buffer_begin_index + (stbir_info->ring_buffer_last_scanline - stbir_info->ring_buffer_first_scanline) + 1) % stbir_info->vertical_filter_pixel_width;
+        STBIR__DEBUG_ASSERT(ring_buffer_index != stbir_info->ring_buffer_begin_index);
+    }
+
+    ring_buffer = stbir__get_ring_buffer_entry(stbir_info->ring_buffer, ring_buffer_index, stbir_info->ring_buffer_length_bytes / sizeof(float));
+    memset(ring_buffer, 0, stbir_info->ring_buffer_length_bytes);
+
+    stbir_info->ring_buffer_last_scanline = n;
+
+    return ring_buffer;
+}
+
+
+static void stbir__resample_horizontal_upsample(stbir__info* stbir_info, int n, float* output_buffer)
+{
+    int x, k;
+    int output_w = stbir_info->output_w;
+    int kernel_pixel_width = stbir_info->horizontal_filter_pixel_width;
+    int channels = stbir_info->channels;
+    float* decode_buffer = stbir__get_decode_buffer(stbir_info);
+    stbir__contributors* horizontal_contributors = stbir_info->horizontal_contributors;
+    float* horizontal_coefficients = stbir_info->horizontal_coefficients;
+    int coefficient_width = stbir_info->horizontal_coefficient_width;
+
+    for (x = 0; x < output_w; x++)
+    {
+        int n0 = horizontal_contributors[x].n0;
+        int n1 = horizontal_contributors[x].n1;
+
+        int out_pixel_index = x * channels;
+        int coefficient_group = coefficient_width * x;
+        int coefficient_counter = 0;
+
+        STBIR__DEBUG_ASSERT(n1 >= n0);
+        STBIR__DEBUG_ASSERT(n0 >= -stbir_info->horizontal_filter_pixel_margin);
+        STBIR__DEBUG_ASSERT(n1 >= -stbir_info->horizontal_filter_pixel_margin);
+        STBIR__DEBUG_ASSERT(n0 < stbir_info->input_w + stbir_info->horizontal_filter_pixel_margin);
+        STBIR__DEBUG_ASSERT(n1 < stbir_info->input_w + stbir_info->horizontal_filter_pixel_margin);
+
+        switch (channels) {
+            case 1:
+                for (k = n0; k <= n1; k++)
+                {
+                    int in_pixel_index = k * 1;
+                    float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
+                }
+                break;
+            case 2:
+                for (k = n0; k <= n1; k++)
+                {
+                    int in_pixel_index = k * 2;
+                    float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
+                    output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
+                }
+                break;
+            case 3:
+                for (k = n0; k <= n1; k++)
+                {
+                    int in_pixel_index = k * 3;
+                    float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
+                    output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
+                    output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient;
+                }
+                break;
+            case 4:
+                for (k = n0; k <= n1; k++)
+                {
+                    int in_pixel_index = k * 4;
+                    float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
+                    output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
+                    output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient;
+                    output_buffer[out_pixel_index + 3] += decode_buffer[in_pixel_index + 3] * coefficient;
+                }
+                break;
+            default:
+                for (k = n0; k <= n1; k++)
+                {
+                    int in_pixel_index = k * channels;
+                    float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
+                    int c;
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    for (c = 0; c < channels; c++)
+                        output_buffer[out_pixel_index + c] += decode_buffer[in_pixel_index + c] * coefficient;
+                }
+                break;
+        }
+    }
+}
+
+static void stbir__resample_horizontal_downsample(stbir__info* stbir_info, int n, float* output_buffer)
+{
+    int x, k;
+    int input_w = stbir_info->input_w;
+    int output_w = stbir_info->output_w;
+    int kernel_pixel_width = stbir_info->horizontal_filter_pixel_width;
+    int channels = stbir_info->channels;
+    float* decode_buffer = stbir__get_decode_buffer(stbir_info);
+    stbir__contributors* horizontal_contributors = stbir_info->horizontal_contributors;
+    float* horizontal_coefficients = stbir_info->horizontal_coefficients;
+    int coefficient_width = stbir_info->horizontal_coefficient_width;
+    int filter_pixel_margin = stbir_info->horizontal_filter_pixel_margin;
+    int max_x = input_w + filter_pixel_margin * 2;
+
+    STBIR__DEBUG_ASSERT(!stbir__use_width_upsampling(stbir_info));
+
+    switch (channels) {
+        case 1:
+            for (x = 0; x < max_x; x++)
+            {
+                int n0 = horizontal_contributors[x].n0;
+                int n1 = horizontal_contributors[x].n1;
+
+                int in_x = x - filter_pixel_margin;
+                int in_pixel_index = in_x * 1;
+                int max_n = n1;
+                int coefficient_group = coefficient_width * x;
+
+                for (k = n0; k <= max_n; k++)
+                {
+                    int out_pixel_index = k * 1;
+                    float coefficient = horizontal_coefficients[coefficient_group + k - n0];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
+                }
+            }
+            break;
+
+        case 2:
+            for (x = 0; x < max_x; x++)
+            {
+                int n0 = horizontal_contributors[x].n0;
+                int n1 = horizontal_contributors[x].n1;
+
+                int in_x = x - filter_pixel_margin;
+                int in_pixel_index = in_x * 2;
+                int max_n = n1;
+                int coefficient_group = coefficient_width * x;
+
+                for (k = n0; k <= max_n; k++)
+                {
+                    int out_pixel_index = k * 2;
+                    float coefficient = horizontal_coefficients[coefficient_group + k - n0];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
+                    output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
+                }
+            }
+            break;
+
+        case 3:
+            for (x = 0; x < max_x; x++)
+            {
+                int n0 = horizontal_contributors[x].n0;
+                int n1 = horizontal_contributors[x].n1;
+
+                int in_x = x - filter_pixel_margin;
+                int in_pixel_index = in_x * 3;
+                int max_n = n1;
+                int coefficient_group = coefficient_width * x;
+
+                for (k = n0; k <= max_n; k++)
+                {
+                    int out_pixel_index = k * 3;
+                    float coefficient = horizontal_coefficients[coefficient_group + k - n0];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
+                    output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
+                    output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient;
+                }
+            }
+            break;
+
+        case 4:
+            for (x = 0; x < max_x; x++)
+            {
+                int n0 = horizontal_contributors[x].n0;
+                int n1 = horizontal_contributors[x].n1;
+
+                int in_x = x - filter_pixel_margin;
+                int in_pixel_index = in_x * 4;
+                int max_n = n1;
+                int coefficient_group = coefficient_width * x;
+
+                for (k = n0; k <= max_n; k++)
+                {
+                    int out_pixel_index = k * 4;
+                    float coefficient = horizontal_coefficients[coefficient_group + k - n0];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
+                    output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
+                    output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient;
+                    output_buffer[out_pixel_index + 3] += decode_buffer[in_pixel_index + 3] * coefficient;
+                }
+            }
+            break;
+
+        default:
+            for (x = 0; x < max_x; x++)
+            {
+                int n0 = horizontal_contributors[x].n0;
+                int n1 = horizontal_contributors[x].n1;
+
+                int in_x = x - filter_pixel_margin;
+                int in_pixel_index = in_x * channels;
+                int max_n = n1;
+                int coefficient_group = coefficient_width * x;
+
+                for (k = n0; k <= max_n; k++)
+                {
+                    int c;
+                    int out_pixel_index = k * channels;
+                    float coefficient = horizontal_coefficients[coefficient_group + k - n0];
+                    STBIR__DEBUG_ASSERT(coefficient != 0);
+                    for (c = 0; c < channels; c++)
+                        output_buffer[out_pixel_index + c] += decode_buffer[in_pixel_index + c] * coefficient;
+                }
+            }
+            break;
+    }
+}
+
+static void stbir__decode_and_resample_upsample(stbir__info* stbir_info, int n)
+{
+    // Decode the nth scanline from the source image into the decode buffer.
+    stbir__decode_scanline(stbir_info, n);
+
+    // Now resample it into the ring buffer.
+    if (stbir__use_width_upsampling(stbir_info))
+        stbir__resample_horizontal_upsample(stbir_info, n, stbir__add_empty_ring_buffer_entry(stbir_info, n));
+    else
+        stbir__resample_horizontal_downsample(stbir_info, n, stbir__add_empty_ring_buffer_entry(stbir_info, n));
+
+    // Now it's sitting in the ring buffer ready to be used as source for the vertical sampling.
+}
+
+static void stbir__decode_and_resample_downsample(stbir__info* stbir_info, int n)
+{
+    // Decode the nth scanline from the source image into the decode buffer.
+    stbir__decode_scanline(stbir_info, n);
+
+    memset(stbir_info->horizontal_buffer, 0, stbir_info->output_w * stbir_info->channels * sizeof(float));
+
+    // Now resample it into the horizontal buffer.
+    if (stbir__use_width_upsampling(stbir_info))
+        stbir__resample_horizontal_upsample(stbir_info, n, stbir_info->horizontal_buffer);
+    else
+        stbir__resample_horizontal_downsample(stbir_info, n, stbir_info->horizontal_buffer);
+
+    // Now it's sitting in the horizontal buffer ready to be distributed into the ring buffers.
+}
+
+// Get the specified scan line from the ring buffer.
+static float* stbir__get_ring_buffer_scanline(int get_scanline, float* ring_buffer, int begin_index, int first_scanline, int ring_buffer_size, int ring_buffer_length)
+{
+    int ring_buffer_index = (begin_index + (get_scanline - first_scanline)) % ring_buffer_size;
+    return stbir__get_ring_buffer_entry(ring_buffer, ring_buffer_index, ring_buffer_length);
+}
+
+
+static void stbir__encode_scanline(stbir__info* stbir_info, int num_pixels, void *output_buffer, float *encode_buffer, int channels, int alpha_channel, int decode)
+{
+    int x;
+    int n;
+    int num_nonalpha;
+    stbir_uint16 nonalpha[STBIR_MAX_CHANNELS];
+
+    if (!(stbir_info->flags&STBIR_FLAG_ALPHA_PREMULTIPLIED))
+    {
+        for (x=0; x < num_pixels; ++x)
+        {
+            int pixel_index = x*channels;
+
+            float alpha = encode_buffer[pixel_index + alpha_channel];
+            float reciprocal_alpha = alpha ? 1.0f / alpha : 0;
+
+            // unrolling this produced a 1% slowdown upscaling a large RGBA linear-space image on my machine - stb
+            for (n = 0; n < channels; n++)
+                if (n != alpha_channel)
+                    encode_buffer[pixel_index + n] *= reciprocal_alpha;
+
+            // We added in a small epsilon to prevent the color channel from being deleted with zero alpha.
+            // Because we only add it for integer types, it will automatically be discarded on integer
+            // conversion, so we don't need to subtract it back out (which would be problematic for
+            // numeric precision reasons).
+        }
+    }
+
+    // build a table of all channels that need colorspace correction, so
+    // we don't perform colorspace correction on channels that don't need it.
+    for (x=0, num_nonalpha=0; x < channels; ++x)
+        if (x != alpha_channel || (stbir_info->flags & STBIR_FLAG_ALPHA_USES_COLORSPACE))
+            nonalpha[num_nonalpha++] = x;
+
+    #define STBIR__ROUND_INT(f)    ((int)          ((f)+0.5))
+    #define STBIR__ROUND_UINT(f)   ((stbir_uint32) ((f)+0.5))
+
+    #ifdef STBIR__SATURATE_INT
+    #define STBIR__ENCODE_LINEAR8(f)   stbir__saturate8 (STBIR__ROUND_INT((f) * 255  ))
+    #define STBIR__ENCODE_LINEAR16(f)  stbir__saturate16(STBIR__ROUND_INT((f) * 65535))
+    #else
+    #define STBIR__ENCODE_LINEAR8(f)   (unsigned char ) STBIR__ROUND_INT(stbir__saturate(f) * 255  )
+    #define STBIR__ENCODE_LINEAR16(f)  (unsigned short) STBIR__ROUND_INT(stbir__saturate(f) * 65535)
+    #endif
+
+    switch (decode)
+    {
+        case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_LINEAR):
+            for (x=0; x < num_pixels; ++x)
+            {
+                int pixel_index = x*channels;
+
+                for (n = 0; n < channels; n++)
+                {
+                    int index = pixel_index + n;
+                    ((unsigned char*)output_buffer)[index] = STBIR__ENCODE_LINEAR8(encode_buffer[index]);
+                }
+            }
+            break;
+
+        case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_SRGB):
+            for (x=0; x < num_pixels; ++x)
+            {
+                int pixel_index = x*channels;
+
+                for (n = 0; n < num_nonalpha; n++)
+                {
+                    int index = pixel_index + nonalpha[n];
+                    ((unsigned char*)output_buffer)[index] = stbir__linear_to_srgb_uchar(encode_buffer[index]);
+                }
+
+                if (!(stbir_info->flags & STBIR_FLAG_ALPHA_USES_COLORSPACE))
+                    ((unsigned char *)output_buffer)[pixel_index + alpha_channel] = STBIR__ENCODE_LINEAR8(encode_buffer[pixel_index+alpha_channel]);
+            }
+            break;
+
+        case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR):
+            for (x=0; x < num_pixels; ++x)
+            {
+                int pixel_index = x*channels;
+
+                for (n = 0; n < channels; n++)
+                {
+                    int index = pixel_index + n;
+                    ((unsigned short*)output_buffer)[index] = STBIR__ENCODE_LINEAR16(encode_buffer[index]);
+                }
+            }
+            break;
+
+        case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB):
+            for (x=0; x < num_pixels; ++x)
+            {
+                int pixel_index = x*channels;
+
+                for (n = 0; n < num_nonalpha; n++)
+                {
+                    int index = pixel_index + nonalpha[n];
+                    ((unsigned short*)output_buffer)[index] = (unsigned short)STBIR__ROUND_INT(stbir__linear_to_srgb(stbir__saturate(encode_buffer[index])) * 65535);
+                }
+
+                if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
+                    ((unsigned short*)output_buffer)[pixel_index + alpha_channel] = STBIR__ENCODE_LINEAR16(encode_buffer[pixel_index + alpha_channel]);
+            }
+
+            break;
+
+        case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR):
+            for (x=0; x < num_pixels; ++x)
+            {
+                int pixel_index = x*channels;
+
+                for (n = 0; n < channels; n++)
+                {
+                    int index = pixel_index + n;
+                    ((unsigned int*)output_buffer)[index] = (unsigned int)STBIR__ROUND_UINT(((double)stbir__saturate(encode_buffer[index])) * 4294967295);
+                }
+            }
+            break;
+
+        case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB):
+            for (x=0; x < num_pixels; ++x)
+            {
+                int pixel_index = x*channels;
+
+                for (n = 0; n < num_nonalpha; n++)
+                {
+                    int index = pixel_index + nonalpha[n];
+                    ((unsigned int*)output_buffer)[index] = (unsigned int)STBIR__ROUND_UINT(((double)stbir__linear_to_srgb(stbir__saturate(encode_buffer[index]))) * 4294967295);
+                }
+
+                if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
+                    ((unsigned int*)output_buffer)[pixel_index + alpha_channel] = (unsigned int)STBIR__ROUND_INT(((double)stbir__saturate(encode_buffer[pixel_index + alpha_channel])) * 4294967295);
+            }
+            break;
+
+        case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR):
+            for (x=0; x < num_pixels; ++x)
+            {
+                int pixel_index = x*channels;
+
+                for (n = 0; n < channels; n++)
+                {
+                    int index = pixel_index + n;
+                    ((float*)output_buffer)[index] = encode_buffer[index];
+                }
+            }
+            break;
+
+        case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB):
+            for (x=0; x < num_pixels; ++x)
+            {
+                int pixel_index = x*channels;
+
+                for (n = 0; n < num_nonalpha; n++)
+                {
+                    int index = pixel_index + nonalpha[n];
+                    ((float*)output_buffer)[index] = stbir__linear_to_srgb(encode_buffer[index]);
+                }
+
+                if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
+                    ((float*)output_buffer)[pixel_index + alpha_channel] = encode_buffer[pixel_index + alpha_channel];
+            }
+            break;
+
+        default:
+            STBIR__UNIMPLEMENTED("Unknown type/colorspace/channels combination.");
+            break;
+    }
+}
+
+static void stbir__resample_vertical_upsample(stbir__info* stbir_info, int n, int in_first_scanline, int in_last_scanline, float in_center_of_out)
+{
+    int x, k;
+    int output_w = stbir_info->output_w;
+    stbir__contributors* vertical_contributors = stbir_info->vertical_contributors;
+    float* vertical_coefficients = stbir_info->vertical_coefficients;
+    int channels = stbir_info->channels;
+    int alpha_channel = stbir_info->alpha_channel;
+    int type = stbir_info->type;
+    int colorspace = stbir_info->colorspace;
+    int kernel_pixel_width = stbir_info->vertical_filter_pixel_width;
+    void* output_data = stbir_info->output_data;
+    float* encode_buffer = stbir_info->encode_buffer;
+    int decode = STBIR__DECODE(type, colorspace);
+    int coefficient_width = stbir_info->vertical_coefficient_width;
+    int coefficient_counter;
+    int contributor = n;
+
+    float* ring_buffer = stbir_info->ring_buffer;
+    int ring_buffer_begin_index = stbir_info->ring_buffer_begin_index;
+    int ring_buffer_first_scanline = stbir_info->ring_buffer_first_scanline;
+    int ring_buffer_last_scanline = stbir_info->ring_buffer_last_scanline;
+    int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float);
+
+    int n0,n1, output_row_start;
+    int coefficient_group = coefficient_width * contributor;
+
+    n0 = vertical_contributors[contributor].n0;
+    n1 = vertical_contributors[contributor].n1;
+
+    output_row_start = n * stbir_info->output_stride_bytes;
+
+    STBIR__DEBUG_ASSERT(stbir__use_height_upsampling(stbir_info));
+
+    memset(encode_buffer, 0, output_w * sizeof(float) * channels);
+
+    // I tried reblocking this for better cache usage of encode_buffer
+    // (using x_outer, k, x_inner), but it lost speed. -- stb
+
+    coefficient_counter = 0;
+    switch (channels) {
+        case 1:
+            for (k = n0; k <= n1; k++)
+            {
+                int coefficient_index = coefficient_counter++;
+                float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, kernel_pixel_width, ring_buffer_length);
+                float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
+                for (x = 0; x < output_w; ++x)
+                {
+                    int in_pixel_index = x * 1;
+                    encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient;
+                }
+            }
+            break;
+        case 2:
+            for (k = n0; k <= n1; k++)
+            {
+                int coefficient_index = coefficient_counter++;
+                float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, kernel_pixel_width, ring_buffer_length);
+                float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
+                for (x = 0; x < output_w; ++x)
+                {
+                    int in_pixel_index = x * 2;
+                    encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient;
+                    encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient;
+                }
+            }
+            break;
+        case 3:
+            for (k = n0; k <= n1; k++)
+            {
+                int coefficient_index = coefficient_counter++;
+                float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, kernel_pixel_width, ring_buffer_length);
+                float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
+                for (x = 0; x < output_w; ++x)
+                {
+                    int in_pixel_index = x * 3;
+                    encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient;
+                    encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient;
+                    encode_buffer[in_pixel_index + 2] += ring_buffer_entry[in_pixel_index + 2] * coefficient;
+                }
+            }
+            break;
+        case 4:
+            for (k = n0; k <= n1; k++)
+            {
+                int coefficient_index = coefficient_counter++;
+                float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, kernel_pixel_width, ring_buffer_length);
+                float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
+                for (x = 0; x < output_w; ++x)
+                {
+                    int in_pixel_index = x * 4;
+                    encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient;
+                    encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient;
+                    encode_buffer[in_pixel_index + 2] += ring_buffer_entry[in_pixel_index + 2] * coefficient;
+                    encode_buffer[in_pixel_index + 3] += ring_buffer_entry[in_pixel_index + 3] * coefficient;
+                }
+            }
+            break;
+        default:
+            for (k = n0; k <= n1; k++)
+            {
+                int coefficient_index = coefficient_counter++;
+                float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, kernel_pixel_width, ring_buffer_length);
+                float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
+                for (x = 0; x < output_w; ++x)
+                {
+                    int in_pixel_index = x * channels;
+                    int c;
+                    for (c = 0; c < channels; c++)
+                        encode_buffer[in_pixel_index + c] += ring_buffer_entry[in_pixel_index + c] * coefficient;
+                }
+            }
+            break;
+    }
+    stbir__encode_scanline(stbir_info, output_w, (char *) output_data + output_row_start, encode_buffer, channels, alpha_channel, decode);
+}
+
+static void stbir__resample_vertical_downsample(stbir__info* stbir_info, int n, int in_first_scanline, int in_last_scanline, float in_center_of_out)
+{
+    int x, k;
+    int output_w = stbir_info->output_w;
+    int output_h = stbir_info->output_h;
+    stbir__contributors* vertical_contributors = stbir_info->vertical_contributors;
+    float* vertical_coefficients = stbir_info->vertical_coefficients;
+    int channels = stbir_info->channels;
+    int kernel_pixel_width = stbir_info->vertical_filter_pixel_width;
+    void* output_data = stbir_info->output_data;
+    float* horizontal_buffer = stbir_info->horizontal_buffer;
+    int coefficient_width = stbir_info->vertical_coefficient_width;
+    int contributor = n + stbir_info->vertical_filter_pixel_margin;
+
+    float* ring_buffer = stbir_info->ring_buffer;
+    int ring_buffer_begin_index = stbir_info->ring_buffer_begin_index;
+    int ring_buffer_first_scanline = stbir_info->ring_buffer_first_scanline;
+    int ring_buffer_last_scanline = stbir_info->ring_buffer_last_scanline;
+    int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float);
+    int n0,n1;
+
+    n0 = vertical_contributors[contributor].n0;
+    n1 = vertical_contributors[contributor].n1;
+
+    STBIR__DEBUG_ASSERT(!stbir__use_height_upsampling(stbir_info));
+
+    for (k = n0; k <= n1; k++)
+    {
+        int coefficient_index = k - n0;
+        int coefficient_group = coefficient_width * contributor;
+        float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
+
+        float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, kernel_pixel_width, ring_buffer_length);
+
+        switch (channels) {
+            case 1:
+                for (x = 0; x < output_w; x++)
+                {
+                    int in_pixel_index = x * 1;
+                    ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient;
+                }
+                break;
+            case 2:
+                for (x = 0; x < output_w; x++)
+                {
+                    int in_pixel_index = x * 2;
+                    ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient;
+                    ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient;
+                }
+                break;
+            case 3:
+                for (x = 0; x < output_w; x++)
+                {
+                    int in_pixel_index = x * 3;
+                    ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient;
+                    ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient;
+                    ring_buffer_entry[in_pixel_index + 2] += horizontal_buffer[in_pixel_index + 2] * coefficient;
+                }
+                break;
+            case 4:
+                for (x = 0; x < output_w; x++)
+                {
+                    int in_pixel_index = x * 4;
+                    ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient;
+                    ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient;
+                    ring_buffer_entry[in_pixel_index + 2] += horizontal_buffer[in_pixel_index + 2] * coefficient;
+                    ring_buffer_entry[in_pixel_index + 3] += horizontal_buffer[in_pixel_index + 3] * coefficient;
+                }
+                break;
+            default:
+                for (x = 0; x < output_w; x++)
+                {
+                    int in_pixel_index = x * channels;
+
+                    int c;
+                    for (c = 0; c < channels; c++)
+                        ring_buffer_entry[in_pixel_index + c] += horizontal_buffer[in_pixel_index + c] * coefficient;
+                }
+                break;
+        }
+    }
+}
+
+static void stbir__buffer_loop_upsample(stbir__info* stbir_info)
+{
+    int y;
+    float scale_ratio = stbir_info->vertical_scale;
+    float out_scanlines_radius = stbir__filter_info_table[stbir_info->vertical_filter].support(1/scale_ratio) * scale_ratio;
+
+    STBIR__DEBUG_ASSERT(stbir__use_height_upsampling(stbir_info));
+
+    for (y = 0; y < stbir_info->output_h; y++)
+    {
+        float in_center_of_out = 0; // Center of the current out scanline in the in scanline space
+        int in_first_scanline = 0, in_last_scanline = 0;
+
+        stbir__calculate_sample_range_upsample(y, out_scanlines_radius, scale_ratio, stbir_info->vertical_shift, &in_first_scanline, &in_last_scanline, &in_center_of_out);
+
+        STBIR__DEBUG_ASSERT(in_last_scanline - in_first_scanline <= stbir_info->vertical_filter_pixel_width);
+
+        if (stbir_info->ring_buffer_begin_index >= 0)
+        {
+            // Get rid of whatever we don't need anymore.
+            while (in_first_scanline > stbir_info->ring_buffer_first_scanline)
+            {
+                if (stbir_info->ring_buffer_first_scanline == stbir_info->ring_buffer_last_scanline)
+                {
+                    // We just popped the last scanline off the ring buffer.
+                    // Reset it to the empty state.
+                    stbir_info->ring_buffer_begin_index = -1;
+                    stbir_info->ring_buffer_first_scanline = 0;
+                    stbir_info->ring_buffer_last_scanline = 0;
+                    break;
+                }
+                else
+                {
+                    stbir_info->ring_buffer_first_scanline++;
+                    stbir_info->ring_buffer_begin_index = (stbir_info->ring_buffer_begin_index + 1) % stbir_info->vertical_filter_pixel_width;
+                }
+            }
+        }
+
+        // Load in new ones.
+        if (stbir_info->ring_buffer_begin_index < 0)
+            stbir__decode_and_resample_upsample(stbir_info, in_first_scanline);
+
+        while (in_last_scanline > stbir_info->ring_buffer_last_scanline)
+            stbir__decode_and_resample_upsample(stbir_info, stbir_info->ring_buffer_last_scanline + 1);
+
+        // Now all buffers should be ready to write a row of vertical sampling.
+        stbir__resample_vertical_upsample(stbir_info, y, in_first_scanline, in_last_scanline, in_center_of_out);
+
+        STBIR_PROGRESS_REPORT((float)y / stbir_info->output_h);
+    }
+}
+
+static void stbir__empty_ring_buffer(stbir__info* stbir_info, int first_necessary_scanline)
+{
+    int output_stride_bytes = stbir_info->output_stride_bytes;
+    int channels = stbir_info->channels;
+    int alpha_channel = stbir_info->alpha_channel;
+    int type = stbir_info->type;
+    int colorspace = stbir_info->colorspace;
+    int output_w = stbir_info->output_w;
+    void* output_data = stbir_info->output_data;
+    int decode = STBIR__DECODE(type, colorspace);
+
+    float* ring_buffer = stbir_info->ring_buffer;
+    int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float);
+
+    if (stbir_info->ring_buffer_begin_index >= 0)
+    {
+        // Get rid of whatever we don't need anymore.
+        while (first_necessary_scanline > stbir_info->ring_buffer_first_scanline)
+        {
+            if (stbir_info->ring_buffer_first_scanline >= 0 && stbir_info->ring_buffer_first_scanline < stbir_info->output_h)
+            {
+                int output_row_start = stbir_info->ring_buffer_first_scanline * output_stride_bytes;
+                float* ring_buffer_entry = stbir__get_ring_buffer_entry(ring_buffer, stbir_info->ring_buffer_begin_index, ring_buffer_length);
+                stbir__encode_scanline(stbir_info, output_w, (char *) output_data + output_row_start, ring_buffer_entry, channels, alpha_channel, decode);
+                STBIR_PROGRESS_REPORT((float)stbir_info->ring_buffer_first_scanline / stbir_info->output_h);
+            }
+
+            if (stbir_info->ring_buffer_first_scanline == stbir_info->ring_buffer_last_scanline)
+            {
+                // We just popped the last scanline off the ring buffer.
+                // Reset it to the empty state.
+                stbir_info->ring_buffer_begin_index = -1;
+                stbir_info->ring_buffer_first_scanline = 0;
+                stbir_info->ring_buffer_last_scanline = 0;
+                break;
+            }
+            else
+            {
+                stbir_info->ring_buffer_first_scanline++;
+                stbir_info->ring_buffer_begin_index = (stbir_info->ring_buffer_begin_index + 1) % stbir_info->vertical_filter_pixel_width;
+            }
+        }
+    }
+}
+
+static void stbir__buffer_loop_downsample(stbir__info* stbir_info)
+{
+    int y;
+    float scale_ratio = stbir_info->vertical_scale;
+    int output_h = stbir_info->output_h;
+    float in_pixels_radius = stbir__filter_info_table[stbir_info->vertical_filter].support(scale_ratio) / scale_ratio;
+    int pixel_margin = stbir_info->vertical_filter_pixel_margin;
+    int max_y = stbir_info->input_h + pixel_margin;
+
+    STBIR__DEBUG_ASSERT(!stbir__use_height_upsampling(stbir_info));
+
+    for (y = -pixel_margin; y < max_y; y++)
+    {
+        float out_center_of_in; // Center of the current out scanline in the in scanline space
+        int out_first_scanline, out_last_scanline;
+
+        stbir__calculate_sample_range_downsample(y, in_pixels_radius, scale_ratio, stbir_info->vertical_shift, &out_first_scanline, &out_last_scanline, &out_center_of_in);
+
+        STBIR__DEBUG_ASSERT(out_last_scanline - out_first_scanline <= stbir_info->vertical_filter_pixel_width);
+
+        if (out_last_scanline < 0 || out_first_scanline >= output_h)
+            continue;
+
+        stbir__empty_ring_buffer(stbir_info, out_first_scanline);
+
+        stbir__decode_and_resample_downsample(stbir_info, y);
+
+        // Load in new ones.
+        if (stbir_info->ring_buffer_begin_index < 0)
+            stbir__add_empty_ring_buffer_entry(stbir_info, out_first_scanline);
+
+        while (out_last_scanline > stbir_info->ring_buffer_last_scanline)
+            stbir__add_empty_ring_buffer_entry(stbir_info, stbir_info->ring_buffer_last_scanline + 1);
+
+        // Now the horizontal buffer is ready to write to all ring buffer rows.
+        stbir__resample_vertical_downsample(stbir_info, y, out_first_scanline, out_last_scanline, out_center_of_in);
+    }
+
+    stbir__empty_ring_buffer(stbir_info, stbir_info->output_h);
+}
+
+static void stbir__setup(stbir__info *info, int input_w, int input_h, int output_w, int output_h, int channels)
+{
+    info->input_w = input_w;
+    info->input_h = input_h;
+    info->output_w = output_w;
+    info->output_h = output_h;
+    info->channels = channels;
+}
+
+static void stbir__calculate_transform(stbir__info *info, float s0, float t0, float s1, float t1, float *transform)
+{
+    info->s0 = s0;
+    info->t0 = t0;
+    info->s1 = s1;
+    info->t1 = t1;
+
+    if (transform)
+    {
+        info->horizontal_scale = transform[0];
+        info->vertical_scale   = transform[1];
+        info->horizontal_shift = transform[2];
+        info->vertical_shift   = transform[3];
+    }
+    else
+    {
+        info->horizontal_scale = ((float)info->output_w / info->input_w) / (s1 - s0);
+        info->vertical_scale = ((float)info->output_h / info->input_h) / (t1 - t0);
+
+        info->horizontal_shift = s0 * info->input_w / (s1 - s0);
+        info->vertical_shift = t0 * info->input_h / (t1 - t0);
+    }
+}
+
+static void stbir__choose_filter(stbir__info *info, stbir_filter h_filter, stbir_filter v_filter)
+{
+    if (h_filter == 0)
+        h_filter = stbir__use_upsampling(info->horizontal_scale) ? STBIR_DEFAULT_FILTER_UPSAMPLE : STBIR_DEFAULT_FILTER_DOWNSAMPLE;
+    if (v_filter == 0)
+        v_filter = stbir__use_upsampling(info->vertical_scale)   ? STBIR_DEFAULT_FILTER_UPSAMPLE : STBIR_DEFAULT_FILTER_DOWNSAMPLE;
+    info->horizontal_filter = h_filter;
+    info->vertical_filter = v_filter;
+}
+
+static stbir_uint32 stbir__calculate_memory(stbir__info *info)
+{
+    int pixel_margin = stbir__get_filter_pixel_margin(info->horizontal_filter, info->horizontal_scale);
+    int filter_height = stbir__get_filter_pixel_width(info->vertical_filter, info->vertical_scale);
+
+    info->horizontal_num_contributors = stbir__get_contributors(info->horizontal_scale, info->horizontal_filter, info->input_w, info->output_w);
+    info->vertical_num_contributors   = stbir__get_contributors(info->vertical_scale  , info->vertical_filter  , info->input_h, info->output_h);
+
+    info->horizontal_contributors_size = info->horizontal_num_contributors * sizeof(stbir__contributors);
+    info->horizontal_coefficients_size = stbir__get_total_horizontal_coefficients(info) * sizeof(float);
+    info->vertical_contributors_size = info->vertical_num_contributors * sizeof(stbir__contributors);
+    info->vertical_coefficients_size = stbir__get_total_vertical_coefficients(info) * sizeof(float);
+    info->decode_buffer_size = (info->input_w + pixel_margin * 2) * info->channels * sizeof(float);
+    info->horizontal_buffer_size = info->output_w * info->channels * sizeof(float);
+    info->ring_buffer_size = info->output_w * info->channels * filter_height * sizeof(float);
+    info->encode_buffer_size = info->output_w * info->channels * sizeof(float);
+
+    STBIR_ASSERT(info->horizontal_filter != 0);
+    STBIR_ASSERT(info->horizontal_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); // this now happens too late
+    STBIR_ASSERT(info->vertical_filter != 0);
+    STBIR_ASSERT(info->vertical_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); // this now happens too late
+
+    if (stbir__use_height_upsampling(info))
+        // The horizontal buffer is for when we're downsampling the height and we
+        // can't output the result of sampling the decode buffer directly into the
+        // ring buffers.
+        info->horizontal_buffer_size = 0;
+    else
+        // The encode buffer is to retain precision in the height upsampling method
+        // and isn't used when height downsampling.
+        info->encode_buffer_size = 0;
+
+    return info->horizontal_contributors_size + info->horizontal_coefficients_size
+        + info->vertical_contributors_size + info->vertical_coefficients_size
+        + info->decode_buffer_size + info->horizontal_buffer_size
+        + info->ring_buffer_size + info->encode_buffer_size;
+}
+
+static int stbir__resize_allocated(stbir__info *info,
+    const void* input_data, int input_stride_in_bytes,
+    void* output_data, int output_stride_in_bytes,
+    int alpha_channel, stbir_uint32 flags, stbir_datatype type,
+    stbir_edge edge_horizontal, stbir_edge edge_vertical, stbir_colorspace colorspace,
+    void* tempmem, size_t tempmem_size_in_bytes)
+{
+    size_t memory_required = stbir__calculate_memory(info);
+
+    int width_stride_input = input_stride_in_bytes ? input_stride_in_bytes : info->channels * info->input_w * stbir__type_size[type];
+    int width_stride_output = output_stride_in_bytes ? output_stride_in_bytes : info->channels * info->output_w * stbir__type_size[type];
+
+#ifdef STBIR_DEBUG_OVERWRITE_TEST
+#define OVERWRITE_ARRAY_SIZE 8
+    unsigned char overwrite_output_before_pre[OVERWRITE_ARRAY_SIZE];
+    unsigned char overwrite_tempmem_before_pre[OVERWRITE_ARRAY_SIZE];
+    unsigned char overwrite_output_after_pre[OVERWRITE_ARRAY_SIZE];
+    unsigned char overwrite_tempmem_after_pre[OVERWRITE_ARRAY_SIZE];
+
+    size_t begin_forbidden = width_stride_output * (info->output_h - 1) + info->output_w * info->channels * stbir__type_size[type];
+    memcpy(overwrite_output_before_pre, &((unsigned char*)output_data)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE);
+    memcpy(overwrite_output_after_pre, &((unsigned char*)output_data)[begin_forbidden], OVERWRITE_ARRAY_SIZE);
+    memcpy(overwrite_tempmem_before_pre, &((unsigned char*)tempmem)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE);
+    memcpy(overwrite_tempmem_after_pre, &((unsigned char*)tempmem)[tempmem_size_in_bytes], OVERWRITE_ARRAY_SIZE);
+#endif
+
+    STBIR_ASSERT(info->channels >= 0);
+    STBIR_ASSERT(info->channels <= STBIR_MAX_CHANNELS);
+
+    if (info->channels < 0 || info->channels > STBIR_MAX_CHANNELS)
+        return 0;
+
+    STBIR_ASSERT(info->horizontal_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table));
+    STBIR_ASSERT(info->vertical_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table));
+
+    if (info->horizontal_filter >= STBIR__ARRAY_SIZE(stbir__filter_info_table))
+        return 0;
+    if (info->vertical_filter >= STBIR__ARRAY_SIZE(stbir__filter_info_table))
+        return 0;
+
+    if (alpha_channel < 0)
+        flags |= STBIR_FLAG_ALPHA_USES_COLORSPACE | STBIR_FLAG_ALPHA_PREMULTIPLIED;
+
+    if (!(flags&STBIR_FLAG_ALPHA_USES_COLORSPACE) || !(flags&STBIR_FLAG_ALPHA_PREMULTIPLIED))
+        STBIR_ASSERT(alpha_channel >= 0 && alpha_channel < info->channels);
+
+    if (alpha_channel >= info->channels)
+        return 0;
+
+    STBIR_ASSERT(tempmem);
+
+    if (!tempmem)
+        return 0;
+
+    STBIR_ASSERT(tempmem_size_in_bytes >= memory_required);
+
+    if (tempmem_size_in_bytes < memory_required)
+        return 0;
+
+    memset(tempmem, 0, tempmem_size_in_bytes);
+
+    info->input_data = input_data;
+    info->input_stride_bytes = width_stride_input;
+
+    info->output_data = output_data;
+    info->output_stride_bytes = width_stride_output;
+
+    info->alpha_channel = alpha_channel;
+    info->flags = flags;
+    info->type = type;
+    info->edge_horizontal = edge_horizontal;
+    info->edge_vertical = edge_vertical;
+    info->colorspace = colorspace;
+
+    info->horizontal_coefficient_width   = stbir__get_coefficient_width  (info->horizontal_filter, info->horizontal_scale);
+    info->vertical_coefficient_width     = stbir__get_coefficient_width  (info->vertical_filter  , info->vertical_scale  );
+    info->horizontal_filter_pixel_width  = stbir__get_filter_pixel_width (info->horizontal_filter, info->horizontal_scale);
+    info->vertical_filter_pixel_width    = stbir__get_filter_pixel_width (info->vertical_filter  , info->vertical_scale  );
+    info->horizontal_filter_pixel_margin = stbir__get_filter_pixel_margin(info->horizontal_filter, info->horizontal_scale);
+    info->vertical_filter_pixel_margin   = stbir__get_filter_pixel_margin(info->vertical_filter  , info->vertical_scale  );
+
+    info->ring_buffer_length_bytes = info->output_w * info->channels * sizeof(float);
+    info->decode_buffer_pixels = info->input_w + info->horizontal_filter_pixel_margin * 2;
+
+#define STBIR__NEXT_MEMPTR(current, newtype) (newtype*)(((unsigned char*)current) + current##_size)
+
+    info->horizontal_contributors = (stbir__contributors *) tempmem;
+    info->horizontal_coefficients = STBIR__NEXT_MEMPTR(info->horizontal_contributors, float);
+    info->vertical_contributors = STBIR__NEXT_MEMPTR(info->horizontal_coefficients, stbir__contributors);
+    info->vertical_coefficients = STBIR__NEXT_MEMPTR(info->vertical_contributors, float);
+    info->decode_buffer = STBIR__NEXT_MEMPTR(info->vertical_coefficients, float);
+
+    if (stbir__use_height_upsampling(info))
+    {
+        info->horizontal_buffer = NULL;
+        info->ring_buffer = STBIR__NEXT_MEMPTR(info->decode_buffer, float);
+        info->encode_buffer = STBIR__NEXT_MEMPTR(info->ring_buffer, float);
+
+        STBIR__DEBUG_ASSERT((size_t)STBIR__NEXT_MEMPTR(info->encode_buffer, unsigned char) == (size_t)tempmem + tempmem_size_in_bytes);
+    }
+    else
+    {
+        info->horizontal_buffer = STBIR__NEXT_MEMPTR(info->decode_buffer, float);
+        info->ring_buffer = STBIR__NEXT_MEMPTR(info->horizontal_buffer, float);
+        info->encode_buffer = NULL;
+
+        STBIR__DEBUG_ASSERT((size_t)STBIR__NEXT_MEMPTR(info->ring_buffer, unsigned char) == (size_t)tempmem + tempmem_size_in_bytes);
+    }
+
+#undef STBIR__NEXT_MEMPTR
+
+    // This signals that the ring buffer is empty
+    info->ring_buffer_begin_index = -1;
+
+    stbir__calculate_filters(info, info->horizontal_contributors, info->horizontal_coefficients, info->horizontal_filter, info->horizontal_scale, info->horizontal_shift, info->input_w, info->output_w);
+    stbir__calculate_filters(info, info->vertical_contributors, info->vertical_coefficients, info->vertical_filter, info->vertical_scale, info->vertical_shift, info->input_h, info->output_h);
+
+    STBIR_PROGRESS_REPORT(0);
+
+    if (stbir__use_height_upsampling(info))
+        stbir__buffer_loop_upsample(info);
+    else
+        stbir__buffer_loop_downsample(info);
+
+    STBIR_PROGRESS_REPORT(1);
+
+#ifdef STBIR_DEBUG_OVERWRITE_TEST
+    STBIR__DEBUG_ASSERT(memcmp(overwrite_output_before_pre, &((unsigned char*)output_data)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE) == 0);
+    STBIR__DEBUG_ASSERT(memcmp(overwrite_output_after_pre, &((unsigned char*)output_data)[begin_forbidden], OVERWRITE_ARRAY_SIZE) == 0);
+    STBIR__DEBUG_ASSERT(memcmp(overwrite_tempmem_before_pre, &((unsigned char*)tempmem)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE) == 0);
+    STBIR__DEBUG_ASSERT(memcmp(overwrite_tempmem_after_pre, &((unsigned char*)tempmem)[tempmem_size_in_bytes], OVERWRITE_ARRAY_SIZE) == 0);
+#endif
+
+    return 1;
+}
+
+
+static int stbir__resize_arbitrary(
+    void *alloc_context,
+    const void* input_data, int input_w, int input_h, int input_stride_in_bytes,
+    void* output_data, int output_w, int output_h, int output_stride_in_bytes,
+    float s0, float t0, float s1, float t1, float *transform,
+    int channels, int alpha_channel, stbir_uint32 flags, stbir_datatype type,
+    stbir_filter h_filter, stbir_filter v_filter,
+    stbir_edge edge_horizontal, stbir_edge edge_vertical, stbir_colorspace colorspace)
+{
+    stbir__info info;
+    int result;
+    size_t memory_required;
+    void* extra_memory;
+
+    stbir__setup(&info, input_w, input_h, output_w, output_h, channels);
+    stbir__calculate_transform(&info, s0,t0,s1,t1,transform);
+    stbir__choose_filter(&info, h_filter, v_filter);
+    memory_required = stbir__calculate_memory(&info);
+    extra_memory = STBIR_MALLOC(memory_required, alloc_context);
+
+    if (!extra_memory)
+        return 0;
+
+    result = stbir__resize_allocated(&info, input_data, input_stride_in_bytes,
+                                            output_data, output_stride_in_bytes, 
+                                            alpha_channel, flags, type,
+                                            edge_horizontal, edge_vertical,
+                                            colorspace, extra_memory, memory_required);
+
+    STBIR_FREE(extra_memory, alloc_context);
+
+    return result;
+}
+
+STBIRDEF int stbir_resize_uint8(     const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                           unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                     int num_channels)
+{
+    return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,NULL,num_channels,-1,0, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT,
+        STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR);
+}
+
+STBIRDEF int stbir_resize_float(     const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                           float *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                     int num_channels)
+{
+    return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,NULL,num_channels,-1,0, STBIR_TYPE_FLOAT, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT,
+        STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR);
+}
+
+STBIRDEF int stbir_resize_uint8_srgb(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                           unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                     int num_channels, int alpha_channel, int flags)
+{
+    return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT,
+        STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB);
+}
+
+STBIRDEF int stbir_resize_uint8_srgb_edgemode(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                                    unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                              int num_channels, int alpha_channel, int flags,
+                                              stbir_edge edge_wrap_mode)
+{
+    return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT,
+        edge_wrap_mode, edge_wrap_mode, STBIR_COLORSPACE_SRGB);
+}
+
+STBIRDEF int stbir_resize_uint8_generic( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                               unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                         int num_channels, int alpha_channel, int flags,
+                                         stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
+                                         void *alloc_context)
+{
+    return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, filter, filter,
+        edge_wrap_mode, edge_wrap_mode, space);
+}
+
+STBIRDEF int stbir_resize_uint16_generic(const stbir_uint16 *input_pixels  , int input_w , int input_h , int input_stride_in_bytes,
+                                               stbir_uint16 *output_pixels , int output_w, int output_h, int output_stride_in_bytes,
+                                         int num_channels, int alpha_channel, int flags,
+                                         stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
+                                         void *alloc_context)
+{
+    return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT16, filter, filter,
+        edge_wrap_mode, edge_wrap_mode, space);
+}
+
+
+STBIRDEF int stbir_resize_float_generic( const float *input_pixels         , int input_w , int input_h , int input_stride_in_bytes,
+                                               float *output_pixels        , int output_w, int output_h, int output_stride_in_bytes,
+                                         int num_channels, int alpha_channel, int flags,
+                                         stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
+                                         void *alloc_context)
+{
+    return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_FLOAT, filter, filter,
+        edge_wrap_mode, edge_wrap_mode, space);
+}
+
+
+STBIRDEF int stbir_resize(         const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                         void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                   stbir_datatype datatype,
+                                   int num_channels, int alpha_channel, int flags,
+                                   stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
+                                   stbir_filter filter_horizontal,  stbir_filter filter_vertical,
+                                   stbir_colorspace space, void *alloc_context)
+{
+    return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,NULL,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical,
+        edge_mode_horizontal, edge_mode_vertical, space);
+}
+
+
+STBIRDEF int stbir_resize_subpixel(const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                         void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                   stbir_datatype datatype,
+                                   int num_channels, int alpha_channel, int flags,
+                                   stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
+                                   stbir_filter filter_horizontal,  stbir_filter filter_vertical,
+                                   stbir_colorspace space, void *alloc_context,
+                                   float x_scale, float y_scale,
+                                   float x_offset, float y_offset)
+{
+    float transform[4];
+    transform[0] = x_scale;
+    transform[1] = y_scale;
+    transform[2] = x_offset;
+    transform[3] = y_offset;
+    return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        0,0,1,1,transform,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical,
+        edge_mode_horizontal, edge_mode_vertical, space);
+}
+
+STBIRDEF int stbir_resize_region(  const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
+                                         void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
+                                   stbir_datatype datatype,
+                                   int num_channels, int alpha_channel, int flags,
+                                   stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
+                                   stbir_filter filter_horizontal,  stbir_filter filter_vertical,
+                                   stbir_colorspace space, void *alloc_context,
+                                   float s0, float t0, float s1, float t1)
+{
+    return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
+        output_pixels, output_w, output_h, output_stride_in_bytes,
+        s0,t0,s1,t1,NULL,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical,
+        edge_mode_horizontal, edge_mode_vertical, space);
+}
+
+#endif // STB_IMAGE_RESIZE_IMPLEMENTATION

+ 2733 - 0
stb_tilemap_editor.h

@@ -0,0 +1,2733 @@
+// stb_tilemap_editor.h - v0.10 - Sean Barrett - http://nothings.org/stb
+// placed in the public domain - not copyrighted - first released 2014-09
+//
+// Embeddable tilemap editor for C/C++
+//
+//
+// COMPILING
+//
+//   This header file contains both the header file and the
+//   implementation file in one. To create the implementation,
+//   in one source file, define a few symbols first and then
+//   include this header:
+//
+//      #define STB_TILEMAP_EDITOR_IMPLEMENTATION
+//      // this triggers the implementation
+//
+//      void STBTE_DRAW_RECT(int x0, int y0, int x1, int y1, uint color);
+//      // this must draw a filled rectangle (exclusive on right/bottom)
+//      // color = (r<<16)|(g<<8)|(b)
+//      
+//      void STBTE_DRAW_TILE(int x0, int y0,
+//                       unsigned short id, int highlight);
+//      // this draws the tile image identified by 'id' in one of several
+//      // highlight modes (see STBTE_drawmode_* in the header section,
+//      // x0,y0,highlight:int; id:unsigned short
+//
+//      #include "stb_tilemap_editor.h"
+//
+//   Optionally you can define the following functions before the include;
+//   note these must be macros (which can just call a single function) so
+//   we can detect if you've defined them:
+//
+//      [[ support for these is not implemented yet ]]
+//
+//      #define STBTE_HITTEST_TILE(x0,y0,id,mx,my)   ...your code here...
+//      // this returns true or false depending on whether the mouse
+//      // pointer at mx,my is over (touching) a tile of type 'id'
+//      // displayed at x0,y0. Normally stb_tilemap_editor just does
+//      // this hittest based on the tile geometry, but if you have
+//      // tiles whose images extend out of the tile, you'll need this.
+//
+//      #define STBTE_DRAW_ICON(x0,y0,id,highlight)  ...your code here...
+//      // this must draw a tile image identified by 'id' but now the
+//      // tile image must be drawn to fit in the tile palette, which
+//      // means it cannot exceed your specified palette spacing.
+//
+// ADDITIONAL CONFIGURATION
+//
+//   The following symbols set static limits which determine how much
+//   memory will be allocated for the editor. You can override them
+//   by making similiar definitions, but memory usage will increase.
+//
+//      #define STBTE_MAX_TILEMAP_X      200   // max 4096
+//      #define STBTE_MAX_TILEMAP_Y      200   // max 4096
+//      #define STBTE_MAX_LAYERS         8     // max 32
+//      #define STBTE_MAX_CATEGORIES     100
+//      #define STBTE_UNDO_BUFFER_BYTES  (1 << 20) // 1MB
+//      #define STBTE_MAX_COPY           90000  // e.g. 300x300
+//
+// API
+//
+//   Further documentation appears in the header-file section below.
+//
+// EDITING MULTIPLE LEVELS
+//
+//   You can only have one active editor instance. To switch between multiple
+//   levels, you can either store the levels in your own format and copy them
+//   in and out of the editor format, or you can create multiple stbte_tilemap
+//   objects and switch between them. The latter has the advantage that each
+//   stbte_tilemap keeps its own undo state. (The clipboard is global, so
+//   either approach allows cut&pasting between levels.)
+//
+// TODO
+//
+//   Eraser!!!
+//   Separate scroll state for each category
+//   Implement paint bucket
+//   Support STBTE_HITTEST_TILE above
+//   Support STBTE_HITTEST_ICON above
+//  ?Cancel drags by clicking other button? - may be fixed
+//   Object properties (per-tile properties) 
+//   Finish support for toolbar at side
+//   Layer name buttons grow to fill box
+//
+// LICENSE
+//
+//   This software has been placed in the public domain by its author.
+//   Where that dedication is not recognized, you are granted a perpetual,
+//   irrevocable license to copy and modify this file as you see fit.
+
+
+
+
+
+///////////////////////////////////////////////////////////////////////
+//
+//   HEADER SECTION
+
+#ifndef STB_TILEMAP_INCLUDE_STB_TILEMAP_EDITOR_H
+#define STB_TILEMAP_INCLUDE_STB_TILEMAP_EDITOR_H
+
+typedef struct stbte_tilemap stbte_tilemap;
+
+// these are the drawmodes used in STBTE_DRAW_TILE
+enum
+{
+   STBTE_drawmode_deemphasize = -1,
+   STBTE_drawmode_normal      =  0,
+   STBTE_drawmode_emphasize   =  1,
+};
+
+////////
+//
+// creation
+//
+
+extern stbte_tilemap *stbte_create(int map_x, int map_y, int map_layers, int spacing_x, int spacing_y, int max_tiles);
+// create an editable tilemap
+//   map_x      : dimensions of map horizontally (user can change this in editor), <= STBTE_MAX_TILEMAP_X
+//   map_y      : dimensions of map vertically (user can change this in editor)    <= STBTE_MAX_TILEMAP_Y
+//   map_layers : number of layers to use (fixed), <= STBTE_MAX_LAYERS
+//   spacing_x  : initial horizontal distance between left edges of map tiles in stb_tilemap_editor pixels
+//   spacing_y  : initial vertical distance between top edges of map tiles in stb_tilemap_editor pixels
+//   max_tiles  : maximum number of tiles that can defined
+//
+// If insufficient memory, returns NULL
+
+extern void stbte_define_tile(stbte_tilemap *tm, unsigned short id, unsigned int layermask, const char * category);
+// call this repeatedly for each tile to install the tile definitions into the editable tilemap
+//   tm        : tilemap created by stbte_create
+//   id        : unique identifier for each tile, 0 <= id < 32768
+//   layermask : bitmask of which layers tile is allowed on: 1 = layer 0, 255 = layers 0..7
+//               (note that onscreen, the editor numbers the layers from 1 not 0)
+//               layer 0 is the furthest back, layer 1 is just in front of layer 0, etc
+//   category  : which category this tile is grouped in
+
+extern void stbte_set_display(int x0, int y0, int x1, int y1);
+// call this once to set the size; if you resize, call it again
+
+
+/////////
+//
+// every frame
+//
+
+extern void stbte_draw(stbte_tilemap *tm);
+
+extern void stbte_tick(stbte_tilemap *tm, float time_in_seconds_since_last_frame);
+
+////////////
+//
+//  user input
+//
+
+// if you're using SDL, call the next function for SDL_MOUSEMOVE, SDL_MOUSEBUTTON, SDL_MOUSEWHEEL;
+// the transformation lets you scale from SDL mouse coords to stb_tilemap_editor coords
+extern void stbte_mouse_sdl(stbte_tilemap *tm, const void *sdl_event, float xscale, float yscale, int xoffset, int yoffset);
+
+// otherwise, hook these up explicitly:
+extern void stbte_mouse_move(stbte_tilemap *tm, int x, int y, int shifted, int scrollkey);
+extern void stbte_mouse_button(stbte_tilemap *tm, int x, int y, int right, int down, int shifted, int scrollkey);
+extern void stbte_mouse_wheel(stbte_tilemap *tm, int x, int y, int vscroll);
+
+
+////////////////
+//
+//  save/load 
+//
+//  There is no editor file format. You have to save and load the data yourself
+//  through the following functions. You can also use these functions to get the
+//  data to generate game-formatted levels directly. (But make sure you save
+//  first! You may also want to autosave to a temp file periodically, etc etc.)
+
+#define STBTE_EMPTY    -1
+
+extern void stbte_get_dimensions(stbte_tilemap *tm, int *max_x, int *max_y);
+// get the dimensions of the level, since the user can change them
+
+extern short* stbte_get_tile(stbte_tilemap *tm, int x, int y);
+// returns an array of shorts that is 'map_layers' in length. each short is
+// either one of the tile_id values from define_tile, or STBTE_EMPTY.
+
+extern void stbte_set_dimensions(stbte_tilemap *tm, int max_x, int max_y);
+// set the dimensions of the level, overrides previous stbte_create_map()
+// values or anything the user has changed
+
+extern void stbte_clear_map(stbte_tilemap *tm);
+// clears the map, including the region outside the defined region, so if the
+// user expands the map, they won't see garbage there
+
+extern void stbte_set_tile(stbte_tilemap *tm, int x, int y, int layer, signed short tile);
+// tile is your tile_id from define_tile, or STBTE_EMPTY
+
+////////
+//
+// optional
+//
+
+extern void stbte_set_background_tile(stbte_tilemap *tm, short id);
+// selects the tile to fill the bottom layer with and used to clear bottom tiles to;
+// should be same ID as 
+
+extern void stbte_set_sidewidths(int left, int right);
+// call this once to set the left & right side widths. don't call
+// it again since the user can change it
+
+extern void stbte_set_spacing(stbte_tilemap *tm, int spacing_x, int spacing_y, int palette_spacing_x, int palette_spacing_y);
+// call this to set the spacing of map tiles and the spacing of palette tiles.
+// if you rescale your display, call it again (e.g. you can implement map zooming yourself)
+
+extern void stbte_set_layername(stbte_tilemap *tm, int layer, const char *layername);
+// sets a string name for your layer that shows in the layer selector. note that this
+// makes the layer selector wider. 'layer' is from 0..(map_layers-1)
+
+#endif
+
+#ifdef STB_TILEMAP_EDITOR_IMPLEMENTATION
+
+#ifndef STBTE_ASSERT
+#define STBTE_ASSERT assert
+#include <assert.h>
+#endif
+
+#ifdef _MSC_VER
+#define STBTE__NOTUSED(v)  (void)(v)
+#else
+#define STBTE__NOTUSED(v)  (void)sizeof(v)
+#endif
+
+#ifndef STBTE_MAX_TILEMAP_X
+#define STBTE_MAX_TILEMAP_X      200
+#endif
+
+#ifndef STBTE_MAX_TILEMAP_Y
+#define STBTE_MAX_TILEMAP_Y      200
+#endif
+
+#ifndef STBTE_MAX_LAYERS
+#define STBTE_MAX_LAYERS         8
+#endif
+
+#ifndef STBTE_MAX_CATEGORIES
+#define STBTE_MAX_CATEGORIES     100
+#endif
+
+#ifndef STBTE_MAX_COPY
+#define STBTE_MAX_COPY           65536
+#endif
+
+#ifndef STBTE_UNDO_BUFFER_BYTES
+#define STBTE_UNDO_BUFFER_BYTES  (1 << 20) // 1MB
+#endif
+
+#define STBTE__UNDO_BUFFER_COUNT  (STBTE_UNDO_BUFFER_BYTES>>1)
+
+#if STBTE_MAX_TILEMAP_X > 4096 || STBTE_MAX_TILEMAP_Y > 4096
+#error "Maximum editable map size is 4096 x 4096"
+#endif
+#if STBTE_MAX_LAYERS > 32
+#error "Maximum layers allowed is 32"
+#endif
+#if STBTE_UNDO_BUFFER_COUNT & (STBTE_UNDO_BUFFER_COUNT-1)
+#error "Undo buffer size must be a power of 2"
+#endif
+
+#define STBTE_COLOR_TOOLBAR_BACKGROUND    0x606060
+#define STBTE_COLOR_TILEMAP_BACKGROUND    0x000000
+#define STBTE_COLOR_TILEMAP_BORDER        0x203060
+#define STBTE_COLOR_TILEMAP_HIGHLIGHT     0xffffff
+#define STBTE_COLOR_PANEL_BACKGROUND      0x403010
+#define STBTE_COLOR_PANEL_OUTLINE         0xc08040
+#define STBTE_COLOR_PANEL_TEXT            0xffffff
+#define STBTE_COLOR_BUTTON_BACKGROUND     0x703870
+#define STBTE_COLOR_BUTTON_OUTLINE        0xc060c0
+#define STBTE_COLOR_BUTTON_TEXT           0xffffff
+#define STBTE_COLOR_BUTTON_DOWN           0xe080e0
+#define STBTE_COLOR_BUTTON_OVER           0xffc0ff
+#define STBTE_COLOR_BUTTON_TEXT_SELECTED  0x000000
+#define STBTE_COLOR_MICROBUTTON           0x40c040
+#define STBTE_COLOR_MICROBUTTON_DOWN      0xc0ffc0
+#define STBTE_COLOR_MICROBUTTON_FRAME     0x00ff00
+#define STBTE_COLOR_MICROBUTTON_OVER      0x80ff80
+#define STBTE_COLOR_TILEPALETTE_OUTLINE   0xffffff
+#define STBTE_COLOR_TILEPALETTE_BACKGROUND 0x000000
+#define STBTE_COLOR_MINIBUTTON_ICON       0xffffff
+#define STBTE_COLOR_SELECTION_OUTLINE1    0xdfdfdf
+#define STBTE_COLOR_SELECTION_OUTLINE2    0x303030
+#define STBTE_COLOR_GRID                  0x404040
+
+#define STBTE_COLOR_LAYERCONTROL                  0x6f6f6f
+#define STBTE_COLOR_LAYERCONTROL_OVER             0xcfcfcf
+#define STBTE_COLOR_LAYERCONTROL_DOWN             0xffffff
+#define STBTE_COLOR_LAYERCONTROL_TOGGLED          0xbfbfbf
+#define STBTE_COLOR_LAYERCONTROL_DISABLED         0x404040
+#define STBTE_COLOR_LAYERCONTROL_OUTLINE          0xffffff
+#define STBTE_COLOR_LAYERCONTROL_OUTLINE_DISABLED 0x202020
+#define STBTE_COLOR_LAYERCONTROL_TEXT             0xffffff
+#define STBTE_COLOR_LAYERCONTROL_TEXT_DOWN        0x5f5f5f
+#define STBTE_COLOR_LAYERCONTROL_TEXT_TOGGLED     0x000000
+#define STBTE_COLOR_LAYERCONTROL_TEXT_DISABLED    0x606060
+
+#define STBTE_COLOR_LAYERMASK_HIDE       0xffff55
+#define STBTE_COLOR_LAYERMASK_LOCK       0x5f55ff
+#define STBTE_COLOR_LAYERMASK_SOLO       0xff5f55
+
+#define STBTE__FONT_HEIGHT    9
+static short stbte__font_offset[95+16];
+static short stbte__fontdata[762] =
+{
+   4,4,4,9,9,9,9,8,9,8,4,9,7,7,7,7,4,2,6,8,6,6,7,3,4,4,8,6,3,6,2,6,6,6,6,6,6,
+   6,6,6,6,6,2,3,5,4,5,6,6,6,6,6,6,6,6,6,6,6,6,7,6,7,7,7,6,7,6,6,6,6,7,7,6,6,
+   6,4,6,4,7,7,3,6,6,5,6,6,5,6,6,4,5,6,4,7,6,6,6,6,6,6,6,6,6,7,6,6,6,5,2,5,8,
+   0,0,0,0,0,0,0,0,0,0,0,0,146,511,146,146,511,146,146,511,146,511,257,341,297,
+   341,297,341,257,511,16,56,124,16,16,16,124,56,16,96,144,270,261,262,136,80,
+   48,224,192,160,80,40,22,14,15,3,448,496,496,240,232,20,10,5,2,112,232,452,
+   450,225,113,58,28,63,30,60,200,455,257,257,0,0,0,257,257,455,120,204,132,
+   132,159,14,4,4,14,159,132,132,204,120,8,24,56,120,56,24,8,32,48,56,60,56,
+   48,32,0,0,0,0,111,111,7,7,0,0,7,7,34,127,127,34,34,127,127,34,36,46,107,107,
+   58,18,99,51,24,12,102,99,48,122,79,93,55,114,80,4,7,3,62,127,99,65,65,99,
+   127,62,8,42,62,28,28,62,42,8,8,8,62,62,8,8,128,224,96,8,8,8,8,8,8,96,96,96,
+   48,24,12,6,3,62,127,89,77,127,62,64,66,127,127,64,64,98,115,89,77,71,66,33,
+   97,73,93,119,35,24,28,22,127,127,16,39,103,69,69,125,57,62,127,73,73,121,
+   48,1,1,113,121,15,7,54,127,73,73,127,54,6,79,73,105,63,30,54,54,128,246,118,
+   8,28,54,99,65,20,20,20,20,65,99,54,28,8,2,3,105,109,7,2,30,63,33,45,47,46,
+   124,126,19,19,126,124,127,127,73,73,127,54,62,127,65,65,99,34,127,127,65,
+   99,62,28,127,127,73,73,73,65,127,127,9,9,9,1,62,127,65,73,121,121,127,127,
+   8,8,127,127,65,65,127,127,65,65,32,96,64,64,127,63,127,127,8,28,54,99,65,
+   127,127,64,64,64,64,127,127,6,12,6,127,127,127,127,6,12,24,127,127,62,127,
+   65,65,65,127,62,127,127,9,9,15,6,62,127,65,81,49,127,94,127,127,9,25,127,
+   102,70,79,73,73,121,49,1,1,127,127,1,1,63,127,64,64,127,63,15,31,48,96,48,
+   31,15,127,127,48,24,48,127,127,99,119,28,28,119,99,7,15,120,120,15,7,97,113,
+   89,77,71,67,127,127,65,65,3,6,12,24,48,96,65,65,127,127,8,12,6,3,6,12,8,64,
+   64,64,64,64,64,64,3,7,4,32,116,84,84,124,120,127,127,68,68,124,56,56,124,
+   68,68,68,56,124,68,68,127,127,56,124,84,84,92,24,8,124,126,10,10,56,380,324,
+   324,508,252,127,127,4,4,124,120,72,122,122,64,256,256,256,506,250,126,126,
+   16,56,104,64,66,126,126,64,124,124,24,56,28,124,120,124,124,4,4,124,120,56,
+   124,68,68,124,56,508,508,68,68,124,56,56,124,68,68,508,508,124,124,4,4,12,
+   8,72,92,84,84,116,36,4,4,62,126,68,68,60,124,64,64,124,124,28,60,96,96,60,
+   28,28,124,112,56,112,124,28,68,108,56,56,108,68,284,316,352,320,508,252,68,
+   100,116,92,76,68,8,62,119,65,65,127,127,65,65,119,62,8,16,24,12,12,24,24,
+   12,4,
+};
+
+typedef struct
+{
+   short id;
+   unsigned short category_id;
+   char *category;
+   unsigned int layermask;
+} stbte__tileinfo;
+
+#define MAX_LAYERMASK    (1 << (8*sizeof(unsigned int)))
+
+typedef short stbte__tiledata;
+
+#define STBTE__NO_TILE   -1
+
+enum
+{
+   STBTE__panel_toolbar,
+   STBTE__panel_info,
+   STBTE__panel_layers,
+   STBTE__panel_categories,
+   STBTE__panel_tiles,
+
+   STBTE__num_panel,
+};
+
+enum
+{
+   STBTE__side_left,
+   STBTE__side_right,
+   STBTE__side_top,
+   STBTE__side_bottom,
+};
+
+enum
+{
+   STBTE__tool_select,
+   STBTE__tool_brush,
+   STBTE__tool_rect,
+   STBTE__tool_eyedrop,
+   STBTE__tool_fill,
+
+   STBTE__tool_grid,
+   STBTE__tool_undo,
+   STBTE__tool_redo,
+   // copy/cut/paste aren't included here because they're displayed differently
+
+   STBTE__num_tool,
+};
+
+// icons are stored in the 0-31 range of ASCII in the font
+static int toolchar[] = { 26,24,20,23,22, 19,29,28, };
+
+enum
+{
+   STBTE__paint,
+
+   // from here down does hittesting
+   STBTE__tick,
+   STBTE__mousemove,
+   STBTE__mousewheel,
+   STBTE__leftdown,
+   STBTE__leftup,
+   STBTE__rightdown,
+   STBTE__rightup,
+};
+
+typedef struct
+{
+   int expanded, mode;
+   int delta_height;     // number of rows they've requested for this
+   int side;
+   int width,height;
+   int x0,y0;
+} stbte__panel;
+
+typedef struct
+{
+   int x0,y0,x1,y1,color;
+} stbte__colorrect;
+
+typedef struct
+{
+   int tool, active_event;
+   int active_id, hot_id, next_hot_id;
+   int event;
+   int mx,my;
+   int ms_time;
+   int shift, scrollkey;
+   int initted;
+   int side_extended[2];
+   stbte__colorrect delayrect[1024];
+   int delaycount;
+   int show_grid;
+   int brush_state; // used to decide which kind of erasing
+   int eyedrop_x, eyedrop_y, eyedrop_last_layer;
+   int pasting, paste_x, paste_y;
+   int scrolling, start_x, start_y;
+   int dragging;
+   int drag_x, drag_y, drag_w, drag_h;
+   int drag_offx, drag_offy, drag_dest_x, drag_dest_y;
+   int undoing;
+   int has_selection, select_x0, select_y0, select_x1, select_y1;
+   int sx,sy;
+   int x0,y0,x1,y1, left_width, right_width; // configurable widths
+   float alert_timer;
+   const char *alert_msg;
+   float dt;
+   stbte__panel panel[STBTE__num_panel];
+   short copybuffer[STBTE_MAX_COPY][STBTE_MAX_LAYERS];
+   int copy_width,copy_height,has_copy;
+} stbte__ui_t;
+
+// there's only one UI system at a time, so we can globalize this
+static stbte__ui_t stbte__ui = { STBTE__tool_brush, 0 };
+
+#define STBTE__INACTIVE()     (stbte__ui.active_id == 0)
+#define STBTE__IS_ACTIVE(id)  (stbte__ui.active_id == (id))
+#define STBTE__IS_HOT(id)     (stbte__ui.hot_id    == (id))
+
+#define STBTE__BUTTON_HEIGHT            (STBTE__FONT_HEIGHT + 2 * STBTE__BUTTON_INTERNAL_SPACING)
+#define STBTE__BUTTON_INTERNAL_SPACING  (2 + (STBTE__FONT_HEIGHT>>4))
+
+typedef struct
+{
+   const char *name;
+   int locked;
+   int hidden;
+} stbte__layer;
+
+enum
+{
+   STBTE__unlocked,
+   STBTE__protected,
+   STBTE__locked,
+};
+
+struct stbte_tilemap
+{
+    stbte__tiledata data[STBTE_MAX_TILEMAP_Y][STBTE_MAX_TILEMAP_X][STBTE_MAX_LAYERS];
+    int max_x, max_y, num_layers;
+    int spacing_x, spacing_y;
+    int palette_spacing_x, palette_spacing_y;
+    int scroll_x,scroll_y;
+    int cur_category, cur_tile, cur_layer;
+    char *categories[STBTE_MAX_CATEGORIES];
+    int num_categories, category_scroll;
+    stbte__tileinfo *tiles;
+    int num_tiles, max_tiles, digits;
+    int cur_palette_count;
+    int palette_scroll;
+    int tileinfo_dirty;
+    stbte__layer layerinfo[STBTE_MAX_LAYERS];
+    int has_layer_names;
+    int layer_scroll;
+    int solo_layer;
+    int undo_pos, undo_len, redo_len;
+    short background_tile;
+    unsigned char id_in_use[32768>>3];
+    short *undo_buffer;
+};
+
+static char *default_category = "[unassigned]";
+
+static void stbte__init_gui(void)
+{
+   int i,n;
+   stbte__ui.initted = 1;
+   // init UI state
+   for (i=0; i < STBTE__num_panel; ++i) {
+      stbte__ui.panel[i].expanded     = 1; // visible if not autohidden
+      stbte__ui.panel[i].delta_height = 0;
+      stbte__ui.panel[i].side         = STBTE__side_left;
+   }
+   stbte__ui.panel[STBTE__panel_toolbar].side = STBTE__side_top;
+
+   if (stbte__ui.left_width == 0)
+      stbte__ui.left_width = 80;
+   if (stbte__ui.right_width == 0)
+      stbte__ui.right_width = 80;
+
+   // init font
+   n=95+16;
+   for (i=0; i < 95+16; ++i) {
+      stbte__font_offset[i] = n;
+      n += stbte__fontdata[i];
+   }
+}
+
+stbte_tilemap *stbte_create_map(int map_x, int map_y, int map_layers, int spacing_x, int spacing_y, int max_tiles)
+{
+   int i;
+   stbte_tilemap *tm;
+   STBTE_ASSERT(map_layers >= 0 && map_layers <= STBTE_MAX_LAYERS);
+   STBTE_ASSERT(map_x >= 0 && map_x <= STBTE_MAX_TILEMAP_X);
+   STBTE_ASSERT(map_y >= 0 && map_y <= STBTE_MAX_TILEMAP_Y);
+   if (map_x < 0 || map_y < 0 || map_layers < 0 ||
+       map_x > STBTE_MAX_TILEMAP_X || map_y > STBTE_MAX_TILEMAP_Y || map_layers > STBTE_MAX_LAYERS)
+      return NULL;
+   
+   if (!stbte__ui.initted)
+      stbte__init_gui();
+
+   tm = (stbte_tilemap *) malloc(sizeof(*tm) + sizeof(*tm->tiles) * max_tiles + STBTE_UNDO_BUFFER_BYTES);
+   if (tm == NULL)
+      return NULL;
+
+   tm->tiles = (stbte__tileinfo *) (tm+1);
+   tm->undo_buffer = (short *) (tm->tiles + max_tiles);
+   tm->num_layers = map_layers;
+   tm->max_x = map_x;
+   tm->max_y = map_y;
+   tm->spacing_x = spacing_x;
+   tm->spacing_y = spacing_y;
+   tm->scroll_x = 0;
+   tm->scroll_y = 0;
+   tm->palette_scroll = 0;
+   tm->palette_spacing_x = spacing_x+1;
+   tm->palette_spacing_y = spacing_y+1;
+   tm->cur_category = -1;
+   tm->cur_tile = 0;
+   tm->solo_layer = -1;
+   tm->undo_len = 0;
+   tm->redo_len = 0;
+   tm->undo_pos = 0;
+   tm->category_scroll = 0;
+   tm->layer_scroll = 0;
+   tm->has_layer_names = 0;
+
+   for (i=0; i < tm->num_layers; ++i) {
+      tm->layerinfo[i].hidden = 0;
+      tm->layerinfo[i].locked = STBTE__unlocked;
+      tm->layerinfo[i].name   = 0;
+   }
+
+   tm->background_tile = STBTE__NO_TILE;
+   stbte_clear_map(tm);
+
+   tm->max_tiles = max_tiles;
+   tm->num_tiles = 0;
+   for (i=0; i < 32768/8; ++i)
+      tm->id_in_use[i] = 0;
+   tm->tileinfo_dirty = 1;
+   return tm;
+}
+
+void stbte_set_background_tile(stbte_tilemap *tm, short id)
+{
+   int i;
+   STBTE_ASSERT(id >= -1 && id < 32768);
+   if (id >= 32768 || id < -1)
+      return;
+   for (i=0; i < STBTE_MAX_TILEMAP_X * STBTE_MAX_TILEMAP_Y; ++i)
+      if (tm->data[0][i][0] == -1)
+         tm->data[0][i][0] = id;
+   tm->background_tile = id;
+}
+
+void stbte_set_spacing(stbte_tilemap *tm, int spacing_x, int spacing_y, int palette_spacing_x, int palette_spacing_y)
+{
+   tm->spacing_x = spacing_x;
+   tm->spacing_y = spacing_y;
+   tm->palette_spacing_x = palette_spacing_x;
+   tm->palette_spacing_y = palette_spacing_y;
+}
+
+void stbte_set_sidewidths(int left, int right)
+{
+   stbte__ui.left_width  = left;
+   stbte__ui.right_width = right;
+}
+
+void stbte_set_display(int x0, int y0, int x1, int y1)
+{
+   stbte__ui.x0 = x0;
+   stbte__ui.y0 = y0;
+   stbte__ui.x1 = x1;
+   stbte__ui.y1 = y1;
+}
+
+void stbte_define_tile(stbte_tilemap *tm, unsigned short id, unsigned int layermask, const char * category_c)
+{
+   char *category = (char *) category_c;
+   STBTE_ASSERT(id < 32768);
+   STBTE_ASSERT(tm->num_tiles < tm->max_tiles);
+   STBTE_ASSERT((tm->id_in_use[id>>3]&(1<<(id&7))) == 0);
+   if (id >= 32768 || tm->num_tiles >= tm->max_tiles || (tm->id_in_use[id>>3]&(1<<(id&7))))
+      return;
+
+   if (category == NULL)
+      category = (char*) default_category;
+   tm->id_in_use[id>>3] |= 1 << (id&7);
+   tm->tiles[tm->num_tiles].category    = category;
+   tm->tiles[tm->num_tiles].id        = id;
+   tm->tiles[tm->num_tiles].layermask = layermask;
+   ++tm->num_tiles;
+   tm->tileinfo_dirty = 1;
+}
+
+void stbte_set_layername(stbte_tilemap *tm, int layer, const char *layername)
+{
+   STBTE_ASSERT(layer >= 0 && layer < tm->num_layers);
+   if (layer >= 0 && layer < tm->num_layers) {
+      tm->layerinfo[layer].name = layername;
+      tm->has_layer_names = 1;
+   }
+}
+
+void stbte_get_dimensions(stbte_tilemap *tm, int *max_x, int *max_y)
+{
+   *max_x = tm->max_x;
+   *max_y = tm->max_y;
+}
+
+extern short* stbte_get_tile(stbte_tilemap *tm, int x, int y)
+{
+   STBTE_ASSERT(x >= 0 && x < tm->max_x && y >= 0 && y < tm->max_y);
+   if (x < 0 || x >= STBTE_MAX_TILEMAP_X || y < 0 || y >= STBTE_MAX_TILEMAP_Y)
+      return NULL;
+   return tm->data[y][x];
+}
+
+// returns an array of map_layers shorts. each short is either
+// one of the tile_id values from define_tile, or STBTE_EMPTY
+
+void stbte_set_dimensions(stbte_tilemap *tm, int map_x, int map_y)
+{
+   STBTE_ASSERT(map_x >= 0 && map_x <= STBTE_MAX_TILEMAP_X);
+   STBTE_ASSERT(map_y >= 0 && map_y <= STBTE_MAX_TILEMAP_Y);
+   if (map_x < 0 || map_y < 0 || map_x > STBTE_MAX_TILEMAP_X || map_y > STBTE_MAX_TILEMAP_Y)
+      return;
+   tm->max_x = map_x;
+   tm->max_y = map_y;
+}
+
+void stbte_clear_map(stbte_tilemap *tm)
+{
+   int i,j;
+   for (i=0; i < STBTE_MAX_TILEMAP_X * STBTE_MAX_TILEMAP_Y; ++i) {
+      tm->data[0][i][0] = tm->background_tile;
+      for (j=1; j < tm->num_layers; ++j)
+         tm->data[0][i][j] = STBTE__NO_TILE;
+   }
+}
+
+void stbte_set_tile(stbte_tilemap *tm, int x, int y, int layer, signed short tile)
+{
+   STBTE_ASSERT(x >= 0 && x < tm->max_x && y >= 0 && y < tm->max_y);
+   STBTE_ASSERT(layer >= 0 && layer < tm->num_layers);
+   STBTE_ASSERT(tile >= -1 && tile < 32768);
+   if (x < 0 || x >= STBTE_MAX_TILEMAP_X || y < 0 || y >= STBTE_MAX_TILEMAP_Y)
+      return;
+   if (layer < 0 || layer >= tm->num_layers || tile < -1)
+      return;
+   tm->data[y][x][layer] = tile;
+}
+
+static void stbte__choose_category(stbte_tilemap *tm, int category)
+{
+   int i,n=0;
+   tm->cur_category = category;
+   for (i=0; i < tm->num_tiles; ++i)
+      if (tm->tiles[i].category_id == category || category == -1)
+         ++n;
+   tm->cur_palette_count = n;
+   tm->palette_scroll = 0;
+}
+
+static int stbte__strequal(char *p, char *q)
+{
+   while (*p)
+      if (*p++ != *q++) return 0;
+   return *q == 0;
+}
+
+static void stbte__compute_tileinfo(stbte_tilemap *tm)
+{
+   int i,j,n=0;
+
+   tm->num_categories=0;
+
+   for (i=0; i < tm->num_tiles; ++i) {
+      stbte__tileinfo *t = &tm->tiles[i];
+      // find category
+      for (j=0; j < tm->num_categories; ++j)
+         if (stbte__strequal(t->category, tm->categories[j]))
+            goto found;
+      tm->categories[j] = t->category;
+      ++tm->num_categories;
+     found:
+      t->category_id = (unsigned short) j;
+   }
+
+   // currently number of categories can never decrease because you
+   // can't remove tile definitions, but let's get it right anyway
+   if (tm->cur_category > tm->num_categories) {
+      tm->cur_category = -1;
+   }
+
+   stbte__choose_category(tm, tm->cur_category);
+
+   tm->tileinfo_dirty = 0;
+}
+
+static void stbte__prepare_tileinfo(stbte_tilemap *tm)
+{
+   if (tm->tileinfo_dirty)
+      stbte__compute_tileinfo(tm);
+}
+
+
+/////////////////////// undo system ////////////////////////
+
+// the undo system works by storing "commands" into a buffer, and
+// then playing back those commands. undo and redo have to store
+// the commands in different order. 
+//
+// the commands are:
+//
+// 1)  end_of_undo_record
+//       -1:short
+//
+// 2)  end_of_redo_record
+//       -2:short
+//
+// 2)  tile update
+//       tile_id:short (-1..32767)
+//       y_coord:short
+//       x_coord:short
+//       layer:short (0..31)
+//
+// Since we use a circular buffer, we might overwrite the undo storage.
+// To detect this, before playing back commands we scan back and see
+// if we see an end_of_undo_record before hitting the relevant boundary,
+// it's wholly contained.
+//
+// When we read back through, we see them in reverse order, so
+// we'll see the layer number first
+
+// given two points, compute the length between them
+#define stbte__wrap(pos)            ((pos) & (STBTE__UNDO_BUFFER_COUNT-1))
+
+#define STBTE__undo_record  -2
+#define STBTE__redo_record  -3
+#define STBTE__undo_junk    -4  // this is written underneath the undo pointer, never used
+
+static void stbte__write_undo(stbte_tilemap *tm, short value)
+{
+   int pos = tm->undo_pos;
+   tm->undo_buffer[pos] = value;
+   tm->undo_pos = stbte__wrap(pos+1);
+   tm->undo_len += (tm->undo_len < STBTE__UNDO_BUFFER_COUNT-2);
+   tm->redo_len -= (tm->redo_len > 0);
+}
+
+static void stbte__write_redo(stbte_tilemap *tm, short value)
+{
+   int pos = tm->undo_pos;
+   tm->undo_buffer[pos] = value;
+   tm->undo_pos = stbte__wrap(pos-1);
+   tm->redo_len += (tm->redo_len < STBTE__UNDO_BUFFER_COUNT-2);
+   tm->undo_len -= (tm->undo_len > 0);
+}
+
+static void stbte__begin_undo(stbte_tilemap *tm)
+{
+   tm->redo_len = 0;
+   stbte__write_undo(tm, STBTE__undo_record);
+   stbte__ui.undoing = 1;
+   stbte__ui.alert_msg = 0; // clear alert if they start doing something
+}
+
+static void stbte__end_undo(stbte_tilemap *tm)
+{
+   if (stbte__ui.undoing) {
+      // check if anything got written
+      int pos = stbte__wrap(tm->undo_pos-1);
+      if (tm->undo_buffer[pos] == STBTE__undo_record) {
+         // empty undo record, move back
+         tm->undo_pos = pos;
+         STBTE_ASSERT(tm->undo_len > 0);
+         tm->undo_len -= 1;
+      }
+      tm->undo_buffer[tm->undo_pos] = STBTE__undo_junk;
+      // otherwise do nothing
+
+      stbte__ui.undoing = 0;
+   }
+}
+
+static void stbte__undo_record(stbte_tilemap *tm, int x, int y, int i, int v)
+{
+   STBTE_ASSERT(stbte__ui.undoing);
+   if (stbte__ui.undoing) {
+      stbte__write_undo(tm, v);
+      stbte__write_undo(tm, x);
+      stbte__write_undo(tm, y);
+      stbte__write_undo(tm, i);
+   }
+}
+
+static void stbte__redo_record(stbte_tilemap *tm, int x, int y, int i, int v)
+{
+   stbte__write_redo(tm, v);
+   stbte__write_redo(tm, x);
+   stbte__write_redo(tm, y);
+   stbte__write_redo(tm, i);
+}
+
+static void stbte__undo(stbte_tilemap *tm)
+{
+   // first scan through for the end record
+   int i, pos = stbte__wrap(tm->undo_pos-1), endpos;
+   for (i=0; i < tm->undo_len; i += 4) {
+      STBTE_ASSERT(tm->undo_buffer[pos] != STBTE__undo_junk);
+      if (tm->undo_buffer[pos] == STBTE__undo_record)
+         break;
+      pos = stbte__wrap(pos-4);
+   }
+   if (i >= tm->undo_len)
+      return;
+   endpos = pos;
+
+   // we found a complete undo record
+   pos = stbte__wrap(tm->undo_pos-1);
+
+   // start a redo record
+   stbte__write_redo(tm, STBTE__redo_record);
+
+   // so now go back through undo and apply in reverse
+   // order, and copy it to redo
+   for (i=0; endpos != pos; i += 4) {
+      int x,y,n,v;
+      // get the undo entry
+      n = tm->undo_buffer[pos];
+      y = tm->undo_buffer[stbte__wrap(pos-1)];
+      x = tm->undo_buffer[stbte__wrap(pos-2)];
+      v = tm->undo_buffer[stbte__wrap(pos-3)];
+      pos = stbte__wrap(pos-4);
+      // write the redo entry
+      stbte__redo_record(tm, x, y, n, tm->data[y][x][n]);
+      // apply the undo entry
+      tm->data[y][x][n] = (short) v;
+   }
+   // overwrite undo record with junk
+   tm->undo_buffer[tm->undo_pos] = STBTE__undo_junk;
+}
+
+static void stbte__redo(stbte_tilemap *tm)
+{
+   // first scan through for the end record
+   int i, pos = stbte__wrap(tm->undo_pos+1), endpos;
+   for (i=0; i < tm->redo_len; i += 4) {
+      STBTE_ASSERT(tm->undo_buffer[pos] != STBTE__undo_junk);
+      if (tm->undo_buffer[pos] == STBTE__redo_record)
+         break;
+      pos = stbte__wrap(pos+4);
+   }
+   if (i >= tm->redo_len)
+      return; // this should only ever happen if redo buffer is empty
+   endpos = pos;
+
+   // we found a complete redo record
+   pos = stbte__wrap(tm->undo_pos+1);
+   
+   // start an undo record
+   stbte__write_undo(tm, STBTE__undo_record);
+
+   for (i=0; pos != endpos; i += 4) {
+      int x,y,n,v;
+      n = tm->undo_buffer[pos];
+      y = tm->undo_buffer[stbte__wrap(pos+1)];
+      x = tm->undo_buffer[stbte__wrap(pos+2)];
+      v = tm->undo_buffer[stbte__wrap(pos+3)];
+      pos = stbte__wrap(pos+4);
+      // don't use stbte__undo_record because it's guarded
+      stbte__write_undo(tm, tm->data[y][x][n]);
+      stbte__write_undo(tm, x);
+      stbte__write_undo(tm, y);
+      stbte__write_undo(tm, n);
+      tm->data[y][x][n] = (short) v;
+   }
+   tm->undo_buffer[tm->undo_pos] = STBTE__undo_junk;
+}
+
+static void stbte__draw_rect(int x0, int y0, int x1, int y1, unsigned int color)
+{
+   STBTE_DRAW_RECT(x0,y0,x1,y1, color);
+}
+
+static void stbte__draw_frame(int x0, int y0, int x1, int y1, unsigned int color)
+{
+   stbte__draw_rect(x0,y0,x1-1,y0+1,color);
+   stbte__draw_rect(x1-1,y0,x1,y1-1,color);
+   stbte__draw_rect(x0+1,y1-1,x1,y1,color);
+   stbte__draw_rect(x0,y0+1,x0+1,y1,color);
+}
+
+static void stbte__draw_halfframe(int x0, int y0, int x1, int y1, unsigned int color)
+{
+   stbte__draw_rect(x0,y0,x1,y0+1,color);
+   stbte__draw_rect(x0,y0+1,x0+1,y1,color);
+}
+
+static int stbte__get_char_width(int ch)
+{
+   return stbte__fontdata[ch-16];
+}
+
+static short *stbte__get_char_bitmap(int ch)
+{
+   return stbte__fontdata + stbte__font_offset[ch-16];
+}
+
+static void stbte__draw_bitmask_as_columns(int x, int y, short bitmask, int color)
+{
+   int start_i = -1, i=0;
+   while (bitmask) {
+      if (bitmask & (1<<i)) {
+         if (start_i < 0)
+            start_i = i;   
+      } else if (start_i >= 0) {
+         stbte__draw_rect(x, y+start_i, x+1, y+i, color);
+         start_i = -1;
+         bitmask &= ~((1<<i)-1); // clear all the old bits; we don't clear them as we go to save code
+      }
+      ++i;
+   }
+}
+
+static void stbte__draw_bitmap(int x, int y, int w, short *bitmap, int color)
+{
+   int i;
+   for (i=0; i < w; ++i)
+      stbte__draw_bitmask_as_columns(x+i, y, *bitmap++, color);
+}
+
+static void stbte__draw_text_core(int x, int y, const char *str, int w, int color, int digitspace)
+{
+   int x_end = x+w;
+   while (*str) {
+      int c = *str++;
+      int cw = stbte__get_char_width(c);
+      if (x + cw > x_end)
+         break;
+      stbte__draw_bitmap(x, y, cw, stbte__get_char_bitmap(c), color);
+      if (digitspace && c == ' ')
+         cw = stbte__get_char_width('0');
+      x += cw+1;
+   }
+}
+
+static void stbte__draw_text(int x, int y, const char *str, int w, int color)
+{
+   stbte__draw_text_core(x,y,str,w,color,0);
+}
+
+static int stbte__text_width(const char *str)
+{
+   int x = 0;
+   while (*str) {
+      int c = *str++;
+      int cw = stbte__get_char_width(c);
+      x += cw+1;
+   }
+   return x;
+}
+
+static void stbte__draw_frame_delayed(int x0, int y0, int x1, int y1, int color)
+{
+   if (stbte__ui.delaycount < 1024) {
+      stbte__colorrect r = { x0,y0,x1,y1,color };
+      stbte__ui.delayrect[stbte__ui.delaycount++] = r;
+   }
+}
+
+static void stbte__flush_delay(void)
+{
+   stbte__colorrect *r = stbte__ui.delayrect;
+   int i;
+   for (i=0; i < stbte__ui.delaycount; ++i,++r)
+      stbte__draw_frame(r->x0,r->y0,r->x1,r->y1,r->color);
+   stbte__ui.delaycount = 0;
+}
+
+static void stbte__activate(int id)
+{
+   stbte__ui.active_id = id;
+   stbte__ui.active_event = stbte__ui.event;
+}
+
+static int stbte__hittest(int x0, int y0, int x1, int y1, int id)
+{
+   int over =    stbte__ui.mx >= x0 && stbte__ui.my >= y0
+              && stbte__ui.mx <  x1 && stbte__ui.my <  y1;
+
+   if (over && stbte__ui.event >= STBTE__tick)
+      stbte__ui.next_hot_id = id;
+
+   return over;
+}
+
+static int stbte__button_core(int id)
+{
+   switch (stbte__ui.event) {
+      case STBTE__leftdown:
+         if (stbte__ui.hot_id == id && STBTE__INACTIVE())
+            stbte__activate(id);
+         break;
+      case STBTE__leftup:
+         if (stbte__ui.active_id == id && STBTE__IS_HOT(id)) {
+            stbte__activate(0);
+            return 1;
+         }
+         break;
+      case STBTE__rightdown:
+         if (stbte__ui.hot_id == id && STBTE__INACTIVE())
+            stbte__activate(id);
+         break;
+      case STBTE__rightup:
+         if (stbte__ui.active_id == id && STBTE__IS_HOT(id)) {
+            stbte__activate(0);
+            return -1;
+         }
+         break;
+   }
+   return 0;
+}
+
+static int stbte__button(char *label, int x, int y, int textoff, int width, int id, int toggled)
+{
+   int x0=x,y0=y, x1=x+width,y1=y+STBTE__BUTTON_HEIGHT;
+   int s = STBTE__BUTTON_INTERNAL_SPACING;
+
+   int over = stbte__hittest(x0,y0,x1,y1,id);
+      
+   if (stbte__ui.event == STBTE__paint) {
+      stbte__draw_rect (x0, y0, x1, y1, STBTE__IS_ACTIVE(id) || toggled ? STBTE_COLOR_BUTTON_DOWN : STBTE_COLOR_BUTTON_BACKGROUND);
+      stbte__draw_frame(x0, y0, x1, y1, STBTE__IS_HOT(id)    || toggled ? STBTE_COLOR_BUTTON_OVER : STBTE_COLOR_BUTTON_OUTLINE);
+      stbte__draw_text (x0+s+textoff, y0+s, label ,width-s*2, toggled ? STBTE_COLOR_BUTTON_TEXT : STBTE_COLOR_BUTTON_TEXT_SELECTED);
+   }
+   return (stbte__button_core(id) == 1);
+}
+
+static int stbte__button_icon(char ch, int x, int y, int width, int id, int toggled)
+{
+   int x0=x,y0=y, x1=x+width,y1=y+STBTE__BUTTON_HEIGHT;
+   int s = STBTE__BUTTON_INTERNAL_SPACING, pad;
+   char label[2] = { ch, 0 };
+
+   int over = stbte__hittest(x0,y0,x1,y1,id);
+      
+   if (stbte__ui.event == STBTE__paint) {
+      stbte__draw_rect (x0, y0, x1, y1, STBTE__IS_ACTIVE(id) || toggled ? STBTE_COLOR_BUTTON_DOWN : STBTE_COLOR_BUTTON_BACKGROUND);
+      stbte__draw_frame(x0, y0, x1, y1, STBTE__IS_HOT(id)    || toggled ? STBTE_COLOR_BUTTON_OVER : STBTE_COLOR_BUTTON_OUTLINE);
+      pad = (9 - stbte__get_char_width(ch))/2;
+      stbte__draw_text (x0+s+pad, y0+s, label ,9, toggled ? STBTE_COLOR_BUTTON_TEXT : STBTE_COLOR_BUTTON_TEXT_SELECTED);
+   }
+   return (stbte__button_core(id) == 1);
+}
+
+static int stbte__minibutton(int x, int y, int ch, int id)
+{
+   int x0 = x, y0 = y, x1 = x+8, y1 = y+7;
+   int over = stbte__hittest(x0,y0,x1,y1,id);
+   if (stbte__ui.event == STBTE__paint) {
+      char str[2] = { ch,0 };
+      stbte__draw_rect (x0,y0,x1,y1, STBTE__IS_ACTIVE(id) ? STBTE_COLOR_MICROBUTTON_DOWN : STBTE_COLOR_MICROBUTTON);
+      stbte__draw_frame(x0,y0,x1,y1, STBTE__IS_HOT(id)    ? STBTE_COLOR_MICROBUTTON_OVER : STBTE_COLOR_MICROBUTTON_FRAME);
+      stbte__draw_text (x0+1,y0,str,99, STBTE_COLOR_MINIBUTTON_ICON);
+   }
+   return stbte__button_core(id);
+}
+
+static int stbte__layerbutton(int x, int y, int ch, int id, int toggled, int disabled, int color)
+{
+   int x0 = x, y0 = y, x1 = x+10, y1 = y+11;
+   int over = stbte__hittest(x0,y0,x1,y1,id);
+   if (stbte__ui.event == STBTE__paint) {
+      int rc = STBTE_COLOR_LAYERCONTROL;
+      int rf = STBTE_COLOR_LAYERCONTROL_OUTLINE;
+      int rt = STBTE_COLOR_LAYERCONTROL_TEXT;
+      if (toggled) {
+         rc = STBTE_COLOR_LAYERCONTROL_TOGGLED;
+         rt = STBTE_COLOR_LAYERCONTROL_TEXT_TOGGLED;
+      }
+      if (STBTE__IS_HOT(id)) {
+         rc = STBTE_COLOR_LAYERCONTROL_OVER;
+      }
+      if (STBTE__IS_ACTIVE(id)) {
+         rc = STBTE_COLOR_LAYERCONTROL_DOWN;
+         rt = STBTE_COLOR_LAYERCONTROL_TEXT_DOWN;
+      }
+      rc &= color;
+      rf &= color;
+      rt &= color;
+      if (disabled) {
+         rc = STBTE_COLOR_LAYERCONTROL_DISABLED;
+         rf = STBTE_COLOR_LAYERCONTROL_OUTLINE_DISABLED;
+         rt = STBTE_COLOR_LAYERCONTROL_TEXT_DISABLED;
+      }
+
+      stbte__draw_rect (x0,y0,x1,y1, rc);
+      stbte__draw_frame(x0,y0,x1,y1, rf);
+      {
+         char str[2] = { ch,0 };
+         int off = (9-stbte__get_char_width(ch))/2;
+         stbte__draw_text (x0+1+off,y0+2,str,99, rt);
+      }
+   }
+   if (disabled)
+      return 0;
+   return stbte__button_core(id);
+}
+
+
+static int stbte__microbutton(int x, int y, int size, int id, int c1, int c2, int toggled)
+{
+   int x0 = x, y0 = y, x1 = x+size, y1 = y+size;
+   int over = stbte__hittest(x0,y0,x1,y1,id);
+   if (stbte__ui.event == STBTE__paint) {
+      stbte__draw_rect (x0,y0,x1,y1, STBTE__IS_ACTIVE(id) || toggled ? c2                           : c1                           );
+      stbte__draw_frame(x0,y0,x1,y1, STBTE__IS_HOT(id)               ? STBTE_COLOR_MICROBUTTON_OVER : STBTE_COLOR_MICROBUTTON_FRAME);
+   }
+   return stbte__button_core(id);
+}
+
+static int stbte__microbutton_dragger(int x, int y, int size, int id, int c1, int c2, int toggled, int *pos)
+{
+   int x0 = x, y0 = y, x1 = x+size, y1 = y+size;
+   int over = stbte__hittest(x0,y0,x1,y1,id);
+   switch (stbte__ui.event) {
+      case STBTE__paint:
+         stbte__draw_rect (x0,y0,x1,y1, STBTE__IS_ACTIVE(id) || toggled ? c2                           : c1                           );
+         stbte__draw_frame(x0,y0,x1,y1, STBTE__IS_HOT(id)               ? STBTE_COLOR_MICROBUTTON_OVER : STBTE_COLOR_MICROBUTTON_FRAME);
+         break;
+      case STBTE__leftdown:
+         if (STBTE__IS_HOT(id) && STBTE__INACTIVE()) {
+            stbte__activate(id);
+            stbte__ui.sx = stbte__ui.mx - *pos;
+         }
+         break;
+      case STBTE__mousemove:
+         if (STBTE__IS_ACTIVE(id) && stbte__ui.active_event == STBTE__leftdown) {
+            *pos = stbte__ui.mx - stbte__ui.sx;  
+         }
+         break;
+      case STBTE__leftup:
+         if (STBTE__IS_ACTIVE(id))
+            stbte__activate(0);
+         break;
+      default:
+         return stbte__button_core(id);
+   }
+   return 0;
+}
+
+static int stbte__category_button(char *label, int x, int y, int width, int id, int toggled)
+{
+   int x0=x,y0=y, x1=x+width,y1=y+STBTE__BUTTON_HEIGHT;
+   int s = STBTE__BUTTON_INTERNAL_SPACING;
+
+   int over = stbte__hittest(x0,y0,x1,y1,id);
+      
+   if (stbte__ui.event == STBTE__paint) {
+      stbte__draw_rect (x0, y0, x1, y1, toggled ? STBTE_COLOR_BUTTON_DOWN : STBTE_COLOR_BUTTON_BACKGROUND);
+      stbte__draw_text (x0+s, y0+s, label ,width-s*2, STBTE__IS_HOT(id) ? STBTE_COLOR_BUTTON_TEXT : STBTE_COLOR_BUTTON_TEXT_SELECTED);
+   }
+
+   return (stbte__button_core(id) == 1);
+}
+
+#define STBTE_COLOR_SCROLLBAR_TRACK  0x808030
+#define STBTE_COLOR_SCROLLBAR_THUMB  0x909040
+
+static void stbte__scrollbar(int x, int y0, int y1, int *val, int v0, int v1, int num_vis, int id)
+{
+   int over;
+   int thumbpos;
+   if (v1 - v0 <= num_vis)
+      return;
+
+   // generate thumbpos from numvis
+   thumbpos = y0+2 + (y1-y0-4) * *val / (v1 - v0 - num_vis);
+   if (thumbpos < y0) thumbpos = y0;
+   if (thumbpos >= y1) thumbpos = y1;
+   over = stbte__hittest(x-1,y0,x+2,y1,id);
+   switch (stbte__ui.event) {
+      case STBTE__paint:
+         stbte__draw_rect(x,y0,x+1,y1, STBTE_COLOR_SCROLLBAR_TRACK);
+         stbte__draw_rect(x-1,thumbpos-3,x+2,thumbpos+4, STBTE_COLOR_SCROLLBAR_THUMB);
+         break;
+      case STBTE__leftdown:
+         if (STBTE__IS_HOT(id) && STBTE__INACTIVE()) {
+            // check if it's over the thumb
+            stbte__activate(id);
+            *val = ((stbte__ui.my-y0) * (v1 - v0 - num_vis) + (y1-y0)/2)/ (y1-y0);
+         }
+         break;
+      case STBTE__mousemove:
+         if (STBTE__IS_ACTIVE(id) && stbte__ui.mx >= x-15 && stbte__ui.mx <= x+15)
+            *val = ((stbte__ui.my-y0) * (v1 - v0 - num_vis) + (y1-y0)/2)/ (y1-y0);
+         break;
+      case STBTE__leftup:
+         if (STBTE__IS_ACTIVE(id))
+            stbte__activate(0);
+         break;
+
+   }
+
+   if (*val >= v1-num_vis)
+      *val = v1-num_vis;
+   if (*val <= v0)
+      *val = v0;
+}
+
+
+static void stbte__compute_digits(stbte_tilemap *tm)
+{
+   if (tm->max_x >= 1000 || tm->max_y >= 1000)
+      tm->digits = 4;
+   else if (tm->max_x >= 100 || tm->max_y >= 100)
+      tm->digits = 3;
+   else
+      tm->digits = 2;
+}
+
+typedef struct
+{
+   int width, height;
+   int x,y;
+   int active;
+   float retracted;
+} stbte__region_t;
+
+stbte__region_t stbte__region[4];
+
+#define STBTE__TOOLBAR_ICON_SIZE   (9+2*2)
+#define STBTE__TOOLBAR_PASTE_SIZE  (34+2*2)
+
+// This routine computes where every panel goes onscreen: computes
+// a minimum width for each side based on which panels are on that
+// side, and accounts for width-dependent layout of certain panels.
+static void stbte__compute_panel_locations(stbte_tilemap *tm)
+{
+   int i, limit, w, k;
+   int window_width  = stbte__ui.x1 - stbte__ui.x0;
+   int window_height = stbte__ui.y1 - stbte__ui.y0;
+   int min_width[STBTE__num_panel]={0,0,0,0,0};
+   int height[STBTE__num_panel]={0,0,0,0,0};
+   int panel_active[STBTE__num_panel]={1,1,1,1,1};
+   int vpos[4] = { 0,0,0,0 };
+   stbte__panel *p = stbte__ui.panel;
+   stbte__panel *pt = &p[STBTE__panel_toolbar];
+
+   for (i=0; i < 4; ++i) {
+      stbte__region[i].active = 0;
+      stbte__region[i].width = 0;
+      stbte__region[i].height = 0;
+   }
+
+   // compute number of digits needs for info panel
+   stbte__compute_digits(tm);
+
+   // determine which panels are active
+   panel_active[STBTE__panel_categories] = tm->num_categories != 0;
+   panel_active[STBTE__panel_layers    ] = tm->num_layers     >  1;
+
+   // compute minimum widths for each panel (assuming they're on sides not top)
+   min_width[STBTE__panel_info      ] = 8 + 11 + 7*tm->digits+17+7;               // estimate min width of "w:0000"
+   min_width[STBTE__panel_tiles     ] = 4 + tm->palette_spacing_x + 5;            // 5 for scrollbar
+   min_width[STBTE__panel_categories] = 4 + 42 + 5;                               // 42 is enough to show ~7 chars; 5 for scrollbar
+   min_width[STBTE__panel_layers    ] = 4 + 54 + 30*tm->has_layer_names;          // 2 digits plus 3 buttons plus scrollbar
+   min_width[STBTE__panel_toolbar   ] = 4 + STBTE__TOOLBAR_PASTE_SIZE;            // wide enough for 'Paste' button
+
+   // compute minimum widths for left & right panels based on the above
+   stbte__region[0].width = stbte__ui.left_width;
+   stbte__region[1].width = stbte__ui.right_width;
+
+   for (i=0; i < STBTE__num_panel; ++i) {
+      if (panel_active[i]) {
+         int side = stbte__ui.panel[i].side;
+         if (min_width[i] > stbte__region[side].width)
+            stbte__region[side].width = min_width[i];
+         stbte__region[side].active = 1;
+      }
+   }
+
+   // now compute the heights of each panel
+
+   // if toolbar at top, compute its size & push the left and right start points down
+   if (stbte__region[STBTE__side_top].active) {
+      int height = STBTE__TOOLBAR_ICON_SIZE+2;
+      pt->x0     = stbte__ui.x0;
+      pt->y0     = stbte__ui.y0;
+      pt->width  = window_width;
+      pt->height = height;
+      vpos[STBTE__side_left] = vpos[STBTE__side_right] = height;
+   } else {
+      int num_rows = STBTE__num_tool * ((stbte__region[pt->side].width-4)/STBTE__TOOLBAR_ICON_SIZE);
+      height[STBTE__panel_toolbar] = num_rows*13 + 3*15 + 4; // 3*15 for cut/copy/paste, which are stacked vertically
+   }
+
+   for (i=0; i < 4; ++i)
+      stbte__region[i].y = stbte__ui.y0 + vpos[i];
+
+   for (i=0; i < 2; ++i) {
+      int anim = (int) (stbte__region[i].width * stbte__region[i].retracted);
+      stbte__region[i].x = (i == STBTE__side_left) ? stbte__ui.x0 - anim : stbte__ui.x1 - stbte__region[i].width + anim;
+   }
+
+   // info panel
+   w = stbte__region[p[STBTE__panel_info].side].width;
+   p[STBTE__panel_info].mode = (w >= 8 + (11+7*tm->digits+17)*2 + 4);
+   if (p[STBTE__panel_info].mode)
+      height[STBTE__panel_info] = 5 + 11*2 + 2 + tm->palette_spacing_y;
+   else
+      height[STBTE__panel_info] = 5 + 11*4 + 2 + tm->palette_spacing_y;
+
+   // layers
+   limit = 6 + stbte__ui.panel[STBTE__panel_layers].delta_height;
+   height[STBTE__panel_layers] = (tm->num_layers > limit ? limit : tm->num_layers)*15 + 7 + (tm->has_layer_names ? 0 : 11);
+
+   // categories
+   limit = 6 + stbte__ui.panel[STBTE__panel_categories].delta_height;
+   height[STBTE__panel_categories] = (tm->num_categories+1 > limit ? limit : tm->num_categories+1)*11 + 14;
+   if (stbte__ui.panel[STBTE__panel_categories].side == stbte__ui.panel[STBTE__panel_categories].side)
+      height[STBTE__panel_categories] -= 4;   
+
+   // palette
+   k =  (stbte__region[p[STBTE__panel_tiles].side].width - 8) / tm->palette_spacing_x;
+   if (k == 0) k = 1;
+   height[STBTE__panel_tiles] = ((tm->num_tiles+k-1)/k) * tm->palette_spacing_y + 8;
+
+   // now compute the locations of all the panels
+   for (i=0; i < STBTE__num_panel; ++i) {
+      if (panel_active[i]) {
+         int side = p[i].side;
+         if (side == STBTE__side_left || side == STBTE__side_right) {
+            p[i].width  = stbte__region[side].width;
+            p[i].x0     = stbte__region[side].x;
+            p[i].y0     = stbte__ui.y0 + vpos[side];
+            p[i].height = height[i];
+            vpos[side] += height[i];
+            if (vpos[side] > window_height) {
+               vpos[side] = window_height;
+               p[i].height = stbte__ui.y1 - p[i].y0;
+            }
+         } else {
+            ; // it's at top, it's already been explicitly set up earlier
+         }
+      } else {
+         // inactive panel
+         p[i].height = 0;
+         p[i].width  = 0;
+         p[i].x0     = stbte__ui.x1;
+         p[i].y0     = stbte__ui.y1;
+      }
+   }
+}
+
+// unique identifiers for imgui
+enum
+{
+   STBTE__map=1,
+   STBTE__region,
+   STBTE__panel,                          // panel background to hide map, and misc controls
+   STBTE__info,                           // info data
+   STBTE__toolbarA, STBTE__toolbarB,      // toolbar buttons: param is tool number
+   STBTE__palette,                        // palette selectors: param is tile index
+   STBTE__categories,                     // category selectors: param is category index
+   STBTE__layer,                          //
+   STBTE__solo, STBTE__hide, STBTE__lock, // layer controls: param is layer
+   STBTE__scrollbar,                      // param is panel ID
+   STBTE__panel_mover,                    // p1 is panel ID, p2 is destination side
+   STBTE__panel_sizer,                    // param panel ID
+   STBTE__scrollbar_id,
+};
+
+// id is:      [      24-bit data     : 7-bit identifer ]
+// map id is:  [  12-bit y : 12 bit x : 7-bit identifier ]
+
+#define STBTE__ID(n,p)     ((n) + ((p)<<7))
+#define STBTE__ID2(n,p,q)  STBTE__ID(n, ((p)<<12)+(q) )
+#define STBTE__IDMAP(x,y)  STBTE__ID2(STBTE__map, x,y)
+
+static void stbte__activate_map(int x, int y)
+{
+   stbte__ui.active_id = STBTE__IDMAP(x,y);
+   stbte__ui.active_event = stbte__ui.event;
+   stbte__ui.sx = x;
+   stbte__ui.sy = y;
+}
+
+static void stbte__alert(const char *msg)
+{
+   stbte__ui.alert_msg = msg;
+   stbte__ui.alert_timer = 3;
+}
+
+static void stbte__brush_predict(stbte_tilemap *tm, short result[])
+{
+   int layer_to_paint = tm->cur_layer;
+   stbte__tileinfo *ti;
+   int i;
+
+   if (tm->cur_tile < 0) return;
+
+   ti = &tm->tiles[tm->cur_tile];
+
+   // find lowest legit layer to paint it on, and put it there
+   for (i=0; i < tm->num_layers; ++i) {
+      // check if object is allowed on layer
+      if (!(ti->layermask & (1 << i)))
+         continue;
+
+      if (i != tm->solo_layer) {
+         short bg;
+
+         // if there's a selected layer, can only paint on that
+         if (tm->cur_layer >= 0 && i != tm->cur_layer)
+            continue;
+
+         // if the layer is hidden, we can't see it
+         if (tm->layerinfo[i].hidden)
+            continue;
+
+         // if the layer is locked, we can't write to it
+         if (tm->layerinfo[i].locked == STBTE__locked)
+            continue;
+
+         bg = i == 0 ? tm->background_tile : STBTE__NO_TILE;
+         // if the layer is non-empty and protected, can't write to it
+         if (tm->layerinfo[i].locked == STBTE__protected && result[i] != bg)
+            continue;
+      }
+
+      result[i] = ti->id;
+      return;
+   }
+}
+
+static void stbte__brush(stbte_tilemap *tm, int x, int y)
+{
+   int layer_to_paint = tm->cur_layer;
+   stbte__tileinfo *ti;
+
+   // find lowest legit layer to paint it on, and put it there
+   int i;
+
+   if (tm->cur_tile < 0) return;
+
+   ti = &tm->tiles[tm->cur_tile];
+
+   for (i=0; i < tm->num_layers; ++i) {
+      // check if object is allowed on layer
+      if (!(ti->layermask & (1 << i)))
+         continue;
+
+      if (i != tm->solo_layer) {
+         short bg;
+
+         // if there's a selected layer, can only paint on that
+         if (tm->cur_layer >= 0 && i != tm->cur_layer)
+            continue;
+
+         // if the layer is hidden, we can't see it
+         if (tm->layerinfo[i].hidden)
+            continue;
+
+         // if the layer is locked, we can't write to it
+         if (tm->layerinfo[i].locked == STBTE__locked)
+            continue;
+
+         bg = i == 0 ? tm->background_tile : STBTE__NO_TILE;
+         // if the layer is non-empty and protected, can't write to it
+         if (tm->layerinfo[i].locked == STBTE__protected && tm->data[y][x][i] != bg)
+            continue;
+      }
+
+      stbte__undo_record(tm,x,y,i,tm->data[y][x][i]);
+      tm->data[y][x][i] = ti->id;
+      return;
+   }
+
+   //stbte__alert("Selected tile not valid on active layer(s)");
+}
+
+enum
+{
+   STBTE__erase_none = -1,
+   STBTE__erase_brushonly = 0,
+   STBTE__erase_any = 1,
+};
+
+static int stbte__erase_predict(stbte_tilemap *tm, short result[], int allow_any)
+{
+   stbte__tileinfo *ti = tm->cur_tile >= 0 ? &tm->tiles[tm->cur_tile] : NULL;
+   int i;
+
+   if (allow_any == STBTE__erase_none)
+      return allow_any;
+
+   // first check if only one layer is legit
+   i = tm->cur_layer;
+   if (tm->solo_layer >= 0)
+      i = tm->solo_layer;
+
+   // if only one layer is legit, directly process that one for clarity
+   if (i >= 0) {
+      short bg = (i == 0 ? tm->background_tile : -1);
+      if (tm->solo_layer < 0) {
+         // check that we're allowed to write to it
+         if (tm->layerinfo[i].hidden) return STBTE__erase_none;
+         if (tm->layerinfo[i].locked) return STBTE__erase_none;
+      }
+      if (result[i] == bg)
+         return STBTE__erase_none; // didn't erase anything
+      if (ti && result[i] == ti->id && (i != 0 || ti->id != tm->background_tile)) {
+         result[i] = bg;
+         return STBTE__erase_brushonly;
+      }
+      if (allow_any == STBTE__erase_any) {
+         result[i] = bg;
+         return STBTE__erase_any;
+      }
+      return STBTE__erase_none;
+   }
+
+   // if multiple layers are legit, first scan all for brush data
+
+   if (ti) {
+      for (i=tm->num_layers-1; i >= 0; --i) {
+         if (result[i] != ti->id)
+            continue;
+         if (tm->layerinfo[i].locked || tm->layerinfo[i].hidden)
+            continue;
+         if (i == 0 && result[i] == tm->background_tile)
+            return STBTE__erase_none;
+         result[i] = (i == 0 ? tm->background_tile : STBTE__NO_TILE);
+         return STBTE__erase_brushonly;
+      }
+   }
+
+   if (allow_any != STBTE__erase_any)
+      return STBTE__erase_none;
+
+   // apply layer filters, erase from top
+   for (i=tm->num_layers-1; i >= 0; --i) {
+      if (result[i] < 0)
+         continue;
+      if (tm->layerinfo[i].locked || tm->layerinfo[i].hidden)
+         continue;
+      if (i == 0 && result[i] == tm->background_tile)
+         return STBTE__erase_none;
+      result[i] = (i == 0 ? tm->background_tile : STBTE__NO_TILE);
+      return STBTE__erase_any;
+   }
+
+   return STBTE__erase_none;
+}
+
+
+static int stbte__erase(stbte_tilemap *tm, int x, int y, int allow_any)
+{
+   stbte__tileinfo *ti = tm->cur_tile >= 0 ? &tm->tiles[tm->cur_tile] : NULL;
+   int i;
+
+   if (allow_any == STBTE__erase_none)
+      return allow_any;
+
+   // first check if only one layer is legit
+   i = tm->cur_layer;
+   if (tm->solo_layer >= 0)
+      i = tm->solo_layer;
+
+   // if only one layer is legit, directly process that one for clarity
+   if (i >= 0) {
+      short bg = (i == 0 ? tm->background_tile : -1);
+      if (tm->solo_layer < 0) {
+         // check that we're allowed to write to it
+         if (tm->layerinfo[i].hidden) return STBTE__erase_none;
+         if (tm->layerinfo[i].locked) return STBTE__erase_none;
+      }
+      if (tm->data[y][x][i] == bg)
+         return -1; // didn't erase anything
+      if (ti && tm->data[y][x][i] == ti->id && (i != 0 || ti->id != tm->background_tile)) {
+         stbte__undo_record(tm,x,y,i,tm->data[y][x][i]);
+         tm->data[y][x][i] = bg;
+         return STBTE__erase_brushonly;
+      }
+      if (allow_any == STBTE__erase_any) {
+         stbte__undo_record(tm,x,y,i,tm->data[y][x][i]);
+         tm->data[y][x][i] = bg;
+         return STBTE__erase_any;
+      }
+      return STBTE__erase_none;
+   }
+
+   // if multiple layers are legit, first scan all for brush data
+
+   if (ti) {
+      for (i=tm->num_layers-1; i >= 0; --i) {
+         if (tm->data[y][x][i] != ti->id)
+            continue;
+         if (tm->layerinfo[i].locked || tm->layerinfo[i].hidden)
+            continue;
+         if (i == 0 && tm->data[y][x][i] == tm->background_tile)
+            return STBTE__erase_none;
+         stbte__undo_record(tm,x,y,i,tm->data[y][x][i]);
+         tm->data[y][x][i] = (i == 0 ? tm->background_tile : STBTE__NO_TILE);
+         return STBTE__erase_brushonly;
+      }
+   }
+
+   if (allow_any != STBTE__erase_any)
+      return STBTE__erase_none;
+
+   // apply layer filters, erase from top
+   for (i=tm->num_layers-1; i >= 0; --i) {
+      if (tm->data[y][x][i] < 0)
+         continue;
+      if (tm->layerinfo[i].locked || tm->layerinfo[i].hidden)
+         continue;
+      if (i == 0 && tm->data[y][x][i] == tm->background_tile)
+         return STBTE__erase_none;
+      stbte__undo_record(tm,x,y,i,tm->data[y][x][i]);
+      tm->data[y][x][i] = (i == 0 ? tm->background_tile : STBTE__NO_TILE);
+      return STBTE__erase_any;
+   }
+
+   return STBTE__erase_none;
+}
+
+static int stbte__find_tile(stbte_tilemap *tm, int tile_id)
+{
+   int i;
+   for (i=0; i < tm->num_tiles; ++i)
+      if (tm->tiles[i].id == tile_id)
+         return i;
+   stbte__alert("Eyedropped tile that isn't in tileset");
+   return -1;
+}
+
+static void stbte__eyedrop(stbte_tilemap *tm, int x, int y)
+{
+   int i,j;
+
+   // flush eyedropper state
+   if (stbte__ui.eyedrop_x != x || stbte__ui.eyedrop_y != y) {
+      stbte__ui.eyedrop_x = x;
+      stbte__ui.eyedrop_y = y;
+      stbte__ui.eyedrop_last_layer = tm->num_layers;
+   }
+
+   // if only one layer is active, query that
+   i = tm->cur_layer;
+   if (tm->solo_layer >= 0)
+      i = tm->solo_layer;
+   if (i >= 0) {
+      if (tm->data[y][x][i] == STBTE__NO_TILE)
+         return;
+      tm->cur_tile = stbte__find_tile(tm, tm->data[y][x][i]);
+      return;
+   }
+
+   // if multiple layers, continue from previous
+   i = stbte__ui.eyedrop_last_layer;
+   for (j=0; j < tm->num_layers; ++j) {
+      if (--i < 0)
+         i = tm->num_layers-1;
+      if (tm->layerinfo[i].hidden)
+         continue;
+      if (tm->data[y][x][i] == STBTE__NO_TILE)
+         continue;
+      stbte__ui.eyedrop_last_layer = i;
+      tm->cur_tile = stbte__find_tile(tm, tm->data[y][x][i]);
+      return;
+   }
+}
+
+// compute the result of pasting into a tile non-destructively so we can preview it
+static void stbte__paste_stack(stbte_tilemap *tm, short result[], short dest[], short src[], int dragging)
+{
+   int i;
+
+   // special case single-layer
+   i = tm->cur_layer;
+   if (tm->solo_layer >= 0)
+      i = tm->solo_layer;
+   if (i >= 0) {
+      if (tm->solo_layer < 0) {
+         // check that we're allowed to write to it
+         if (tm->layerinfo[i].hidden) return;
+         if (tm->layerinfo[i].locked == STBTE__locked) return;
+         // if dragging w/o copy, we have to be allowed to erase
+         if (dragging && tm->layerinfo[i].locked == STBTE__protected)
+             return;
+      }
+      result[i] = dest[i];
+      if (src[i] != STBTE__NO_TILE)
+         result[i] = src[i];
+      return;
+   }
+
+   for (i=0; i < tm->num_layers; ++i) {
+      result[i] = dest[i];
+      if (src[i] != STBTE__NO_TILE) {
+         if (!tm->layerinfo[i].hidden && tm->layerinfo[i].locked != STBTE__locked)
+            if (!dragging || tm->layerinfo[i].locked == STBTE__unlocked)
+               result[i] = src[i];
+         }
+   }
+}
+
+// compute the result of dragging away from a tile
+static void stbte__clear_stack(stbte_tilemap *tm, short result[])
+{
+   int i;
+   // special case single-layer
+   i = tm->cur_layer;
+   if (tm->solo_layer >= 0)
+      i = tm->solo_layer;
+   if (i >= 0) {
+      result[i] = (i == 0 ? tm->background_tile : STBTE__NO_TILE);
+   } else {
+      for (i=0; i < tm->num_layers; ++i) {
+         if (!tm->layerinfo[i].hidden && tm->layerinfo[i].locked == STBTE__unlocked)
+            result[i] = (i == 0 ? tm->background_tile : STBTE__NO_TILE);
+      }
+   }
+}
+
+// check if some map square is active
+#define STBTE__IS_MAP_ACTIVE()  ((stbte__ui.active_id & 127) == STBTE__map)
+#define STBTE__IS_MAP_HOT()     ((stbte__ui.hot_id & 127) == STBTE__map)
+
+static void stbte__fillrect(stbte_tilemap *tm, int x0, int y0, int x1, int y1, int fill)
+{
+   int i,j;
+   int x=x0,y=y0;
+
+   stbte__begin_undo(tm);
+   if (x0 > x1) i=x0,x0=x1,x1=i;
+   if (y0 > y1) j=y0,y0=y1,y1=j;
+   for (j=y0; j <= y1; ++j)
+      for (i=x0; i <= x1; ++i)
+         if (fill)
+            stbte__brush(tm, i,j);
+         else
+            stbte__erase(tm, i,j,STBTE__erase_any);
+   stbte__end_undo(tm);
+   // suppress warning from brush
+   stbte__ui.alert_msg = 0;
+}
+
+static void stbte__select_rect(stbte_tilemap *tm, int x0, int y0, int x1, int y1)
+{
+   stbte__ui.has_selection = 1;
+   stbte__ui.select_x0 = (x0 < x1 ? x0 : x1);
+   stbte__ui.select_x1 = (x0 < x1 ? x1 : x0);
+   stbte__ui.select_y0 = (y0 < y1 ? y0 : y1);
+   stbte__ui.select_y1 = (y0 < y1 ? y1 : y0);
+}
+
+static void stbte__copy_cut(stbte_tilemap *tm, int cut)
+{
+   int i,j,n,w,h,p=0;
+   if (!stbte__ui.has_selection)
+      return;
+   w = stbte__ui.select_x1 - stbte__ui.select_x0 + 1;
+   h = stbte__ui.select_y1 - stbte__ui.select_y0 + 1;
+   if (STBTE_MAX_COPY / w < h) {
+      stbte__alert("Selection too large for copy buffer, increase STBTE_MAX_COPY");
+      return;
+   }
+
+   for (i=0; i < w*h; ++i)
+      for (n=0; n < tm->num_layers; ++n)
+         stbte__ui.copybuffer[i][n] = STBTE__NO_TILE;
+
+   if (cut)
+      stbte__begin_undo(tm);
+   for (j=stbte__ui.select_y0; j <= stbte__ui.select_y1; ++j) {
+      for (i=stbte__ui.select_x0; i <= stbte__ui.select_x1; ++i) {
+         for (n=0; n < tm->num_layers; ++n) {
+            if (tm->solo_layer >= 0) {
+               if (tm->solo_layer != n)
+                  continue;
+            } else {
+               if (tm->cur_layer >= 0)
+                  if (tm->cur_layer != n)
+                     continue;
+               if (tm->layerinfo[n].hidden)
+                  continue;
+               if (cut && tm->layerinfo[n].locked)
+                  continue;
+            }
+            stbte__ui.copybuffer[p][n] = tm->data[j][i][n];
+            if (cut) {
+               stbte__undo_record(tm,i,j,n, tm->data[j][i][n]);
+               tm->data[j][i][n] = (n==0 ? tm->background_tile : -1);
+            }
+         }
+         ++p;
+      }
+   }
+   if (cut)
+      stbte__end_undo(tm);
+   stbte__ui.copy_width = w;
+   stbte__ui.copy_height = h;
+   stbte__ui.has_copy = 1;
+   stbte__ui.has_selection = 0;
+}
+
+static void stbte__paste(stbte_tilemap *tm, int mapx, int mapy)
+{
+   int w = stbte__ui.copy_width;
+   int h = stbte__ui.copy_height;
+   int i,j,k,p;
+   int x = mapx - (w>>1);
+   int y = mapy - (h>>1);
+   if (stbte__ui.has_copy == 0)
+      return;
+   stbte__begin_undo(tm);
+   p = 0;
+   for (j=0; j < h; ++j) {
+      for (i=0; i < w; ++i) {
+         if (y+j >= 0 && y+j < tm->max_y && x+i >= 0 && x+i < tm->max_x) {
+            // compute the new stack
+            short tilestack[STBTE_MAX_LAYERS];
+            for (k=0; k < tm->num_layers; ++k)
+               tilestack[k] = tm->data[y+j][x+i][k];
+            stbte__paste_stack(tm, tilestack, tilestack, stbte__ui.copybuffer[p], 0);
+            // update anything that changed
+            for (k=0; k < tm->num_layers; ++k) {
+               if (tilestack[k] != tm->data[y+j][x+i][k]) {
+                  stbte__undo_record(tm, x+i,y+j,k, tm->data[y+j][x+i][k]);
+                  tm->data[y+j][x+i][k] = tilestack[k];
+               }
+            }
+         }
+         ++p;
+      }
+   }
+   stbte__end_undo(tm);
+}
+
+static void stbte__drag_update(stbte_tilemap *tm, int mapx, int mapy)
+{
+   int w = stbte__ui.drag_w, h = stbte__ui.drag_h;
+   int ox,oy,i;
+   short temp[STBTE_MAX_LAYERS];
+   short *data = NULL;
+   if (!stbte__ui.shift) {
+      ox = mapx - stbte__ui.drag_x;
+      oy = mapy - stbte__ui.drag_y;
+      if (ox >= 0 && ox < w && oy >= 0 && oy < h) {
+         for (i=0; i < tm->num_layers; ++i)
+            temp[i] = tm->data[mapy][mapx][i];
+         data = temp;
+         stbte__clear_stack(tm, data);
+      }
+   }
+   ox = mapx - stbte__ui.drag_dest_x;
+   oy = mapy - stbte__ui.drag_dest_y;
+   if (ox >= 0 && ox < w && oy >= 0 && oy < h) {
+      if (data == NULL) {
+         for (i=0; i < tm->num_layers; ++i)
+            temp[i] = tm->data[mapy][mapx][i];
+         data = temp;
+      }
+      stbte__paste_stack(tm, data, data, tm->data[stbte__ui.drag_y+oy][stbte__ui.drag_x+ox], !stbte__ui.shift);
+   }
+   if (data) {
+      for (i=0; i < tm->num_layers; ++i)
+         if (tm->data[mapy][mapx][i] != data[i]) {
+            stbte__undo_record(tm, mapx, mapy, i, tm->data[mapy][mapx][i]);
+            tm->data[mapy][mapx][i] = data[i];
+         }
+   }
+}
+
+static void stbte__drag_place(stbte_tilemap *tm, int mapx, int mapy)
+{
+   int i,j;
+   int move_x = (stbte__ui.drag_dest_x - stbte__ui.drag_x);
+   int move_y = (stbte__ui.drag_dest_y - stbte__ui.drag_y);
+   if (move_x == 0 && move_y == 0)
+      return;
+
+   stbte__begin_undo(tm);
+   // we now need a 2D memmove-style mover that doesn't
+   // overwrite any data as it goes. this requires being
+   // direction sensitive in the same way as memmove
+   if (move_y > 0 || (move_y == 0 && move_x > 0)) {
+      for (j=tm->max_y-1; j >= 0; --j)
+         for (i=tm->max_x-1; i >= 0; --i)
+            stbte__drag_update(tm,i,j);
+   } else {
+      for (j=0; j < tm->max_y; ++j)
+         for (i=0; i < tm->max_x; ++i)
+            stbte__drag_update(tm,i,j);
+   }
+   stbte__end_undo(tm);
+
+   stbte__ui.has_selection = 1;
+   stbte__ui.select_x0 = stbte__ui.drag_dest_x;
+   stbte__ui.select_y0 = stbte__ui.drag_dest_y;
+   stbte__ui.select_x1 = stbte__ui.select_x0 + stbte__ui.drag_w;
+   stbte__ui.select_y1 = stbte__ui.select_y0 + stbte__ui.drag_h;
+}
+
+
+static void stbte__tile(stbte_tilemap *tm, int sx, int sy, int mapx, int mapy)
+{
+   int tool = stbte__ui.tool;
+   int i;
+   int x0=sx, y0=sy;
+   int x1=sx+tm->spacing_x, y1=sy+tm->spacing_y;
+   int id = STBTE__IDMAP(mapx,mapy);
+   int over = stbte__hittest(x0,y0,x1,y1, id);
+   switch (stbte__ui.event) {
+      case STBTE__paint: {
+         short *data = tm->data[mapy][mapx];
+         short temp[STBTE_MAX_LAYERS];
+
+         if (STBTE__IS_MAP_HOT()) {
+            if (stbte__ui.pasting) {
+               int ox = mapx - stbte__ui.paste_x;
+               int oy = mapy - stbte__ui.paste_y;
+               if (ox >= 0 && ox < stbte__ui.copy_width && oy >= 0 && oy < stbte__ui.copy_height) {
+                  stbte__paste_stack(tm, temp, tm->data[mapy][mapx], stbte__ui.copybuffer[oy*stbte__ui.copy_width+ox], 0);
+                  data = temp;
+               }
+            } else if (stbte__ui.dragging) {
+               int ox,oy;
+               for (i=0; i < tm->num_layers; ++i)
+                  temp[i] = tm->data[mapy][mapx][i];
+               data = temp;
+
+               // if it's in the source area, remove things unless shift-dragging
+               ox = mapx - stbte__ui.drag_x;
+               oy = mapy - stbte__ui.drag_y;
+               if (!stbte__ui.shift && ox >= 0 && ox < stbte__ui.drag_w && oy >= 0 && oy < stbte__ui.drag_h) {
+                  stbte__clear_stack(tm, temp);
+               }
+
+               ox = mapx - stbte__ui.drag_dest_x;
+               oy = mapy - stbte__ui.drag_dest_y;
+               if (ox >= 0 && ox < stbte__ui.drag_w && oy >= 0 && oy < stbte__ui.drag_h) {
+                  stbte__paste_stack(tm, temp, temp, tm->data[stbte__ui.drag_y+oy][stbte__ui.drag_x+ox], !stbte__ui.shift);
+               }
+            } else if (STBTE__IS_MAP_ACTIVE()) {
+               if (stbte__ui.tool == STBTE__tool_rect) {
+                  if ((stbte__ui.ms_time & 511) < 380) {
+                     int ex = ((stbte__ui.hot_id >> 19) & 4095);
+                     int ey = ((stbte__ui.hot_id >>  7) & 4095);
+                     int sx = stbte__ui.sx;
+                     int sy = stbte__ui.sy;
+
+                     if (   ((mapx >= sx && mapx < ex+1) || (mapx >= ex && mapx < sx+1))
+                         && ((mapy >= sy && mapy < ey+1) || (mapy >= ey && mapy < sy+1))) {
+                        int i;
+                        for (i=0; i < tm->num_layers; ++i)
+                           temp[i] = tm->data[mapy][mapx][i];
+                        data = temp;
+                        if (stbte__ui.active_event == STBTE__leftdown)
+                           stbte__brush_predict(tm, temp);
+                        else
+                           stbte__erase_predict(tm, temp, STBTE__erase_any);
+                     }
+                  }
+               }
+            }
+         }
+
+         if (STBTE__IS_HOT(id) && STBTE__INACTIVE() && !stbte__ui.pasting) {
+            if (stbte__ui.tool == STBTE__tool_brush) {
+               if ((stbte__ui.ms_time & 511) < 300) {
+                  data = temp;
+                  for (i=0; i < tm->num_layers; ++i)
+                     temp[i] = tm->data[mapy][mapx][i];
+                  stbte__brush_predict(tm, temp);
+               }
+            }
+         }
+
+         for (i=0; i < tm->num_layers; ++i) {
+            if (i == tm->solo_layer || (!tm->layerinfo[i].hidden && tm->solo_layer < 0))
+               if (data[i] >= 0)
+                  STBTE_DRAW_TILE(x0,y0, (unsigned short) data[i], 0);
+            if (i == 0 && stbte__ui.show_grid==1)
+               stbte__draw_halfframe(x0,y0,x0+tm->spacing_x, y0+tm->spacing_y, STBTE_COLOR_GRID);
+         }
+         if (stbte__ui.pasting || stbte__ui.dragging || stbte__ui.scrolling)
+            break;
+         if (stbte__ui.scrollkey && !STBTE__IS_MAP_ACTIVE())
+            break;
+         if (STBTE__IS_HOT(id) && STBTE__IS_MAP_ACTIVE() && (tool == STBTE__tool_rect || tool == STBTE__tool_select)) {
+            int rx0,ry0,rx1,ry1,t;
+            // compute the center of each rect
+            rx0 = x0 + tm->spacing_x/2;
+            ry0 = y0 + tm->spacing_y/2;
+            rx1 = rx0 + (stbte__ui.sx - mapx) * tm->spacing_x;
+            ry1 = ry0 + (stbte__ui.sy - mapy) * tm->spacing_y;
+            if (rx0 > rx1) t=rx0,rx0=rx1,rx1=t;
+            if (ry0 > ry1) t=ry0,ry0=ry1,ry1=t;
+            rx0 -= tm->spacing_x/2;
+            ry0 -= tm->spacing_y/2;
+            rx1 += tm->spacing_x/2;
+            ry1 += tm->spacing_y/2;
+            stbte__draw_frame_delayed(rx0-1,ry0-1,rx1+1,ry1+1, STBTE_COLOR_TILEMAP_HIGHLIGHT);
+            break;
+         }
+         if (STBTE__IS_HOT(id) && STBTE__INACTIVE()) {
+            stbte__draw_frame_delayed(x0-1,y0-1,x1+1,y1+1, STBTE_COLOR_TILEMAP_HIGHLIGHT);
+         }
+         break;
+      }
+   }
+
+   if (stbte__ui.pasting) {
+      switch (stbte__ui.event) {
+         case STBTE__leftdown:
+            if (STBTE__IS_HOT(id)) {
+               stbte__ui.pasting = 0;
+               stbte__paste(tm, mapx, mapy);
+               stbte__activate(0);
+            }
+            break;
+         case STBTE__leftup:
+            // just clear it no matter what, since they might click away to clear it
+            stbte__activate(0);
+            break;
+         case STBTE__rightdown:
+            if (STBTE__IS_HOT(id)) {
+               stbte__activate(0);
+               stbte__ui.pasting = 0;
+            }
+            break;
+      }
+      return;
+   }
+
+   if (stbte__ui.scrolling) {
+      if (stbte__ui.event == STBTE__leftup) {
+         stbte__activate(0);
+         stbte__ui.scrolling = 0;
+      }
+      if (stbte__ui.event == STBTE__mousemove) {
+         tm->scroll_x += (stbte__ui.start_x - stbte__ui.mx);
+         tm->scroll_y += (stbte__ui.start_y - stbte__ui.my);
+         stbte__ui.start_x = stbte__ui.mx;
+         stbte__ui.start_y = stbte__ui.my;
+      }
+      return;
+   }
+
+   // regardless of tool, leftdown is a scrolldrag
+   if (STBTE__IS_HOT(id) && stbte__ui.scrollkey && stbte__ui.event == STBTE__leftdown) {
+      stbte__ui.scrolling = 1;
+      stbte__ui.start_x = stbte__ui.mx;
+      stbte__ui.start_y = stbte__ui.my;
+      return;
+   }
+
+   switch (tool) {
+      case STBTE__tool_brush:
+         switch (stbte__ui.event) {
+            case STBTE__mousemove:
+               if (STBTE__IS_MAP_ACTIVE() && over) {
+                  // don't brush/erase same tile multiple times unless they move away and back @TODO should just be only once, but that needs another data structure
+                  if (!STBTE__IS_ACTIVE(id)) {
+                     if (stbte__ui.active_event == STBTE__leftdown)
+                        stbte__brush(tm, mapx, mapy);
+                     else
+                        stbte__erase(tm, mapx, mapy, stbte__ui.brush_state);
+                     stbte__ui.active_id = id; // switch to this map square so we don't rebrush IT multiple times
+                  }
+               }
+               break;
+            case STBTE__leftdown:
+               if (STBTE__IS_HOT(id) && STBTE__INACTIVE()) {
+                  stbte__activate(id);
+                  stbte__begin_undo(tm);
+                  stbte__brush(tm, mapx, mapy);
+               }
+               break;
+            case STBTE__rightdown:
+               if (STBTE__IS_HOT(id) && STBTE__INACTIVE()) {
+                  stbte__activate(id);
+                  stbte__begin_undo(tm);
+                  stbte__ui.brush_state = stbte__erase(tm, mapx, mapy, 1);
+               }
+               break;
+            case STBTE__leftup:
+            case STBTE__rightup:
+               if (STBTE__IS_MAP_ACTIVE()) {
+                  stbte__end_undo(tm);
+                  stbte__activate(0);
+               }
+               break;
+         }
+         break;
+
+      case STBTE__tool_select:
+         if (STBTE__IS_HOT(id)) {
+            switch (stbte__ui.event) {
+               case STBTE__leftdown:
+                  if (STBTE__INACTIVE()) {
+                     // if we're clicking in an existing selection...
+                     if (stbte__ui.has_selection) {
+                        if (  mapx >= stbte__ui.select_x0 && mapx <= stbte__ui.select_x1
+                           && mapy >= stbte__ui.select_y0 && mapy <= stbte__ui.select_y1)
+                        {
+                           stbte__ui.dragging = 1;
+                           stbte__ui.drag_x = stbte__ui.select_x0;
+                           stbte__ui.drag_y = stbte__ui.select_y0;
+                           stbte__ui.drag_w = stbte__ui.select_x1 - stbte__ui.select_x0 + 1;
+                           stbte__ui.drag_h = stbte__ui.select_y1 - stbte__ui.select_y0 + 1;
+                           stbte__ui.drag_offx = mapx - stbte__ui.select_x0;
+                           stbte__ui.drag_offy = mapy - stbte__ui.select_y0;
+                        }
+                     }
+                     stbte__ui.has_selection = 0; // no selection until it completes
+                     stbte__activate_map(mapx,mapy);
+                  }
+                  break;
+               case STBTE__leftup:
+                  if (STBTE__IS_MAP_ACTIVE()) {
+                     if (stbte__ui.dragging) {
+                        stbte__drag_place(tm, mapx,mapy);
+                        stbte__ui.dragging = 0;
+                        stbte__activate(0);
+                     } else {
+                        stbte__select_rect(tm, stbte__ui.sx, stbte__ui.sy, mapx, mapy);
+                        stbte__activate(0);
+                     }
+                  }
+                  break;
+               case STBTE__rightdown:
+                  stbte__ui.has_selection = 0;
+                  break;
+            }
+         }
+         break;
+
+      case STBTE__tool_rect:
+         if (STBTE__IS_HOT(id)) {
+            switch (stbte__ui.event) {
+               case STBTE__leftdown:
+                  if (STBTE__INACTIVE())
+                     stbte__activate_map(mapx,mapy);
+                  break;
+               case STBTE__leftup:
+                  if (STBTE__IS_MAP_ACTIVE()) {
+                     stbte__fillrect(tm, stbte__ui.sx, stbte__ui.sy, mapx, mapy, 1);
+                     stbte__activate(0);
+                  }
+                  break;
+               case STBTE__rightdown:
+                  if (STBTE__INACTIVE())
+                     stbte__activate_map(mapx,mapy);
+                  break;
+               case STBTE__rightup:
+                  if (STBTE__IS_MAP_ACTIVE()) {
+                     stbte__fillrect(tm, stbte__ui.sx, stbte__ui.sy, mapx, mapy, 0);
+                     stbte__activate(0);
+                  }
+                  break;
+            }
+         }
+         break;
+
+
+      case STBTE__tool_eyedrop:
+         switch (stbte__ui.event) {
+            case STBTE__leftdown:
+               if (STBTE__IS_HOT(id) && STBTE__INACTIVE())
+                  stbte__eyedrop(tm,mapx,mapy);
+               break;
+         }
+         break;
+   }
+}
+
+static void stbte__toolbar(stbte_tilemap *tm, int x0, int y0, int w, int h)
+{
+   int i;
+   int estimated_width = 13 * STBTE__num_tool + 8+8+ 120+4;
+   int x = x0 + w/2 - estimated_width/2;
+   int y = y0+1;
+
+   for (i=0; i < STBTE__num_tool; ++i) {
+      int highlight=0;
+      highlight = (stbte__ui.tool == i);
+      if (i == STBTE__tool_grid && stbte__ui.show_grid)
+         highlight=1;
+      if (i == STBTE__tool_fill)
+         continue;
+      if (stbte__button_icon(toolchar[i], x, y, 13, STBTE__ID(STBTE__toolbarA, i), highlight)) {
+         switch (i) {
+            case STBTE__tool_eyedrop:
+               stbte__ui.eyedrop_last_layer = tm->num_layers; // flush eyedropper state
+               // fallthrough
+            default:
+               stbte__ui.tool = i;
+               stbte__ui.has_selection = 0;
+               break;
+            case STBTE__tool_grid:
+               stbte__ui.show_grid = (stbte__ui.show_grid+1)%3;
+               break;
+            case STBTE__tool_undo:
+               stbte__undo(tm);
+               break;
+            case STBTE__tool_redo:
+               stbte__redo(tm);
+               break;
+         }
+      }
+      x += 13;
+      if (i+1 == STBTE__tool_undo || i+1 == STBTE__tool_grid)
+          x += 8;
+   }
+
+   x += 8;
+   if (stbte__button("cut"  , x, y,10, 40, STBTE__ID(STBTE__toolbarB,0), 0)) {
+      if (stbte__ui.has_selection)
+         stbte__copy_cut(tm, 1);
+   }
+   x += 42;
+   if (stbte__button("copy" , x, y, 5, 40, STBTE__ID(STBTE__toolbarB,1), 0)) {
+      if (stbte__ui.has_selection)
+         stbte__copy_cut(tm, 0);
+   }
+   x += 42;
+   if (stbte__button("paste", x, y, 0, 40, STBTE__ID(STBTE__toolbarB,2), stbte__ui.pasting)) {
+      if (stbte__ui.has_copy) {
+         stbte__ui.pasting = 1;
+         stbte__activate(STBTE__ID(STBTE__toolbarB,3));
+      }
+   }
+}
+
+static int stbte__info_value(char *label, int x, int y, int val, int digits, int id)
+{
+   if (stbte__ui.event == STBTE__paint) {
+      int off = 9-stbte__get_char_width(label[0]);
+      char text[16];
+      sprintf(text, label, digits, val);
+      stbte__draw_text_core(x+off,y, text, 999, STBTE_COLOR_PANEL_TEXT,1);
+   }
+   if (id) {
+      x += 9+7*digits+4;
+      if (stbte__minibutton(x,y, '+', id + (0<<19)))
+         val += (stbte__ui.shift ? 10 : 1);
+      x += 9;
+      if (stbte__minibutton(x,y, '-', id + (1<<19)))
+         val -= (stbte__ui.shift ? 10 : 1);
+      if (val < 1) val = 1; else if (val > 4096) val = 4096;
+   }
+   return val;
+}
+
+static void stbte__info(stbte_tilemap *tm, int x0, int y0, int w, int h)
+{
+   int mode = stbte__ui.panel[STBTE__panel_info].mode;
+   int s = 11+7*tm->digits+4+15;
+   int x,y;
+   int in_region;
+
+   x = x0+2;
+   y = y0+2;
+   tm->max_x = stbte__info_value("w:%*d",x,y, tm->max_x, tm->digits, STBTE__ID(STBTE__info,0));
+   if (mode)
+      x += s;
+   else
+      y += 11;
+   tm->max_y = stbte__info_value("h:%*d",x,y, tm->max_y, tm->digits, STBTE__ID(STBTE__info,1));
+   x = x0+2;
+   y += 11;
+   in_region = (stbte__ui.hot_id & 127) == STBTE__map;
+   stbte__info_value(in_region ? "x:%*d" : "x:",x,y, (stbte__ui.hot_id>>19)&4095, tm->digits, 0);
+   if (mode)
+      x += s;
+   else
+      y += 11;
+   stbte__info_value(in_region ? "y:%*d" : "y:",x,y, (stbte__ui.hot_id>> 7)&4095, tm->digits, 0);
+   y += 15;
+   x = x0+2;
+   stbte__draw_text(x,y,"brush:",40,STBTE_COLOR_PANEL_TEXT);
+   if (tm->cur_tile >= 0)
+      STBTE_DRAW_TILE(x+43,y-3,tm->tiles[tm->cur_tile].id,1);
+}
+
+static void stbte__layers(stbte_tilemap *tm, int x0, int y0, int w, int h)
+{
+   int i, y;
+   int x1 = x0+w;
+   int y1 = y0+h;
+   int xoff = tm->has_layer_names ? 50 : 20;
+   int num_rows;
+   x0 += 2;
+   y0 += 5;
+   if (!tm->has_layer_names) {
+      if (stbte__ui.event == STBTE__paint) {
+         stbte__draw_text(x0,y0, "Layers", w-4, STBTE_COLOR_PANEL_TEXT);
+      }
+      y0 += 11;
+   }
+   num_rows = (y1-y0)/15;
+   y = y0;
+   for (i=0; i < tm->num_layers; ++i) {
+      char text[3], *str = (char *) tm->layerinfo[i].name;
+      static char lockedchar[3] = { 'U', 'P', 'L' };
+      int locked = tm->layerinfo[i].locked;
+      int disabled = (tm->solo_layer >= 0 && tm->solo_layer != i);
+      if (i-tm->layer_scroll >= 0 && i-tm->layer_scroll < num_rows) {
+         if (str == NULL)
+            sprintf(str=text, "%2d", i+1);
+         if (stbte__button(str, x0,y,(i+1<10)*2,xoff-2, STBTE__ID(STBTE__layer,i), tm->cur_layer==i))
+            tm->cur_layer = (tm->cur_layer == i ? -1 : i);
+         if (stbte__layerbutton(x0+xoff +  0,y+1,'H',STBTE__ID(STBTE__hide,i), tm->layerinfo[i].hidden,disabled,STBTE_COLOR_LAYERMASK_HIDE))
+            tm->layerinfo[i].hidden = !tm->layerinfo[i].hidden;
+         if (stbte__layerbutton(x0+xoff + 12,y+1,lockedchar[locked],STBTE__ID(STBTE__lock,i), locked!=0,disabled,STBTE_COLOR_LAYERMASK_LOCK))
+            tm->layerinfo[i].locked = (locked+1)%3;
+         if (stbte__layerbutton(x0+xoff + 24,y+1,'S',STBTE__ID(STBTE__solo,i), tm->solo_layer==i,0,STBTE_COLOR_LAYERMASK_SOLO))
+            tm->solo_layer = (tm->solo_layer == i ? -1 : i);
+         y += 15;
+      }
+   }
+   stbte__scrollbar(x1-4, y0,y1-2, &tm->layer_scroll, 0, tm->num_layers, num_rows, STBTE__ID(STBTE__scrollbar_id, STBTE__layer));
+}
+
+static void stbte__categories(stbte_tilemap *tm, int x0, int y0, int w, int h)
+{
+   int s=11, x,y, i;
+   int num_rows = h / s;
+
+   w -= 4;
+   x = x0+2;
+   y = y0+4;
+   if (tm->category_scroll == 0) {
+      if (stbte__category_button("*ALL*", x,y, w, STBTE__ID(STBTE__categories, 65535), tm->cur_category == -1)) {
+         stbte__choose_category(tm, -1);
+      }
+      y += s;
+   }
+
+   for (i=0; i < tm->num_categories; ++i) {
+      if (i+1 - tm->category_scroll >= 0 && i+1 - tm->category_scroll < num_rows) {
+         if (y + 10 > y0+h)
+            return;
+         if (stbte__category_button(tm->categories[i], x,y,w, STBTE__ID(STBTE__categories,i), tm->cur_category == i))
+            stbte__choose_category(tm, i);
+         y += s;
+      }
+   }
+   stbte__scrollbar(x0+w, y0+4, y0+h-4, &tm->category_scroll, 0, tm->num_categories+1, num_rows, STBTE__ID(STBTE__scrollbar_id, STBTE__categories));
+}
+
+static void stbte__tile_in_palette(stbte_tilemap *tm, int x, int y, int slot)
+{
+   stbte__tileinfo *t = &tm->tiles[slot];
+   int x0=x, y0=y, x1 = x+tm->palette_spacing_x - 1, y1 = y+tm->palette_spacing_y;
+   int id = STBTE__ID(STBTE__palette, slot);
+   int over = stbte__hittest(x0,y0,x1,y1, id);
+   switch (stbte__ui.event) {
+      case STBTE__paint:
+         stbte__draw_rect(x,y,x+tm->palette_spacing_x-1,y+tm->palette_spacing_x-1, STBTE_COLOR_TILEPALETTE_BACKGROUND);
+         STBTE_DRAW_TILE(x,y,t->id, slot == tm->cur_tile);
+         if (slot == tm->cur_tile)
+            stbte__draw_frame_delayed(x-1,y-1,x+tm->palette_spacing_x,y+tm->palette_spacing_y, STBTE_COLOR_TILEPALETTE_OUTLINE);
+         break;
+      default:
+         if (stbte__button_core(id))
+            tm->cur_tile = slot;
+         break;
+   }
+}
+
+static void stbte__palette_of_tiles(stbte_tilemap *tm, int x0, int y0, int w, int h)
+{
+   int i,x,y;
+   int num_vis_rows = (h-6) / tm->palette_spacing_y;
+   int num_columns = (w-2-6) / tm->palette_spacing_x;
+   int num_total_rows = (tm->cur_palette_count + num_columns-1) / num_columns; // ceil()
+   int column,row;
+   int x1 = x0+w, y1=y0+h;
+   x = x0+2;
+   y = y0+6;
+
+   column = 0;
+   row    = -tm->palette_scroll;   
+   for (i=0; i < tm->num_tiles; ++i) {
+      stbte__tileinfo *t = &tm->tiles[i];
+
+      // filter based on category
+      if (tm->cur_category >= 0 && t->category_id != tm->cur_category)
+         continue;
+
+      // display it
+      if (row >= 0 && row < num_vis_rows) {
+         x = x0 + 2 + tm->palette_spacing_x * column;
+         y = y0 + 6 + tm->palette_spacing_y * row;
+         stbte__tile_in_palette(tm,x,y,i);
+      }
+
+      ++column;
+      if (column == num_columns) {
+         column = 0;
+         ++row;
+      }
+   }
+   stbte__flush_delay();
+   stbte__scrollbar(x1-4, y0+6, y1-2, &tm->palette_scroll, 0, num_total_rows, num_vis_rows, STBTE__ID(STBTE__scrollbar_id, STBTE__palette));
+}
+
+
+static void stbte__editor_traverse(stbte_tilemap *tm)
+{
+   int i,j;
+
+   if (tm == NULL)
+      return;
+   if (stbte__ui.x0 == stbte__ui.x1 || stbte__ui.y0 == stbte__ui.y1)
+      return;
+
+   stbte__prepare_tileinfo(tm);
+
+   stbte__compute_panel_locations(tm); // @OPTIMIZE: we don't need to recompute this every time
+
+   if (stbte__ui.event == STBTE__paint) {
+      // fill screen with border
+      stbte__draw_rect(stbte__ui.x0, stbte__ui.y0, stbte__ui.x1, stbte__ui.y1, STBTE_COLOR_TILEMAP_BORDER);
+      // fill tilemap with tilemap background
+      stbte__draw_rect(stbte__ui.x0 - tm->scroll_x, stbte__ui.y0 - tm->scroll_y,
+                       stbte__ui.x0 - tm->scroll_x + tm->spacing_x * tm->max_x,
+                       stbte__ui.y0 - tm->scroll_y + tm->spacing_y * tm->max_y, STBTE_COLOR_TILEMAP_BACKGROUND);
+   }
+
+   // step 1: traverse all the tilemap data...
+   // @OPTIMIZE crop to only region visible between UI widgets -- also necessary to avoid painting under it, etc
+   // @OPTIMIZE crop to visible region by computing correct i,j range
+   for (j=0; j < tm->max_y; ++j) {
+      int y = stbte__ui.y0 + j * tm->spacing_y - tm->scroll_y;
+      if (y + tm->spacing_y < stbte__ui.y0 || y > stbte__ui.y1)
+         continue;
+      for (i=0; i < tm->max_x; ++i) {
+         int x = stbte__ui.x0 + i * tm->spacing_x - tm->scroll_x;
+         if (x + tm->spacing_x >= stbte__ui.x0 && x < stbte__ui.x1)
+            stbte__tile(tm, x, y, i, j);
+      }
+   }
+
+   // draw grid on top of everything
+   if (stbte__ui.show_grid == 2) {
+      int x = stbte__ui.x0 - tm->scroll_x;
+      int y = stbte__ui.y0 - tm->scroll_y;
+      for (j=0; j < tm->max_y; ++j, y += tm->spacing_y)
+         stbte__draw_rect(stbte__ui.x0, y, stbte__ui.x1, y+1, STBTE_COLOR_GRID);
+      for (i=0; i < tm->max_x; ++i, x += tm->spacing_x)
+         stbte__draw_rect(x, stbte__ui.y0, x+1, stbte__ui.y1, STBTE_COLOR_GRID);
+   }
+
+   // draw the selection border
+   if (stbte__ui.event == STBTE__paint) {
+      if (stbte__ui.has_selection) {
+         int x0,y0,x1,y1;
+         x0 = stbte__ui.x0 + (stbte__ui.select_x0    ) * tm->spacing_x - tm->scroll_x;
+         y0 = stbte__ui.y0 + (stbte__ui.select_y0    ) * tm->spacing_y - tm->scroll_y;
+         x1 = stbte__ui.x0 + (stbte__ui.select_x1 + 1) * tm->spacing_x - tm->scroll_x + 1;
+         y1 = stbte__ui.y0 + (stbte__ui.select_y1 + 1) * tm->spacing_y - tm->scroll_y + 1;
+         stbte__draw_frame(x0,y0,x1,y1, (stbte__ui.ms_time & 256 ? STBTE_COLOR_SELECTION_OUTLINE1 : STBTE_COLOR_SELECTION_OUTLINE2));
+      }
+   }
+   stbte__flush_delay();
+
+   // step 2: traverse the panels
+   for (i=0; i < STBTE__num_panel; ++i) {
+      stbte__panel *p = &stbte__ui.panel[i];
+      if (stbte__ui.event == STBTE__paint) {
+         stbte__draw_rect (p->x0,p->y0,p->x0+p->width,p->y0+p->height, STBTE_COLOR_PANEL_BACKGROUND);
+         stbte__draw_frame(p->x0,p->y0,p->x0+p->width,p->y0+p->height, STBTE_COLOR_PANEL_OUTLINE);
+      }
+      // obscure tilemap data underneath panel
+      stbte__hittest(p->x0,p->y0,p->x0+p->width,p->y0+p->height, STBTE__ID2(STBTE__panel, i, 0));
+      switch (i) {
+         case STBTE__panel_toolbar:
+            if (stbte__ui.event == STBTE__paint)
+               stbte__draw_rect(p->x0,p->y0,p->x0+p->width,p->y0+p->height, STBTE_COLOR_TOOLBAR_BACKGROUND);
+            stbte__toolbar(tm,p->x0,p->y0,p->width,p->height);
+            break;
+         case STBTE__panel_info:
+            stbte__info(tm,p->x0,p->y0,p->width,p->height);
+            break;
+         case STBTE__panel_layers:
+            stbte__layers(tm,p->x0,p->y0,p->width,p->height);
+            break;
+         case STBTE__panel_categories:
+            stbte__categories(tm,p->x0,p->y0,p->width,p->height);
+            break;
+         case STBTE__panel_tiles:
+            // erase boundary between categories and tiles if they're on same side
+            if (stbte__ui.event == STBTE__paint && p->side == stbte__ui.panel[STBTE__panel_categories].side)
+               stbte__draw_rect(p->x0+1,p->y0-1,p->x0+p->width-1,p->y0+1, STBTE_COLOR_PANEL_BACKGROUND);
+            stbte__palette_of_tiles(tm,p->x0,p->y0,p->width,p->height);
+            break;
+      }
+      // draw the panel side selectors
+      for (j=0; j < 2; ++j) {
+         int result;
+         if (i == STBTE__panel_toolbar) continue;
+         result = stbte__microbutton(p->x0+p->width - 1 - 2*4 + 4*j,p->y0+2,3, STBTE__ID2(STBTE__panel, i, j+1), 0x808080,0xc0c0c0, 0);
+         if (result) {
+            switch (j) {
+               case 0: p->side = result > 0 ? STBTE__side_left : STBTE__side_right; break;
+               case 1: p->delta_height += result; break;
+            }
+         }
+      }
+   }
+
+   if (stbte__ui.panel[STBTE__panel_categories].delta_height < -5) stbte__ui.panel[STBTE__panel_categories].delta_height = -5;
+   if (stbte__ui.panel[STBTE__panel_layers    ].delta_height < -5) stbte__ui.panel[STBTE__panel_layers    ].delta_height = -5;
+
+
+   // step 3: traverse the regions to place expander controls on them
+   for (i=0; i < 2; ++i) {
+      if (stbte__region[i].active) {
+         int x = stbte__region[i].x;
+         int width;
+         if (i == STBTE__side_left)
+            width =  stbte__ui.left_width , x += stbte__region[i].width + 1;
+         else
+            width = -stbte__ui.right_width, x -= 6;
+         if (stbte__microbutton_dragger(x, stbte__region[i].y+2, 5, STBTE__ID(STBTE__region,i), 0x206020,0xffffff, 0, &width)) {
+            // if non-0, it is expanding, so retract it
+            if (stbte__region[i].retracted == 0.0)
+               stbte__region[i].retracted = 0.01f;
+            else
+               stbte__region[i].retracted = 0.0;
+         }
+         if (i == STBTE__side_left)
+            stbte__ui.left_width  =  width;
+         else
+            stbte__ui.right_width = -width;
+         if (stbte__ui.event == STBTE__tick) {
+            if (stbte__region[i].retracted && stbte__region[i].retracted < 1.0f) {
+               stbte__region[i].retracted += stbte__ui.dt*4;
+               if (stbte__region[i].retracted > 1)
+                  stbte__region[i].retracted = 1;
+            }
+         }
+      }
+   }
+
+   if (stbte__ui.event == STBTE__paint && stbte__ui.alert_msg) {
+      int w = stbte__text_width(stbte__ui.alert_msg);
+      int x = (stbte__ui.x0+stbte__ui.x1)/2;
+      int y = (stbte__ui.y0+stbte__ui.y1)/2;
+      stbte__draw_rect (x-w/2-4,y-8, x+w/2+4,y+8, 0x604020);
+      stbte__draw_frame(x-w/2-4,y-8, x+w/2+4,y+8, 0x906030);
+      stbte__draw_text (x-w/2,y-4, stbte__ui.alert_msg, w+1, 0xff8040);
+   }
+
+   if (stbte__ui.event == STBTE__tick && stbte__ui.alert_msg) {
+      stbte__ui.alert_timer -= stbte__ui.dt;
+      if (stbte__ui.alert_timer < 0) {
+         stbte__ui.alert_timer = 0;
+         stbte__ui.alert_msg = 0;
+      }
+   }
+}
+
+static void stbte__do_event(stbte_tilemap *tm)
+{
+   stbte__ui.next_hot_id = 0;
+   stbte__editor_traverse(tm);
+   stbte__ui.hot_id = stbte__ui.next_hot_id;
+
+   // automatically cancel on mouse-up in case the object that triggered it
+   // doesn't exist anymore
+   if (stbte__ui.active_id) {
+      if (stbte__ui.event == STBTE__leftup || stbte__ui.event == STBTE__rightup) {
+         if (!stbte__ui.pasting) {
+            stbte__activate(0);
+            if (stbte__ui.undoing)
+               stbte__end_undo(tm);
+            stbte__ui.scrolling = 0;
+            stbte__ui.dragging = 0;
+         }
+      }
+   }
+
+   // we could do this stuff in the widgets directly, but it would keep recomputing
+   // the same thing on every tile, which seems dumb.
+
+   if (stbte__ui.pasting) {
+      if (STBTE__IS_MAP_HOT()) {
+         // compute pasting location based on last hot
+         stbte__ui.paste_x = ((stbte__ui.hot_id >> 19) & 4095) - (stbte__ui.copy_width >> 1);
+         stbte__ui.paste_y = ((stbte__ui.hot_id >>  7) & 4095) - (stbte__ui.copy_height >> 1);
+      }
+   }
+   if (stbte__ui.dragging) {
+      if (STBTE__IS_MAP_HOT()) {
+         stbte__ui.drag_dest_x = ((stbte__ui.hot_id >> 19) & 4095) - stbte__ui.drag_offx;
+         stbte__ui.drag_dest_y = ((stbte__ui.hot_id >>  7) & 4095) - stbte__ui.drag_offy;
+      }
+   }
+}
+
+static void stbte__set_event(int event, int x, int y)
+{
+   stbte__ui.event = event;
+   stbte__ui.mx    = x;
+   stbte__ui.my    = y;
+}
+
+void stbte_draw(stbte_tilemap *tm)
+{
+   stbte__ui.event = STBTE__paint;
+   stbte__editor_traverse(tm);
+}
+
+void stbte_mouse_move(stbte_tilemap *tm, int x, int y, int shifted, int scrollkey)
+{
+   stbte__set_event(STBTE__mousemove, x,y);
+   stbte__ui.shift = shifted;
+   stbte__ui.scrollkey = scrollkey;
+   stbte__do_event(tm);
+}
+
+void stbte_mouse_button(stbte_tilemap *tm, int x, int y, int right, int down, int shifted, int scrollkey)
+{
+   static int events[2][2] = { { STBTE__leftup , STBTE__leftdown  },
+                               { STBTE__rightup, STBTE__rightdown } };
+   stbte__set_event(events[right][down], x,y);
+   stbte__ui.shift = shifted;
+   stbte__ui.scrollkey = scrollkey;
+
+   stbte__do_event(tm);
+}
+
+void stbte_mouse_wheel(stbte_tilemap *tm, int x, int y, int vscroll)
+{
+
+}
+
+void stbte_tick(stbte_tilemap *tm, float dt)
+{
+   stbte__ui.event = STBTE__tick;
+   stbte__ui.dt    = dt;
+   stbte__do_event(tm);
+   stbte__ui.ms_time += (int) (dt * 1024) + 1; // make sure if time is superfast it always updates a little
+}
+
+void stbte_mouse_sdl(stbte_tilemap *tm, const void *sdl_event, float xs, float ys, int xo, int yo)
+{
+#ifdef _SDL_H
+   SDL_Event *event = (SDL_Event *) sdl_event;
+   SDL_Keymod km = SDL_GetModState();
+   int shift = (km & KMOD_LCTRL) || (km & KMOD_RCTRL);
+   int scrollkey = 0 != SDL_GetKeyboardState(NULL)[SDL_SCANCODE_SPACE];
+   switch (event->type) {
+      case SDL_MOUSEMOTION:
+         stbte_mouse_move(tm, (int) (xs*event->motion.x+xo), (int) (ys*event->motion.y+yo), shift, scrollkey);
+         break;
+      case SDL_MOUSEBUTTONUP:
+         stbte_mouse_button(tm, (int) (xs*event->button.x+xo), (int) (ys*event->button.y+yo), event->button.button != SDL_BUTTON_LEFT, 0, shift, scrollkey);
+         break;
+      case SDL_MOUSEBUTTONDOWN:
+         stbte_mouse_button(tm, (int) (xs*event->button.x+xo), (int) (ys*event->button.y+yo), event->button.button != SDL_BUTTON_LEFT, 1, shift, scrollkey);
+         break;
+      case SDL_MOUSEWHEEL:
+         stbte_mouse_wheel(tm, stbte__ui.mx, stbte__ui.my, event->wheel.y);
+         break;
+   }
+#else
+   STBTE__NOTUSED(tm);
+   STBTE__NOTUSED(sdl_event);
+   STBTE__NOTUSED(xs);
+   STBTE__NOTUSED(ys);
+   STBTE__NOTUSED(xo);
+   STBTE__NOTUSED(yo);
+#endif
+}
+
+#endif // STB_TILEMAP_EDITOR_IMPLEMENTATION

+ 21 - 12
stb_truetype.h

@@ -1,5 +1,5 @@
-// stb_truetype.h - v0.9 - public domain
-// authored from 2009-2013 by Sean Barrett / RAD Game Tools
+// stb_truetype.h - v0.99 - public domain
+// authored from 2009-2014 by Sean Barrett / RAD Game Tools
 //
 //   This library processes TrueType files:
 //        parse files
@@ -21,7 +21,7 @@
 //   Mikko Mononen: compound shape support, more cmap formats
 //   Tor Andersson: kerning, subpixel rendering
 //
-//   Bug/warning reports:
+//   Bug/warning reports/fixes:
 //       "Zer" on mollyrocket (with fix)
 //       Cass Everitt
 //       stoiko (Haemimont Games)
@@ -33,9 +33,11 @@
 //       Anthony Pesch
 //       Johan Duparc
 //       Hou Qiming
+//       Fabian "ryg" Giesen
 //
 // VERSION HISTORY
 //
+//   0.99 (2014-09-18) fix multiple bugs with subpixel rendering (ryg)
 //   0.9  (2014-08-07) support certain mac/iOS fonts without an MS platformID
 //   0.8b (2014-07-07) fix a warning
 //   0.8  (2014-05-25) fix a few more warnings
@@ -1385,14 +1387,21 @@ void stbtt_FreeShape(const stbtt_fontinfo *info, stbtt_vertex *v)
 void stbtt_GetGlyphBitmapBoxSubpixel(const stbtt_fontinfo *font, int glyph, float scale_x, float scale_y,float shift_x, float shift_y, int *ix0, int *iy0, int *ix1, int *iy1)
 {
    int x0,y0,x1,y1;
-   if (!stbtt_GetGlyphBox(font, glyph, &x0,&y0,&x1,&y1))
-      x0=y0=x1=y1=0; // e.g. space character
-   // now move to integral bboxes (treating pixels as little squares, what pixels get touched)?
-   if (ix0) *ix0 =  STBTT_ifloor(x0 * scale_x + shift_x);
-   if (iy0) *iy0 = -STBTT_iceil (y1 * scale_y + shift_y);
-   if (ix1) *ix1 =  STBTT_iceil (x1 * scale_x + shift_x);
-   if (iy1) *iy1 = -STBTT_ifloor(y0 * scale_y + shift_y);
+   if (!stbtt_GetGlyphBox(font, glyph, &x0,&y0,&x1,&y1)) {
+      // e.g. space character
+      if (ix0) *ix0 = 0;
+      if (iy0) *iy0 = 0;
+      if (ix1) *ix1 = 0;
+      if (iy1) *iy1 = 0;
+   } else {
+      // move to integral bboxes (treating pixels as little squares, what pixels get touched)?
+      if (ix0) *ix0 = STBTT_ifloor( x0 * scale_x + shift_x);
+      if (iy0) *iy0 = STBTT_ifloor(-y1 * scale_y + shift_y);
+      if (ix1) *ix1 = STBTT_iceil ( x1 * scale_x + shift_x);
+      if (iy1) *iy1 = STBTT_iceil (-y0 * scale_y + shift_y);
+   }
 }
+
 void stbtt_GetGlyphBitmapBox(const stbtt_fontinfo *font, int glyph, float scale_x, float scale_y, int *ix0, int *iy0, int *ix1, int *iy1)
 {
    stbtt_GetGlyphBitmapBoxSubpixel(font, glyph, scale_x, scale_y,0.0f,0.0f, ix0, iy0, ix1, iy1);
@@ -1639,9 +1648,9 @@ static void stbtt__rasterize(stbtt__bitmap *result, stbtt__point *pts, int *wcou
             a=j,b=k;
          }
          e[n].x0 = p[a].x * scale_x + shift_x;
-         e[n].y0 = p[a].y * y_scale_inv * vsubsample + shift_y;
+         e[n].y0 = (p[a].y * y_scale_inv + shift_y) * vsubsample;
          e[n].x1 = p[b].x * scale_x + shift_x;
-         e[n].y1 = p[b].y * y_scale_inv * vsubsample + shift_y;
+         e[n].y1 = (p[b].y * y_scale_inv + shift_y) * vsubsample;
          ++n;
       }
    }

+ 983 - 0
tests/resample_test.cpp

@@ -0,0 +1,983 @@
+#include <malloc.h>
+
+#if defined(_WIN32) && _MSC_VER > 1200
+#define STBIR_ASSERT(x) \
+	if (!(x)) {         \
+		__debugbreak();  \
+	} else
+#else
+#include <assert.h>
+#define STBIR_ASSERT(x) assert(x)
+#endif
+
+#define STBIR_MALLOC stbir_malloc
+#define STBIR_FREE stbir_free
+
+class stbir_context {
+public:
+	stbir_context()
+	{
+		size = 1000000;
+		memory = malloc(size);
+	}
+
+	~stbir_context()
+	{
+		free(memory);
+	}
+
+	size_t size;
+	void* memory;
+} g_context;
+
+void* stbir_malloc(size_t size, void* context)
+{
+	if (!context)
+		return malloc(size);
+
+	stbir_context* real_context = (stbir_context*)context;
+	if (size > real_context->size)
+		return 0;
+
+	return real_context->memory;
+}
+
+void stbir_free(void* memory, void* context)
+{
+	if (!context)
+		free(memory);
+}
+
+//#include <stdio.h>
+void stbir_progress(float p)
+{
+	//printf("%f\n", p);
+	STBIR_ASSERT(p >= 0 && p <= 1);
+}
+
+#define STBIR_PROGRESS_REPORT stbir_progress
+
+#define STB_IMAGE_RESIZE_IMPLEMENTATION
+#define STB_IMAGE_RESIZE_STATIC
+#include "stb_image_resize.h"
+
+#define STB_IMAGE_WRITE_IMPLEMENTATION
+#include "stb_image_write.h"
+
+#define STB_IMAGE_IMPLEMENTATION
+#include "stb_image.h"
+
+#ifdef _WIN32
+#include <sys/timeb.h>
+#endif
+
+#include <direct.h>
+
+#define MT_SIZE 624
+static size_t g_aiMT[MT_SIZE];
+static size_t g_iMTI = 0;
+
+// Mersenne Twister implementation from Wikipedia.
+// Avoiding use of the system rand() to be sure that our tests generate the same test data on any system.
+void mtsrand(size_t iSeed)
+{
+	g_aiMT[0] = iSeed;
+	for (size_t i = 1; i < MT_SIZE; i++)
+	{
+		size_t inner1 = g_aiMT[i - 1];
+		size_t inner2 = (g_aiMT[i - 1] >> 30);
+		size_t inner = inner1 ^ inner2;
+		g_aiMT[i] = (0x6c078965 * inner) + i;
+	}
+
+	g_iMTI = 0;
+}
+
+size_t mtrand()
+{
+	if (g_iMTI == 0)
+	{
+		for (size_t i = 0; i < MT_SIZE; i++)
+		{
+			size_t y = (0x80000000 & (g_aiMT[i])) + (0x7fffffff & (g_aiMT[(i + 1) % MT_SIZE]));
+			g_aiMT[i] = g_aiMT[(i + 397) % MT_SIZE] ^ (y >> 1);
+			if ((y % 2) == 1)
+				g_aiMT[i] = g_aiMT[i] ^ 0x9908b0df;
+		}
+	}
+
+	size_t y = g_aiMT[g_iMTI];
+	y = y ^ (y >> 11);
+	y = y ^ ((y << 7) & (0x9d2c5680));
+	y = y ^ ((y << 15) & (0xefc60000));
+	y = y ^ (y >> 18);
+
+	g_iMTI = (g_iMTI + 1) % MT_SIZE;
+
+	return y;
+}
+
+
+inline float mtfrand()
+{
+	const int ninenine = 999999;
+	return (float)(mtrand() % ninenine)/ninenine;
+}
+
+static void resizer(int argc, char **argv)
+{
+	unsigned char* input_pixels;
+	unsigned char* output_pixels;
+	int w, h;
+	int n;
+	int out_w, out_h;
+	input_pixels = stbi_load(argv[1], &w, &h, &n, 0);
+	out_w = w*3;
+	out_h = h*3;
+	output_pixels = (unsigned char*) malloc(out_w*out_h*n);
+	//stbir_resize_uint8_srgb(input_pixels, w, h, 0, output_pixels, out_w, out_h, 0, n, -1,0);
+	stbir_resize_uint8(input_pixels, w, h, 0, output_pixels, out_w, out_h, 0, n);
+	stbi_write_png("output.png", out_w, out_h, n, output_pixels, 0);
+	exit(0);
+}
+
+static void performance(int argc, char **argv)
+{
+	unsigned char* input_pixels;
+	unsigned char* output_pixels;
+	int w, h, count;
+	int n, i;
+	int out_w, out_h, srgb=1;
+	input_pixels = stbi_load(argv[1], &w, &h, &n, 0);
+    #if 0
+    out_w = w/4; out_h = h/4; count=100; // 1
+    #elif 0
+	out_w = w*2; out_h = h/4; count=20; // 2   // note this is structured pessimily, would be much faster to downsample vertically first
+    #elif 0
+    out_w = w/4; out_h = h*2; count=50; // 3
+    #elif 0
+    out_w = w*3; out_h = h*3; count=2; srgb=0; // 4
+    #else
+    out_w = w*3; out_h = h*3; count=2; // 5   // this is dominated by linear->sRGB conversion
+    #endif
+
+	output_pixels = (unsigned char*) malloc(out_w*out_h*n);
+    for (i=0; i < count; ++i)
+        if (srgb)
+	        stbir_resize_uint8_srgb(input_pixels, w, h, 0, output_pixels, out_w, out_h, 0, n,-1,0);
+        else
+	        stbir_resize(input_pixels, w, h, 0, output_pixels, out_w, out_h, 0, STBIR_TYPE_UINT8, n,-1, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT, STBIR_COLORSPACE_LINEAR, NULL);
+	exit(0);
+}
+
+void test_suite(int argc, char **argv);
+
+int main(int argc, char** argv)
+{
+	//resizer(argc, argv);
+    //performance(argc, argv);
+
+	test_suite(argc, argv);
+	return 0;
+}
+
+void resize_image(const char* filename, float width_percent, float height_percent, stbir_filter filter, stbir_edge edge, stbir_colorspace colorspace, const char* output_filename)
+{
+	int w, h, n;
+
+	unsigned char* input_data = stbi_load(filename, &w, &h, &n, 0);
+	if (!input_data)
+	{
+		printf("Input image could not be loaded\n");
+		return;
+	}
+
+	int out_w = (int)(w * width_percent);
+	int out_h = (int)(h * height_percent);
+
+	unsigned char* output_data = (unsigned char*)malloc(out_w * out_h * n);
+
+	stbir_resize(input_data, w, h, 0, output_data, out_w, out_h, 0, STBIR_TYPE_UINT8, n, STBIR_ALPHA_CHANNEL_NONE, 0, edge, edge, filter, filter, colorspace, &g_context);
+
+	stbi_image_free(input_data);
+
+	stbi_write_png(output_filename, out_w, out_h, n, output_data, 0);
+
+	free(output_data);
+}
+
+template <typename F, typename T>
+void convert_image(const F* input, T* output, int length)
+{
+	double f = (pow(2.0, 8.0 * sizeof(T)) - 1) / (pow(2.0, 8.0 * sizeof(F)) - 1);
+	for (int i = 0; i < length; i++)
+		output[i] = (T)(((double)input[i]) * f);
+}
+
+template <typename T>
+void test_format(const char* file, float width_percent, float height_percent, stbir_datatype type, stbir_colorspace colorspace)
+{
+	int w, h, n;
+	unsigned char* input_data = stbi_load(file, &w, &h, &n, 0);
+
+	if (input_data == NULL)
+		return;
+
+
+	int new_w = (int)(w * width_percent);
+	int new_h = (int)(h * height_percent);
+
+	T* T_data = (T*)malloc(w * h * n * sizeof(T));
+    memset(T_data, 0, w*h*n*sizeof(T));
+	convert_image<unsigned char, T>(input_data, T_data, w * h * n);
+
+	T* output_data = (T*)malloc(new_w * new_h * n * sizeof(T));
+
+	stbir_resize(T_data, w, h, 0, output_data, new_w, new_h, 0, type, n, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_CATMULLROM, colorspace, &g_context);
+
+	free(T_data);
+	stbi_image_free(input_data);
+
+	unsigned char* char_data = (unsigned char*)malloc(new_w * new_h * n * sizeof(char));
+	convert_image<T, unsigned char>(output_data, char_data, new_w * new_h * n);
+
+	char output[200];
+	sprintf(output, "test-output/type-%d-%d-%d-%d-%s", type, colorspace, new_w, new_h, file);
+	stbi_write_png(output, new_w, new_h, n, char_data, 0);
+
+	free(char_data);
+	free(output_data);
+}
+
+void convert_image_float(const unsigned char* input, float* output, int length)
+{
+	for (int i = 0; i < length; i++)
+		output[i] = ((float)input[i])/255;
+}
+
+void convert_image_float(const float* input, unsigned char* output, int length)
+{
+	for (int i = 0; i < length; i++)
+		output[i] = (unsigned char)(stbir__saturate(input[i]) * 255);
+}
+
+void test_float(const char* file, float width_percent, float height_percent, stbir_datatype type, stbir_colorspace colorspace)
+{
+	int w, h, n;
+	unsigned char* input_data = stbi_load(file, &w, &h, &n, 0);
+
+	if (input_data == NULL)
+		return;
+
+	int new_w = (int)(w * width_percent);
+	int new_h = (int)(h * height_percent);
+
+	float* T_data = (float*)malloc(w * h * n * sizeof(float));
+	convert_image_float(input_data, T_data, w * h * n);
+
+	float* output_data = (float*)malloc(new_w * new_h * n * sizeof(float));
+
+	stbir_resize_float_generic(T_data, w, h, 0, output_data, new_w, new_h, 0, n, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM, colorspace, &g_context);
+
+	free(T_data);
+	stbi_image_free(input_data);
+
+	unsigned char* char_data = (unsigned char*)malloc(new_w * new_h * n * sizeof(char));
+	convert_image_float(output_data, char_data, new_w * new_h * n);
+
+	char output[200];
+	sprintf(output, "test-output/type-%d-%d-%d-%d-%s", type, colorspace, new_w, new_h, file);
+	stbi_write_png(output, new_w, new_h, n, char_data, 0);
+
+	free(char_data);
+	free(output_data);
+}
+
+void test_channels(const char* file, float width_percent, float height_percent, int channels)
+{
+	int w, h, n;
+	unsigned char* input_data = stbi_load(file, &w, &h, &n, 0);
+
+	if (input_data == NULL)
+		return;
+
+	int new_w = (int)(w * width_percent);
+	int new_h = (int)(h * height_percent);
+
+	unsigned char* channels_data = (unsigned char*)malloc(w * h * channels * sizeof(unsigned char));
+
+	for (int i = 0; i < w * h; i++)
+	{
+		int input_position = i * n;
+		int output_position = i * channels;
+
+		for (int c = 0; c < channels; c++)
+			channels_data[output_position + c] = input_data[input_position + stbir__min(c, n)];
+	}
+
+	unsigned char* output_data = (unsigned char*)malloc(new_w * new_h * channels * sizeof(unsigned char));
+
+	stbir_resize_uint8_srgb(channels_data, w, h, 0, output_data, new_w, new_h, 0, channels, STBIR_ALPHA_CHANNEL_NONE, 0);
+
+	free(channels_data);
+	stbi_image_free(input_data);
+
+	char output[200];
+	sprintf(output, "test-output/channels-%d-%d-%d-%s", channels, new_w, new_h, file);
+	stbi_write_png(output, new_w, new_h, channels, output_data, 0);
+
+	free(output_data);
+}
+
+void test_subpixel(const char* file, float width_percent, float height_percent, float s1, float t1)
+{
+	int w, h, n;
+	unsigned char* input_data = stbi_load(file, &w, &h, &n, 0);
+
+	if (input_data == NULL)
+		return;
+
+	s1 = ((float)w - 1 + s1)/w;
+	t1 = ((float)h - 1 + t1)/h;
+
+	int new_w = (int)(w * width_percent);
+	int new_h = (int)(h * height_percent);
+
+	unsigned char* output_data = (unsigned char*)malloc(new_w * new_h * n * sizeof(unsigned char));
+
+	stbir_resize_region(input_data, w, h, 0, output_data, new_w, new_h, 0, STBIR_TYPE_UINT8, n, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_SRGB, &g_context, 0, 0, s1, t1);
+
+	stbi_image_free(input_data);
+
+	char output[200];
+	sprintf(output, "test-output/subpixel-%d-%d-%f-%f-%s", new_w, new_h, s1, t1, file);
+	stbi_write_png(output, new_w, new_h, n, output_data, 0);
+
+	free(output_data);
+}
+
+unsigned int* pixel(unsigned int* buffer, int x, int y, int c, int w, int n)
+{
+	return &buffer[y*w*n + x*n + c];
+}
+
+void test_premul()
+{
+	unsigned int input[2 * 2 * 4];
+	unsigned int output[1 * 1 * 4];
+	unsigned int output2[2 * 2 * 4];
+
+	memset(input, 0, sizeof(input));
+
+	// First a test to make sure premul is working properly.
+
+	// Top left - solid red
+	*pixel(input, 0, 0, 0, 2, 4) = 255;
+	*pixel(input, 0, 0, 3, 2, 4) = 255;
+
+	// Bottom left - solid red
+	*pixel(input, 0, 1, 0, 2, 4) = 255;
+	*pixel(input, 0, 1, 3, 2, 4) = 255;
+
+	// Top right - transparent green
+	*pixel(input, 1, 0, 1, 2, 4) = 255;
+	*pixel(input, 1, 0, 3, 2, 4) = 25;
+
+	// Bottom right - transparent green
+	*pixel(input, 1, 1, 1, 2, 4) = 255;
+	*pixel(input, 1, 1, 3, 2, 4) = 25;
+
+	stbir_resize(input, 2, 2, 0, output, 1, 1, 0, STBIR_TYPE_UINT32, 4, 3, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_BOX, STBIR_COLORSPACE_LINEAR, &g_context);
+
+	float r = (float)255 / 4294967296;
+	float g = (float)255 / 4294967296;
+	float ra = (float)255 / 4294967296;
+	float ga = (float)25 / 4294967296;
+	float a = (ra + ga) / 2;
+
+	STBIR_ASSERT(output[0] == (unsigned int)(r * ra / 2 / a * 4294967296 + 0.5f)); // 232
+	STBIR_ASSERT(output[1] == (unsigned int)(g * ga / 2 / a * 4294967296 + 0.5f)); // 23
+	STBIR_ASSERT(output[2] == 0);
+	STBIR_ASSERT(output[3] == (unsigned int)(a * 4294967296 + 0.5f)); // 140
+
+	// Now a test to make sure it doesn't clobber existing values.
+
+	// Top right - completely transparent green
+	*pixel(input, 1, 0, 1, 2, 4) = 255;
+	*pixel(input, 1, 0, 3, 2, 4) = 0;
+
+	// Bottom right - completely transparent green
+	*pixel(input, 1, 1, 1, 2, 4) = 255;
+	*pixel(input, 1, 1, 3, 2, 4) = 0;
+
+	stbir_resize(input, 2, 2, 0, output2, 2, 2, 0, STBIR_TYPE_UINT32, 4, 3, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_BOX, STBIR_COLORSPACE_LINEAR, &g_context);
+
+	STBIR_ASSERT(*pixel(output2, 0, 0, 0, 2, 4) == 255);
+	STBIR_ASSERT(*pixel(output2, 0, 0, 1, 2, 4) == 0);
+	STBIR_ASSERT(*pixel(output2, 0, 0, 2, 2, 4) == 0);
+	STBIR_ASSERT(*pixel(output2, 0, 0, 3, 2, 4) == 255);
+
+	STBIR_ASSERT(*pixel(output2, 0, 1, 0, 2, 4) == 255);
+	STBIR_ASSERT(*pixel(output2, 0, 1, 1, 2, 4) == 0);
+	STBIR_ASSERT(*pixel(output2, 0, 1, 2, 2, 4) == 0);
+	STBIR_ASSERT(*pixel(output2, 0, 1, 3, 2, 4) == 255);
+
+	STBIR_ASSERT(*pixel(output2, 1, 0, 0, 2, 4) == 0);
+	STBIR_ASSERT(*pixel(output2, 1, 0, 1, 2, 4) == 255);
+	STBIR_ASSERT(*pixel(output2, 1, 0, 2, 2, 4) == 0);
+	STBIR_ASSERT(*pixel(output2, 1, 0, 3, 2, 4) == 0);
+
+	STBIR_ASSERT(*pixel(output2, 1, 1, 0, 2, 4) == 0);
+	STBIR_ASSERT(*pixel(output2, 1, 1, 1, 2, 4) == 255);
+	STBIR_ASSERT(*pixel(output2, 1, 1, 2, 2, 4) == 0);
+	STBIR_ASSERT(*pixel(output2, 1, 1, 3, 2, 4) == 0);
+}
+
+// test that splitting a pow-2 image into tiles produces identical results
+void test_subpixel_1()
+{
+	unsigned char image[8 * 8];
+
+	mtsrand(0);
+
+	for (int i = 0; i < sizeof(image); i++)
+		image[i] = mtrand() & 255;
+
+	unsigned char output_data[16 * 16];
+
+	stbir_resize_region(image, 8, 8, 0, output_data, 16, 16, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_SRGB, &g_context, 0, 0, 1, 1);
+
+	unsigned char output_left[8 * 16];
+	unsigned char output_right[8 * 16];
+
+	stbir_resize_region(image, 8, 8, 0, output_left, 8, 16, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_SRGB, &g_context, 0, 0, 0.5f, 1);
+	stbir_resize_region(image, 8, 8, 0, output_right, 8, 16, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_SRGB, &g_context, 0.5f, 0, 1, 1);
+
+	for (int x = 0; x < 8; x++)
+	{
+		for (int y = 0; y < 16; y++)
+		{
+			STBIR_ASSERT(output_data[y * 16 + x] == output_left[y * 8 + x]);
+			STBIR_ASSERT(output_data[y * 16 + x + 8] == output_right[y * 8 + x]);
+		}
+	}
+}
+
+// test that replicating an image and using a subtile of it produces same results as wraparound
+void test_subpixel_2()
+{
+	unsigned char image[8 * 8];
+
+	mtsrand(0);
+
+	for (int i = 0; i < sizeof(image); i++)
+		image[i] = mtrand() & 255;
+
+	unsigned char large_image[32 * 32];
+
+	for (int x = 0; x < 8; x++)
+	{
+		for (int y = 0; y < 8; y++)
+		{
+			for (int i = 0; i < 4; i++)
+			{
+				for (int j = 0; j < 4; j++)
+					large_image[j*4*8*8 + i*8 + y*4*8 + x] = image[y*8 + x];
+			}
+		}
+	}
+
+	unsigned char output_data_1[16 * 16];
+	unsigned char output_data_2[16 * 16];
+
+	stbir_resize(image, 8, 8, 0, output_data_1, 16, 16, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_WRAP, STBIR_EDGE_WRAP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_SRGB, &g_context);
+	stbir_resize_region(large_image, 32, 32, 0, output_data_2, 16, 16, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_WRAP, STBIR_EDGE_WRAP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_SRGB, &g_context, 0.25f, 0.25f, 0.5f, 0.5f);
+
+	{for (int x = 0; x < 16; x++)
+	{
+		for (int y = 0; y < 16; y++)
+			STBIR_ASSERT(output_data_1[y * 16 + x] == output_data_2[y * 16 + x]);
+	}}
+}
+
+// test that 0,0,1,1 subpixel produces same result as no-rect
+void test_subpixel_3()
+{
+	unsigned char image[8 * 8];
+
+	mtsrand(0);
+
+	for (int i = 0; i < sizeof(image); i++)
+		image[i] = mtrand() & 255;
+
+	unsigned char output_data_1[32 * 32];
+	unsigned char output_data_2[32 * 32];
+
+	stbir_resize_region(image, 8, 8, 0, output_data_1, 32, 32, 0, STBIR_TYPE_UINT8, 1, 0, STBIR_ALPHA_CHANNEL_NONE, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_LINEAR, NULL, 0, 0, 1, 1);
+	stbir_resize_uint8(image, 8, 8, 0, output_data_2, 32, 32, 0, 1);
+
+	for (int x = 0; x < 32; x++)
+	{
+		for (int y = 0; y < 32; y++)
+			STBIR_ASSERT(output_data_1[y * 32 + x] == output_data_2[y * 32 + x]);
+	}
+}
+
+// test that 1:1 resample using s,t=0,0,1,1 with bilinear produces original image
+void test_subpixel_4()
+{
+	unsigned char image[8 * 8];
+
+	mtsrand(0);
+
+	for (int i = 0; i < sizeof(image); i++)
+		image[i] = mtrand() & 255;
+
+	unsigned char output[8 * 8];
+
+	stbir_resize_region(image, 8, 8, 0, output, 8, 8, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_TRIANGLE, STBIR_FILTER_TRIANGLE, STBIR_COLORSPACE_LINEAR, &g_context, 0, 0, 1, 1);
+	STBIR_ASSERT(memcmp(image, output, 8 * 8) == 0);
+}
+
+static unsigned int  image88_int[8][8];
+static unsigned char image88 [8][8];
+static unsigned char output88[8][8];
+static unsigned char output44[4][4];
+static unsigned char output22[2][2];
+static unsigned char output11[1][1];
+
+void resample_88(stbir_filter filter)
+{
+	stbir_resize_uint8_generic(image88[0],8,8,0, output88[0],8,8,0, 1,-1,0, STBIR_EDGE_CLAMP, filter, STBIR_COLORSPACE_LINEAR, NULL);
+	stbir_resize_uint8_generic(image88[0],8,8,0, output44[0],4,4,0, 1,-1,0, STBIR_EDGE_CLAMP, filter, STBIR_COLORSPACE_LINEAR, NULL);
+	stbir_resize_uint8_generic(image88[0],8,8,0, output22[0],2,2,0, 1,-1,0, STBIR_EDGE_CLAMP, filter, STBIR_COLORSPACE_LINEAR, NULL);
+	stbir_resize_uint8_generic(image88[0],8,8,0, output11[0],1,1,0, 1,-1,0, STBIR_EDGE_CLAMP, filter, STBIR_COLORSPACE_LINEAR, NULL);
+}
+
+void verify_box(void)
+{
+	int i,j,t;
+
+	resample_88(STBIR_FILTER_BOX);
+
+	for (i=0; i < sizeof(image88); ++i)
+		STBIR_ASSERT(image88[0][i] == output88[0][i]);
+
+	t = 0;
+	for (j=0; j < 4; ++j)
+		for (i=0; i < 4; ++i) {
+			int n = image88[j*2+0][i*2+0]
+			      + image88[j*2+0][i*2+1]
+				  + image88[j*2+1][i*2+0]
+				  + image88[j*2+1][i*2+1];
+			STBIR_ASSERT(output44[j][i] == ((n+2)>>2) || output44[j][i] == ((n+1)>>2)); // can't guarantee exact rounding due to numerical precision
+			t += n;
+		}
+	STBIR_ASSERT(output11[0][0] == ((t+32)>>6) || output11[0][0] == ((t+31)>>6)); // can't guarantee exact rounding due to numerical precision
+}
+
+void verify_filter_normalized(stbir_filter filter, int output_size, unsigned int value)
+{
+	int i, j;
+	unsigned int output[64];
+
+	stbir_resize(image88_int[0], 8, 8, 0, output, output_size, output_size, 0, STBIR_TYPE_UINT32, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, filter, filter, STBIR_COLORSPACE_LINEAR, NULL);
+
+	for (j = 0; j < output_size; ++j)
+		for (i = 0; i < output_size; ++i)
+			STBIR_ASSERT(value == output[j*output_size + i]);
+}
+
+float round2(float f)
+{
+	return (float) floor(f+0.5f); // round() isn't C standard pre-C99
+}
+
+void test_filters(void)
+{
+	int i,j;
+
+	mtsrand(0);
+
+	for (i=0; i < sizeof(image88); ++i)
+		image88[0][i] = mtrand() & 255;
+	verify_box();
+
+	for (i=0; i < sizeof(image88); ++i)
+		image88[0][i] = 0;
+	image88[4][4] = 255;
+	verify_box();
+
+	for (j=0; j < 8; ++j)
+		for (i=0; i < 8; ++i)
+			image88[j][i] = (j^i)&1 ? 255 : 0;
+	verify_box();
+
+	for (j=0; j < 8; ++j)
+		for (i=0; i < 8; ++i)
+			image88[j][i] = i&2 ? 255 : 0;
+	verify_box();
+
+	int value = 64;
+
+	for (j = 0; j < 8; ++j)
+		for (i = 0; i < 8; ++i)
+			image88_int[j][i] = value;
+
+	verify_filter_normalized(STBIR_FILTER_BOX, 8, value);
+	verify_filter_normalized(STBIR_FILTER_TRIANGLE, 8, value);
+	verify_filter_normalized(STBIR_FILTER_CUBICBSPLINE, 8, value);
+	verify_filter_normalized(STBIR_FILTER_CATMULLROM, 8, value);
+	verify_filter_normalized(STBIR_FILTER_MITCHELL, 8, value);
+
+	verify_filter_normalized(STBIR_FILTER_BOX, 4, value);
+	verify_filter_normalized(STBIR_FILTER_TRIANGLE, 4, value);
+	verify_filter_normalized(STBIR_FILTER_CUBICBSPLINE, 4, value);
+	verify_filter_normalized(STBIR_FILTER_CATMULLROM, 4, value);
+	verify_filter_normalized(STBIR_FILTER_MITCHELL, 4, value);
+
+	verify_filter_normalized(STBIR_FILTER_BOX, 2, value);
+	verify_filter_normalized(STBIR_FILTER_TRIANGLE, 2, value);
+	verify_filter_normalized(STBIR_FILTER_CUBICBSPLINE, 2, value);
+	verify_filter_normalized(STBIR_FILTER_CATMULLROM, 2, value);
+	verify_filter_normalized(STBIR_FILTER_MITCHELL, 2, value);
+
+	verify_filter_normalized(STBIR_FILTER_BOX, 1, value);
+	verify_filter_normalized(STBIR_FILTER_TRIANGLE, 1, value);
+	verify_filter_normalized(STBIR_FILTER_CUBICBSPLINE, 1, value);
+	verify_filter_normalized(STBIR_FILTER_CATMULLROM, 1, value);
+	verify_filter_normalized(STBIR_FILTER_MITCHELL, 1, value);
+
+	{
+		// This test is designed to produce coefficients that are very badly denormalized.
+		unsigned int v = 556;
+
+		unsigned int input[100 * 100];
+		unsigned int output[11 * 11];
+
+		for (j = 0; j < 100 * 100; ++j)
+			input[j] = v;
+
+		stbir_resize(input, 100, 100, 0, output, 11, 11, 0, STBIR_TYPE_UINT32, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_TRIANGLE, STBIR_FILTER_TRIANGLE, STBIR_COLORSPACE_LINEAR, NULL);
+
+		for (j = 0; j < 11 * 11; ++j)
+			STBIR_ASSERT(v == output[j]);
+	}
+
+	{
+		// Now test the trapezoid filter for downsampling.
+		unsigned int input[3 * 1];
+		unsigned int output[2 * 1];
+
+		input[0] = 0;
+		input[1] = 255;
+		input[2] = 127;
+
+		stbir_resize(input, 3, 1, 0, output, 2, 1, 0, STBIR_TYPE_UINT32, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_BOX, STBIR_COLORSPACE_LINEAR, NULL);
+
+		STBIR_ASSERT(output[0] == (unsigned int)round2((float)(input[0] * 2 + input[1]) / 3));
+		STBIR_ASSERT(output[1] == (unsigned int)round2((float)(input[2] * 2 + input[1]) / 3));
+
+		stbir_resize(input, 1, 3, 0, output, 1, 2, 0, STBIR_TYPE_UINT32, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_BOX, STBIR_COLORSPACE_LINEAR, NULL);
+
+		STBIR_ASSERT(output[0] == (unsigned int)round2((float)(input[0] * 2 + input[1]) / 3));
+		STBIR_ASSERT(output[1] == (unsigned int)round2((float)(input[2] * 2 + input[1]) / 3));
+	}
+
+	{
+		// Now test the trapezoid filter for upsampling.
+		unsigned int input[2 * 1];
+		unsigned int output[3 * 1];
+
+		input[0] = 0;
+		input[1] = 255;
+
+		stbir_resize(input, 2, 1, 0, output, 3, 1, 0, STBIR_TYPE_UINT32, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_BOX, STBIR_COLORSPACE_LINEAR, NULL);
+
+		STBIR_ASSERT(output[0] == input[0]);
+		STBIR_ASSERT(output[1] == (input[0] + input[1]) / 2);
+		STBIR_ASSERT(output[2] == input[1]);
+
+		stbir_resize(input, 1, 2, 0, output, 1, 3, 0, STBIR_TYPE_UINT32, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_BOX, STBIR_COLORSPACE_LINEAR, NULL);
+
+		STBIR_ASSERT(output[0] == input[0]);
+		STBIR_ASSERT(output[1] == (input[0] + input[1]) / 2);
+		STBIR_ASSERT(output[2] == input[1]);
+	}
+
+	// checkerboard
+	{
+		unsigned char input[64][64];
+		unsigned char output[16][16];
+		int i,j;
+		for (j=0; j < 64; ++j)
+			for (i=0; i < 64; ++i)
+				input[j][i] = (i^j)&1 ? 255 : 0;
+		stbir_resize_uint8_generic(input[0], 64, 64, 0, output[0],16,16,0, 1,-1,0,STBIR_EDGE_WRAP,STBIR_FILTER_DEFAULT,STBIR_COLORSPACE_LINEAR,0);
+		for (j=0; j < 16; ++j)
+			for (i=0; i < 16; ++i)
+				STBIR_ASSERT(output[j][i] == 128);
+		stbir_resize_uint8_srgb_edgemode(input[0], 64, 64, 0, output[0],16,16,0, 1,-1,0,STBIR_EDGE_WRAP);
+		for (j=0; j < 16; ++j)
+			for (i=0; i < 16; ++i)
+				STBIR_ASSERT(output[j][i] == 188);
+
+
+	}
+
+	{
+		// Test trapezoid box filter
+		unsigned char input[2 * 1];
+		unsigned char output[127 * 1];
+
+		input[0] = 0;
+		input[1] = 255;
+
+		stbir_resize(input, 2, 1, 0, output, 127, 1, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_BOX, STBIR_COLORSPACE_LINEAR, NULL);
+		STBIR_ASSERT(output[0] == 0);
+		STBIR_ASSERT(output[127 / 2 - 1] == 0);
+		STBIR_ASSERT(output[127 / 2] == 128);
+		STBIR_ASSERT(output[127 / 2 + 1] == 255);
+		STBIR_ASSERT(output[126] == 255);
+		stbi_write_png("test-output/trapezoid-upsample-horizontal.png", 127, 1, 1, output, 0);
+
+		stbir_resize(input, 1, 2, 0, output, 1, 127, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_BOX, STBIR_COLORSPACE_LINEAR, NULL);
+		STBIR_ASSERT(output[0] == 0);
+		STBIR_ASSERT(output[127 / 2 - 1] == 0);
+		STBIR_ASSERT(output[127 / 2] == 128);
+		STBIR_ASSERT(output[127 / 2 + 1] == 255);
+		STBIR_ASSERT(output[126] == 255);
+		stbi_write_png("test-output/trapezoid-upsample-vertical.png", 1, 127, 1, output, 0);
+	}
+}
+
+#define UMAX32   4294967295U
+
+static void write32(char *filename, stbir_uint32 *output, int w, int h)
+{
+    stbir_uint8 *data = (stbir_uint8*) malloc(w*h*3);
+    for (int i=0; i < w*h*3; ++i)
+        data[i] = output[i]>>24;
+    stbi_write_png(filename, w, h, 3, data, 0);
+    free(data);
+}
+
+static void test_32(void)
+{
+    int w=100,h=120,x,y, out_w,out_h;
+    stbir_uint32 *input  = (stbir_uint32*) malloc(4 * 3 * w * h);
+    stbir_uint32 *output = (stbir_uint32*) malloc(4 * 3 * 3*w * 3*h);
+    for (y=0; y < h; ++y) {
+        for (x=0; x < w; ++x) {
+            input[y*3*w + x*3 + 0] = x * ( UMAX32/w );
+            input[y*3*w + x*3 + 1] = y * ( UMAX32/h );
+            input[y*3*w + x*3 + 2] = UMAX32/2;
+        }
+    }
+    out_w = w*33/16;
+    out_h = h*33/16;
+    stbir_resize(input,w,h,0,output,out_w,out_h,0,STBIR_TYPE_UINT32,3,-1,0,STBIR_EDGE_CLAMP,STBIR_EDGE_CLAMP,STBIR_FILTER_DEFAULT,STBIR_FILTER_DEFAULT,STBIR_COLORSPACE_LINEAR,NULL);
+    write32("test-output/seantest_1.png", output,out_w,out_h);
+
+    out_w = w*16/33;
+    out_h = h*16/33;
+    stbir_resize(input,w,h,0,output,out_w,out_h,0,STBIR_TYPE_UINT32,3,-1,0,STBIR_EDGE_CLAMP,STBIR_EDGE_CLAMP,STBIR_FILTER_DEFAULT,STBIR_FILTER_DEFAULT,STBIR_COLORSPACE_LINEAR,NULL);
+    write32("test-output/seantest_2.png", output,out_w,out_h);
+}
+
+
+void test_suite(int argc, char **argv)
+{
+	int i;
+	char *barbara;
+
+	_mkdir("test-output");
+
+	if (argc > 1)
+		barbara = argv[1];
+	else
+		barbara = "barbara.png";
+
+	// check what cases we need normalization for
+#if 1
+	{
+		float x, y;
+		for (x = -1; x < 1; x += 0.05f) {
+			float sums[5] = { 0 };
+			float o;
+			for (o = -5; o <= 5; ++o) {
+				sums[0] += stbir__filter_mitchell(x + o, 1);
+				sums[1] += stbir__filter_catmullrom(x + o, 1);
+				sums[2] += stbir__filter_cubic(x + o, 1);
+				sums[3] += stbir__filter_triangle(x + o, 1);
+				sums[4] += stbir__filter_trapezoid(x + o, 0.5f);
+			}
+			for (i = 0; i < 5; ++i)
+				STBIR_ASSERT(sums[i] >= 1.0 - 0.001 && sums[i] <= 1.0 + 0.001);
+		}
+
+#if 1	
+		for (y = 0.11f; y < 1; y += 0.01f) {  // Step
+			for (x = -1; x < 1; x += 0.05f) { // Phase
+				float sums[5] = { 0 };
+				float o;
+				for (o = -5; o <= 5; o += y) {
+					sums[0] += y * stbir__filter_mitchell(x + o, 1);
+					sums[1] += y * stbir__filter_catmullrom(x + o, 1);
+					sums[2] += y * stbir__filter_cubic(x + o, 1);
+					sums[4] += y * stbir__filter_trapezoid(x + o, 0.5f);
+					sums[3] += y * stbir__filter_triangle(x + o, 1);
+				}
+				for (i = 0; i < 3; ++i)
+					STBIR_ASSERT(sums[i] >= 1.0 - 0.0170 && sums[i] <= 1.0 + 0.0170);
+			}
+		}
+#endif
+	}
+#endif
+
+#if 0 // linear_to_srgb_uchar table
+	for (i=0; i < 256; ++i) {
+		float f = stbir__srgb_to_linear((i-0.5f)/255.0f);
+		printf("%9d, ", (int) ((f) * (1<<28)));
+		if ((i & 7) == 7)
+			printf("\n");
+	}
+#endif
+
+	// old tests that hacky fix worked on - test that
+	// every uint8 maps to itself
+	for (i = 0; i < 256; i++) {
+		float f = stbir__srgb_to_linear(float(i) / 255);
+		int n = stbir__linear_to_srgb_uchar(f);
+		STBIR_ASSERT(n == i);
+	}
+
+	// new tests that hacky fix failed for - test that
+	// values adjacent to uint8 round to nearest uint8
+	for (i = 0; i < 256; i++) {
+		for (float y = -0.42f; y <= 0.42f; y += 0.01f) {
+			float f = stbir__srgb_to_linear((i+y) / 255.0f);
+			int n = stbir__linear_to_srgb_uchar(f);
+			STBIR_ASSERT(n == i);
+		}
+	}
+
+	test_filters();
+
+	test_subpixel_1();
+	test_subpixel_2();
+	test_subpixel_3();
+	test_subpixel_4();
+
+	test_premul();
+
+	test_32();
+
+	// Some tests to make sure errors don't pop up with strange filter/dimension combinations.
+	stbir_resize(image88, 8, 8, 0, output88, 4, 16, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_SRGB, &g_context);
+	stbir_resize(image88, 8, 8, 0, output88, 4, 16, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_BOX, STBIR_COLORSPACE_SRGB, &g_context);
+	stbir_resize(image88, 8, 8, 0, output88, 16, 4, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_BOX, STBIR_FILTER_CATMULLROM, STBIR_COLORSPACE_SRGB, &g_context);
+	stbir_resize(image88, 8, 8, 0, output88, 16, 4, 0, STBIR_TYPE_UINT8, 1, STBIR_ALPHA_CHANNEL_NONE, 0, STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM, STBIR_FILTER_BOX, STBIR_COLORSPACE_SRGB, &g_context);
+
+	for (i = 0; i < 10; i++)
+		test_subpixel(barbara, 0.5f, 0.5f, (float)i / 10, 1);
+
+	for (i = 0; i < 10; i++)
+		test_subpixel(barbara, 0.5f, 0.5f, 1, (float)i / 10);
+
+	for (i = 0; i < 10; i++)
+		test_subpixel(barbara, 2, 2, (float)i / 10, 1);
+
+	for (i = 0; i < 10; i++)
+		test_subpixel(barbara, 2, 2, 1, (float)i / 10);
+
+	// Channels test
+	test_channels(barbara, 0.5f, 0.5f, 1);
+	test_channels(barbara, 0.5f, 0.5f, 2);
+	test_channels(barbara, 0.5f, 0.5f, 3);
+	test_channels(barbara, 0.5f, 0.5f, 4);
+
+	test_channels(barbara, 2, 2, 1);
+	test_channels(barbara, 2, 2, 2);
+	test_channels(barbara, 2, 2, 3);
+	test_channels(barbara, 2, 2, 4);
+
+	// filter tests
+	resize_image(barbara, 2, 2, STBIR_FILTER_BOX         , STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-upsample-nearest.png");
+	resize_image(barbara, 2, 2, STBIR_FILTER_TRIANGLE    , STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-upsample-bilinear.png");
+	resize_image(barbara, 2, 2, STBIR_FILTER_CUBICBSPLINE, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-upsample-bicubic.png");
+	resize_image(barbara, 2, 2, STBIR_FILTER_CATMULLROM  , STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-upsample-catmullrom.png");
+	resize_image(barbara, 2, 2, STBIR_FILTER_MITCHELL    , STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-upsample-mitchell.png");
+
+	resize_image(barbara, 0.5f, 0.5f, STBIR_FILTER_BOX         , STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-downsample-nearest.png");
+	resize_image(barbara, 0.5f, 0.5f, STBIR_FILTER_TRIANGLE    , STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-downsample-bilinear.png");
+	resize_image(barbara, 0.5f, 0.5f, STBIR_FILTER_CUBICBSPLINE, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-downsample-bicubic.png");
+	resize_image(barbara, 0.5f, 0.5f, STBIR_FILTER_CATMULLROM  , STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-downsample-catmullrom.png");
+	resize_image(barbara, 0.5f, 0.5f, STBIR_FILTER_MITCHELL    , STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, "test-output/barbara-downsample-mitchell.png");
+
+	for (i = 10; i < 100; i++)
+	{
+		char outname[200];
+		sprintf(outname, "test-output/barbara-width-%d.jpg", i);
+		resize_image(barbara, (float)i / 100, 1, STBIR_FILTER_CATMULLROM, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, outname);
+	}
+
+	for (i = 110; i < 500; i += 10)
+	{
+		char outname[200];
+		sprintf(outname, "test-output/barbara-width-%d.jpg", i);
+		resize_image(barbara, (float)i / 100, 1, STBIR_FILTER_CATMULLROM, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, outname);
+	}
+
+	for (i = 10; i < 100; i++)
+	{
+		char outname[200];
+		sprintf(outname, "test-output/barbara-height-%d.jpg", i);
+		resize_image(barbara, 1, (float)i / 100, STBIR_FILTER_CATMULLROM, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, outname);
+	}
+
+	for (i = 110; i < 500; i += 10)
+	{
+		char outname[200];
+		sprintf(outname, "test-output/barbara-height-%d.jpg", i);
+		resize_image(barbara, 1, (float)i / 100, STBIR_FILTER_CATMULLROM, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, outname);
+	}
+
+	for (i = 50; i < 200; i += 10)
+	{
+		char outname[200];
+		sprintf(outname, "test-output/barbara-width-height-%d.jpg", i);
+		resize_image(barbara, 100 / (float)i, (float)i / 100, STBIR_FILTER_CATMULLROM, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB, outname);
+	}
+
+	test_format<unsigned short>(barbara, 0.5, 2.0, STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB);
+	test_format<unsigned short>(barbara, 0.5, 2.0, STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR);
+	test_format<unsigned short>(barbara, 2.0, 0.5, STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB);
+	test_format<unsigned short>(barbara, 2.0, 0.5, STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR);
+
+	test_format<unsigned int>(barbara, 0.5, 2.0, STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB);
+	test_format<unsigned int>(barbara, 0.5, 2.0, STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR);
+	test_format<unsigned int>(barbara, 2.0, 0.5, STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB);
+	test_format<unsigned int>(barbara, 2.0, 0.5, STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR);
+
+	test_float(barbara, 0.5, 2.0, STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB);
+	test_float(barbara, 0.5, 2.0, STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR);
+	test_float(barbara, 2.0, 0.5, STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB);
+	test_float(barbara, 2.0, 0.5, STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR);
+
+	// Edge behavior tests
+	resize_image("hgradient.png", 2, 2, STBIR_FILTER_CATMULLROM, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR, "test-output/hgradient-clamp.png");
+	resize_image("hgradient.png", 2, 2, STBIR_FILTER_CATMULLROM, STBIR_EDGE_WRAP, STBIR_COLORSPACE_LINEAR, "test-output/hgradient-wrap.png");
+
+	resize_image("vgradient.png", 2, 2, STBIR_FILTER_CATMULLROM, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR, "test-output/vgradient-clamp.png");
+	resize_image("vgradient.png", 2, 2, STBIR_FILTER_CATMULLROM, STBIR_EDGE_WRAP, STBIR_COLORSPACE_LINEAR, "test-output/vgradient-wrap.png");
+
+	resize_image("1px-border.png", 2, 2, STBIR_FILTER_CATMULLROM, STBIR_EDGE_REFLECT, STBIR_COLORSPACE_LINEAR, "test-output/1px-border-reflect.png");
+	resize_image("1px-border.png", 2, 2, STBIR_FILTER_CATMULLROM, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR, "test-output/1px-border-clamp.png");
+
+	// sRGB tests
+	resize_image("gamma_colors.jpg", .5f, .5f, STBIR_FILTER_CATMULLROM, STBIR_EDGE_REFLECT, STBIR_COLORSPACE_SRGB, "test-output/gamma_colors.jpg");
+	resize_image("gamma_2.2.jpg", .5f, .5f, STBIR_FILTER_CATMULLROM, STBIR_EDGE_REFLECT, STBIR_COLORSPACE_SRGB, "test-output/gamma_2.2.jpg");
+	resize_image("gamma_dalai_lama_gray.jpg", .5f, .5f, STBIR_FILTER_CATMULLROM, STBIR_EDGE_REFLECT, STBIR_COLORSPACE_SRGB, "test-output/gamma_dalai_lama_gray.jpg");
+}

+ 5 - 0
tests/resample_test_c.c

@@ -0,0 +1,5 @@
+#define STB_IMAGE_RESIZE_IMPLEMENTATION
+#define STB_IMAGE_RESIZE_STATIC
+#include "stb_image_resize.h"
+
+// Just to make sure it will build properly with a c compiler

+ 93 - 0
tests/resize.dsp

@@ -0,0 +1,93 @@
+# Microsoft Developer Studio Project File - Name="resize" - Package Owner=<4>
+# Microsoft Developer Studio Generated Build File, Format Version 6.00
+# ** DO NOT EDIT **
+
+# TARGTYPE "Win32 (x86) Console Application" 0x0103
+
+CFG=resize - Win32 Debug
+!MESSAGE This is not a valid makefile. To build this project using NMAKE,
+!MESSAGE use the Export Makefile command and run
+!MESSAGE 
+!MESSAGE NMAKE /f "resize.mak".
+!MESSAGE 
+!MESSAGE You can specify a configuration when running NMAKE
+!MESSAGE by defining the macro CFG on the command line. For example:
+!MESSAGE 
+!MESSAGE NMAKE /f "resize.mak" CFG="resize - Win32 Debug"
+!MESSAGE 
+!MESSAGE Possible choices for configuration are:
+!MESSAGE 
+!MESSAGE "resize - Win32 Release" (based on "Win32 (x86) Console Application")
+!MESSAGE "resize - Win32 Debug" (based on "Win32 (x86) Console Application")
+!MESSAGE 
+
+# Begin Project
+# PROP AllowPerConfigDependencies 0
+# PROP Scc_ProjName ""
+# PROP Scc_LocalPath ""
+CPP=cl.exe
+RSC=rc.exe
+
+!IF  "$(CFG)" == "resize - Win32 Release"
+
+# PROP BASE Use_MFC 0
+# PROP BASE Use_Debug_Libraries 0
+# PROP BASE Output_Dir "Release"
+# PROP BASE Intermediate_Dir "Release"
+# PROP BASE Target_Dir ""
+# PROP Use_MFC 0
+# PROP Use_Debug_Libraries 0
+# PROP Output_Dir "Release"
+# PROP Intermediate_Dir "Release"
+# PROP Ignore_Export_Lib 0
+# PROP Target_Dir ""
+# ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c
+# ADD CPP /nologo /G6 /W3 /GX /Z7 /O2 /I ".." /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c
+# ADD BASE RSC /l 0x409 /d "NDEBUG"
+# ADD RSC /l 0x409 /d "NDEBUG"
+BSC32=bscmake.exe
+# ADD BASE BSC32 /nologo
+# ADD BSC32 /nologo
+LINK32=link.exe
+# ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386
+# ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386
+
+!ELSEIF  "$(CFG)" == "resize - Win32 Debug"
+
+# PROP BASE Use_MFC 0
+# PROP BASE Use_Debug_Libraries 1
+# PROP BASE Output_Dir "Debug"
+# PROP BASE Intermediate_Dir "Debug"
+# PROP BASE Target_Dir ""
+# PROP Use_MFC 0
+# PROP Use_Debug_Libraries 1
+# PROP Output_Dir "Debug"
+# PROP Intermediate_Dir "Debug"
+# PROP Target_Dir ""
+# ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c
+# ADD CPP /nologo /W3 /Gm /GX /ZI /Od /I ".." /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c
+# ADD BASE RSC /l 0x409 /d "_DEBUG"
+# ADD RSC /l 0x409 /d "_DEBUG"
+BSC32=bscmake.exe
+# ADD BASE BSC32 /nologo
+# ADD BSC32 /nologo
+LINK32=link.exe
+# ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept
+# ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept
+
+!ENDIF 
+
+# Begin Target
+
+# Name "resize - Win32 Release"
+# Name "resize - Win32 Debug"
+# Begin Source File
+
+SOURCE=.\resample_test.cpp
+# End Source File
+# Begin Source File
+
+SOURCE=..\stb_image_resize.h
+# End Source File
+# End Target
+# End Project

+ 9 - 1
tests/stb.dsp

@@ -66,7 +66,7 @@ LINK32=link.exe
 # PROP Ignore_Export_Lib 0
 # PROP Target_Dir ""
 # ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c
-# ADD CPP /nologo /MTd /W3 /GX /Zd /Od /I ".." /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /D "MAIN_TEST" /FR /FD /GZ /c
+# ADD CPP /nologo /MTd /W3 /GX /Zi /Od /I ".." /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /D "TT_TEST" /FR /FD /GZ /c
 # SUBTRACT CPP /YX
 # ADD BASE RSC /l 0x409 /d "_DEBUG"
 # ADD RSC /l 0x409 /d "_DEBUG"
@@ -106,10 +106,18 @@ SOURCE=..\stb_dxt.h
 # End Source File
 # Begin Source File
 
+SOURCE=..\stb_herringbone_wang_tile.h
+# End Source File
+# Begin Source File
+
 SOURCE=..\stb_image.h
 # End Source File
 # Begin Source File
 
+SOURCE=..\stb_image_resize.h
+# End Source File
+# Begin Source File
+
 SOURCE=..\stb_image_write.h
 # End Source File
 # Begin Source File

+ 12 - 0
tests/stb.dsw

@@ -63,6 +63,18 @@ Package=<4>
 
 ###############################################################################
 
+Project: "resize"=.\resize\resize.dsp - Package Owner=<4>
+
+Package=<5>
+{{{
+}}}
+
+Package=<4>
+{{{
+}}}
+
+###############################################################################
+
 Project: "stb"=.\stb.dsp - Package Owner=<4>
 
 Package=<5>

+ 8 - 0
tests/test_c_compilation.c

@@ -5,6 +5,7 @@
 #define STB_DIVIDE_IMPLEMENTATION
 #define STB_IMAGE_IMPLEMENTATION
 #define STB_HERRINGBONE_WANG_TILE_IMEPLEMENTATIOn
+#define STB_IMAGE_RESIZE_IMPLEMENTATION
 
 #include "stb_herringbone_wang_tile.h"
 #include "stb_image.h"
@@ -13,3 +14,10 @@
 #include "stb_dxt.h"
 #include "stb_c_lexer.h"
 #include "stb_divide.h"
+#include "stb_image_resize.h"
+
+
+#define STBTE_DRAW_RECT(x0,y0,x1,y1,color) 0
+#define STBTE_DRAW_TILE(x,y,id,highlight)  0
+#define STB_TILEMAP_EDITOR_IMPLEMENTATION
+#include "stb_tilemap_editor.h"

+ 5 - 0
tests/test_cpp_compilation.cpp

@@ -15,3 +15,8 @@
 #include "stb_divide.h"
 #include "stb_image.h"
 #include "stb_herringbone_wang_tile.h"
+
+#define STBTE_DRAW_RECT(x0,y0,x1,y1,color)
+#define STBTE_DRAW_TILE(x,y,id,highlight)
+#define STB_TILEMAP_EDITOR_IMPLEMENTATION
+#include "stb_tilemap_editor.h"

+ 13 - 0
tests/test_truetype.c

@@ -4,14 +4,27 @@
 #include <stdio.h>
 
 char ttf_buffer[1<<25];
+unsigned char output[512*100];
 
 #ifdef TT_TEST
+
+void debug(void)
+{
+   stbtt_fontinfo font;
+   fread(ttf_buffer, 1, 1<<25, fopen("c:/x/lm/LiberationMono-Regular.ttf", "rb"));
+   stbtt_InitFont(&font, ttf_buffer, 0);
+
+   stbtt_MakeGlyphBitmap(&font, output, 6, 9, 512, 5.172414E-03f, 5.172414E-03f, 54);
+}
+
 int main(int argc, char **argv)
 {
    stbtt_fontinfo font;
    unsigned char *bitmap;
    int w,h,i,j,c = (argc > 1 ? atoi(argv[1]) : 34807), s = (argc > 2 ? atoi(argv[2]) : 32);
 
+   debug();
+
    fread(ttf_buffer, 1, 1<<25, fopen(argc > 3 ? argv[3] : "c:/windows/fonts/mingliu.ttc", "rb"));
 
    stbtt_InitFont(&font, ttf_buffer, stbtt_GetFontOffsetForIndex(ttf_buffer,0));

+ 14 - 12
tools/README.list

@@ -1,12 +1,14 @@
-stb_vorbis.c      | audio    | decode ogg vorbis files from file/memory to float/16-bit signed output
-stb_image.h       | graphics | image loading/decoding from file/memory: JPG, PNG, TGA, BMP, PSD, GIF, HDR, PIC
-stb_truetype.h    | graphics | parse, decode, and rasterize characters from truetype fonts
-stb_image_write.h | graphics | image writing to disk: PNG, TGA, BMP
-stretchy_buffer.h | utility  | typesafe dynamic array for C (i.e. approximation to vector<>), doesn't compile as C++
-stb_textedit.h    | UI       | guts of a text editor for games etc implementing them from scratch
-stb_dxt.h         | 3D graphics | Fabian "ryg" Giesen's real-time DXT compressor
-stb_herringbone_wang_tile.h | games | herringbone Wang tile map generator
-stb_perlin.h      | 3D graphics | revised Perlin noise (3D input, 1D output)
-stb_c_lexer.h     | parsing  | simplify writing parsers for C-like languages
-stb_divide.h      | math     | more useful 32-bit modulus e.g. "euclidean divide"
-stb.h             | misc     | helper functions for C, mostly redundant in C++; basically author's personal stuff
+stb_vorbis.c                | audio       | decode ogg vorbis files from file/memory to float/16-bit signed output
+stb_image.h                 | graphics    | image loading/decoding from file/memory: JPG, PNG, TGA, BMP, PSD, GIF, HDR, PIC
+stb_truetype.h              | graphics    | parse, decode, and rasterize characters from truetype fonts
+stb_image_write.h           | graphics    | image writing to disk: PNG, TGA, BMP
+stb_image_resize.h          | graphics    | resize images larger/smaller with good quality
+stretchy_buffer.h           | utility     | typesafe dynamic array for C (i.e. approximation to vector<>), doesn't compile as C++
+stb_textedit.h              | UI          | guts of a text editor for games etc implementing them from scratch
+stb_dxt.h                   | 3D graphics | Fabian "ryg" Giesen's real-time DXT compressor
+stb_perlin.h                | 3D graphics | revised Perlin noise (3D input, 1D output)
+stb_tilemap_editor.h        | games       | embeddable tilemap editor
+stb_herringbone_wang_tile.h | games       | herringbone Wang tile map generator
+stb_c_lexer.h               | parsing     | simplify writing parsers for C-like languages
+stb_divide.h                | math        | more useful 32-bit modulus e.g. "euclidean divide"
+stb.h                       | misc        | helper functions for C, mostly redundant in C++; basically author's personal stuff

+ 4 - 0
tools/make_readme.dsp

@@ -84,5 +84,9 @@ LINK32=link.exe
 
 SOURCE=.\make_readme.c
 # End Source File
+# Begin Source File
+
+SOURCE=.\README.list
+# End Source File
 # End Target
 # End Project