While getting rid of extraneous junk in an application package is easy using Trimmit, the only way to prevent "code bloat" (and accompanying excess RAM and CPU usage) is through good programming practices. Where most developers fall short is in poor optimization of borrowed code. Let’s take CTGradient as an example as it’s well known and used (or more accurately, abused) in dozens of applications.
CTGradient contains an incredible diversity of built-in gradients, gradient styles, and methods for creating fancy rainbows, radial gradients, linear gradients, aqua gradients, and a number of other interesting class and instance methods. It allows you to dynamically alter gradients by adding or removing colors, changing the level of transparency, filling NSRects or NSBezierPaths, rotating the gradient, etc. For demonstration purposes, all these features are excellent. For production, this is a nightmare.
CTGradient.m weighs in at over 1300 lines of code.
Ignoring Mr. Weider’s unique style of formatting, take a look through the code. If you’re putting this class in an application, you can immediately remove over a thousand lines. Just like that. But if you take some time to understand what’s going on, you can optimize this thing till it runs like a Ferrari.
You’ll notice Trimmit uses a gradient background for it’s window. Let’s cut down CTGradient until it matches the level of optimization of Trimmit’s gradient code.
Clear the junk
Firstly, let’s remove the methods we know for sure we won’t need. Remove the following methods completely from both the interface and the implementation (scroll down for more):
// We don't need the preset gradients
+ (id)dividerGradient;
+ (id)statusBarGradient;
+ (id)aquaSelectedGradient;
+ (id)aquaNormalGradient;
+ (id)aquaPressedGradient;
+ (id)unifiedSelectedGradient;
+ (id)unifiedNormalGradient;
+ (id)unifiedPressedGradient;
+ (id)unifiedDarkGradient;
+ (id)sourceListSelectedGradient;
+ (id)sourceListUnselectedGradient;
+ (id)rainbowGradient;
+ (id)hydrogenSpectrumGradient;
// We won't make any drastic modifications to the gradient once created
- (CTGradient *)gradientWithAlphaComponent:(float)alpha;
- (CTGradient *)addColorStop:(NSColor *)color atPosition:(float)position;
- (CTGradient *)removeColorStopAtIndex:(unsigned)index;
- (CTGradient *)removeColorStopAtPosition:(float)position;
- (CTGradientBlendingMode)blendingMode;
- (NSColor *)colorStopAtIndex:(unsigned)index;
- (NSColor *)colorAtPosition:(float)position;
// Now we don't need to conform to the NSCopying and NSCoding protocols
- (id)copyWithZone:(NSZone *)zone;
- (void)encodeWithCoder:(NSCoder *)coder;
- (id)initWithCoder:(NSCoder *)coder;
// We only need to fill a simple NSRect
- (void)drawSwatchInRect:(NSRect)rect;
- (void)radialFillRect:(NSRect)rect;
- (void)fillBezierPath:(NSBezierPath *)path angle:(float)angle;
- (void)radialFillBezierPath:(NSBezierPath *)path;
// Remove the entire (Private) category;
// but leave -addElement:
- (void)_commonInit;
// move setBlendingMode code into init
- (void)setBlendingMode:(CTGradientBlendingMode)mode;
- (CTGradientElement *)elementAtIndex:(unsigned)index;
- (CTGradientElement)removeElementAtIndex:(unsigned)index;
- (CTGradientElement)removeElementAtPosition:(float)position;
The following C functions are now unused:
static void chromaticEvaluation(void *info, const float *in, float *out);
static void inverseChromaticEvaluation(void *info, const float *in, float *out);
static void transformRGB_HSV(float *components);
static void transformHSV_RGB(float *components);
static void resolveHSV(float *color1, float *color2);
And also remove the following from the header:
typedef enum _CTBlendingMode
{
CTLinearBlendingMode,
CTChromaticBlendingMode,
CTInverseChromaticBlendingMode
} CTGradientBlendingMode;
Remove the <protocols> and unnecessary instance variables so the interface looks like this:
@interface CTGradient : NSObject {
CTGradientElement* elementList;
CGFunctionRef gradientFunction;
}
+ (id)gradientWithBeginningColor:(NSColor *)begin endingColor:(NSColor *)end;
- (void)fillRect:(NSRect)rect angle:(float)angle;
- (void)addElement:(CTGradientElement *)newElement;
@end
Before we go further, we’ll need to take a detour and fix the remaining code so that it compiles. That’s easy enough - just remove any references to code we’ve removed.
If you’ve been following along, your CTGradient should now look like this (also made it readable!).
In a few short minutes, we’re down from more than 1300 to a little over 200 lines.
It gets better.
Optimize it
First stop, fillRect:angle:. Since the gradient for Trimmit’s window only runs vertically (angle 90), we can straight away take out the angle argument, and also cut the entire if / else structure with the angle down to the two lines that are under the if(angle == 90). Ah, much better.
Now things start getting a teeny bit more complex - and fun.
The CTGradient code is written so that one can have many different color stops. The CTGradientElement struct has a nextElement which points to the next CTGradientElement and the position of the current element is stored under the float position. However, we need just two - the starting and ending shades.
Let’s take a look at cutting the multiple elements down to just two.
The addElement: method has a lot of looping to insert the element at the correct place. However, since we know that we only ever need two, we can cut it down to this:
- (void)addElement:(CTGradientElement *)newElement {
if(elementList) {
// elementList exists, add second element
elementList->nextElement = malloc(sizeof(CTGradientElement));
*(elementList->nextElement) = *newElement;
elementList->nextElement->nextElement = 0;
} else {
// no elements - add first element
elementList = malloc(sizeof(CTGradientElement));
*elementList = *newElement;
elementList->nextElement = 0;
}
}
Similarly with dealloc,
- (void)dealloc {
CGFunctionRelease(gradientFunction);
free(elementList->nextElement);
free(elementList);
[super dealloc];
}
While we’re at it, let’s clean up init as well:
- (id)init {
if(self = [super init]) {
CGFunctionCallbacks evaluationCallbackInfo = {0 , &linearEvaluation, 0};
static const float input_value_range[2] = { 0, 1 };
static const float output_value_ranges[8] = { 0, 1, 0, 1, 0, 1, 0, 1 };
gradientFunction = CGFunctionCreate(&elementList, 1, input_value_range, 4, output_value_ranges, &evaluationCallbackInfo);
}
return self;
}
Now we head to the linearEvaluation function. info passed in refers to elementList (see the CGFunctionCreate call). Again, since we know that we’ll only ever have two elements we can reduce it to:
void linearEvaluation (void *info, const float *in, float *out) {
float position = *in;
CTGradientElement *color1 = *(CTGradientElement **)info;
CTGradientElement *color2 = color1->nextElement;
out[0] = (color2->red - color1->red)*position + color1->red;
out[1] = (color2->green - color1->green)*position + color1->green;
out[2] = (color2->blue - color1->blue)*position + color1->blue;
out[3] = (color2->alpha - color1->alpha)*position + color1->alpha;
}
And now we can finally remove the lines from +gradientWithBeginningColor:endingColor: where the positions are set, and also remove the position float from the CTGradientElement struct.
We’re now down to a little over 100 lines. Your CTGradientElement.m should now be looking something like this. Going well, but let’s take things up a notch.
For Trimmit’s background, we’re not interested in the red, green, blue and alpha components - we just need a grayscale shading. This is the most fun part.
Instead of float red, green, blue, alpha; in the CTGradientElement struct, we can have just float shade;.
Now, during init we have:
CGFunctionCallbacks evaluationCallbackInfo = {0 , &linearEvaluation, 0};
static const float input_value_range[2] = { 0, 1 };
static const float output_value_ranges[8] = { 0, 1, 0, 1, 0, 1, 0, 1 };
gradientFunction = CGFunctionCreate(&elementList, 1, input_value_range, 4, output_value_ranges, &evaluationCallbackInfo);
The third argument, input_value_range is the domain. This gets passed into linearEvalution through *in.This is the independent variable. When we’re drawing the gradient, linearEvaluation is called with *in (the domain) starting at the first value of input_value_range and ending up at the second. So, something like 0.000, 0.001, 0.002 … 0.998, 0.999, 1.000. The function’s job is to set the the color components for each given value of the domain.
The first optimization we can make here is to return only one channel instead of four. This changes it to:
- (id)init {
if (self = [super init]) {
CGFunctionCallbacks evaluationCallbackInfo = {0, &linearEvaluation, 0};
static const float range[2] = {0, 1};
static const float domain[2] = {0, 1};
gradientFunction = CGFunctionCreate(&elementList, 1, domain, 1, range, &evaluationCallbackInfo);
}
return self;
}
This change impacts back on the linearEvaluation function - now it only needs to return one channel:
void linearEvaluation (void *info, const float *in, float *out) {
CTGradientElement *color1 = *(CTGradientElement **)info;
out[0] = (color1->nextElement->shade - color1->shade)*(*in) + color1->shade;
}
And don’t forget +gradientWithBeginningColor:endingColor:. We improve performance here, as we only need the shade - not a color. We can rename it to something appropriate.
+ (id)gradientWithBeginningShade:(float)begin endingShade:(float)end {
id newInstance = [[[self class] alloc] init];
CTGradientElement color1, color2;
color1.shade = begin; color2.shade = end;
[newInstance addElement:&color1];
[newInstance addElement:&color2];
return [newInstance autorelease];
}
Now we’re only returning one grayscale channel. But hold on, we’re still in the RGB colorspace! That’s easily fixed. In fillRect:, replace:
#if MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_X_VERSION_10_4
CGColorSpaceRef colorspace = CGColorSpaceCreateWithName(kCGColorSpaceGenericRGB);
#else
CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceRGB();
#endif
with:
CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceGray();
CGColorSpaceCreateDeviceGray is device-dependent on Mac OS X 10.3 and below, but starting from Tiger, it’s now device-independent which is good in terms of appearance.
While we’re in fillRect:, we can also make a few more improvements:
- (void)fillRect:(NSRect)rect {
CGContextRef currentContext = [[NSGraphicsContext currentContext] graphicsPort];
CGContextSaveGState(currentContext);
CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceGray();
CGContextClipToRect(currentContext, *(CGRect *)&rect);
CGShadingRef myCGShading = CGShadingCreateAxial(colorspace, CGPointMake(0, 0), CGPointMake(0, NSMaxY(rect)), gradientFunction, 0, 0);
CGContextDrawShading(currentContext, myCGShading);
CGShadingRelease(myCGShading);
CGColorSpaceRelease(colorspace);
CGContextRestoreGState(currentContext);
}
I moved the startPoint and endPoint CGPoints straight into the CGShadingCreateAxial call. Also, we only ever fill the whole view with the gradient, so just put in 0 instead of using NSMinX and NSMinY. We could replace the CGPointMake(0, 0) with (CGPoint){0,0}, but I’m not sure if that’d be noticeably faster.
Our code is at a lean 80 lines. Now for the biggest change yet.
Heading back to dear old linearEvaluation. Look at this line:
out[0] = (color1->nextElement->shade - color1->shade)*(*in) + color1->shade;
What’s happening here? Think about it like this:
Essentially, we’re just "graphing" a straight line:
y = mx + c
where x is the position, *in, and y is the shade we return (out[0]), and c is the initial shade.
Since our x value only goes from 0 to 1 (see domain in CGFunctionCreate), at x = 0, we will have y = c. At x = 1, we’ll have y = m + c.

So we only need the start shade as our *info value, since we can have m as the difference between the final and initial shades. Let’s effect this in our code.
We can get rid of the CTGradientElement struct. Remove from the header:
typedef struct _CTGradientElement {
float shade;
struct _CTGradientElement *nextElement;
} CTGradientElement;
Instead of the struct, we’ll use a single shade instance variable. We don’t need the +gradientWithBeginningShade:endingShade: anymore, nor do we need -addElement:. So our interface now looks like this:
@interface CTGradient : NSObject {
float shade;
CGFunctionRef gradientFunction;
}
- (void)fillRect:(NSRect)rect;
@end
From the implementation, we can remove the free calls in dealloc as well as the whole of +gradientWithBeginningShade:endingShade: and addElement:. We can set the shade in the init method (and replace &elementList with &shade), and change the linearEvaluation function to:
void linearEvaluation (void *info, const float *in, float *out) {
out[0] = *(float*)info + (*in)*0.1;
}
The 0.1 is what determines how much brighter the top of the gradient is (it’s the m value of our straight line) compared to the bottom.
But since the (*in)*0.1 will go from 0 to 0.1 (as *in goes from 0 to 1), we can actually change the domain we declare for CGFunctionCreate to {0, 0.1} and have our function as:
void linearEvaluation (void *info, const float *in, float *out) {
out[0] = *(float*)info + *in;
}
You can now add accessors for shade if you need to change the shade of the gradient or you could even have an initWithShade: method. In fact, here’s an Xcode project with the gradient.
We now have our gradient code down from 1300 lines to just over 30 lines.
We’ve almost reached a similar level of optimization to Trimmit’s gradient. There will be further optimizations you can make, but it depends on what you’re using the gradient for. For example, if the gradient you’re drawing is always the same, you can optimize further by removing the shade instance variable and all references to it, and hardcode the value into the linearEvaluation function.
Getting to the end of this article, you’re probably thinking that it would be a bit of a waste to use CTGradient if by the time you optimize it for your application, it doesn’t resemble the original at all. And you’re quite right. The documentation already shows you how to draw gradients, yet the number of applications using CTGradient - the whole 1300 lines of it - is astonishing.
Please: When you use other people’s code, don’t put it in without a thought. Go through it, understand it, and optimize it for your specific need. For the better performance and reduced RAM usage, computers will thank you.
124 Comments so far
Leave a commentPages: [1] 2 3 4 » Show All
I’d be surprised if it actually ran noticeably faster, but it certainly improves readability.
announced by Taybin on November 21, 2007 11:04 am | Permalink
Always skeptics. We went from 1300 lines of code, to under 30. There will definitely be huge gains in performance. We’re no longer passing a struct around with four floats and a pointer, no more create / destroy every time, significantly less computation in the function, etc., etc. (I speak from experience. The memory usage for Trimmit went waaay down after some smart optimizations.)
I think the way CTGradient is formatted, any change would improve readability.
stated by Ankur on November 21, 2007 11:20 am | Permalink
“I think the way CTGradient is formatted, any change would improve readability.”
Yeah, that’s like the worst of all worlds formatting.
uttered by Taybin on November 21, 2007 11:46 am | Permalink
While I agree that this change is more readable, show me a benchmark, a profile, some numbers that there are these “huge gains” in performance that you speak of.
Give us the numbers! The skeptics will always be here without hard evidence. We’re doing computing here, not tweaking of cocktail recipes. Back up what you say with data, and the skeptics will all go away.
Because after all,
“Premature optimization is the root of all evil” — CarHoare
http://c2.com/cgi/wiki?PrematureOptimization
determined by Joe Goh on November 21, 2007 11:59 am | Permalink
Agree that you need to post real benchmarks. This was a great exercise in optimization (well, a great exercise in spatial optimization at least - a poor exercise in performance optimization because you didn’t measure!). I’m glad you posted it, but the fact that you’re letting the benefits float in theoretical land makes the exercise a lot less admirable.
If you can do real performance testing that shows a significant benefit, you’ll not only win more respect from readers, but also convince some people to switch to your code.
The reason the original code is “Just Fine” to most of us? We’re more than happy to give up the memory usage for 1300 lines of code in order to have a simple, drop-in solution for gradient rendering. I like your refined and minimalist expression of the code, and will probably switch to it if there are demonstrable benefits. Without demonstrable benefits? The risk of switching code bases isn’t worth it.
reasonded by Daniel Jalkut on November 21, 2007 12:14 pm | Permalink
Following up … in general I would say it’s better to use code without reviewing it or obsessing over the finer details. After all, we take the pros and cons of framework code from Apple in the MEGABYTES. Unless the code you’re importing can be fixed to the measurable benefit of your application, then it’s not worth reviewing it as long as it satisfies the criteria you imported it for.
proclaimed by Daniel Jalkut on November 21, 2007 12:18 pm | Permalink
Be aware that massively editing someone else’s code means that you’ll have to do it all over again when you update to a newer version of it.
published by Peter Hosey on November 21, 2007 12:38 pm | Permalink
I’m not saying that the CTGradient rewrite here is incorrect or buggy, but rewriting correct code that is already known to be widely used means that we’re risking introducing bugs into our products for no known benefit, excepting readability for us.
A relevant excerpt for this discussion from this great article from Joel On Software (which I highly recommend everyone to read): http://www.joelonsoftware.com/articles/fog0000000069.html
“There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:
It’s harder to read code than to write it.”
“The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they’ve been fixed. There’s nothing wrong with it. It doesn’t acquire bugs just by sitting around on your hard drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart, that rusts just sitting in the garage? Is software like a teddy bear that’s kind of gross if it’s not made out of all new material?
Back to that two page function. Yes, I know, it’s just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I’ll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn’t have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.
Each of these bugs took weeks of real-world usage before they were found. The programmer might have spent a couple of days reproducing the bug in the lab and fixing it. If it’s like a lot of bugs, the fix might be one line of code, or it might even be a couple of characters, but a lot of work and time went into those two characters.
When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.”
voiced by Joe Goh on November 21, 2007 12:44 pm | Permalink
Ok, I wrote an app with the full CTGradient, and another with just the gradient.
And making the window fullscreen:
The full CTGradient is expensive. And the “optGrad” app could be optimized even further.
That’s not the point - “my code” is hardly 30 lines. You could write the same thing just by looking at the docs. I’m not trying to put down CTGradient and get people to use my code. I’m trying to tell people that such wastage in a shipping app is not on.
You’re right and I’ve had issues along these lines before with PHP, but it doesn’t apply to this case - CTGradient is 1300 lines of examples and demos. The gradient code taken almost straight from the documentation and optimized is tiny. It’s Apple’s code. You won’t have to “do it all again” when there’s a new version of CTGradient.
I’m talking about 1300 lines of EXAMPLES and DEMONSTRATIONS shipping with dozens - probably more - applications. The gradient code is tiny. It works because it’s from the docs. Have you read the article?
stated by Ankur on November 21, 2007 1:03 pm | Permalink
Awh, not a fan of Whitesmiths?
http://en.wikipedia.org/wiki/Indent_style#Whitesmiths_style
uttered by Chad Weider on November 21, 2007 1:11 pm | Permalink
Sends shivers down my spine
The BSD KNF style seems the most readable to me.
professed by Ankur on November 21, 2007 1:17 pm | Permalink
That’s what that is? I remember seeing it in Icon source code and I wondered where the hell they had gotten that bizarre formatting.
published by Taybin on November 21, 2007 2:14 pm | Permalink
If you use the subversion vendor branch pattern, this isn’t a problem. It’s the only sane way to use 3rd party libraries.
written by Taybin on November 21, 2007 2:15 pm | Permalink
Would you care to share how you got those memory usage numbers for optGrad and CTGradient? I’m running both projects and they are always within about a 0.5 mb of each other. Plus, it’s not really a fair estimate since the apps do slightly different things (CTGradient.app has controls and other code).
expressed by Matt on November 21, 2007 3:24 pm | Permalink
It’s a different optGrad.
Here.
Especially when you resize the windows, CTGradient goes ballistic. At least on my computer.
Wha…?
voiced by Ankur on November 21, 2007 4:03 pm | Permalink
If CTGradient really causes an app to use an additional 10MB of memory … yeah … that’s VERY interesting. That’s your punch line. Not the 30 lines of code.
But let’s see some more analysis on that front. 10 MB? What the hell is it doing?
voiced by Daniel Jalkut on November 21, 2007 4:08 pm | Permalink
Assuming you do have the resources to go over, understand, and change code other people wrote, could you also give a concise summary of what should be done licensing-wise? i.e “Giving back to the community” - do I need to post those 30 lines somewhere? As a separate file? As a method in the original file? Don’t I need to document the limitations I’ve decided upon somewhere?
uttered by uv on November 21, 2007 4:08 pm | Permalink
Daniel, it’s not always that dramatic:
Depends what you do with it, and what the user does with the window.
I suppose that depends on the license the code was released under. In this case our 30 lines can be accomplished straight from the docs, so no issues. But “Giving back to the community”, as you say, is always good.
revealed by Ankur on November 21, 2007 4:16 pm | Permalink
I’d rather spend a day on refining my UI than spending the day optimizing a drawing routine that gets called once in a blue. I prefer apps by developers who’d do the same.
As computers get faster, our libraries get higher abstractions. This is not a bad thing.
written by Joachim Bengtsson on November 21, 2007 5:00 pm | Permalink
Pages: [1] 2 3 4 » Show All
Leave a comment