|Posted by firstname.lastname@example.org on October 11, 2014 at 4:05 AM|
I mentioned before how blurring is a pretty expensive operation so it got me thinking. For dynamic content, we need to do the full thing, and what we have in Qt Graphical Effects is decent, but for static images there are techniques to do it a lot faster.
We can load images from disk, do a bit of processing, and then be able to dynamically animate from sharp to heavily blurred with close to the same rendering performance as blending a normal image, and with minimal memory overhead. Inspired by articles like this and this, I implemented something I think can be quite useful in Qt Quick. This is done by using custom mipmaps with mipmap bias. A QML module implementing this feature can be found here.
Below are two screenshots from the test application. The blur ratio is adjusted to keep focus on either the droids in the foreground (topmost) or on the walkers in the background (lowermost).
Mipmapping is normally used to improve performance and reduce aliasing in downscaled textures. In our case, we want to use the mipmap levels to render blurred images instead. Look at the following example:
The original image is on the left. Towards the right are the mipmap levels 1, 2 and 3 for this image, generated using glGenerateMipmaps(). The lower row shows the mipmap levels in their original size and the topmost shows the mipmap levels scaled back up to the size of the original image. Using mipmap bias, an optional argument to the texture sampling function, we can pick a higher mipmap level. When combined with GL_LINEAR_MIPMAP_LINEAR, we can interpolate both in x and y and also between the mipmap levels, seamlessly moving from sharp to coarse. However, the default mipmaps give quite blocky results.
The mipmap levels above are created automatically using glGenerateMipmaps(). It is possible to specify each level manually as well. In the following image, I’ve created mipmap levels by scaling the image down with linear sampling (basically 2x2 box sampling), then applied a 3x3 box blur. The process is repeated for each level based on the previous image. It is not a true gaussian blur, but it comes out quite decent.
The result is that the mipmap levels are now blurred instead of blocky. Using these mipmap levels and the mipmap bias, we can now select an arbitrary blurryness with a single texture2D() call in the fragment shader. The resulting performance is equivalent to that of blending a normal image, which means this approach is fully doable on mobile GPUs. The mipmap levels adds about 1/3 to the memory consumption, which is quite acceptable.
Please note that because of the manual downscaling and box blur, the initial texture upload is significantly slower than a plain upload, so it needs be done in controlled places, like a loading screen or during application startup, to avoid hickups in the animation. For a very controlled setup, one could imagine generating all the mipmap levels offline and just upload them in the application. That would save a bit of time, but makes it a bit less flexible. Note also that mipmapping does not mix with a texture atlas, so each image needs to be a separate texture, which means the blurred images do not batch in the scene graph renderer. That means that while we can have several large images being blurred, we cannot have hundreds or thousands of small ones in a scene and expect to sustain a velvet 60 fps.
So, if you want to incorporate animated blurs of static images into your application or game, have a look at BlurredImage.
|Posted by email@example.com on September 20, 2014 at 4:55 AM|
This is a test that I usually perform as a first step when encountering a performance issue on a new machine. It establishes whether or not velvet animations are possible at all. It is super simple and works pretty much everywhere. My equivalent of glxgears, you might say. The idea is to show a fullscreen rectangle and alternate the color between red and blue (or any combo of your choice) as fast as swapBuffers() allows. If the visual output is a shimmering, slightly uncomfortable, almost solid pink, then swapping works as it should.
People having issues with flashing light should avoid this particular test.
Because of a phenomenon we call persistence of vision, the red and blue color will stay in our retina for a fraction of a second after we switch to the alternate color. As long as the color is changed at fast and regular intervals we will observe a solid, shimmering, pink. If the color is changing at uneven intervals, it becomes really obvious. In theory this should be a matter of setting swap interval to 1, but it doesn’t always work. Perhaps more so on Linux.
If there is one or more horizontal lines across the screen, this is an indication of tearing. If the screen flashes with frames in which you can clearly see the red and blue colors, it is because frames are being dropped or because frames are rendered too fast. Check the driver and/or the compositor for vsync settings. Disabling the compositor is another thing that might help. There are a number of things that can be tweaked. Generally, I've found proprietary NVidia and AMD drivers produce solid results out of the box.
If the “QML (via animation system)” test fails and the others pass, you might be looking at a case of timer driver animations. On Linux, we’re using the basic render loop for mesa drivers. Run it again with QSG_INFO=1 in the environment and see if it prints out “basic render loop”. This was chosen some time ago as a sensible fallback in case we didn’t know what we were dealing with. Our primary worry was spinning at 100% when the GL stack was not throttling us, so safer to tick by timers to draw only every 16 ms. As you probably know, ticking animations based on timers does not lead to smooth results, so an alternative is to specify QSG_RENDER_LOOP=windows (yeah, windows… just accept it) in the environment. This render loop will tick animations in sync with buffer swapping and will look much smoother.
|Posted by firstname.lastname@example.org on September 15, 2014 at 4:10 AM|
I wanted to start this out with a definition of what is good enough and why. I already covered some of this some time ago here, but I wanted to tidy it up a bit and have it here for completeness, in a framework agnostic fashion.
In real life, objects will move, accelerate and decelerate, but very few things can teleport. So for our brain to perceive natural motion, objects on the screen need to follow the same rules. Their velocity, acceleration and deceleration needs to be steady. Say an object moves with 1 pixel per millisecond and our display refreshes at 60 Hz (meaning how fast the display is capable of updating the content on screen). For its motion to be steady, the object should move exactly 16.6 pixels per frame and the frames would have to be presented to the viewer at exactly 16.6 ms apart.
Both of these factors are important. If the object moves between 10-22 pixels per frame and is presented at 16.6 ms apart (with the help of vsync) it will look bad. If the objects are moving with 16.6 pixels per frame, but skips every ten frames, it will also look bad. When the application is producing 60 frames per second (fps), it will line up with the capacity of the display and it will be velvet.
- Ok, so my application runs at 50-something fps That is pretty good, isn’t it?
- Sorry, but no..
Assuming a 60 Hz display, if the application runs at 59 fps, this means that once per second, the velvet motion will stop. Some people see this, some don’t, but we all notice it. Put something that runs at 59 next to something that runs at 60 (assuming a refresh rate of 60) and anyone can tell you that 60 fps feels more right.
For almost a full second the application has been tricking your brain into thinking that it is observing objects moving about naturally. Then suddenly, they teleport. The illusion stops... Running close to the display’s frame rate is almost worse, because it is almost there so the shattering of the illusion is that much greater. If you cannot hit exactly 60, aim for a higher swap interval. It does mean that the observant and trained eye will be able to see discrete frames, but at least the motion will be steady. For a 60 Hz display, animations running at 30 fps will look better than animations running at 40 or 59 fps.
- Ok, but my application runs at 79 FPS, isn’t that awesome?
- Nope, equally bad..
Being able to run with a very high frame rate can serve a purpose. It gives an indication as to how much slack there is in the system before things turn sour and go below the display's frame rate. It also gives you an indication as to how high the CPU load will be and what the power consumption is once you fix it to run at the display's frame rate.
However, it does mean that it is not smooth. Either frames are being presented faster than the frame rate of the display, which leads to tearing. Or the frame is simply discarded, which means that the movement in pixels per visible frame is not steady.
Also, in most cases, this number is an average over time. A number of 79 could mean that most frames rendered in 12 ms, but there could be the odd frame rendering in 30 ms, leading to an overall high frame rate, but very uneven movement. Because animations anyway look crap at this frame rate, you don’t notice, so you are actually damaging the end result.
If you want to measure slack, disable the CPU and GPU scaling governors, run the application at exactly the display's frame rate and measure the CPU load and aim for as low as possible. This will let you catch bad single frames while measuring the live sustained performance of the application.
- So, is 60 fps the ultimate framerate?
- No, but it is good enough and often the best option available.
Though there exists higher frame rate devices, the majority of consumer devices, be it mobile phones, tablets, TV sets or laptops, tend to aim for either 50 Hz or 60 Hz, so these are the numbers to work with. When we, some time in the future, standardize on 100 Hz, 120 Hz or higher, the difference will be the animation and motion equivalent to retina displays. In the meantime, we'll have to make the best of what we have.
So, to sum it up:
Aim to run the application at exactly the frame rate of the target display. Not a single frame higher and not a single frame lower. If that is not possible after redesigning for performance and applying all micro optimizations, then aim for a higher swap interval. For instance, by halving the frame rate.