Easy-to-understand description of SIF.
This article explains the essence of the developed technology in a comprehensible manner for those not willing to investigate the patent.
To provide an illustrative and easy-to-understand description of SIF essence and to explain the differences between SIF and other technologies, we will use the allegory of construction.
Let’s suppose we need to write instructions of how to build a house of a very complicated shape with all walls curved and with windows, rooms and doors, which are all different. Of course, such house will hardly resemble a standard home ;), but within the framework of this task, we are interested in the volume of construction documentation but not in the advantages of the house from the architectural point of view.
- We can, for example, build the whole house from bricks, but we will need to describe the position of each brick the house is being constructed of. This will made our documentation enormous ;).
- Another way to approach this task is to decompose the original complicated shape of the house into multiple universal blocks. These blocks will have different shape, but together they will present a set that enables to construct walls of virtually any shape, as the blocks fit each other very well. Such blocks provide for much more compact instructions, but still we will have to solve a complicated task for each wall section: to make the shortest possible description of the exact blocks and their exact order that will ensure the optimal conveyance of the designed wall shape. Thus, we can obtain very compact documentation, but its preparation will be very labor-intensive and ambiguous task.
- Finally, we can apply totally Russian approach to the problem: take large number of heavy blocks of a complicated shape, which do not fit each other accurately, and make a short description of how to build the house of this blocks adding a note for places where the blocks do not fit tight saying “to comply with the required shape, perform filing down”. This will allow obtaining a small volume of documentation written easily and quickly.
Let’s take the above-described house as the allegory of image and documentation writing as the allegory of compression process:
- The first described method represents standard non-adaptive compression algorithms based on DCT or Wavelets. The method is quite good and quick, but it doesn’t allow for large compression ratios.
- The second method is the object-based method of image presentation and its decomposition into various adaptive bases like Matching Pursuits or curvelets. There is a problem with this approach, as image decomposition into objects requires very large volume of calculations. It is not clear what algorithms should be used for everything to work properly. Large money has been spent on the development of efficient object-based compression methods, but things are not moving, as there has not been developed any practical codec effectively decomposing image into objects.
- The last method conveys the SIF essence. When we do not need to care about tight fitting of blocks to each other, we can make compression of images very quick and with the same speed that non-adaptive algorithms of the first type do. The image is decomposed into the set of standard elements made of the description of basic "rough" shape of the block and the description of the technique to “file down” this block in a particular image. Such elements are called “resampling patterns” in the patent, but I simply call them graphic primitives; however, they are not actually graphic primitives, but a something else ;).
The main tasks of SIF project were creation of practicable, quick and universal algorithms for image decomposition into graphic primitives and reverse synthesis resulting in quality image. Both tasks have been solved successfully. Notably, the synthesis algorithm appeared to be so universal that it allows seamless stitching of graphic primitives obtained from frame-to-frame difference with those obtained from the current frame without additional postfiltering; any other algorithm like Matching Pursuit will never be capable of this. By the way, namely the absence of methods to combine intercoded and intracoded sections in one frame makes new effective algorithms of video image compression noncompetitive; SIF is, of course, an exception ;).
As the task of obtaining quality image in SIF is divided into two parts of compression and synthesis, in order to obtain the same quality result, we can improve either the computational complexity of the analytical part of the algorithm or the complexity of its synthesis part. For example, first variants of SIF compression engine had the reverse symmetry, i.e. unpacking was slower that compression. The current engine’s symmetry is about one, but as it was said before, nothing prevents us from developing special engine for particular tasks.
The class of SIF-like algorithms is very extensive. Depending on the number of graphic primitives used, particular implementation of SIF can approach either Wavelet-like codecs or purely object-based algorithms. Implementation of SIF with three graphic primitives can be considered approximately equal to non-adaptive Wavelet transformation that uses quantizing in the form of enclosed zerotree. Thus, three-element SIF transformation compresses slightly worse than JPEG2000 with its more qualitative bitplane quantizer. Notably, the current compression engine uses only seven graphic primitives, which allows obtaining quite high, though not record-breaking, compression efficiency. The thing is that the main task solved by creation of the current compression engine was procession of universal algorithms of image decomposition and synthesis using graphic primitives, but the increase in the number of graphic primitives leaded to increased code volume and delayed primary task solving.
Thus, current SIF-core6 compression engine is nothing more than a prototype of record-breaking engines of the next generation. According to my estimates, the optimal number of graphic primitives for SIF-like algorithms is within the range from 32 to 128, and the compression efficiency of such engines shall substantially exceed that of the current one.
Another method to increase compression efficiency of SIF-like algorithms is improvement of the principles for graphic primitives adjustment in the process of compression. The developed universal synthesis algorithm narrows the image recovery task to the well worked-out task of effective image interpolation with equally distributed references. To solve this task, many adaptive algorithms of high quality have been developed, which surpass substantially the interpolation algorithm used in the current compression core.
Another very useful feature of SIF-like algorithms is their extreme flexibility and adaptiveness. SIF, like object-based algorithms, decomposes input image into “geometry” and “textures”. The method of obtaining "geometry" is not transferred to output. This allows the usage of extremely flexible and adaptive psychovisual models. When applying it, you can set your compression threshold for any of the graphic primitives the image is decomposed into, as any additional data is transferred to the compressed image. For the current engine, minimum possible adaptiveness range is 2x2 pixels! The psychovisual model is directly built into the engine that decomposes image into graphic primitives; therefore, it is virtually free from the point of view of computational complexity of the algorithm. That is why SIF allows implementation of psychovisual models that cannot be implemented in any other compression algorithms, which is demonstrated by the current video codec.
Thus, the amount of useful features makes SIF-like algorithms most promising for creation of the next generation codecs.
|