showlobi.blogg.se

File compression
File compression




file compression

Because of this high rate of redundancy, text files compress very well.

file compression file compression

In most languages of the world, certain letters and words often appear together in the same pattern. So how good is this system? The file-reduction ratio depends on a number of factors, including file type, file size and compression scheme. So we've compressed the total file size from 79 units to 59 units! This is just one way of compressing the phrase, and not necessarily the most efficient one. The sentence now takes up 18 units of memory, and our dictionary takes up 41 units. Using the patterns we picked out above, and adding "_" for spaces, we come up with this larger dictionary:Īnd this smaller sentence: "1not_2345_-_12354" No matter what specific method you use, this in-depth searching system lets you compress the file much more efficiently than you could by just picking out words. The way a program actually does this is fairly complicated, as you can see by the discussions on. This ability to rewrite the dictionary is the "adaptive" part of LZ adaptive dictionary-based algorithm. The phrase "can do for" is also repeated, one time followed by "your" and one time followed by "you," giving us a repeated pattern of "can do for you." This lets us write 15 characters (including spaces) with one number value, while "your country" only lets us write 13 characters (with spaces) with one number value, so the program would overwrite the "your country" entry as just "r country," and then write a separate entry for "can do for you." The program proceeds in this way, picking up all repeated bits of information and then calculating which patterns it should write to the dictionary. But as the compression program worked through this sentence, it would quickly discover a better choice for a dictionary entry: Not only is "ou" repeated, but the entire words "your" and "country" are both repeated, and they are actually repeated together, as the phrase "your country." In this case, the program would overwrite the dictionary entry for "ou" with the entry for "your country." The next thing the program might notice is "ou," which appears in both "your" and "country." If this were a longer document, writing this pattern to the dictionary could save a lot of space - "ou" is a fairly common combination in the English language. But in this short phrase, this pattern doesn't occur enough to make it a worthwhile entry, so the program would eventually overwrite it. In "ask not what your," there is a repeated pattern of the letter "t" followed by a space - in "not" and "what." If the compression program wrote this to the dictionary, it could write a "1" every time a "t" were followed by a space. If the compression program scanned Kennedy's phrase, the first redundancy it would come across would be only a couple of letters long. And, as we'll see in the next section, it would also be rewriting the dictionary to get the most efficient organization possible. This gives us a file size of 74, so we haven't reduced the file size by very much.īut this is only one sentence! You can imagine that if the compression program worked through the rest of Kennedy's speech, it would find these words and others repeated many more times. Our compressed sentence (including spaces) takes up 37 units, and the dictionary (words and numbers) also takes up 37 units.

file compression

FILE COMPRESSION FULL

We already saw that the full phrase takes up 79 units. In an actual compression scheme, figuring out the various file requirements would be fairly complicated but for our purposes, let's go back to the idea that every character and every space takes up one unit of memory. It automatically reconstructs the original file once it's downloaded.īut how much space have we actually saved with this system? "1 not 2 3 4 5 6 7 8 - 1 2 8 5 6 7 3 4" is certainly shorter than "Ask not what your country can do for you ask what you can do for your country " but keep in mind that we need to save the dictionary itself along with the file. To create this sort of file, the programmer includes a simple expansion program with the compressed file. You might also have encountered compressed files that open themselves up. This is what the expansion program on your computer does when it expands a downloaded file. If you knew the system, you could easily reconstruct the original phrase using only this dictionary and number pattern.






File compression