Jan Marco, jij vraagt mij wat ik nou het leukste vind. Niet? Ok, maar nu we het er toch over hebben, net als de schrijver en gedragsbioloog en ex-jurkdrager in die Maartens Moestuin-serie, zolang ik maar met af en toe weer een ander hoedje op gewoon in mijn digitale tuintje kan scharrelen vind ik alles best: vandaag internet-snijbiet doen, op zware zeeklei?
Moet ik jou wel eerst even laten vertalen, ziet er wat onwennig uit:
Hi Weatherman,
Did you like this have a seen a vast cemetery for deceased P2P search software initiatives.
Did not know of the existence of this list. It is good that people have tried something. : Slight_smile:Not everything is a success! May also be that something was working depth has become time “corny”. Like the V&D.
Enzovoort:
The cordial greeting Jan Marco
Vergeleken met je oorspronkelijke bijdrage in onze eigen taal levert dit een verbetering op van bijna 10%:
html brotli omvang
jm 29-6-2016 2421 1138 47,0%
idem Engels 2263 852 37,6%
In bytes. Net als zopfli, zoete broodjes, is brotli iets Zwitsers van bladerdeeg:
Introducing Brotli: a new compression algorithm for the internet
At Google, we think that internet users’ time is valuable, and that they shouldn’t have to wait long for a web page to load. Because fast is better than slow, two years ago we published the Zopfli compression algorithm. This received such positive feedback in the industry that it has been integrated into many compression solutions, ranging from PNG optimizers to preprocessing web content.
Based on its use and other modern compression needs, such as web font compression, today we are excited to announce that we have developed and open sourced a new algorithm, the Brotli compression algorithm.
Google - Open Source Blog - Tuesday, September 22, 2015
Daar heb je het, Google heeft je hierboven vertaald en kan jou daarna nog beter samenpersen dan in het Nederlands.
Te danken aan de grove dwarsdoorsnede van webpagina’s die Google uiteraard allemaal langs ziet komen:
Static dictionary
Brotli also features a static dictionary. Unlike most general purpose compression algorithms, Brotli uses a pre-defined 120 kilobyte dictionary. The dictionary contains over 13000 common words, phrases and other substrings derived from a large corpus of text and HTML documents.
It contains 13,504 words or syllables of English, Spanish, Chinese, Hindi, Russian, and Arabic, as well as common phrases used in machine readable languages, particularly HTML and JavaScript. The total size of the static dictionary is 122,784 bytes. The static dictionary is extended by a mechanism of transforms that slightly change the words in the dictionary. A total of 1,633,984 sequences, although not all of them unique, can be constructed by using the 121 transforms.
Static dictionary:
Internet-Draft Brotli May 2015
The hexadecimal form of the DICT array is the following, where the
length is 122,784 bytes and the zlib CRC-32 of the byte sequence is
0x5136cb04.
74696d65646f776e6c6966656c6566746261636b636f64656461746173686f77
6f6e6c7973697465636974796f70656e6a7573746c696b6566726565776f726b
74657874796561726f766572626f64796c6f7665666f726d626f6f6b706c6179
6c6976656c696e6568656c70686f6d65736964656d6f7265776f72646c6f6e67
7468656d7669657766696e64706167656461797366756c6c686561647465726d
656163686172656166726f6d747275656d61726b61626c6575706f6e68696768
646174656c616e646e6577736576656e6e65787463617365626f7468706f7374
757365646d61646568616e6468657265776861746e616d654c696e6b626c6f67
...
In leesbare vorm:
timedownlifeleftbackcodedatashow
timedownlifeleftbackcodedatashow
onlysitecityopenjustlikefreework
textyearoverbodyloveformbookplay
livelinehelphomesidemorewordlong
themviewfindpagedaysfullheadterm
eachareafromtruemarkableuponhigh
datelandnewsevennextcasebothpost
usedmadehandherewhatnameLinkblog
...
Verder naar beneden in de lijst worden de individuele termen langer en ook samengesteld uit meer dan een woord:
stated is only discussion of
top">< search/ middle of the
racing tuesday an individual
resize loosely difficult to
--> Solomon point of view
pacity sexual homosexuality
sexual - <a hr acceptance of
bureau medium" </span></div>
.jpg" DO NOT manufacturers
10,000 France, origin of the
obtain with a commonly used
titles war and importance of
... ... ...
Het is natuurlijk een statistische selectie maar als je door de lijst loopt slaat je fantasie steeds op hol, niet “god”, wel “goddess”, in welke pagina’s dan?
Het hele idee van een woordenlijst bij compressie gebruiken nog even:
Improving compression with a preset dictionary
For example almost all HTML files start with the string "<!doctype html><html "
, however in this string only the second HTML will be replaced with a match, and the rest of the string will remain uncompressed. To solve this problem the deflate dictionary effectively acts as an initial back reference for possible matches.
So if we add the aforementioned string "<!doctype html><html "
to the dictionary, the algorithm will be able to match it from the start, improving the compression ratio. And there are many more such strings that are used in any HTML page, which we can put in the dictionary to improve compression ratio.
Hoe het voorstel van Google ontvangen werd - door Google:
The current state of Brotli compression
In late May 2016 Chrome pushed out Chrome 51, unlike many releases of Chrome which are complete non-events, this release has an enormous impact. Google turned on Brotli support – and they promptly backported it into Chrome 50.
Firefox added support for Brotli in September 2015. 8 months later, thanks to Google, Brotli went from a compression format supported in less than 10% of global browsers to nearly 50% global adoption!
En hoe Google’s Brotli-initiatief valt bij Slashdot:
Google Launches Brotli, a New Open Source Compression Algorithm For the Web
-
If they want to make webpages load quicker, remove ads.
-
Stop making my browser run 500 trips to DNS in order to run 500 trips to every ad server in the world.
-
And lossless too? I’d prefer if they lost the ads, then the compression wouldn’t be needed.
-
This is not about speed, this is about GOOGLE’s bandwidth. Because they process so many transactions a second, they see cost savings even for small improvements.