Google claims CALM speeds up natural language processing by up to 3X with no loss in accuracy
Google has introduced a new method for speeding up natural-language processing by large language models which it calls Confident Adaptive Language Modeling (CALM).
This new technology differs from earlier versions in that it allocates computational power according to its perception of the relative difficulty of predicting the correct output.
Before CALM, in processing text inputs consisting of a sequence of sentences, language processors first encoded text input word-by-word, then decoded word-by-word sequentially through several layers of transformations. CALM speeds this process up by skipping some transformation layers as soon as it has sufficient statistical confidence in what the output will be.
In tests on 3 large datasets, Google researchers found that CALM was able to achieve full performance of language models using only one-third to one-half of transformation layers, by dynamically distributing computational power so as to use full available power only on the hardest transformations, and minimal power on the simplest.
Researchers conclude that CALM succeeds in speeding up the generation of text output by large language models without reducing output quality.
Note, however, that that capability comes at the expense of needing more max computational power in order to maintain processing time.
Will Google incorporate CALM into "live" natural-language processing anytime soon? Stay tuned.