Whoa.

Training Day

A new Amazon AI model, according to the researchers who built it, is exhibiting language abilities that it wasn't trained on.

In a not-yet-peer-reviewed academic paper, the team at Amazon AGI — which stands for "artificial general intelligence," or human-level AI — say their large language model (LLM) is exhibiting "state-of-the-art naturalness" at conversational text. Per the examples shared in the paper, the model does seem sophisticated.

As the paper indicates, the model was able to come up with all sorts of sentences that, according to criteria crafted with the help of an "expert linguist," showed it was making the types of language leaps that are natural in human language learners but have been difficult to obtain in AI.

Named "Big Adaptive Streamable TTS with Emergent abilities" or BASE TTS, the initial model was trained on 100,000 hours of "public domain speech data," 90 percent in English, to teach it how Americans talk. To test out how large models would need to be to show "emergent abilities," or abilities they were not trained on, the Amazon AGI team trained two smaller models, one on 1,000 hours of speech data and another on 10,000, to see which of the three — if any — exhibited the type of language naturalness they were looking for.

Interestingly enough, it was the 10,000-hour model — the Goldilocks of the three, if you will — that scored highest on the Amazon researchers' emergent abilities criteria list, which included things like the ability to understand punctuation, non-English words, and emotions.

Speak Now

The middle model spat out sentences that would seem to human readers very natural, exhibiting the ability to transcribe non-words ("Shh, Lucy, shhh, we mustn’t wake your baby brother,” Tom whispered, as they tiptoed past the nursery") and even the kind of internetspeak many netizens use in text messages and spoken language alike ("She received an odd text from her brother: 'Emergency @ home; call ASAP! Mom & Dad are worried…#familymatters.'")

In the paper, whose international team of authors includes 18 AI experts, the Amazon AGI consortium pointed out that BASE TTS was never "explicitly" told to come up with its more surprising outputs.

"These sentences are designed to contain challenging tasks — parsing garden-path sentences, placing phrasal stress on long-winded compound nouns, producing emotional or whispered speech, or producing the correct phonemes for foreign words like “qi” or punctuations like “@” — none of which BASE TTS is explicitly trained to perform," the paper reads.

It's not AGI, of course — but these findings could regardless have implications on the path towards that goal, especially if it didn't need such a gigantic set of training data to get there.

More on AI leaps: AI Used to Resurrect Dead Dictator to Sway Election


Share This Article