May 5, 2024 1 min read links

Link: Better Siri is coming: what Apple’s research says about its AI plans

In a paper called “LLM in a flash: Efficient Large Language Model Inference with Limited Memory” (all these papers have really boring titles but are really interesting, I promise!), researchers devised a system for storing a model’s data, which is usually stored on your device’s RAM, on the SSD instead. “We have demonstrated the ability to run LLMs up to twice the size of available DRAM [on the SSD],” the researchers wrote, “achieving an acceleration in inference speed by 4-5x compared to traditional loading methods in CPU, and 20-25x in GPU.” By taking advantage of the most inexpensive and available storage on your device, they found, the models can run faster and more efficiently. #

--

Yoooo, this is a quick note on a link that made me go, WTF? Find all past links here.

You might also like...

Link: Issue 216: China, the US & Japan's Inbound Tourism Challenge

Link: Nvidia announces its Earth-2 Medium Range weather model, built on its Atlas architecture, claiming it outperforms Google DeepMind's GenCast in 70+ variables (Tim De Chant/TechCrunch)

Link: Dario Amodei — The Adolescence of Technology

Link: Meta's decision to deprioritize VR in favor of AI and internet-connected glasses has chilled the VR industry, leading to concerns about its future (Jonathan Vanian/CNBC)

Link: Power prices surge in Virginia, home to the world's largest data center hub; record demand is expected during the winter storm, partly due to data center needs (Tim McLaughlin/Reuters)