Apple Makes AI Run on Your Phone with Smart Flash Memory Hack

Artificial Intelligence (AI) has emerged as a hot topic this year, with applications like ChatGPT, supported by AI, gaining widespread acclaim. These applications rely on large language models (LLMs) for support. Although Apple has yet to introduce groundbreaking features in AI, recent research reveals the company’s substantial efforts in this field.

As reported by TechPowerup, Apple researchers have developed a novel flash memory exploitation technique to store AI model data in flash chips. This innovation enables devices with limited memory, such as iPhones, to run large language models. Large language models demand extensive computational power and resources, necessitating accelerators and significant memory capacity. Apple’s new method aims to address these challenges.

Apple’s researchers have introduced two key technologies: ‘Windowing,’ where AI models reuse partially processed data to reduce memory reads, enhancing efficiency; and ‘Row-Column Bundling,’ which groups data more effectively for faster flash memory retrieval, speeding up processing.

Apple claims this technology enables iPhones to run AI models up to twice the size of their memory capacity. The M1 Max’s CPU inference speed for large language models could increase by 4-5 times, and the GPU by 20-25 times, significantly expanding applicability and accessibility.