THE LATEST NEWS
EnCharge Picks The PC For Its First Analog AI Chip

Analog AI accelerator startup EnCharge AI announced its first product, the 200-TOPS (INT8) EN100 AI accelerator designed for laptops, workstations and other client devices. The device is based on the EnCharge’s capacitor-based analog compute-in-memory technology, which the company says can achieve a power efficiency above 40 TOPS/W.

In-memory computing is essential for AI as it is the only technology that can provide efficient math acceleration combined with efficient data movement, EnCharge CEO Naveen Verma told EE Times.

“AI is a two-sided problem,” Verma said. “On the one hand, we have a very large number of operations and so we need high efficiency for the math. On the other hand, these operations involve a lot of data and moving that data around becomes a big limitation. It turns out in-memory computing is one of the few architectures that has the potential to address both of these problems simultaneously.”

Analog computing

Analog computing is not a new idea, but the emergence of math-heavy AI workloads in recent years has prompted several startups to build new architectures based on some of the same concepts. In general, the basic operations of multiply and add are achieved within a memory array. A memory cell stores a weight, acting as a variable resistor with resistance in some way proportional to the weight value.

Data is encoded onto a voltage, which when supplied to the memory cell effectively multiplies the data value with the weight value. Output wires are joined together such that currents combine as a simple form of addition. This is a very low-energy way to do multiply and add, the two math operations required for matrix multiplication, which form the bulk of AI workloads. Having computation take place in the memory—where the weights are already stored—also means less data movement is needed, which is more energy efficient.

Other companies’ analog compute schemes have had various levels of success over the years. Mythic uses an array of Flash memory cells as a matrix multiply accelerator, for example, but this requires complex calibration algorithms for process and temperature variations that can reduce precision. Other types of memory can be used; Tetramem uses RRAM in its memory array. D-Matrix uses modified SRAM for analog multiply combined with digital addition in its scheme to get around problems with precision and accuracy in all-analog designs.

Naveen Verma (Source: EnCharge)

“While analog has the potential to give us orders of magnitude energy efficiency [advantages], the problem is we don’t build analog compute chips because it’s noisy, and it doesn’t scale,” Verma said. “The problem is semiconductor devices are variable with things like temperature and manufacturing process, so currents from these devices, which you’re trying to add up for accumulation or data reduction, these things become very noisy.”

Verma’s lab at Princeton came up with a way of getting around the noise problem. Instead of using current through semiconductor devices, EnCharge instead uses charge on capacitors, generating charge and coupling capacitors together for an accumulate result.

“That’s where the really critical piece comes in. These capacitors are basically just metal wires, and we have those in any foundry semiconductor technology,” Verma said, referring to the interconnect layers that are typically above the transistors in a logic or memory designs. An added benefit is that utilizing the metal layers means the compute in memory part does not take up any extra silicon area.

“What’s really important is they don’t vary with temperature, they don’t have any material parameter dependencies, they are perfectly linear, they only depend on geometry,” Verma said. “And it turns out that geometry is the thing that we can control very well in CMOS.”

Capacitors have very low temperature coefficients. That is, their capacitance does not vary much with temperature, Verma said. This is because, in contrast to semiconductor materials where charge flows through the material impeded by the movement of its atoms—whose effect is more pronounced at high temperatures—dielectric materials’ permeability is independent of temperature.

EnCharge’s metal layer capacitors do not need the feature resolution transistors do, but CMOS’ precision helps minimize noise, Verma said. EN100 is on 16 nm, but EnCharge internal test chips have moved beyond that, he said, showing the technology is scalable with process technology.

EnCharge’s EN100 on an M.2 card (Source: EnCharge)

Memory cell design

Other analog compute schemes have suffered from high energy penalties when converting in and out of the analog domain either side of the memory array with large numbers of DACs and ADCs. Verma said that a key differentiator of EnCharge’s capacitor-based technology is that the output is a voltage signal, not current. This means converting to voltage using transimpedance amplifiers is not needed—power and area efficient successive approximation register ADCs can be used. Transimpedance amplifiers would take three or four times the energy of the ADC itself, Verma said.

“These ADCs are in the order of 20% of the area of the memory array, so they’re not big overhead in terms of area and even less in terms of energy, maybe 15 to 18%,” he said. “This has been a critical aspect in preserving the promise of analog, which in many cases gets lost when you start to build full systems.”

EnCharge uses an SRAM cell of its own design, which is slightly modified from a standard foundry SRAM cell to give it the ability to control the capacitor.

“There’s a lot of IP around how you build that cell in a way that remains dense, and how you co-design it with this capacitor structure in a way that preserves that fundamental intrinsic accuracy—also, in full-size, practical arrays,” Verma said. 

When it comes down to it, the capacitor part is the easy part—it is making the rest of the architecture as efficient as the accelerator, and developing a software stack that has kept the company busy for seven or eight years, Verma said.

“Being honest, it’s great to have a spark and a big innovation like we did on the analog computing, but the real work has been in getting the architecture and the software [in place],” he said.

The EN100’s analog accelerator supports 8- and 4-bit precision. For layers or operators that require higher precision, or floating point, there are digital engines on-chip. EnCharge’s compiler maps the workload across the different engines.  

EN100 also comes on a four-chip PCIe card (Source: EnCharge)

Energy efficiency

Many applications could use energy efficiency above 40 TOPS/W, but EnCharge has decided to focus on AI in the PC as a first market.

“We want to be able to keep our eye on a big market opportunity where we have a very concentrated and very critical value proposition,” Verma said. “If that is your North Star, then this market is very aggressively emerging and we believe it needs us, and we want to be there to support it.”

Verma said OEMs want to enable personalized or specialized models to be deployed locally on a user’s PC for compliance and security reasons. For example, 200 TOPS can enable more capable models than today’s Copilot-enabled laptops, which require 40 TOPS of acceleration.  

“Doing that locally means doing it under very severe power and space constraints,” Verma said.  “That’s where an energy efficiency value proposition like EnCharge can bring starts to have real traction, and so that’s the path we’re following.”

The difference between current AI accelerators for the PC are at around 40 TOPS, and EnCharge’s 200 TOPS marks an inflection point for client systems, Verma said.

“If you’re running 1-2 billion parameter models, the models are OK, but when you get to 5, 10 or 15 billion parameters, all of a sudden they dramatically increase in capability,” he said. “Things like multimodality becomes accessible, reasoning models become accessible. The aim is for EnCharge to enable these kinds of models to become accessible on the device.”

EnCharge is partnering with laptop and client platform OEMs, which is driving partnerships with ODMs. Consumer/client OEMs tend to rely on certain ODMs because the volumes they need require specialized design to manage design complexity, validation and qualification, Verma said. The company is already engaged with ISVs for testing purposes, but these engagements will move to a new phase going forward, he said.

EnCharge has a range of models up and running on the EN100, including CNNs, language and vision transformers, and encoder/decoder models, largely prioritized by partner requests, Verma said.

The EN100 will be available on a single-chip M.2 card with 32 GB LPDDR, with a power envelope of 8.25 W. A four-chip, half-height, half-length PCIe card offers up to 1 POPS (INT8) in a 40 W power envelope, with 128 GB LPDDR memory.

Strategic customers will receive samples later this year.

From EETimes

Back
EnCharge Picks The PC For Its First Analog AI Chip
Analog AI accelerator startup EnCharge AI announced its first product, the 200-TOPS (INT8) EN100 AI accelerator designed f...
More info
NXP’s Edge LLM Strategy: Kinara, RAG, Agents
SANTA CLARA, Calif.— At the Embedded Vision Summit 2025, Ali Ors, global director of AI strategy and technologies at NX...
More info
Chip Industry Warns U.S. Tariffs, Bans Could Halt Growth
Leading chipmakers building new fabs in the U.S. warned the administration of U.S. President Donald Trump against levying new tariff...
More info