Tenstorrent says its Galaxy Blackhole servers can challenge NVIDIA in AI inference

news
Tenstorrent says its Galaxy Blackhole servers can challenge NVIDIA in AI inference

Tenstorrent is making big claims for its new Galaxy Blackhole AI servers. The company says the system can deliver very high inference performance while lowering total cost compared with current GPU based platforms.

The hardware is built around Tenstorrent’s Blackhole chips, which use a RISC V based design. Inside each chip are Tensix cores with programmable RISC processors, matrix units, vector units, local SRAM, and high bandwidth networking. The goal is to build a system where compute, memory, and networking are designed together for large AI workloads.

Galaxy Blackhole is aimed at fast and cheaper AI inference

During its TT Deploy livestream, Tenstorrent showed several demos meant to prove that its Galaxy systems can handle modern AI workloads at scale.

The strongest claim came from DeepSeek R1 671B. In its Blitz Mode, Tenstorrent says 16 Galaxy servers reached more than 350 tokens per second per user for decode performance. The same system also showed under four seconds time to first token with a 100K context.

Galaxy Blackhole claimReported result
DeepSeek R1 671B decodeMore than 350 tokens per second per user
DeepSeek R1 671B prefillUnder four seconds time to first token at 100K context
Video generation81 frame 720p video in 2.4 seconds
Entry server priceStarts at $110,000
Four server clusterStarts at $440,000

Tenstorrent also showed video generation performance, saying its Galaxy Supercluster can generate an 81 frame 720p video in 2.4 seconds. That means the system can produce a five second video faster than real time in that demo.

The company is also pushing hard on cost. Tenstorrent claims its Galaxy platform can deliver a much lower token cost than NVIDIA GB300 systems, with the report citing around $6 per million tokens versus roughly $30 for the competing GPU setup. That is where the company’s “5x” total cost of ownership claim comes from.

The base Galaxy Blackhole server starts at $110,000. It includes 32 Blackhole chips, 23 PFLOPs of FP8 AI compute, 6.2GB of on chip SRAM, 1TB of DRAM, and 56 800G Ethernet ports. Customers can also buy larger supercluster setups using four to 36 Galaxy servers.

The aggressive language from Tenstorrent is expected. The AI hardware market is dominated by NVIDIA, so any challenger has to prove that it can compete on performance, software, networking, and cost. Hardware numbers alone are not enough. Buyers will care about model support, software maturity, deployment tools, reliability, and whether the claimed cost advantage holds in real datacenter use.

The interesting part is that Tenstorrent is not trying to be a normal GPU supplier. It is building an AI system around RISC V, open software, integrated networking, and inference focused workloads. That gives it a clearer identity in a market where many companies are trying to compete with NVIDIA directly.

For now, Galaxy Blackhole looks promising, but it still has to prove itself outside controlled demos. If Tenstorrent can deliver stable software and repeat these results in real deployments, it could become one of the more serious alternatives for companies that want lower AI inference costs.

Discover: News

Discussion (0)

Be the first to comment.